Subquery to select max(date) value

Started by Rich Shepardabout 7 years ago50 messagesgeneral
Jump to latest
#1Rich Shepard
rshepard@appl-ecosys.com

The query is to return the latest next_contact date for each person. Using
the max() aggregate function and modeling the example of lo_temp on page 13
of the rel. 10 manual I wrote this statement:

select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name, A.next_contact)
from People as P, Organizations as O, Activities as A
where P.org_id = O.org_id and P.person_id = A.person_id and
A.next_contact = select (max(A.next_contact) from A)
group by A.next_contact, P.person_id;

The syntax error returned by psql is:

psql:next_contact_date.sql:7: ERROR: syntax error at or near "select"
LINE 4: A.next_contact = select (max(A.next_contact) from A)
^
and I fail to see what I've done incorrectly.

Do I need to insert DISTINCT ON in the main or sub query? If so, what is the
correct syntax to extract all desired columns from each selected row?

If this is covered in the manual please point me to the proper section; if
not, please educate me on the appropriate syntax to produce the desired
output.

TIA,

Rich

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#1)
Re: Subquery to select max(date) value

On Tue, Feb 12, 2019 at 3:24 PM Rich Shepard <rshepard@appl-ecosys.com> wrote:

psql:next_contact_date.sql:7: ERROR: syntax error at or near "select"
LINE 4: A.next_contact = select (max(A.next_contact) from A)
^
and I fail to see what I've done incorrectly.

You put the open parenthesis after the word select instead of before.

A.next_contact = (SELECT max(A.next_contact) FROM A)

David J.

#3Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#2)
Re: Subquery to select max(date) value [RESOLVED]

On Tue, 12 Feb 2019, David G. Johnston wrote:

You put the open parenthesis after the word select instead of before.
A.next_contact = (SELECT max(A.next_contact) FROM A)

David.

Color me suitably embarrassed.

Thank you,

Rich

#4Rich Shepard
rshepard@appl-ecosys.com
In reply to: Rich Shepard (#1)
Re: Subquery to select max(date) value

On Tue, 12 Feb 2019, Jeff Ross wrote:

Try (select (max(A.next_contact) from A)

Thanks, Jeff.

The syntax accepted by psql is
A.next_contact = (select (max(A.next_contact)) from Activities as A)
but the date is not included in the output.

The revised statement is now:
select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name, A.next_contact)
from People as P, Organizations as O, Activities as A
where P.org_id = O.org_id and P.person_id = A.person_id and
A.next_contact = (select (max(A.next_contact)) from Activities as A)
group by A.next_contact, O.org_id, P.person_id;

This produces the first 5 colums in the outer select but no next_contact
date. When I move A.next_contact to the head of the select list each row
begins with 'infinity'.

I'm still missing the requisite knowledge.

Rich

#5Rich Shepard
rshepard@appl-ecosys.com
In reply to: Rich Shepard (#4)
Re: Subquery to select max(date) value

On Tue, 12 Feb 2019, Rich Shepard wrote:

A.next_contact = (select (max(A.next_contact)) from Activities as A)

Errata:

The parentheses around the max aggregate are not necessary.

A.next_contact now displays at the end of each returned row as 'infinity'.

Rich

#6Matt Zagrabelny
mzagrabe@d.umn.edu
In reply to: Rich Shepard (#1)
Re: Subquery to select max(date) value

Hey Rich,

On Tue, Feb 12, 2019 at 4:24 PM Rich Shepard <rshepard@appl-ecosys.com>
wrote:

The query is to return the latest next_contact date for each person. Using
the max() aggregate function and modeling the example of lo_temp on page 13
of the rel. 10 manual I wrote this statement:

I use DISTINCT ON and ORDER BY to get the single latest value from a table
with multiple date entries:

https://stackoverflow.com/questions/9795660/postgresql-distinct-on-with-different-order-by

HTH,

-m

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#5)
Re: Subquery to select max(date) value

On 2/12/19 2:48 PM, Rich Shepard wrote:

On Tue, 12 Feb 2019, Rich Shepard wrote:

     A.next_contact = (select (max(A.next_contact)) from Activities as A)

Errata:

The parentheses around the max aggregate are not necessary.

A.next_contact now displays at the end of each returned row as 'infinity'.

'infinity' is the max date, so this is what you want?

Rich

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Ken Tanzer
ken.tanzer@gmail.com
In reply to: Rich Shepard (#5)
Re: Subquery to select max(date) value

On Tue, Feb 12, 2019 at 2:48 PM Rich Shepard <rshepard@appl-ecosys.com>
wrote:

On Tue, 12 Feb 2019, Rich Shepard wrote:

A.next_contact = (select (max(A.next_contact)) from Activities as A)

Errata:

The parentheses around the max aggregate are not necessary.

A.next_contact now displays at the end of each returned row as 'infinity'.

Your subquery isn't doing anything to match on person_id, so it's going to

match all the records with the highest next_contact in activities.

I think you want something more like:

A.next_contact = (select (max(A.next_contact)) from Activities as A2 WHERE
A2.person_id=A.person_id)

Or, for that matter, since next_contact is all that you're drawing from
activities, you can also just put it in the select:

select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name,
(select max(A.next_contact) from Activities as A WHERE
p.person_id=A.person_id)
FROM ...

Cheers,
Ken
--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

#9Rich Shepard
rshepard@appl-ecosys.com
In reply to: Adrian Klaver (#7)
Re: Subquery to select max(date) value

On Tue, 12 Feb 2019, Adrian Klaver wrote:

'infinity' is the max date, so this is what you want?

Adrian,

Nope. When I went to make a cup of coffee I realized that I need the other
date constraints (such as IS NOT NULL), too. I'm re-wording the statement to
put everything in the correct order. Will probably try DISTINCT ON, too, if
that makes it simpler or faster.

Thanks,

Rich

#10Rich Shepard
rshepard@appl-ecosys.com
In reply to: Ken Tanzer (#8)
Re: Subquery to select max(date) value

On Tue, 12 Feb 2019, Ken Tanzer wrote:

select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name,
(select max(A.next_contact) from Activities as A WHERE
p.person_id=A.person_id)
FROM ...

Ken,

Yes, cheers indeed. A bit of thinking and re-organizing resulted in a
working statement that's close to what I want:

select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name,
(select max(A.next_contact)))
from People as P, Organizations as O, Activities as A
where P.org_id = O.org_id and P.person_id = A.person_id and
/*A.next_contact = (select max(A.next_contact) from Activities as A) and */
A.next_contact <= 'today' and A.next_contact > '2018-12-31' and
A.next_contact is not null
group by A.next_contact, O.org_id, P.person_id;

The two issues I now focus on resolving are the multiple rows per person
rather than only the most recent and the date displayed at the end of each
output row. DISTINCT ON will eliminate the first issue.

Thanks,

Rich

#11Ken Tanzer
ken.tanzer@gmail.com
In reply to: Rich Shepard (#10)
Re: Subquery to select max(date) value

Ken,

Yes, cheers indeed. A bit of thinking and re-organizing resulted in a
working statement that's close to what I want:

select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name,
(select max(A.next_contact)))
from People as P, Organizations as O, Activities as A
where P.org_id = O.org_id and P.person_id = A.person_id and
/*A.next_contact = (select max(A.next_contact) from Activities as
A) and */
A.next_contact <= 'today' and A.next_contact > '2018-12-31' and
A.next_contact is not null
group by A.next_contact, O.org_id, P.person_id;

The two issues I now focus on resolving are the multiple rows per person
rather than only the most recent and the date displayed at the end of each
output row. DISTINCT ON will eliminate the first issue.

If that's getting you what you want, then great and more power to you. It
looks like you'll only get people who have a next_contact in your target
window there. You might also consider something like this...

select
p.person_id,
p.lname,
p.fname,
p.direct_phone,
o.org_name,
a.next_contact
from
people as p
LEFT JOIN organizations o USING (person_id)
LEFT JOIN (
SELECT
DISTINCT ON (person_id)
person_id,
next_contact
FROM activities a
-- WHERE ???
ORDER BY person_id,next_contact DESC
) a USING (person_id)
;

Cheers,
Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

#12Rich Shepard
rshepard@appl-ecosys.com
In reply to: Ken Tanzer (#11)
Re: Subquery to select max(date) value

On Tue, 12 Feb 2019, Ken Tanzer wrote:

If that's getting you what you want, then great and more power to you. It
looks like you'll only get people who have a next_contact in your target
window there. You might also consider something like this...

<clip>

Ken,

I'll work with your example. This looks most promising.

What I want is a report of folks I need to contact. Some have next_contact
dates in the past when I did not act so I need to do so as long as there is
a next_contact date (no NULLs) and the prospect is active.

When I have a fully working statement I'll post it to the list so others can
read the accepted working solution.

Thanks again,

Rich

#13Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Rich Shepard (#9)
Re: Subquery to select max(date) value

"Rich" == Rich Shepard <rshepard@appl-ecosys.com> writes:

Rich> Will probably try DISTINCT ON, too, if that makes it simpler or
Rich> faster.

You want LATERAL.

--
Andrew (irc:RhodiumToad)

#14Jan Kohnert
nospam001-lists@jan-kohnert.de
In reply to: Rich Shepard (#1)
Re: Subquery to select max(date) value

Am Dienstag, 12. Februar 2019, 23:23:49 CET schrieb Rich Shepard:

The query is to return the latest next_contact date for each person. Using
the max() aggregate function and modeling the example of lo_temp on page 13
of the rel. 10 manual I wrote this statement:

You don't even need a subselect to do so:

select
P.person_id, P.lname, P.fname, P.direct_phone,
O.org_name, max(A.next_contact) as next_contact
from People as P
join Organizations as O
on P.org_id = O.org_id
join Activities as A
on P.person_id = A.person_id
group by
P.person_id, P.lname, P.fname,
P.direct_phone, O.org_name;

--
Kind regards Jan

#15Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrew Gierth (#13)
Re: Subquery to select max(date) value

On Wed, 13 Feb 2019, Andrew Gierth wrote:

Rich> Will probably try DISTINCT ON, too, if that makes it simpler or
Rich> faster.

You want LATERAL.

Andrew,

That's new to me so I'll read about it.

Thanks,

Rich

#16Rich Shepard
rshepard@appl-ecosys.com
In reply to: Ken Tanzer (#11)
Re: Subquery to select max(date) value

On Tue, 12 Feb 2019, Ken Tanzer wrote:

If that's getting you what you want, then great and more power to you. It
looks like you'll only get people who have a next_contact in your target
window there. You might also consider something like this...

select
p.person_id,
p.lname,
p.fname,
p.direct_phone,
o.org_name,
a.next_contact
from
people as p
LEFT JOIN organizations o USING (person_id)
LEFT JOIN (
SELECT
DISTINCT ON (person_id)
person_id,
next_contact
FROM activities a
-- WHERE ???
ORDER BY person_id,next_contact DESC
) a USING (person_id)
;

I modified this to restrict the time and ignore nulls by replacing the
question marks:

WHERE a.next_contact <= 'today' and a.next_contact > '2018-12-31' and
a.next_contact is not null

For a reason I've not yet found, the last condition is not observed; i.e.,
those rows with null next_contact dates appear in the results. Position in
the sequence makes no difference. What might cause this?

Regards,

Rich

#17Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#16)
Re: Subquery to select max(date) value

On 2/13/19 6:28 AM, Rich Shepard wrote:

On Tue, 12 Feb 2019, Ken Tanzer wrote:

If that's getting you what you want, then great and more power to
you.  It
looks like you'll only get people who have a next_contact in your target
window there.  You might also consider something like this...

select
   p.person_id,
   p.lname,
   p.fname,
   p.direct_phone,
   o.org_name,
   a.next_contact
from
   people as p
   LEFT JOIN organizations o USING (person_id)
   LEFT JOIN (
       SELECT
           DISTINCT ON (person_id)
           person_id,
           next_contact
       FROM activities a
       -- WHERE ???
       ORDER BY person_id,next_contact DESC
   ) a USING (person_id)
;

I modified this to restrict the time and ignore nulls by replacing the
question marks:

WHERE a.next_contact <= 'today' and a.next_contact > '2018-12-31' and
      a.next_contact is not null

For a reason I've not yet found, the last condition is not observed; i.e.,
those rows with null next_contact dates appear in the results. Position in
the sequence makes no difference. What might cause this?

The LEFT JOIN. There are rows in people for which there no records
coming from the sub-select on activities, so the row is 'padded' with
NULL values for the missing data.

Regards,

Rich

--
Adrian Klaver
adrian.klaver@aklaver.com

#18Rich Shepard
rshepard@appl-ecosys.com
In reply to: Adrian Klaver (#17)
Re: Subquery to select max(date) value

On Wed, 13 Feb 2019, Adrian Klaver wrote:

The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL values
for the missing data.

Adrian,

I assume it's the inner left join. I'll trace what's happening at each step
and learn where to specify no nulls.

Thanks,

Rich

#19Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#18)
Re: Subquery to select max(date) value

On 2/13/19 7:37 AM, Rich Shepard wrote:

On Wed, 13 Feb 2019, Adrian Klaver wrote:

The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL
values
for the missing data.

Adrian,

I assume it's the inner left join. I'll trace what's happening at each step

AFAIK there is no inner left join:
https://www.postgresql.org/docs/10/sql-select.html#SQL-FROM
"
join_type

One of

[ INNER ] JOIN

LEFT [ OUTER ] JOIN

RIGHT [ OUTER ] JOIN

FULL [ OUTER ] JOIN

CROSS JOIN
"

and learn where to specify no nulls.

You can't it is the nature of the join:

"LEFT OUTER JOIN returns all rows in the qualified Cartesian product
(i.e., all combined rows that pass its join condition), plus one copy of
each row in the left-hand table for which there was no right-hand row
that passed the join condition. This left-hand row is extended to the
full width of the joined table by inserting null values for the
right-hand columns. Note that only the JOIN clause's own condition is
considered while deciding which rows have matches. Outer conditions are
applied afterwards."

What you are seeing are records for which there is a person in the
people table that either:

1) Do not have a record under that person_id in the activities table

or

2) Do not have a record that meets the date filtering you are doing on
next_contact.

Thanks,

Rich

--
Adrian Klaver
adrian.klaver@aklaver.com

#20Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#18)
Re: Subquery to select max(date) value

On 2/13/19 7:37 AM, Rich Shepard wrote:

On Wed, 13 Feb 2019, Adrian Klaver wrote:

The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL
values
for the missing data.

Adrian,

I assume it's the inner left join. I'll trace what's happening at each step
and learn where to specify no nulls.

Should have been clearer in my previous post, you can get rid of the
nulls by filtering out the entire row.

Thanks,

Rich

--
Adrian Klaver
adrian.klaver@aklaver.com

#21Rich Shepard
rshepard@appl-ecosys.com
In reply to: Adrian Klaver (#19)
#22Rich Shepard
rshepard@appl-ecosys.com
In reply to: Adrian Klaver (#20)
#23Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#22)
#24Rich Shepard
rshepard@appl-ecosys.com
In reply to: Adrian Klaver (#23)
#25Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#18)
#26Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Adrian Klaver (#25)
#27Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrew Gierth (#26)
#28Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#27)
#29Rich Shepard
rshepard@appl-ecosys.com
In reply to: Adrian Klaver (#28)
#30Brent Wood
Brent.Wood@niwa.co.nz
In reply to: Rich Shepard (#29)
#31Rich Shepard
rshepard@appl-ecosys.com
In reply to: Brent Wood (#30)
#32Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrew Gierth (#13)
#33Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Rich Shepard (#32)
#34Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrew Gierth (#33)
#35Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrew Gierth (#33)
#36Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Rich Shepard (#35)
#37Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrew Gierth (#36)
#38Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrew Gierth (#36)
#39Ken Tanzer
ken.tanzer@gmail.com
In reply to: Rich Shepard (#38)
#40David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#38)
#41David G. Johnston
david.g.johnston@gmail.com
In reply to: Ken Tanzer (#39)
#42Ken Tanzer
ken.tanzer@gmail.com
In reply to: David G. Johnston (#41)
#43Rich Shepard
rshepard@appl-ecosys.com
In reply to: Ken Tanzer (#39)
#44Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#40)
#45Ken Tanzer
ken.tanzer@gmail.com
In reply to: Rich Shepard (#43)
#46David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#44)
#47Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Rich Shepard (#44)
#48Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#46)
#49Rich Shepard
rshepard@appl-ecosys.com
In reply to: Ken Tanzer (#45)
#50Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrew Gierth (#47)