Subquery to select max(date) value
The query is to return the latest next_contact date for each person. Using
the max() aggregate function and modeling the example of lo_temp on page 13
of the rel. 10 manual I wrote this statement:
select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name, A.next_contact)
from People as P, Organizations as O, Activities as A
where P.org_id = O.org_id and P.person_id = A.person_id and
A.next_contact = select (max(A.next_contact) from A)
group by A.next_contact, P.person_id;
The syntax error returned by psql is:
psql:next_contact_date.sql:7: ERROR: syntax error at or near "select"
LINE 4: A.next_contact = select (max(A.next_contact) from A)
^
and I fail to see what I've done incorrectly.
Do I need to insert DISTINCT ON in the main or sub query? If so, what is the
correct syntax to extract all desired columns from each selected row?
If this is covered in the manual please point me to the proper section; if
not, please educate me on the appropriate syntax to produce the desired
output.
TIA,
Rich
On Tue, Feb 12, 2019 at 3:24 PM Rich Shepard <rshepard@appl-ecosys.com> wrote:
psql:next_contact_date.sql:7: ERROR: syntax error at or near "select"
LINE 4: A.next_contact = select (max(A.next_contact) from A)
^
and I fail to see what I've done incorrectly.
You put the open parenthesis after the word select instead of before.
A.next_contact = (SELECT max(A.next_contact) FROM A)
David J.
On Tue, 12 Feb 2019, David G. Johnston wrote:
You put the open parenthesis after the word select instead of before.
A.next_contact = (SELECT max(A.next_contact) FROM A)
David.
Color me suitably embarrassed.
Thank you,
Rich
On Tue, 12 Feb 2019, Jeff Ross wrote:
Try (select (max(A.next_contact) from A)
Thanks, Jeff.
The syntax accepted by psql is
A.next_contact = (select (max(A.next_contact)) from Activities as A)
but the date is not included in the output.
The revised statement is now:
select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name, A.next_contact)
from People as P, Organizations as O, Activities as A
where P.org_id = O.org_id and P.person_id = A.person_id and
A.next_contact = (select (max(A.next_contact)) from Activities as A)
group by A.next_contact, O.org_id, P.person_id;
This produces the first 5 colums in the outer select but no next_contact
date. When I move A.next_contact to the head of the select list each row
begins with 'infinity'.
I'm still missing the requisite knowledge.
Rich
Import Notes
Reply to msg id not found: 2780074d-3b87-b154-8bc6-a0f898fbdde2@openvistas.net
On Tue, 12 Feb 2019, Rich Shepard wrote:
A.next_contact = (select (max(A.next_contact)) from Activities as A)
Errata:
The parentheses around the max aggregate are not necessary.
A.next_contact now displays at the end of each returned row as 'infinity'.
Rich
Hey Rich,
On Tue, Feb 12, 2019 at 4:24 PM Rich Shepard <rshepard@appl-ecosys.com>
wrote:
The query is to return the latest next_contact date for each person. Using
the max() aggregate function and modeling the example of lo_temp on page 13
of the rel. 10 manual I wrote this statement:
I use DISTINCT ON and ORDER BY to get the single latest value from a table
with multiple date entries:
https://stackoverflow.com/questions/9795660/postgresql-distinct-on-with-different-order-by
HTH,
-m
On 2/12/19 2:48 PM, Rich Shepard wrote:
On Tue, 12 Feb 2019, Rich Shepard wrote:
A.next_contact = (select (max(A.next_contact)) from Activities as A)
Errata:
The parentheses around the max aggregate are not necessary.
A.next_contact now displays at the end of each returned row as 'infinity'.
'infinity' is the max date, so this is what you want?
Rich
--
Adrian Klaver
adrian.klaver@aklaver.com
On Tue, Feb 12, 2019 at 2:48 PM Rich Shepard <rshepard@appl-ecosys.com>
wrote:
On Tue, 12 Feb 2019, Rich Shepard wrote:
A.next_contact = (select (max(A.next_contact)) from Activities as A)
Errata:
The parentheses around the max aggregate are not necessary.
A.next_contact now displays at the end of each returned row as 'infinity'.
Your subquery isn't doing anything to match on person_id, so it's going to
match all the records with the highest next_contact in activities.
I think you want something more like:
A.next_contact = (select (max(A.next_contact)) from Activities as A2 WHERE
A2.person_id=A.person_id)
Or, for that matter, since next_contact is all that you're drawing from
activities, you can also just put it in the select:
select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name,
(select max(A.next_contact) from Activities as A WHERE
p.person_id=A.person_id)
FROM ...
Cheers,
Ken
--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client>*
ken.tanzer@agency-software.org
(253) 245-3801
Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.
On Tue, 12 Feb 2019, Adrian Klaver wrote:
'infinity' is the max date, so this is what you want?
Adrian,
Nope. When I went to make a cup of coffee I realized that I need the other
date constraints (such as IS NOT NULL), too. I'm re-wording the statement to
put everything in the correct order. Will probably try DISTINCT ON, too, if
that makes it simpler or faster.
Thanks,
Rich
On Tue, 12 Feb 2019, Ken Tanzer wrote:
select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name,
(select max(A.next_contact) from Activities as A WHERE
p.person_id=A.person_id)
FROM ...
Ken,
Yes, cheers indeed. A bit of thinking and re-organizing resulted in a
working statement that's close to what I want:
select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name,
(select max(A.next_contact)))
from People as P, Organizations as O, Activities as A
where P.org_id = O.org_id and P.person_id = A.person_id and
/*A.next_contact = (select max(A.next_contact) from Activities as A) and */
A.next_contact <= 'today' and A.next_contact > '2018-12-31' and
A.next_contact is not null
group by A.next_contact, O.org_id, P.person_id;
The two issues I now focus on resolving are the multiple rows per person
rather than only the most recent and the date displayed at the end of each
output row. DISTINCT ON will eliminate the first issue.
Thanks,
Rich
Ken,
Yes, cheers indeed. A bit of thinking and re-organizing resulted in a
working statement that's close to what I want:select (P.person_id, P.lname, P.fname, P.direct_phone, O.org_name,
(select max(A.next_contact)))
from People as P, Organizations as O, Activities as A
where P.org_id = O.org_id and P.person_id = A.person_id and
/*A.next_contact = (select max(A.next_contact) from Activities as
A) and */
A.next_contact <= 'today' and A.next_contact > '2018-12-31' and
A.next_contact is not null
group by A.next_contact, O.org_id, P.person_id;The two issues I now focus on resolving are the multiple rows per person
rather than only the most recent and the date displayed at the end of each
output row. DISTINCT ON will eliminate the first issue.
If that's getting you what you want, then great and more power to you. It
looks like you'll only get people who have a next_contact in your target
window there. You might also consider something like this...
select
p.person_id,
p.lname,
p.fname,
p.direct_phone,
o.org_name,
a.next_contact
from
people as p
LEFT JOIN organizations o USING (person_id)
LEFT JOIN (
SELECT
DISTINCT ON (person_id)
person_id,
next_contact
FROM activities a
-- WHERE ???
ORDER BY person_id,next_contact DESC
) a USING (person_id)
;
Cheers,
Ken
--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client>*
ken.tanzer@agency-software.org
(253) 245-3801
Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.
On Tue, 12 Feb 2019, Ken Tanzer wrote:
If that's getting you what you want, then great and more power to you. It
looks like you'll only get people who have a next_contact in your target
window there. You might also consider something like this...
<clip>
Ken,
I'll work with your example. This looks most promising.
What I want is a report of folks I need to contact. Some have next_contact
dates in the past when I did not act so I need to do so as long as there is
a next_contact date (no NULLs) and the prospect is active.
When I have a fully working statement I'll post it to the list so others can
read the accepted working solution.
Thanks again,
Rich
"Rich" == Rich Shepard <rshepard@appl-ecosys.com> writes:
Rich> Will probably try DISTINCT ON, too, if that makes it simpler or
Rich> faster.
You want LATERAL.
--
Andrew (irc:RhodiumToad)
Am Dienstag, 12. Februar 2019, 23:23:49 CET schrieb Rich Shepard:
The query is to return the latest next_contact date for each person. Using
the max() aggregate function and modeling the example of lo_temp on page 13
of the rel. 10 manual I wrote this statement:
You don't even need a subselect to do so:
select
P.person_id, P.lname, P.fname, P.direct_phone,
O.org_name, max(A.next_contact) as next_contact
from People as P
join Organizations as O
on P.org_id = O.org_id
join Activities as A
on P.person_id = A.person_id
group by
P.person_id, P.lname, P.fname,
P.direct_phone, O.org_name;
--
Kind regards Jan
On Wed, 13 Feb 2019, Andrew Gierth wrote:
Rich> Will probably try DISTINCT ON, too, if that makes it simpler or
Rich> faster.You want LATERAL.
Andrew,
That's new to me so I'll read about it.
Thanks,
Rich
On Tue, 12 Feb 2019, Ken Tanzer wrote:
If that's getting you what you want, then great and more power to you. It
looks like you'll only get people who have a next_contact in your target
window there. You might also consider something like this...select
p.person_id,
p.lname,
p.fname,
p.direct_phone,
o.org_name,
a.next_contact
from
people as p
LEFT JOIN organizations o USING (person_id)
LEFT JOIN (
SELECT
DISTINCT ON (person_id)
person_id,
next_contact
FROM activities a
-- WHERE ???
ORDER BY person_id,next_contact DESC
) a USING (person_id)
;
I modified this to restrict the time and ignore nulls by replacing the
question marks:
WHERE a.next_contact <= 'today' and a.next_contact > '2018-12-31' and
a.next_contact is not null
For a reason I've not yet found, the last condition is not observed; i.e.,
those rows with null next_contact dates appear in the results. Position in
the sequence makes no difference. What might cause this?
Regards,
Rich
On 2/13/19 6:28 AM, Rich Shepard wrote:
On Tue, 12 Feb 2019, Ken Tanzer wrote:
If that's getting you what you want, then great and more power to
you. It
looks like you'll only get people who have a next_contact in your target
window there. You might also consider something like this...select
p.person_id,
p.lname,
p.fname,
p.direct_phone,
o.org_name,
a.next_contact
from
people as p
LEFT JOIN organizations o USING (person_id)
LEFT JOIN (
SELECT
DISTINCT ON (person_id)
person_id,
next_contact
FROM activities a
-- WHERE ???
ORDER BY person_id,next_contact DESC
) a USING (person_id)
;I modified this to restrict the time and ignore nulls by replacing the
question marks:WHERE a.next_contact <= 'today' and a.next_contact > '2018-12-31' and
a.next_contact is not nullFor a reason I've not yet found, the last condition is not observed; i.e.,
those rows with null next_contact dates appear in the results. Position in
the sequence makes no difference. What might cause this?
The LEFT JOIN. There are rows in people for which there no records
coming from the sub-select on activities, so the row is 'padded' with
NULL values for the missing data.
Regards,
Rich
--
Adrian Klaver
adrian.klaver@aklaver.com
On Wed, 13 Feb 2019, Adrian Klaver wrote:
The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL values
for the missing data.
Adrian,
I assume it's the inner left join. I'll trace what's happening at each step
and learn where to specify no nulls.
Thanks,
Rich
On 2/13/19 7:37 AM, Rich Shepard wrote:
On Wed, 13 Feb 2019, Adrian Klaver wrote:
The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL
values
for the missing data.Adrian,
I assume it's the inner left join. I'll trace what's happening at each step
AFAIK there is no inner left join:
https://www.postgresql.org/docs/10/sql-select.html#SQL-FROM
"
join_type
One of
[ INNER ] JOIN
LEFT [ OUTER ] JOIN
RIGHT [ OUTER ] JOIN
FULL [ OUTER ] JOIN
CROSS JOIN
"
and learn where to specify no nulls.
You can't it is the nature of the join:
"LEFT OUTER JOIN returns all rows in the qualified Cartesian product
(i.e., all combined rows that pass its join condition), plus one copy of
each row in the left-hand table for which there was no right-hand row
that passed the join condition. This left-hand row is extended to the
full width of the joined table by inserting null values for the
right-hand columns. Note that only the JOIN clause's own condition is
considered while deciding which rows have matches. Outer conditions are
applied afterwards."
What you are seeing are records for which there is a person in the
people table that either:
1) Do not have a record under that person_id in the activities table
or
2) Do not have a record that meets the date filtering you are doing on
next_contact.
Thanks,
Rich
--
Adrian Klaver
adrian.klaver@aklaver.com
On 2/13/19 7:37 AM, Rich Shepard wrote:
On Wed, 13 Feb 2019, Adrian Klaver wrote:
The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL
values
for the missing data.Adrian,
I assume it's the inner left join. I'll trace what's happening at each step
and learn where to specify no nulls.
Should have been clearer in my previous post, you can get rid of the
nulls by filtering out the entire row.
Thanks,
Rich
--
Adrian Klaver
adrian.klaver@aklaver.com