Extract only maximum date from column

Started by Rich Shepard4 months ago21 messagesgeneral
Jump to latest
#1Rich Shepard
rshepard@appl-ecosys.com

I want the script to extract only the maximum `next_contact' date and
haven't learned how to do this from my reading of web sites.

The script:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
where c.next_contact >= '2025-11-01'
group by p.person_nbr, p.company_nbr, c.next_contact
order by p.person_nbr, p.company_nbr, max(c.next_contact);

returns all contacts rather than only the latest one.

Is using a sub-select the proper way?

TIA,

Rich

#2Bryan Sayer
blslists@gmail.com
In reply to: Rich Shepard (#1)
Re: Extract only maximum date from column

I believe in general you need

having c.next_contact = max(c.next_contact)

(at least in ANSI SQL) Use having for after the join has occured

*Bryan Sayer*
Retired Demographer/Statistician
/In a world in which you can be anything, *be kind*/

Show quoted text

On 12/4/2025 2:55 PM, Rich Shepard wrote:

I want the script to extract only the maximum `next_contact' date and
haven't learned how to do this from my reading of web sites.

The script:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
where c.next_contact >= '2025-11-01'
group by p.person_nbr, p.company_nbr, c.next_contact
order by p.person_nbr, p.company_nbr, max(c.next_contact);

returns all contacts rather than only the latest one.

Is using a sub-select the proper way?

TIA,

Rich

#3Ron
ronljohnsonjr@gmail.com
In reply to: Rich Shepard (#1)
Re: Extract only maximum date from column

On Thu, Dec 4, 2025 at 2:55 PM Rich Shepard <rshepard@appl-ecosys.com>
wrote:

I want the script to extract only the maximum `next_contact' date and
haven't learned how to do this from my reading of web sites.

The script:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
where c.next_contact >= '2025-11-01'
group by p.person_nbr, p.company_nbr, c.next_contact
order by p.person_nbr, p.company_nbr, max(c.next_contact);

returns all contacts rather than only the latest one.

Is using a sub-select the proper way?

May not be the only way, but it's how I do it:
SELECT *
FROM foo
WHERE some_dt = (SELECT MAX(some_dt) FROM foo);

It might return more than one row...

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#1)
Re: Extract only maximum date from column

On Thursday, December 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:

I want the script to extract only the maximum `next_contact' date and
haven't learned how to do this from my reading of web sites.

The script:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
where c.next_contact >= '2025-11-01'
group by p.person_nbr, p.company_nbr, c.next_contact
order by p.person_nbr, p.company_nbr, max(c.next_contact);

returns all contacts rather than only the latest one.

Is using a sub-select the proper way?

I would go with a lateral join subquery of the contracts table. Using an
aggregates to perform ranking is an anti-pattern. You want the contract
ranked first when ordered by contract_date. Either use a window function
to explicitly rank the contracts or use a limit/fetch clause to simply
return the first ordered one.

You also seem to have completely missed joining people to contracts -
suggest using explicit “join” clauses to make that error less likely.

David J.

#5Rich Shepard
rshepard@appl-ecosys.com
In reply to: Bryan Sayer (#2)
Re: Extract only maximum date from column

On Thu, 4 Dec 2025, Bryan Sayer wrote:

I believe in general you need

having c.next_contact = max(c.next_contact)

(at least in ANSI SQL) Use having for after the join has occurred

Bryan,

Postgresql didn't like that regardless of where I inserted the `having'
stanza.

Thanks,

Rich

#6Rich Shepard
rshepard@appl-ecosys.com
In reply to: Ron (#3)
Re: Extract only maximum date from column

On Thu, 4 Dec 2025, Ron Johnson wrote:

May not be the only way, but it's how I do it:
SELECT *
FROM foo
WHERE some_dt = (SELECT MAX(some_dt) FROM foo);

It might return more than one row...

Ron,

Didn't quite work for me this way:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
where c.next_contact (select max(c.next_contact) from c.contacts) >= '2025-11-01'
group by p.person_nbr, p.company_nbr
order by p.person_nbr, p.company_nbr;

Thanks,

Rich

#7Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#4)
Re: Extract only maximum date from column

On Thu, 4 Dec 2025, David G. Johnston wrote:

I would go with a lateral join subquery of the contracts table. Using an
aggregates to perform ranking is an anti-pattern. You want the contract
ranked first when ordered by contract_date. Either use a window function
to explicitly rank the contracts or use a limit/fetch clause to simply
return the first ordered one.

David,

It's 'contacts' rather than 'contracts'; a minor typo.

I'll read on making a lateral join. Thanks.

You also seem to have completely missed joining people to contracts -
suggest using explicit “join” clauses to make that error less likely.

Oops! My bad. Fixed.

Regards,

Rich

#8Bryan Sayer
blslists@gmail.com
In reply to: Rich Shepard (#5)
Re: Extract only maximum date from column

You don't include the where clause, just the having clause after the
group by.

At least that is what I remember. But it has been awhile.

*Bryan Sayer*
Retired Demographer/Statistician
/In a world in which you can be anything, *be kind*/

Show quoted text

On 12/4/2025 3:13 PM, Rich Shepard wrote:

On Thu, 4 Dec 2025, Bryan Sayer wrote:

I believe in general you need

having c.next_contact = max(c.next_contact)

(at least in ANSI SQL) Use having for after the join has occurred

Bryan,

Postgresql didn't like that regardless of where I inserted the `having'
stanza.

Thanks,

Rich

#9Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#4)
Re: Extract only maximum date from column

On Thu, 4 Dec 2025, David G. Johnston wrote:

I would go with a lateral join subquery of the contracts table. Using an
aggregates to perform ranking is an anti-pattern. You want the contract
ranked first when ordered by contract_date. Either use a window function
to explicitly rank the contracts or use a limit/fetch clause to simply
return the first ordered one.

David,

I'm closer, but still missing the proper syntax:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
join lateral (select max(c.next_contact) as last_contact
where p.person_nbr = c.person_nbr and
last_contact >= '2025-11-01'
)
c on true;

resulting in:
psql:companies-contacted-2025.sql:9: ERROR: aggregate functions are not allowed in FROM clause of their own query level
LINE 3: join lateral (select max(c.next_contact) as last_contact

Regards,

Rich

#10Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#9)
Re: Extract only maximum date from column

On 12/4/25 1:39 PM, Rich Shepard wrote:

On Thu, 4 Dec 2025, David G. Johnston wrote:

I would go with a lateral join subquery of the contracts table. Using an
aggregates to perform ranking is an anti-pattern. You want the contract
ranked first when ordered by contract_date. Either use a window function
to explicitly rank the contracts or use a limit/fetch clause to simply
return the first ordered one.

David,

I'm closer, but still missing the proper syntax:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
join lateral (select max(c.next_contact) as last_contact
     where p.person_nbr = c.person_nbr and
     last_contact >= '2025-11-01'
     )
     c on true;

resulting in:
psql:companies-contacted-2025.sql:9: ERROR:  aggregate functions are not
allowed in FROM clause of their own query level
LINE 3: join lateral (select max(c.next_contact) as last_contact

Would the below work?:

WITH lc AS (SELECT person_nbr, max(next_contact) AS last_contact from
contacts where next_contact > '2025-11-01' group by c.person_nbr)
select p.person_nbr, p.company_nbr, lc.last_contact from people AS p
join lc on p.person.nbr = lc.person_nbr;

Regards,

Rich

--
Adrian Klaver
adrian.klaver@aklaver.com

#11David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#9)
Re: Extract only maximum date from column

On Thursday, December 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:

On Thu, 4 Dec 2025, David G. Johnston wrote:

I would go with a lateral join subquery of the contracts table. Using an

aggregates to perform ranking is an anti-pattern. You want the contract
ranked first when ordered by contract_date. Either use a window function
to explicitly rank the contracts or use a limit/fetch clause to simply
return the first ordered one.

David,

I'm closer, but still missing the proper syntax:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
join lateral (select max(c.next_contact) as last_contact
where p.person_nbr = c.person_nbr and
last_contact >= '2025-11-01'
)
c on true;

resulting in:
psql:companies-contacted-2025.sql:9: ERROR: aggregate functions are not
allowed in FROM clause of their own query level
LINE 3: join lateral (select max(c.next_contact) as last_contact

As mentioned, the aggregate max should be avoided - you aren’t doing
statistics, you are ranking.

Select person.*, lastcontact.* from person join lateral (select contact.*
from contact where contact.person_id=person.person_id order by
last_contact_date desc limit 1) as lastcontact on true;

David J.

#12Rich Shepard
rshepard@appl-ecosys.com
In reply to: Adrian Klaver (#10)
Re: Extract only maximum date from column

On Thu, 4 Dec 2025, Adrian Klaver wrote:

Would the below work?:

WITH lc AS (SELECT person_nbr, max(next_contact) AS last_contact from
contacts where next_contact > '2025-11-01' group by c.person_nbr)
select p.person_nbr, p.company_nbr, lc.last_contact from people AS p join lc
on p.person.nbr = lc.person_nbr;

Adrian,

Reformated and still has an error:
WITH lc AS (SELECT person_nbr, max(next_contact) AS last_contact
from contacts where next_contact >= '2025-11-01'
group by c.person_nbr)
select p.person_nbr, p.company_nbr, lc.last_contact
from people AS p
join lc on p.person.nbr = lc.person_nbr;

psql:companies-contacted-2025.sql:16: ERROR: missing FROM-clause entry for table "c"
LINE 3: group by c.person_nbr)

So, tweaking from reported errors:
WITH lc AS (SELECT p.person_nbr, max(c.next_contact) AS last_contact
from people as p, contacts as c
where next_contact >= '2025-11-01'
group by p.person_nbr)
select p.person_nbr, p.company_nbr, lc.last_contact
from people AS p
join lc on p.person.nbr = lc.person_nbr;

psql:companies-contacted-2025.sql:9: ERROR: missing FROM-clause entry for table "person"
LINE 7: join lc on p.person.nbr = lc.person_nbr;
^
This is obviously a much more complicated query than I expected.

Thanks,

Rich

#13Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#11)
Re: Extract only maximum date from column

On Thu, 4 Dec 2025, David G. Johnston wrote:

As mentioned, the aggregate max should be avoided - you aren’t doing
statistics, you are ranking.

David,

Got it.

Select person.*, lastcontact.* from person join lateral (select contact.*
from contact where contact.person_id=person.person_id order by
last_contact_date desc limit 1) as lastcontact on true;

Select person.*, lastcontact.*
from people
join lateral (select contact.*
from contacts
where contacts.person_nbr = people.person_nbr
order by last_contact_date
desc limit 1)
as lastcontact on true;

psql:companies-contacted-2025.sql:10: ERROR: missing FROM-clause entry for table "contact"
LINE 3: join lateral (select contact.*

So:
Select person.*, lastcontact.*
from people
join lateral (select contacts.*
from contacts
where contacts.person_nbr = people.person_nbr
order by last_contact_date
desc limit 1)
as lastcontact on true;

psql:companies-contacted-2025.sql:10: ERROR: column "last_contact_date" does not exist
LINE 6: order by last_contact_date
^
Regards,

Rich

#14David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#13)
Re: Extract only maximum date from column

On Thursday, December 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:

On Thu, 4 Dec 2025, David G. Johnston wrote:

As mentioned, the aggregate max should be avoided - you aren’t doing

statistics, you are ranking.

David,

Got it.

Select person.*, lastcontact.* from person join lateral (select contact.*

from contact where contact.person_id=person.person_id order by
last_contact_date desc limit 1) as lastcontact on true;

Select person.*, lastcontact.*
from people
join lateral (select contact.*
from contacts
where contacts.person_nbr = people.person_nbr
order by last_contact_date
desc limit 1)
as lastcontact on true;

psql:companies-contacted-2025.sql:10: ERROR: missing FROM-clause entry
for table "contact"
LINE 3: join lateral (select contact.*

So:
Select person.*, lastcontact.*
from people
join lateral (select contacts.*
from contacts
where contacts.person_nbr = people.person_nbr
order by last_contact_date
desc limit 1)
as lastcontact on true;

psql:companies-contacted-2025.sql:10: ERROR: column "last_contact_date"
does not exist
LINE 6: order by last_contact_date

I was giving you a query form. You should use the actual table and column
names in your schema…

David J.

#15Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rich Shepard (#12)
Re: Extract only maximum date from column

On 12/4/25 14:17, Rich Shepard wrote:

On Thu, 4 Dec 2025, Adrian Klaver wrote:

So, tweaking from reported errors:
WITH lc AS (SELECT p.person_nbr, max(c.next_contact) AS last_contact
     from people as p, contacts as c
     where next_contact >= '2025-11-01'
     group by p.person_nbr)
select p.person_nbr, p.company_nbr, lc.last_contact
from people AS p
join lc on p.person.nbr = lc.person_nbr;

psql:companies-contacted-2025.sql:9: ERROR:  missing FROM-clause entry
for table "person"
LINE 7: join lc on p.person.nbr = lc.person_nbr;

My mistake should be p.person_nbr

                   ^
This is obviously a much more complicated query than I expected.

Thanks,

Rich

--
Adrian Klaver
adrian.klaver@aklaver.com

#16Alban Hertroys
haramrae@gmail.com
In reply to: Rich Shepard (#1)
Re: Extract only maximum date from column

On 4 Dec 2025, at 20:55, Rich Shepard <rshepard@appl-ecosys.com> wrote:

I want the script to extract only the maximum `next_contact' date and
haven't learned how to do this from my reading of web sites.

The script:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
where c.next_contact >= '2025-11-01'
group by p.person_nbr, p.company_nbr, c.next_contact
order by p.person_nbr, p.company_nbr, max(c.next_contact);

returns all contacts rather than only the latest one.

Is using a sub-select the proper way?

TIA,

Rich

That looks like a classical case for a correlated subquery with WHERE NOT EXISTS.

Something like:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p
join contacts as c
on -- I’m really missing some kind of relation between p and c here, I think that’s relevant
where c.next_contact >= ‘2025-11-01’
-- make sure there’s no later contact
and not exists (
select 1 -- the value is irrelevant, as long as there’s no later instance of a contact
from contacts c2
where c2.next_contact >= ‘2025-11-01’
and c2.next_contact > c.next_contact
)

P.S. My mail-client tried to outsmart me in this reply (in no small part) and I just got back from the pub, so I can’t exactly guarantee correctness of the above, but the principal idea should be solid.

Alban Hertroys
--
There is always an exception to always.

#17Bernice Southey
bernice.southey@gmail.com
In reply to: Rich Shepard (#1)
Re: Extract only maximum date from column

Rich Shepard <rshepard@appl-ecosys.com> wrote:

select p.person_nbr, p.company_nbr, c.next_contact
from people as p, contacts as c
where c.next_contact >= '2025-11-01'
group by p.person_nbr, p.company_nbr, c.next_contact
order by p.person_nbr, p.company_nbr, max(c.next_contact);

returns all contacts rather than only the latest one.

DISTINCT ON might be what you're looking for. It's an extremely useful
feature in postgres and well worth understanding. Here's a nice
explanation, that's similar to your case:
https://www.geekytidbits.com/postgres-distinct-on/

select distinct on (p.person_nbr) p.person_nbr, p.company_nbr, c.next_contact
from people as p join contacts as c using (person_nbr)
where c.next_contact >= '2025-11-01' order by p.person_nbr, c.next_contact;

Using the following test data:
create table people(person_nbr, company_nbr) as values (1, 1), (2, 1), (3, 2);

create table contacts(person_nbr, next_contact) as values
(1, '2025-10-31'::date), (1, '2025-11-01'), (1, '2025, 11, 02'),
(3, '2025-11-02'::date), (3, '2025-11-03');

Here's the results:
person_nbr | company_nbr | next_contact
------------+-------------+--------------
1 | 1 | 2025-11-01
3 | 2 | 2025-11-02

DISTINCT ON can use an index matching the ORDER BY.
create index on contacts(person_nbr, next_contact);

Add some data to make the index worth it to the planner:
insert into contacts select i, '2025-11-05'::date + j from
generate_series(4, 100) i, generate_series(1, 100) j;
analyze contacts;

Check the query plan:
explain select distinct on (p.person_nbr) ...

This is what you want to see:
-> Index Only Scan using
contacts_person_nbr_next_contact_idx on contacts c
Index Cond: (next_contact >= '2025-11-01'::date)

Thanks, Bernice

#18Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Rich Shepard (#1)

Hi

Am I missing something?

select p.person_nbr
,p.company_nbr
,max(c.next_contact) -- the lasted contact in the group
from people as p
inner join contacts as c
on p.contact_id = c.id -- assumed join condition
where c.next_contact >= '2025-11-01'
group by p.person_nbr -- the group for which the latest contact is wanted
,p.company_nbr
order by p.person_nbr
,p.company_nbr;

Kind regards

Thiemo

#19Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#14)
Re: Extract only maximum date from column

On Thu, 4 Dec 2025, David G. Johnston wrote:

So:
Select person.*, lastcontact.*
from people
join lateral (select contacts.*
from contacts
where contacts.person_nbr = people.person_nbr
order by last_contact_date
desc limit 1)
as lastcontact on true;

I was giving you a query form. You should use the actual table and column
names in your schema…

David,

The two table names are 'people' and 'contacts', the columns are
'person_nbr' and 'last_contact.' I usually use aliases rather than the table
name to reference the column.

Rich

#20Rich Shepard
rshepard@appl-ecosys.com
In reply to: Thiemo Kellner (#18)
Re: Extract only maximum date from column [RESOLVED]

On Fri, 5 Dec 2025, Thiemo Kellner wrote:

Am I missing something?

Thiemo,

Nope. I was the one missing something:

select p.person_nbr, p.company_nbr, max(c.next_contact)
from people as p
inner join contacts as c
on p.person_nbr = c.person_nbr
where c.next_contact >= '2025-11-01'
group by p.person_nbr, p.company_nbr
order by p.person_nbr, p.company_nbr;

and,

person_nbr | company_nbr | max
------------+-------------+------------
226 | 736 | 2025-12-02
367 | 396 | 2026-01-06
464 | 2286 | 2025-11-05
465 | 702 | 2025-11-17
537 | 356 | 2025-11-03
553 | 367 | 2025-12-02
554 | 641 | 2025-12-02
...

Mea culpa!

Many thanks,

Rich

#21Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Rich Shepard (#20)