Rich Shepard <rshepard@appl-ecosys.com> wrote:
> select p.person_nbr, p.company_nbr, c.next_contact
> from people as p, contacts as c
> where c.next_contact >= '2025-11-01'
> group by p.person_nbr, p.company_nbr, c.next_contact
> order by p.person_nbr, p.company_nbr, max(c.next_contact);
>
> returns all contacts rather than only the latest one.
DISTINCT ON might be what you're looking for. It's an extremely useful
feature in postgres and well worth understanding. Here's a nice
explanation, that's similar to your case:
https://www.geekytidbits.com/postgres-distinct-on/
select distinct on (p.person_nbr) p.person_nbr, p.company_nbr, c.next_contact
from people as p join contacts as c using (person_nbr)
where c.next_contact >= '2025-11-01' order by p.person_nbr, c.next_contact;
Using the following test data:
create table people(person_nbr, company_nbr) as values (1, 1), (2, 1), (3, 2);
create table contacts(person_nbr, next_contact) as values
(1, '2025-10-31'::date), (1, '2025-11-01'), (1, '2025, 11, 02'),
(3, '2025-11-02'::date), (3, '2025-11-03');
Here's the results:
person_nbr | company_nbr | next_contact
------------+-------------+--------------
1 | 1 | 2025-11-01
3 | 2 | 2025-11-02
DISTINCT ON can use an index matching the ORDER BY.
create index on contacts(person_nbr, next_contact);
Add some data to make the index worth it to the planner:
insert into contacts select i, '2025-11-05'::date + j from
generate_series(4, 100) i, generate_series(1, 100) j;
analyze contacts;
Check the query plan:
explain select distinct on (p.person_nbr) ...
This is what you want to see:
-> Index Only Scan using
contacts_person_nbr_next_contact_idx on contacts c
Index Cond: (next_contact >= '2025-11-01'::date)
Thanks, Bernice