Thread: Group By Question
I have two tables and want to attach and return the most recent data from the second table. Table 1 has a counter ID and name. Table 2 has a counter ID, Reference (to Table 1 ID), Date, and Like. I want to do a query that gets each name and their most recent like. I have a unique key setup on likes for the reference and date so I know there is only 1 per day. I can do this query fine: SELECT test.people.id, test.people.name, test.likes.ref, MAX(test.likes.date) FROM test.people LEFT JOIN test.likes ON test.people.id = test.likes.ref GROUP BY test.people.id, test.people.name, test.likes.ref However, when I try to add in test.likes.id OR test.likes.likes I get an error that it has to be included in the Group By (do not want that) or has to be an aggregate function. I just want the value of those fields from whatever row it is getting the MAX(date) field. How can I return those values? Thanks, Andrew ============ VISUALLY ============ people ============ ID Name 1 Andrew 2 Bob 3 John ========== likes ========== ID Ref Date Likes 1 1 2010-09-30 Candy 2 3 2010-09-30 Fruit 3 3 2010-01-01 Nuts ======================== SQL in test schema ======================== CREATE TABLE test.likes ( id serial NOT NULL, ref integer NOT NULL, date date, likes character varying(255), CONSTRAINT like_pkey PRIMARY KEY (id), CONSTRAINT likes_ref_key UNIQUE (ref, date) ) WITH (OIDS=FALSE); CREATE TABLE test.people ( id serial NOT NULL, "name" character varying(255), CONSTRAINT people_pkey PRIMARY KEY (id) ) WITH (OIDS=FALSE); INSERT INTO people (id, name) VALUES (1, 'Andrew'); INSERT INTO people (id, name) VALUES (2, 'Bob'); INSERT INTO people (id, name) VALUES (3, 'John'); INSERT INTO likes (id, ref, date, likes) VALUES (1, 1, '2010-09-30', 'Candy'); INSERT INTO likes (id, ref, date, likes) VALUES (2, 3, '2010-09-30', 'Fruit'); INSERT INTO likes (id, ref, date, likes) VALUES (3, 3, '2010-01-01', 'Nuts');
Andrew E. Tegenkamp wrote: > I have two tables and want to attach and return the most recent data from > the second table. > > Table 1 has a counter ID and name. Table 2 has a counter ID, Reference (to > Table 1 ID), Date, and Like. I want to do a query that gets each name and > their most recent like. I have a unique key setup on likes for the reference > and date so I know there is only 1 per day. I can do this query fine: > > SELECT test.people.id, test.people.name, test.likes.ref, MAX(test.likes.date) > FROM test.people LEFT JOIN test.likes ON test.people.id = test.likes.ref > GROUP BY test.people.id, test.people.name, test.likes.ref > > However, when I try to add in test.likes.id OR test.likes.likes I get an > error that it has to be included in the Group By (do not want that) or has > to be an aggregate function. I just want the value of those fields from > whatever row it is getting the MAX(date) field. > > How can I return those values? <snip> I believe you'll need to use SQL subqueries to force a different order of operations, such as group/max before join, and so on. Something like this: SELECT test.people.id, test.people.name, filt_likes.ref, filt_likes.date, filt_likes.likes FROM test.people LEFT JOIN ( SELECT test.likes.* FROM test.likes INNER JOIN ( SELECT ref, MAX(date) AS max_date FROM test.likes GROUP BY ref ) AS filt ON test.likes.ref = filt.ref AND test_likes.date = filt.max_date ) AS filt_likes ON test.people.id = filt_likes.ref Try testing that. -- Darren Duncan
* Andrew E. Tegenkamp wrote: > I have two tables and want to attach and return the most recent data from > the second table. > > Table 1 has a counter ID and name. Table 2 has a counter ID, Reference (to > Table 1 ID), Date, and Like. I want to do a query that gets each name and > their most recent like. I have a unique key setup on likes for the reference > and date so I know there is only 1 per day. I can do this query fine: > > SELECT test.people.id, test.people.name, test.likes.ref, > MAX(test.likes.date) > FROM test.people LEFT JOIN test.likes ON test.people.id = test.likes.ref > GROUP BY test.people.id, test.people.name, test.likes.ref > > However, when I try to add in test.likes.id OR test.likes.likes I get an > error that it has to be included in the Group By (do not want that) or has > to be an aggregate function. I just want the value of those fields from > whatever row it is getting the MAX(date) field. SELECT p.name, l.date, l.likes FROM people p LEFT JOIN (SELECT l1.ref, l1.date, l1.likes FROM likes l1 GROUP BY l1.ref, l1.date, l1.likes HAVING l1.date = (SELECT max(date) FROM likes WHERE ref = l1.ref)) l ON (p.id = l.ref); Or the newfangled way, replacing the inner subselect with a window: SELECT p.id, p.name, l.likes FROM people p LEFT JOIN (SELECT l1.ref, l1.likes, l1.date, max(l1.date) OVER (PARTITION BY ref) AS maxdate FROM likes l1) l ON (p.id = l.ref AND l.date = l.maxdate); On this "dataset", the windowed version is estimated to be ~ 60% faster than the grouped one, and the actual execution time is ~ 20% lower. -- Christian
That's getting too complicated. It can be done simply as: SELECT DISTINCT(test.people.id) test.people.id, test.people.name, test.likes.ref FROM test.people LEFT JOIN test.likes ON test.people.id = test.likes.ref ORDER BY test.people.id, test.likes.date DESCENDING Assuming people.id is unique, the order by clause makes the first row of every group the most recent like and distinct expression keeps only the first row for each person. Chris -- Chris Velevitch Manager - Adobe Platform Users Group, Sydney m: 0415 469 095 www.apugs.org.au Adobe Platform Users Group, Sydney October 2010: Flash Builder for SalesForce Date: 25th October, 6pm for 6:30 start Details and RSVP coming soon