Thread: Complicated "group by" question
I have a table of people ("reviewers"), a table of review assignments ("assign"), and a table of review acceptances ("accept"). I would like to be able to write a query to return the latest (e.g., max(assign_date)) assignment for each reviewer, plus the acc_id field from "accept". I think I should be able to do this with a GROUP BY clause, but am having no luck. Table structure: reviewers assign accept ----------------------------------------- reviewer_id assign_id accept_id reviewer_id assign_id ... assign_date ... ... Thanks for any guidance. Andy ---------------------------------------------------------------------- Andrew J Perrin - http://www.unc.edu/~aperrin Assistant Professor of Sociology, U of North Carolina, Chapel Hill clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu
Andrew, > I have a table of people ("reviewers"), a table of review assignments > ("assign"), and a table of review acceptances ("accept"). I would like to > be able to write a query to return the latest (e.g., max(assign_date)) > assignment for each reviewer, plus the acc_id field from "accept". I > think I should be able to do this with a GROUP BY clause, but am having no > luck. Some vagueness: you didn't say whether you wanted to see two assignments if they have the same, latest date. Nor did you specify whether you wanted to see assignments that had not been accepted (the below assumes yes to both) Hmmm ... one way, SQL-standard: SELECT reviewer.name, assign_date, acc_id FROM reviewers JOIN assign ON reviewer.id = assign.reviewer_idLEFT OUTER JOIN accept ON assign.id = accept.assign_id WHERE assign_date IN (SELECT max(ass2.assign_date) FROM assign ass2WHERE ass2.reviewer_id = reviewers.id) or for a bit faster execution on PG you cann replace that WHERE clause with: WHERE assign_date IN (SELECT ass2.assign_date FROM assign ass2WHERE ass2.reviewer_id = reviewers.id ORDER BY ass2.assign_dateDESC LIMIT 1) -- Josh Berkus Aglio Database Solutions San Francisco
Andrew, If assing is not a many to many relation, why did you not fold accept_id into assign? Any way, here is the query you need: select assign.reviewer_id, ss.max_assign_date, accept.assign_id, accept.accept_id from ( select reviewer_id, max( assign_date) as max_assign_date from assign group by reviewer_id) as ss, assign, accept where ss.reviewer_id = assign.reviewer_id and ss.max_assign_date = assign.assign_date and assign.assign_id = accept.assign_id; Andrew Perrin wrote: > I have a table of people ("reviewers"), a table of review assignments > ("assign"), and a table of review acceptances ("accept"). I would like to > be able to write a query to return the latest (e.g., max(assign_date)) > assignment for each reviewer, plus the acc_id field from "accept". I > think I should be able to do this with a GROUP BY clause, but am having no > luck. > > Table structure: > > reviewers assign accept > ----------------------------------------- > reviewer_id assign_id accept_id > reviewer_id assign_id > ... assign_date > ... ... > > > Thanks for any guidance. > > Andy > > ---------------------------------------------------------------------- > Andrew J Perrin - http://www.unc.edu/~aperrin > Assistant Professor of Sociology, U of North Carolina, Chapel Hill > clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
Excellent - thanks, Josh! ---------------------------------------------------------------------- Andrew J Perrin - http://www.unc.edu/~aperrin Assistant Professor of Sociology, U of North Carolina, Chapel Hill clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu On Wed, 25 Aug 2004, Josh Berkus wrote: > Andrew, > > > I have a table of people ("reviewers"), a table of review assignments > > ("assign"), and a table of review acceptances ("accept"). I would like to > > be able to write a query to return the latest (e.g., max(assign_date)) > > assignment for each reviewer, plus the acc_id field from "accept". I > > think I should be able to do this with a GROUP BY clause, but am having no > > luck. > > Some vagueness: you didn't say whether you wanted to see two assignments if > they have the same, latest date. Nor did you specify whether you wanted to > see assignments that had not been accepted (the below assumes yes to both) > > Hmmm ... one way, SQL-standard: > > SELECT reviewer.name, assign_date, acc_id > FROM reviewers JOIN assign ON reviewer.id = assign.reviewer_id > LEFT OUTER JOIN accept ON assign.id = accept.assign_id > WHERE assign_date IN (SELECT max(ass2.assign_date) FROM assign ass2 > WHERE ass2.reviewer_id = reviewers.id) > > or for a bit faster execution on PG you cann replace that WHERE clause with: > > WHERE assign_date IN (SELECT ass2.assign_date FROM assign ass2 > WHERE ass2.reviewer_id = reviewers.id ORDER BY ass2.assign_date DESC LIMIT 1) > > -- > Josh Berkus > Aglio Database Solutions > San Francisco >
Andrew Perrin wrote: > I have a table of people ("reviewers"), a table of review assignments > ("assign"), and a table of review acceptances ("accept"). I would like to > be able to write a query to return the latest (e.g., max(assign_date)) > assignment for each reviewer, plus the acc_id field from "accept". I > think I should be able to do this with a GROUP BY clause, but am having no > luck. > > Table structure: > > reviewers assign accept > ----------------------------------------- > reviewer_id assign_id accept_id > reviewer_id assign_id > ... assign_date > ... ... > I think you want to write a non-GROUPed query using "DISTINCT ON". Something like this: SELECT DISTINCT ON (reviewer_id,assign_id) reviewer_id, assign_id, assign_date, accept_id FROM reviewers JOIN assign USING (reviewer_id) JOIN accept USING (accept_id) ORDER BY reviewer_id, assign_id, assign_date DESC; -- (Posted from an account used as a SPAM dump. If you really want to get in touch with me, dump the 'jboes' and substitute 'mur'.) ________ Jeffery Boes <>< jboes@qtm.net