Thread: Complicated "group by" question

Complicated "group by" question

From
Andrew Perrin
Date:
I have a table of people ("reviewers"), a table of review assignments
("assign"), and a table of review acceptances ("accept"). I would like to
be able to write a query to return the latest (e.g., max(assign_date))
assignment for each reviewer, plus the acc_id field from "accept".  I
think I should be able to do this with a GROUP BY clause, but am having no
luck.

Table structure:

reviewers    assign        accept
-----------------------------------------
reviewer_id    assign_id    accept_id    reviewer_id    assign_id
...        assign_date    ...        ...


Thanks for any guidance.

Andy

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu



Re: Complicated "group by" question

From
Josh Berkus
Date:
Andrew,

> I have a table of people ("reviewers"), a table of review assignments
> ("assign"), and a table of review acceptances ("accept"). I would like to
> be able to write a query to return the latest (e.g., max(assign_date))
> assignment for each reviewer, plus the acc_id field from "accept".  I
> think I should be able to do this with a GROUP BY clause, but am having no
> luck.

Some vagueness: you didn't say whether you wanted to see two assignments if 
they have the same, latest date.   Nor did you specify whether you wanted to 
see assignments that had not been accepted (the below assumes yes to both)

Hmmm ... one way, SQL-standard:

SELECT reviewer.name, assign_date, acc_id
FROM reviewers JOIN assign ON reviewer.id = assign.reviewer_idLEFT OUTER JOIN accept ON assign.id = accept.assign_id
WHERE assign_date IN (SELECT max(ass2.assign_date) FROM assign ass2WHERE ass2.reviewer_id = reviewers.id)

or for a bit faster execution on PG you cann replace that WHERE clause with:

WHERE assign_date IN (SELECT ass2.assign_date FROM assign ass2WHERE ass2.reviewer_id = reviewers.id ORDER BY
ass2.assign_dateDESC LIMIT 1)
 

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


Re: Complicated "group by" question

From
Jean-Luc Lachance
Date:
Andrew,

If assing is not a many to many relation,
why did you not fold accept_id into assign?

Any way, here is the query you need:

select assign.reviewer_id, ss.max_assign_date,  accept.assign_id, accept.accept_id
from (  select reviewer_id, max( assign_date) as max_assign_date  from assign group by reviewer_id) as ss, assign,
accept
where ss.reviewer_id = assign.reviewer_id  and ss.max_assign_date = assign.assign_date  and assign.assign_id =
accept.assign_id;



Andrew Perrin wrote:

> I have a table of people ("reviewers"), a table of review assignments
> ("assign"), and a table of review acceptances ("accept"). I would like to
> be able to write a query to return the latest (e.g., max(assign_date))
> assignment for each reviewer, plus the acc_id field from "accept".  I
> think I should be able to do this with a GROUP BY clause, but am having no
> luck.
> 
> Table structure:
> 
> reviewers    assign        accept
> -----------------------------------------
> reviewer_id    assign_id    accept_id
>         reviewer_id    assign_id
> ...        assign_date
>         ...        ...
> 
> 
> Thanks for any guidance.
> 
> Andy
> 
> ----------------------------------------------------------------------
> Andrew J Perrin - http://www.unc.edu/~aperrin
> Assistant Professor of Sociology, U of North Carolina, Chapel Hill
> clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 


Re: Complicated "group by" question

From
Andrew Perrin
Date:
Excellent - thanks, Josh!

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu


On Wed, 25 Aug 2004, Josh Berkus wrote:

> Andrew,
>
> > I have a table of people ("reviewers"), a table of review assignments
> > ("assign"), and a table of review acceptances ("accept"). I would like to
> > be able to write a query to return the latest (e.g., max(assign_date))
> > assignment for each reviewer, plus the acc_id field from "accept".  I
> > think I should be able to do this with a GROUP BY clause, but am having no
> > luck.
>
> Some vagueness: you didn't say whether you wanted to see two assignments if
> they have the same, latest date.   Nor did you specify whether you wanted to
> see assignments that had not been accepted (the below assumes yes to both)
>
> Hmmm ... one way, SQL-standard:
>
> SELECT reviewer.name, assign_date, acc_id
> FROM reviewers JOIN assign ON reviewer.id = assign.reviewer_id
>     LEFT OUTER JOIN accept ON assign.id = accept.assign_id
> WHERE assign_date IN (SELECT max(ass2.assign_date) FROM assign ass2
>     WHERE ass2.reviewer_id = reviewers.id)
>
> or for a bit faster execution on PG you cann replace that WHERE clause with:
>
> WHERE assign_date IN (SELECT ass2.assign_date FROM assign ass2
>     WHERE ass2.reviewer_id = reviewers.id ORDER BY ass2.assign_date DESC LIMIT 1)
>
> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>


Re: Complicated "group by" question

From
Jeff Boes
Date:
Andrew Perrin wrote:
> I have a table of people ("reviewers"), a table of review assignments
> ("assign"), and a table of review acceptances ("accept"). I would like to
> be able to write a query to return the latest (e.g., max(assign_date))
> assignment for each reviewer, plus the acc_id field from "accept".  I
> think I should be able to do this with a GROUP BY clause, but am having no
> luck.
> 
> Table structure:
> 
> reviewers    assign        accept
> -----------------------------------------
> reviewer_id    assign_id    accept_id
>         reviewer_id    assign_id
> ...        assign_date
>         ...        ...
> 


I think you want to write a non-GROUPed query using "DISTINCT ON". 
Something like this:

SELECT DISTINCT ON (reviewer_id,assign_id) reviewer_id, assign_id, assign_date, accept_id
FROM reviewers
JOIN assign USING (reviewer_id)
JOIN accept USING (accept_id)
ORDER BY reviewer_id, assign_id, assign_date DESC;


-- 
(Posted from an account used as a SPAM dump. If you really want to get
in touch with me, dump the 'jboes' and substitute 'mur'.)
________
Jeffery Boes <>< jboes@qtm.net