Thread: LIMIT within UNION?

LIMIT within UNION?

From
Andrew Perrin
Date:
Greetings-

I have a table of participants to be contacted for a study. Some are in
the "exposure" group, others in the "control" group. This is designated by
a column, typenr, that contains 1 for exposure, 2 for control.

The complication is this: I need to select 200 total. The 200 number
should include *all* those eligible in the exposure group, plus enough
from the control group to bring the total number up to 200. (Yes, there is
a valid reason for this.) Furthermore, I need to sort the output of the
two groups *together* by zip code.

What I've done is to write a script that counts the number of eligible
exposure candidates:

SELECT count(*) FROM participants WHERE <eligibility>AND typenr=1

Then subtract that number (currently 28) from 200 to get 172 control
participants. Then the problem starts. 

SELECT ... FROM participants
WHERE typenr=1 AND <eligibility>
UNION
SELECT ... FROM participants
WHERE typenr=2 LIMIT 172
ORDER BY zip;

returns ERROR:  parser: parse error at or near "ORDER"

I've tried a variety of parentheses to no avail.

Can someone shed some light?

Thanks!

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu




Re: LIMIT within UNION?

From
Tom Lane
Date:
Andrew Perrin <clists@perrin.socsci.unc.edu> writes:
> SELECT ... FROM participants
> WHERE typenr=1 AND <eligibility>
> UNION
> SELECT ... FROM participants
> WHERE typenr=2 LIMIT 172
> ORDER BY zip;

I think you need

SELECT * FROM
(
SELECT ... FROM participants
WHERE typenr=1 AND <eligibility>
UNION
(SELECT ... FROM participants
WHERE typenr=2 LIMIT 172)
) ss
ORDER BY zip;

Not sure if the inner set of parens is essential, but it might be.
The outer SELECT superstructure is definitely necessary to give a
place to hang the ORDER BY on.
        regards, tom lane


Re: LIMIT within UNION?

From
Roland Roberts
Date:
>>>>> "Andrew" == Andrew Perrin <clists@perrin.socsci.unc.edu> writes:
   Andrew> What I've done is to write a script that counts the number   Andrew> of eligible exposure candidates:
   Andrew> SELECT count(*) FROM participants WHERE <eligibility>   Andrew>  AND typenr=1
   Andrew> Then subtract that number (currently 28) from 200 to get   Andrew> 172 control participants. Then the
problemstarts.
 
   [...]
   Andrew> returns ERROR:  parser: parse error at or near "ORDER"

Can you do this via a subselect:

SELECT * FROM    ( SELECT ... FROM participants     WHERE typenr=1 AND <eligibility>     UNION     SELECT ... FROM
participants    WHERE typenr=2 LIMIT 172 )
 
ORDER BY zip;

roland
--            PGP Key ID: 66 BC 3B CD
Roland B. Roberts, PhD                             RL Enterprises
roland@rlenter.com                     76-15 113th Street, Apt 3B
roland@astrofoto.org                       Forest Hills, NY 11375


Re: LIMIT within UNION?

From
Andrew Perrin
Date:
Thanks! That did it.

The inner parens are necessary - without them the ORDER BY seems to be
parsed as part of the second subquery and is therefore a syntax error.

Best,
Andy

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu


On Thu, 12 Sep 2002, Tom Lane wrote:

> Andrew Perrin <clists@perrin.socsci.unc.edu> writes:
> > SELECT ... FROM participants
> > WHERE typenr=1 AND <eligibility>
> > UNION
> > SELECT ... FROM participants
> > WHERE typenr=2 LIMIT 172
> > ORDER BY zip;
> 
> I think you need
> 
> SELECT * FROM
> (
> SELECT ... FROM participants
> WHERE typenr=1 AND <eligibility>
> UNION
> (SELECT ... FROM participants
> WHERE typenr=2 LIMIT 172)
> ) ss
> ORDER BY zip;
> 
> Not sure if the inner set of parens is essential, but it might be.
> The outer SELECT superstructure is definitely necessary to give a
> place to hang the ORDER BY on.
> 
>             regards, tom lane
> 



Re: LIMIT within UNION?

From
Andrew Perrin
Date:
On 12 Sep 2002, Roland Roberts wrote:

> >>>>> "Andrew" == Andrew Perrin <clists@perrin.socsci.unc.edu> writes:
> ... 
> Can you do this via a subselect:
> 
> SELECT * FROM 
>     ( SELECT ... FROM participants
>       WHERE typenr=1 AND <eligibility>
>       UNION
>       SELECT ... FROM participants
>       WHERE typenr=2 LIMIT 172 )
> ORDER BY zip;
> 

Unfortunately in this case the LIMIT is applied to the fully-UNIONed set,
limiting the total number of cases to 172 instead of just those from the
second subquery. Tom Lane's example worked, though.

Thanks,
Andy


----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu






Re: LIMIT within UNION?

From
Stephan Szabo
Date:
On Thu, 12 Sep 2002, Andrew Perrin wrote:

> Greetings-
>
> I have a table of participants to be contacted for a study. Some are in
> the "exposure" group, others in the "control" group. This is designated by
> a column, typenr, that contains 1 for exposure, 2 for control.
>
> The complication is this: I need to select 200 total. The 200 number
> should include *all* those eligible in the exposure group, plus enough
> from the control group to bring the total number up to 200. (Yes, there is
> a valid reason for this.) Furthermore, I need to sort the output of the
> two groups *together* by zip code.

Do you get more than 200 if there are more eligible people and does the
... ever include the same person in both sides of the union?

If not in the second case, union all would probably save the database
some extra work since it won't have to try to weed out duplicates.

If not in the first case, then wouldn't a limit 200 on the after union
result set work rather than a separate count and subtraction?




Re: LIMIT within UNION?

From
Andrew Perrin
Date:
On Thu, 12 Sep 2002, Stephan Szabo wrote:

> On Thu, 12 Sep 2002, Andrew Perrin wrote:
> 
> > Greetings-
> >
> > I have a table of participants to be contacted for a study. Some are in
> > the "exposure" group, others in the "control" group. This is designated by
> > a column, typenr, that contains 1 for exposure, 2 for control.
> >
> > The complication is this: I need to select 200 total. The 200 number
> > should include *all* those eligible in the exposure group, plus enough
> > from the control group to bring the total number up to 200. (Yes, there is
> > a valid reason for this.) Furthermore, I need to sort the output of the
> > two groups *together* by zip code.
> 
> Do you get more than 200 if there are more eligible people 

Yes - in the (rather rare) case that there are 200 or more eligible
exposure subjects, the result set should be the total number of eligible
exposure subjects.

> and does the
> ... ever include the same person in both sides of the union?

No; each person is only in one of the two sides.

> 
> If not in the second case, union all would probably save the database
> some extra work since it won't have to try to weed out duplicates.

I'll try that.

> 
> If not in the first case, then wouldn't a limit 200 on the after union
> result set work rather than a separate count and subtraction?
> 

Interesting - this would count on the UNION including all cases in the
first query before those in the second query. Are UNIONed records
presented in any predictable order?



----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu






Re: LIMIT within UNION?

From
Tom Lane
Date:
Andrew Perrin <clists@perrin.socsci.unc.edu> writes:
> Interesting - this would count on the UNION including all cases in the
> first query before those in the second query. Are UNIONed records
> presented in any predictable order?

If you can use UNION ALL then this would be pretty safe to rely on.
If you must use UNION (to eliminate dups) then it won't work.
        regards, tom lane


Re: LIMIT within UNION?

From
Josh Berkus
Date:
Andrew,

Another approach, one that does not require you to know before constructing
the query how many eligible subjects there are, is not to use a union at all:

SELECT * FROM (
SELECT ... , zip
FROM participants
WHERE (typenr = 1 and <eligibility criteria)
OR (typenr = 2)
ORDER BY (typenr = 1 and <eligibility criteria>) DESC
LIMIT 200 ) p1
ORDER BY zip;

The inner query gives you all of the records that meet the eligibility
criteria, plus all of the records that have typenr = 2, in the order of
whether or not they meet the criteria (as a boolean value) and truncates it
at 200 records.
The outer query then re-sorts this result in zip order.

This seems, to me, much more flexible than using a UNION query.

--
-Josh BerkusAglio Database SolutionsSan Francisco



Re: LIMIT within UNION?

From
Andrew Perrin
Date:
This is certainly the most elegant solution - thanks!

Andy

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu


On Thu, 12 Sep 2002, Josh Berkus wrote:

> Andrew,
> 
> Another approach, one that does not require you to know before constructing 
> the query how many eligible subjects there are, is not to use a union at all:
> 
> SELECT * FROM (
> SELECT ... , zip
> FROM participants
> WHERE (typenr = 1 and <eligibility criteria)
> OR (typenr = 2)
> ORDER BY (typenr = 1 and <eligibility criteria>) DESC
> LIMIT 200 ) p1
> ORDER BY zip;
> 
> The inner query gives you all of the records that meet the eligibility 
> criteria, plus all of the records that have typenr = 2, in the order of 
> whether or not they meet the criteria (as a boolean value) and truncates it 
> at 200 records.
> The outer query then re-sorts this result in zip order.
> 
> This seems, to me, much more flexible than using a UNION query.
> 
> -- 
> -Josh Berkus
>  Aglio Database Solutions
>  San Francisco
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
>