Re: Combining two SELECTs - Mailing list pgsql-general

From Tom Lane
Subject Re: Combining two SELECTs
Date
Msg-id 823.962781230@sss.pgh.pa.us
Whole thread Raw
In response to Combining two SELECTs  ("Eric Jain" <jain@gmx.net>)
Responses RE: Combining two SELECTs  ("Eric Jain" <jain@gmx.net>)
List pgsql-general
"Eric Jain" <jain@gmx.net> writes:
> Any ideas how the following two statements could be combined into a
> single one?

> SELECT DISTINCT host, url, id
> INTO TEMP
> FROM log
> WHERE
>   host IN (SELECT host FROM robots)
>   AND status IN (200, 304);

> SELECT host, COUNT(*) AS hits
> FROM TEMP
> GROUP BY host
> ORDER BY hits DESC;

Offhand I do not think you can do this in one "simple" SQL query,
because the SQL query semantics require that GROUP BY grouping occurs
before DISTINCT processing, whereas you want the other order.

(I'm assuming you need exactly these semantics, and not closely-
related ones as someone else suggested.)

By 7.2 or so, we hope to support sub-SELECTs in FROM, which'd let
you do this along the lines of

    SELECT host,COUNT(*) FROM (SELECT DISTINCT host, ...)
        GROUP BY ...

You might try to do it today by defining the SELECT DISTINCT as
a view and then selecting from the view with GROUP BY, but I
expect it won't work --- presently, views are implemented by
expanding the view macro-style, so they don't work for any case
that you couldn't write out as a single SQL-compliant query.
(Again, we hope to make this work better in 7.2.)

For now, the temp table seems like a good workaround.

            regards, tom lane

pgsql-general by date:

Previous
From: Philip Warner
Date:
Subject: Re: [HACKERS] Re: Revised Copyright: is this morepalatable?
Next
From: Guillaume Perréal
Date:
Subject: Re: Combining two SELECTs