Thread: Combining two SELECTs

Combining two SELECTs

From
"Eric Jain"
Date:
Any ideas how the following two statements could be combined into a
single one?

This would greatly simplify the integration of this query into an
existing web interface...

SELECT DISTINCT host, url, id
INTO TEMP
FROM log
WHERE
  host IN (SELECT host FROM robots)
  AND status IN (200, 304);

SELECT host, COUNT(*) AS hits
FROM TEMP
GROUP BY host
ORDER BY hits DESC;


--
Eric Jain


RE: Combining two SELECTs

From
"Eric Jain"
Date:
> I don't see why this wouldn't work:
>
> SELECT log.host,count(*) as hits  FROM log, robots WHERE
> log.host=robots.host AND status IN (200,304)
> GROUP BY log.host ORDER BY hits DESC ;
>
> Len Morgan

Thanks. This does work of course, however I forgot to mention that the
query must filter out any duplicates in the 'log' table. Simplified:

time | host    | url
-----+---------+--------
001  | dec.com | index
003  | dec.com | index
011  | dec.com | index
015  | dec.com | content
057  | xyz.com | index


desired result:

host    | hits
--------+-----
dec.com | 2
xyz.com | 1


--
Eric Jain


Re: Combining two SELECTs

From
Tom Lane
Date:
"Eric Jain" <jain@gmx.net> writes:
> Any ideas how the following two statements could be combined into a
> single one?

> SELECT DISTINCT host, url, id
> INTO TEMP
> FROM log
> WHERE
>   host IN (SELECT host FROM robots)
>   AND status IN (200, 304);

> SELECT host, COUNT(*) AS hits
> FROM TEMP
> GROUP BY host
> ORDER BY hits DESC;

Offhand I do not think you can do this in one "simple" SQL query,
because the SQL query semantics require that GROUP BY grouping occurs
before DISTINCT processing, whereas you want the other order.

(I'm assuming you need exactly these semantics, and not closely-
related ones as someone else suggested.)

By 7.2 or so, we hope to support sub-SELECTs in FROM, which'd let
you do this along the lines of

    SELECT host,COUNT(*) FROM (SELECT DISTINCT host, ...)
        GROUP BY ...

You might try to do it today by defining the SELECT DISTINCT as
a view and then selecting from the view with GROUP BY, but I
expect it won't work --- presently, views are implemented by
expanding the view macro-style, so they don't work for any case
that you couldn't write out as a single SQL-compliant query.
(Again, we hope to make this work better in 7.2.)

For now, the temp table seems like a good workaround.

            regards, tom lane

Re: Combining two SELECTs

From
Guillaume Perréal
Date:
Tom Lane wrote:
>
> "Eric Jain" <jain@gmx.net> writes:
> > Any ideas how the following two statements could be combined into a
> > single one?
>
> > SELECT DISTINCT host, url, id
> > INTO TEMP
> > FROM log
> > WHERE
> >   host IN (SELECT host FROM robots)
> >   AND status IN (200, 304);
>
> > SELECT host, COUNT(*) AS hits
> > FROM TEMP
> > GROUP BY host
> > ORDER BY hits DESC;
>
> Offhand I do not think you can do this in one "simple" SQL query,
> because the SQL query semantics require that GROUP BY grouping occurs
> before DISTINCT processing, whereas you want the other order.
>
>
> For now, the temp table seems like a good workaround.
>

And splitting some complex queries in simpler ones (using temp tables) can
increase performance, depending on the query.

Regards,
Guillaume Perréal - Stagiaire MIAG
Cemagref (URH), Lyon, France
Tél: (+33) 4.72.20.87.64

Re: Combining two SELECTs

From
Date:
SELECT count(*), host, url, id
from log
group by host, url, id

You want to include your where and order by clauses - but that should get
you the results your looking for.  I think :)

Steve

On Tue, 4 Jul 2000, Eric Jain wrote:

> Any ideas how the following two statements could be combined into a
> single one?
>
> This would greatly simplify the integration of this query into an
> existing web interface...
>
> SELECT DISTINCT host, url, id
> INTO TEMP
> FROM log
> WHERE
>   host IN (SELECT host FROM robots)
>   AND status IN (200, 304);
>
> SELECT host, COUNT(*) AS hits
> FROM TEMP
> GROUP BY host
> ORDER BY hits DESC;
>
>
> --
> Eric Jain
>


RE: Combining two SELECTs

From
"Eric Jain"
Date:
> By 7.2 or so, we hope to support sub-SELECTs in FROM, which'd let
> you do this along the lines of
>
>     SELECT host,COUNT(*) FROM (SELECT DISTINCT host, ...)
>         GROUP BY ...

That would be great. Would have come in handy quite a few times
already...


--
Eric Jain