Re: [SQL] Trouble with massive select statement. - Mailing list pgsql-sql

From Tom Lane
Subject Re: [SQL] Trouble with massive select statement.
Date
Msg-id 18337.930007761@sss.pgh.pa.us
Whole thread Raw
In response to Trouble with massive select statement.  (Darren Greer <dgreer@websightsolutions.com>)
List pgsql-sql
Darren Greer <dgreer@websightsolutions.com> writes:
> Now what I need to do is select all the users who have taken a test
> (for a particular station) within, say the last so many number of
> days.  The following select statement does that.

> select distinct username, station 
>     from testmaster t1 
>     where 1 < (
>         select count(t2.username) 
>             from testmaster t2
>             where t2.test_date > '05-14-1999' 
>             and t2.station = 'WZZZ' 
>             and t1.username = t2.username
>         )
> ;

The above doesn't seem to me to do what you claim you want to do ---
it seems to be looking for users who have taken the test *more than
once* in the given interval.

Assuming that the code is right and the comment wrong ;-), I agree
that this is the hard way to do it.  The inner select will be
re-evaluated from scratch for every tuple scanned by the outer select,
so your run time is proportional to the square of the number of tuples
in testmaster.  Not good.  I think you want to use a HAVING clause:

SELECT username, station FROM testmaster WHERE test_date > '05-14-1999' and station = 'WZZZ'  GROUP BY username,
stationHAVING count(*) > 1;
 

The WHERE selects only the tuples you care about, the GROUP BY collects
them into groups with the same username & station (thus serving the
purpose you were using DISTINCT for; you don't need DISTINCT with GROUP
BY); and finally the HAVING selects only the groups you care about, ie
those containing more than one tuple.  (When you use GROUP BY, count(*)
and other aggregates only aggregate over a group, not the whole table.)
        regards, tom lane


pgsql-sql by date:

Previous
From: Darren Greer
Date:
Subject: Trouble with massive select statement.
Next
From: "Hostmaster - Internet au Virtuel Inc."
Date:
Subject: ODBC SQL question