Darren Greer <dgreer@websightsolutions.com> writes:
> Now what I need to do is select all the users who have taken a test
> (for a particular station) within, say the last so many number of
> days. The following select statement does that.
> select distinct username, station
> from testmaster t1
> where 1 < (
> select count(t2.username)
> from testmaster t2
> where t2.test_date > '05-14-1999'
> and t2.station = 'WZZZ'
> and t1.username = t2.username
> )
> ;
The above doesn't seem to me to do what you claim you want to do ---
it seems to be looking for users who have taken the test *more than
once* in the given interval.
Assuming that the code is right and the comment wrong ;-), I agree
that this is the hard way to do it. The inner select will be
re-evaluated from scratch for every tuple scanned by the outer select,
so your run time is proportional to the square of the number of tuples
in testmaster. Not good. I think you want to use a HAVING clause:
SELECT username, station FROM testmaster WHERE test_date > '05-14-1999' and station = 'WZZZ' GROUP BY username,
stationHAVING count(*) > 1;
The WHERE selects only the tuples you care about, the GROUP BY collects
them into groups with the same username & station (thus serving the
purpose you were using DISTINCT for; you don't need DISTINCT with GROUP
BY); and finally the HAVING selects only the groups you care about, ie
those containing more than one tuple. (When you use GROUP BY, count(*)
and other aggregates only aggregate over a group, not the whole table.)
regards, tom lane