Quota query with decent performance? - Mailing list pgsql-sql

From Troels Arvin
Subject Quota query with decent performance?
Date
Msg-id pan.2003.11.11.22.58.08.758402@arvin.dk
Whole thread Raw
Responses Re: Quota query with decent performance?  (Josh Berkus <josh@agliodbs.com>)
Re: Quota query with decent performance?  (Chester Kustarz <chester@arbor.net>)
List pgsql-sql
Hello,

I'm researching how "quota queries" (a term used by Fabian Pascal) may be
performed in various DBMSes with acceptable performance:
http://troels.arvin.dk/db/rdbms/#select-limit-simple-note

An example of a quota query could be to get the top-3 youngest people from
a collection of people. The complicated part is that such a query might
return more than 3 rows in some tie situations.

In MSSQL and DB2 there are very efficient facilities for such queries, but
I can't find any well-performing declarative methods for PostgreSQL. I
have tried a couple of different strategies, and I currently get the best
results from a correlated subquery like

SELECT * FROM person AS px
WHERE ( SELECT COUNT(*) FROM person AS py WHERE py.age < px.age
) < 3;

When my base table has 4000 rows, my query takes 27 seconds in PostgreSQL
7.2.3 (PIII 1000MHz) which is clearly unacceptable, especially comparing
to the same query in DB2 which only takes 1.4 seconds (on the same server)
- or to this non-standard-SQL DB2-query which only takes 0.02 seconds to
calculate the same result:

SELECT *
FROM ( SELECT firstname,age,RANK() OVER (ORDER BY age ASC) AS rank FROM person
) AS foo
WHERE rank<=3;

Test-files with table definitions and randomly generated rows:
http://troels.arvin.dk/db/tests/quota.1/

Any suggestions on how to perform fast "quota queries" in PostgreSQL?

-- 
Greetings from Troels Arvin, Copenhagen, Denmark




pgsql-sql by date:

Previous
From: "Nick Fankhauser"
Date:
Subject: Re: Is there a more elegant way to write this query?...
Next
From: Josh Berkus
Date:
Subject: Re: Quota query with decent performance?