Re: Optimizing DISTINCT with LIMIT - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: Optimizing DISTINCT with LIMIT
Date
Msg-id 878wqw117l.fsf@oxford.xeocode.com
Whole thread Raw
In response to Optimizing DISTINCT with LIMIT  (tmp <skrald@amossen.dk>)
Responses Re: Optimizing DISTINCT with LIMIT  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
tmp <skrald@amossen.dk> writes:

>   If the input is "sufficiently" large and the LIMIT threshold "sufficiently"
> small, maintain the DISTINCT output by hashning while traversing the input and
> stop when the LIMIT threshold is reached. No sorting required and *at* *most*
> one read of input.

You mean like this?

postgres=# explain select distinct x  from i limit 5;                           QUERY PLAN

-------------------------------------------------------------------Limit  (cost=54.50..54.51 rows=1 width=304)  ->
HashAggregate (cost=54.50..54.51 rows=1 width=304)        ->  Seq Scan on i  (cost=0.00..52.00 rows=1000 width=304)
 
(3 rows)


This will be in the upcoming 8.4 release.


Versions since about 7.4 or so have been capable of producing this plan but
not for DISTINCT, only for the equivalent GROUP BY query:

postgres=# explain select x  from i group by x limit 5;

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Re: pgsql: Properly unregister OpenSSL callbacks when libpq is done with
Next
From: Heikki Linnakangas
Date:
Subject: Re: Optimizing DISTINCT with LIMIT