Re: count of occurences PLUS optimisation - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: count of occurences PLUS optimisation
Date
Msg-id 20010914163800.A8613@svana.org
Whole thread Raw
In response to Re: count of occurences PLUS optimisation  ("Thurstan R. McDougle" <trmcdougle@my-deja.com>)
List pgsql-general
On Thu, Sep 13, 2001 at 05:38:56PM +0100, Thurstan R. McDougle wrote:
> What I am talking about is WHEN the sort is required we could make the
> sort more efficient as inserting into a SHORT ordered list should be
> better than building a BIG list and sorting it, then only keeping a
> small part of the list.

For a plain SORT, it would be possible. Anything to avoid materialising the
entire table in memory. Unfortunatly it won't help if there is a GROUP
afterwards because the group can't really know when to stop.

But yes, if you had LIMIT<SORT<...>> you could do that. I can't imagine it
would be too hard to arrange.

> In the example in question there would be perhaps 400 records, but only
> 10 are needed.  From the questions on these lists it seems quite common
> for only a very low proportion of the records to be required (less then
> 10%/upto 100 typically), in these cases it would seem to be a usefull
> optimisation.

Say you have a query:

select id, count(*) from test group by id order by count desc limit 10;

This becomes:

LIMIT < SORT < GROUP < SORT < test > > > >

The inner sort would still have to scan the whole table, unless you have an
index on id. In that case your optimisation would be cool.

Have I got it right now?

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> Magnetism, electricity and motion are like a three-for-two special offer:
> if you have two of them, the third one comes free.

pgsql-general by date:

Previous
From: Justin Clift
Date:
Subject: Re: business perspective
Next
From: Ryan Mahoney
Date:
Subject: Re: business perspective