Re: count of occurences PLUS optimisation - Mailing list pgsql-general

From Thurstan R. McDougle
Subject Re: count of occurences PLUS optimisation
Date
Msg-id 3BA0E120.244F64A9@my-deja.com
Whole thread Raw
In response to Re: count of occurences PLUS optimisation  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: count of occurences PLUS optimisation
List pgsql-general
Martijn van Oosterhout wrote:
>
> On Thu, Sep 13, 2001 at 01:03:54PM +0100, Thurstan R. McDougle wrote:
> > It would depend on how efficient the ORDER BY and LIMIT work together.
> > (The ORDER BY could build a list of LIMIT n items and just replace items
> > in that list...a lot more efficient both of memory and comparisons than
> > building the full list and then keeping the top n)
>
> There is some of this already. In the output of EXPLAIN you see two numbers.
> The first is the estimated time toget the first tuple, the second is to get
> all the tuples.
>
> When LIMIT is applied, the estimated total cost is adjusted based on the
> number of rows. So with a small number of tuples the planner will favour
> plans that get tuples early even if the total cost would be larger.
>
> > HACKERS: If it does not do this it might be a usefull optimisation.
> > There would probably need to be a cutoff limit on whether to apply this
> > method or sort and keep n.  Also for LIMIT plus OFFSET it would need to
> > build a list of the the total of the LIMIT and OFFSET figures.
>
> The problem is that it sometimes doesn't help as much as you'd expect. If
> you see a Sort stage in the plan, that means that everything below that has
> to be completly calculated.
>
> The only solution is to use a sorted index to avoid the sort step, if
> possible.


What I am talking about is WHEN the sort is required we could make the
sort more efficient as inserting into a SHORT ordered list should be
better than building a BIG list and sorting it, then only keeping a
small part of the list.

In the example in question there would be perhaps 400 records, but only
10 are needed.  From the questions on these lists it seems quite common
for only a very low proportion of the records to be required (less then
10%/upto 100 typically), in these cases it would seem to be a usefull
optimisation.


>
> HTH,
> --
> Martijn van Oosterhout <kleptog@svana.org>
> http://svana.org/kleptog/
> > Magnetism, electricity and motion are like a three-for-two special offer:
> > if you have two of them, the third one comes free.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
This is the identity that I use for NewsGroups. Email to
this will just sit there. If you wish to email me replace
the domain with knightpiesold . co . uk (no spaces).

pgsql-general by date:

Previous
From: Ryan Mahoney
Date:
Subject: Re: design alpha and beta
Next
From: Shaun Thomas
Date:
Subject: concatenation in procedures.