Re: [HACKERS] Solution for LIMIT cost estimation - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Solution for LIMIT cost estimation
Date
Msg-id 22628.950556469@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Solution for LIMIT cost estimation  (Philip Warner <pjw@rhyme.com.au>)
Responses Re: [HACKERS] Solution for LIMIT cost estimation  (Don Baccus <dhogaza@pacifier.com>)
RE: [HACKERS] Solution for LIMIT cost estimation  ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
List pgsql-hackers
Philip Warner <pjw@rhyme.com.au> writes:
> Just my 0.02c, but I don't like putting limits on an optimizer. 

That's my feeling too.  I'm leaning towards letting the optimizer do the
best it can with the given query (which means using OFFSET+LIMIT as the
estimated number of tuples to be fetched), and documenting the potential
gotcha as best we can.  Something like:

CAUTION: if you repeat a query several times with different OFFSET or
LIMIT values to fetch different portions of the whole result, you will
find that you get inconsistent results unless you specify an ORDER BY
condition that is strong enough to ensure that all selected tuples must
appear in a unique order.  Without ORDER BY, the system is free to
return the tuples in any order it finds convenient --- and it may well
make different implementation choices leading to different orderings
depending on the OFFSET and LIMIT values.  In general, you should be
very wary of using OFFSET or LIMIT with an unordered or partially
ordered query; you will get a difficult-to-predict, implementation-
dependent subset of the selected tuples.

Is that clear enough?  Can anyone improve on the wording?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [PATCHES] Re: [HACKERS] Almost there on column aliases
Next
From: Don Baccus
Date:
Subject: Re: [HACKERS] Solution for LIMIT cost estimation