Re: [HACKERS] Solution for LIMIT cost estimation - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Solution for LIMIT cost estimation
Date
Msg-id 15107.950509953@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Solution for LIMIT cost estimation  (Philip Warner <pjw@rhyme.com.au>)
List pgsql-hackers
Philip Warner <pjw@rhyme.com.au> writes:
> #1 seems pretty nasty as a concept, unless of course this actually reflects
> the way that PG retrieves rows. My guess is that it will have to retrieve
> rows 1 to (offset + limit), not (offset) to (offset + limit), so the whole
> appreximation should be based on #2. 

Right --- if we could start the query in the middle this would all be
a lot nicer, but we can't.  The implementation of OFFSET is just to
discard the first N tuples retrieved before beginning to hand any tuples
back to the client.  So the "right" approach for the optimizer is to
assume that OFFSET+LIMIT tuples will be retrieved.  The trouble is that
that can mean that the query plan changes depending on OFFSET, which
leads to consistency problems if you don't lock down the tuple ordering
with ORDER BY.

> a. Does the optimizer know how to do 'index-only' queries (where all fields
> are satisfied by the index)

Postgres doesn't have indexes that allow index-only queries --- you
still have to fetch the tuples, because the index doesn't carry
commit status.  I think that's irrelevant anyway, since we're not
only interested in the behavior for simple queries...

> b. Just to clarify, OFFSET does affect the tuples actually returned,
> doesn't it?

Of course.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Chris Bitmead
Date:
Subject: Re: [HACKERS] Solution for LIMIT cost estimation
Next
From: "davida"
Date:
Subject: subscribe hackers