Re: Handle LIMIT/OFFSET before select clause (was: Feature request: optimizer improvement) - Mailing list pgsql-hackers

From Joe Love
Subject Re: Handle LIMIT/OFFSET before select clause (was: Feature request: optimizer improvement)
Date
Msg-id CAK3BLoSH7Ts+d1uKXdjdWPqqvqr3RgN=JNKE8SpYUP_p=SXsFg@mail.gmail.com
Whole thread Raw
In response to Re: Handle LIMIT/OFFSET before select clause (was: Feature request: optimizer improvement)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Handle LIMIT/OFFSET before select clause (was: Feature request: optimizer improvement)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I'm wondering what type of index would work for this as it is a volatile function. Not knowing how PGs optimizer runs, I'm at a loss as to why this wouldn't be possible or worth doing. It seems to me that all functions in the "select" part of the statement could be calculated at the end of the query after the results have been gathered, and even after the sorting had been done as long as the column wasn't part of the order by (or perhaps group by). 

I have an entire set of functions that perform in this way. For example, I'm selecting a list of all my products and the function does a complex calculation based on inventory in the warehouse + expected deliveries from the factory to determine how many of each item is available, and when they first become available.   What's helpful is for the users search criteria to initially limit the search result, and then I want to paginate the results and only show them a few at a time. In the verbose syntax I mentioned originally, the query performs well, in the most straightforward syntax, it does not. I'm not sure I even need to "hint" the optimizer to perform this type of an optimization as it seems it would be beneficial (or at least not detrimental) 100% of the time.


On Sat, Nov 2, 2013 at 10:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Atri Sharma <atri.jiit@gmail.com> writes:
> I understand the reasons for executing SELECT before the sort. But,
> couldnt we get the planner to see the LIMIT part and push the sort
> node above the select node for this specific case?

[ Shrug... ]  I don't see the point.  If the OP actually cares about the
speed of this query, he's going to want to avoid the sort step too,
which is what makes the index a good idea.

More generally, this is not a transformation we could apply
unconditionally --- at the very least it'd need to be avoided when
volatile functions are involved, and I don't think it's invariably
a win from a cost standpoint even if there aren't semantic blockers.
But right now the planner has no real ability to reason about placement
of SELECT-list evaluation: it's done in a fixed spot in any particular
plan structure.  I don't think it's practical to add such considerations
to the rat's nest that is grouping_planner and friends.  I have
ambitions to replace all that with a Path-generation-and-comparison
approach, and the Paths used for this would need to carry some
indication of which expressions would be evaluated where.  So maybe
once that's done we could think about whether this is worth doing.
I remain dubious though.

                        regards, tom lane



--
Joe's Computer Service
405-227-0951
Computer Running Slow? Call Joe!
$125, Your computer will run like new!
www.JoesComputerService.net

pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Fast insertion indexes: why no developments
Next
From: Jeff Janes
Date:
Subject: Re: [PATCH] configure: add git describe output to PG_VERSION when building a git tree