Re: Ordered Partitioned Table Scans - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Ordered Partitioned Table Scans
Date
Msg-id CAOBaU_Y+mOsCJqLs_zkae=Wf0HgWFAx0TaA-LnUon_-R6DzVEw@mail.gmail.com
Whole thread Raw
In response to Re: Ordered Partitioned Table Scans  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Mar 22, 2019 at 7:19 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Mar 22, 2019 at 12:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> > > On Fri, Mar 22, 2019 at 11:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> In cases where, say, the first child requires no sort but also doesn't
> > >> emit very many rows, while the second child requires an expensive sort,
> > >> the planner will have a ridiculously optimistic opinion of the cost of
> > >> fetching slightly more rows than are available from the first child.
> > >> This might lead it to wrongly choose a merge join over a hash for example.
> >
> > > I think this is very much a valid point, especially in view of the
> > > fact that we already choose supposedly fast-start plans too often.  I
> > > don't know whether it's a death sentence for this patch, but it should
> > > at least make us stop and think hard.
> >
> > BTW, another thing we could possibly do to answer this objection is to
> > give the ordered-Append node an artificially pessimistic startup cost,
> > such as the sum or the max of its children's startup costs.  That's
> > pretty ugly and unprincipled, but maybe it's better than not having the
> > ability to generate the plan shape at all?
>
> Yeah, I'm not sure whether that's a good idea or not.  I think one of
> the problems with a cost-based optimizer is that once you start
> putting things in with the wrong cost because you think it will give
> the right answer, you're sorta playing with fire, because you can't
> necessarily predict how things are going are going to turn out in more
> complex scenarios.  On the other hand, it may sometimes be the right
> thing to do.

I've been bitten too many times with super inefficient plans of the
form "let's use the wrong index instead of the good one because I'll
probably find there the tuple I want very quickly", due to LIMIT
assuming an even distribution.   Since those queries can end up taking
dozens of minutes instead of less a ms, without a lot of control to
fix this kind of problem I definitely don't want to introduce another
similar source of pain for users.

However, what we're talking here is still a corner case.  People
having partitioned tables with a mix of partitions with and without
indexes suitable for ORDER BY x LIMIT y queries should already have at
best average performance.  And trying to handle this case cannot hurt
cases where all partitions have suitable indexes, so that may be an
acceptable risk?

I also have mixed feeling about this artificial startup cost penalty,
but if we go this way we can for sure cumulate the startup cost of all
sorts that we think we'll have to perform (according to each path's
estimated rows and the given limit_tuples).  That probably won't be
enough though, especially with fractional paths.


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: rename labels in heapam.c?
Next
From: Tom Lane
Date:
Subject: Re: speeding up planning with partitions