Re: More stable query plans via more predictable column statistics - Mailing list pgsql-hackers

From Alex Shulgin
Subject Re: More stable query plans via more predictable column statistics
Date
Msg-id CAM-UEKTAgra3ahRm8Nw_KPdCAx-YHavQd3xkRMpYUCp18q7+0Q@mail.gmail.com
Whole thread Raw
In response to Re: More stable query plans via more predictable column statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: More stable query plans via more predictable column statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: More stable query plans via more predictable column statistics  (Alex Shulgin <alex.shulgin@gmail.com>)
List pgsql-hackers
On Sun, Apr 3, 2016 at 7:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alex Shulgin <alex.shulgin@gmail.com> writes:
> On Sun, Apr 3, 2016 at 7:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, we have to do *something* with the last (possibly only) value.
>> Neither "include always" nor "omit always" seem sane to me.  What other
>> decision rule do you want there?

> Well, what implies that the last value is somehow special?  I would think
> we should just do with it whatever we do with the rest of the candidate
> MCVs.

Sure, but both of the proposed decision rules break down when there are no
values after the one under consideration.  We need to do something sane
there.

Hm... There are indeed the case where it would beneficial to have at least 2 values in the histogram (to have at least the low/high bounds for inequality comparison selectivity) instead of taking both to the MCV list or taking one to the MCVs and having to discard the other.

Obviously, we need a fresh idea on how to handle this.

> For "the only value" case: we cannot build a histogram out of a single
> value, so omitting it from MCVs is not a good strategy, ISTM.
> From my point of view that amounts to "include always".

If there is only one value, it will have 100% of the samples, so it would
get included under just about any decision rule (other than "more than
100% of this value plus following values").  I don't think making sure
this case works is sufficient to get us to a reasonable rule --- it's
a necessary case, but not a sufficient case.
 
Well, if it's the only value it will be accepted simply because we are checking that special case already and don't even bother to loop through the track list.

--
Alex

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: pgbench more operators & functions
Next
From: Piotr Stefaniak
Date:
Subject: Small fix: avoid passing null pointers to memcpy()