Re: More stable query plans via more predictable column statistics - Mailing list pgsql-hackers

From Alex Shulgin
Subject Re: More stable query plans via more predictable column statistics
Date
Msg-id CAM-UEKScDXTbo0LWrQbZoZnuv=wNctz1=tABPGzWoGkEFAh0bQ@mail.gmail.com
Whole thread Raw
In response to Re: More stable query plans via more predictable column statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: More stable query plans via more predictable column statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Apr 3, 2016 at 7:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alex Shulgin <alex.shulgin@gmail.com> writes:
> On Sun, Apr 3, 2016 at 3:43 AM, Alex Shulgin <alex.shulgin@gmail.com> wrote:
>> I'm not sure yet about the 1% rule for the last value, but would also love
>> to see if we can avoid the arbitrary limit here.  What happens with a last
>> value which is less than 1% popular in the current code anyway?

> Now that I think about it, I don't really believe this arbitrary heuristic
> is any good either, sorry.

Yeah, it was just a placeholder to produce a working patch.

Maybe we could base this cutoff on the stats target for the column?
That is, "1%" would be the right number if stats target is 100,
otherwise scale appropriately.

> What was your motivation to introduce some limit at the bottom anyway?

Well, we have to do *something* with the last (possibly only) value.
Neither "include always" nor "omit always" seem sane to me.  What other
decision rule do you want there?

Well, what implies that the last value is somehow special?  I would think we should just do with it whatever we do with the rest of the candidate MCVs.

For "the only value" case: we cannot build a histogram out of a single value, so omitting it from MCVs is not a good strategy, ISTM.

From my point of view that amounts to "include always".  What problems do you see with this approach exactly?

--
Alex

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: More stable query plans via more predictable column statistics
Next
From: Tom Lane
Date:
Subject: Re: More stable query plans via more predictable column statistics