Re: More stable query plans via more predictable column statistics - Mailing list pgsql-hackers

From Shulgin, Oleksandr
Subject Re: More stable query plans via more predictable column statistics
Date
Msg-id CACACo5TS-v4KkU-LdgZ-uhbKpSRWb1rjGaC5o3uWn9Gwvi2MQA@mail.gmail.com
Whole thread Raw
In response to Re: More stable query plans via more predictable column statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: More stable query plans via more predictable column statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
<p dir="ltr">On Apr 1, 2016 23:14, "Tom Lane" <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>
wrote:<br/> ><br /> > "Shulgin, Oleksandr" <<a
href="mailto:oleksandr.shulgin@zalando.de">oleksandr.shulgin@zalando.de</a>>writes:<br /> > > Alright.  I'm
attachingthe latest version of this patch split in two<br /> > > parts: the first one is NULLs-related bugfix and
thesecond is the<br /> > > "improvement" part, which applies on top of the first one.<br /> ><br /> > I've
appliedthe first of these patches,<p dir="ltr">Great news, thank you!<p dir="ltr">> broken into two parts first<br
/>> because it seemed like there were two issues and second because Tomas<br /> > deserved primary credit for one
part,ie realizing we were using the<br /> > Haas-Stokes formula wrong.<br /> ><br /> > As for the other part,
Icommitted it with one non-cosmetic change:<br /> > I do not think it is right to omit "too wide" values when
considering<br/> > the threshold for MCVs.  As submitted, the patch was inconsistent on<br /> > that point anyway
sinceit did it differently in compute_distinct_stats<br /> > and compute_scalar_stats.  But the larger picture here
isthat we define<br /> > the MCV population to exclude nulls, so it's reasonable to consider a<br /> > value as
anMCV even if it's greatly outnumbered by nulls.  There is<br /> > no such exclusion for "too wide" values; those
thingsare just an<br /> > implementation limitation in analyze.c, not something that is part of<br /> > the
pg_statisticdefinition.  If there are a lot of "too wide" values<br /> > in the sample, we don't know whether any of
themare duplicates, but<br /> > we do know that the frequencies of the normal-width values have to be<br /> >
discountedappropriately.<p dir="ltr">Okay.<p dir="ltr">> Haven't looked at 0002 yet.<p dir="ltr">[crosses fingers]
hopeyou'll have a chance to do that before feature freeze for 9.6…<p dir="ltr">--<br /> Alex<br /> 

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: syntax sugar for conditional check
Next
From: Michael Paquier
Date:
Subject: Re: Speedup twophase transactions