Re: More stable query plans via more predictable column statistics - Mailing list pgsql-hackers
From | Shulgin, Oleksandr |
---|---|
Subject | Re: More stable query plans via more predictable column statistics |
Date | |
Msg-id | CACACo5T=a+GmEuvHq9R5bYJQPVJe8Fio-qyYtpnyzs_wrTkY5Q@mail.gmail.com Whole thread Raw |
In response to | Re: More stable query plans via more predictable column statistics (Joel Jacobson <joel@trustly.com>) |
Responses |
Re: More stable query plans via more predictable column
statistics
|
List | pgsql-hackers |
On Tue, Mar 8, 2016 at 9:10 PM, Joel Jacobson <joel@trustly.com> wrote:
On Wed, Mar 9, 2016 at 1:25 AM, Shulgin, Oleksandr
<oleksandr.shulgin@zalando.de> wrote:
> Thank you for spending your time to run these :-)
n/p, it took like 30 seconds :-)
Great! I'm glad to hear it was as easy to use as I hoped for :-)
> I don't want to be asking for too much here, but is there a chance you could
> try the effects of the proposed patch on an offline copy of your database?
Yes, I think that should be possible.
> Do you envision or maybe have experienced problems with query plans
> referring to the columns that are near the top of the above hist_ratio
> report? In other words: what are the practical implications for you with
> the values being duplicated rather badly throughout the histogram like in
> the example you shown?
I don't know much about the internals of query planner,
I just read the "57.1. Row Estimation Examples" to get a basic understanding.
If I understand it correctly, if the histogram_bounds contains a lot
of duplicated values,
then the row estimation will be inaccurate, which in turn will trick
the query planner
into a sub-optimal plan?
Yes, basically it should matter the most for the equality comparison operator, such that a MCV entry would provide more accurate selectivity estimate (and the histogram is not used at all in this case anyway). For the "less/greater-than" comparison both MCV list and histogram are used, so the drawback of having repeated values in the histogram, in my understanding is the same: less accurate selectivity estimates for the values that could fall precisely into a bin which didn't make it into the histogram.
We've had some problems lately with the query planner, or actually we've always
had them but never noticed them nor cared about them, but now during peak times
we've had short periods where we haven't been able to fully cope up
with the traffic.
I tracked down the most self_time-consuming functions and quickly saw
how to optimize them.
Many of them where on the form:
SELECT .. FROM SomeBigTable WHERE Col1 = [some dynamic value] AND Col2
= [some constant value] AND Col3 = [some other constant value]
The number of rows matching the WHERE clause were very tiny, perfect
match for a partial index:
CREATE INDEX .. ON SomeBigTable USING btree (Col1) WHERE Col2 = [some
constant value] AND Col3 = [some other constant value];
Even though the new partial index matched the query perfectly, the
query planner didn't want to use it. Instead it continued to use some
other sub-optimal index.
The only way to force it to use the correct index was to use the
"+0"-trick which I recently learned from one of my colleagues:
SELECT .. FROM SomeBigTable WHERE Col1 = [some dynamic value] AND
Col2+0 = [some constant value] AND Col3+0 = [some other constant
value]
CREATE INDEX .. ON SomeBigTable USING btree (Col1) WHERE Col2+0 =
[some constant value] AND Col3+0 = [some other constant value];
By adding +0 to the columns, the query planner will as I understand it
be extremely motivated to use the correct index, as otherwise it would
have to do a seq scan on the entire big table, which would be very
costly.
I'm glad the trick worked, now the system is fast again.
We're still on 9.1, so maybe these problems will go away once we upgrade to 9.5.
Hm... sounds like a planner bug to me. I'm not exceptionally aware of the changes in partial index handling that were made after 9.1, though grepping the commit log for "partial index" produces a number of hits after the date of 9.1 release.
I don't know if these problems I described can be fixed by your patch,
but I wanted to share this story since I know our systems (Trustly's
and Zalando's) are quite similar in design,
so maybe you have experienced something similar.
I would not expect this type of problem to be affected by the patch in any way, though maybe I'm missing the complete picture here.
Also, I'm not aware of similar problems in our systems, but I can ask around. :-)
Thank you.
--
Alex
pgsql-hackers by date: