Thread: [BUGS] Strange influence of default_statistics_target

[BUGS] Strange influence of default_statistics_target

From
Вадим Акбашев
Date:
Hello!
I have encountered a problem with querry plan building:
I'd set  default_statistics_target=700, run analyze. Postgres optimize had chosen plan with hash_join and it took ~1 min for qerry to complete.
Then i set default_statistics_target=500 and the plan was significantly changed and was using merge_join instead, complition time reduced in hundreds times, cost reduced drastically. 
Now i can't understand why more precise statistics leads to less optimized plan and what is the right way to use default_statistics_target parameter?
I attach both good and bad querry plans and the querry itself

Attachment

Re: [BUGS] Strange influence of default_statistics_target

From
Tom Lane
Date:
Вадим Акбашев <ufaowl@gmail.com> writes:
> I have encountered a problem with querry plan building:
> I'd set  default_statistics_target=700, run analyze. Postgres optimize had
> chosen plan with hash_join and it took ~1 min for qerry to complete.
> Then i set default_statistics_target=500 and the plan was significantly
> changed and was using merge_join instead, complition time reduced in
> hundreds times, cost reduced drastically.
> Now i can't understand why more precise statistics leads to less optimized
> plan and what is the right way to use default_statistics_target parameter?
> I attach both good and bad querry plans and the querry itself

Are those really the same query?  Plan 2 is enforcing a "number_value IS
NOT NULL" condition on "attribute_value av1" that I don't see in plan 1.
And neither plan seems to have much to do with the query, since the
query has UNIONs that aren't in the plans.

But the short answer seems to be that in both cases, the only reason that
the plan doesn't take forever to run is that one sub-join chances to yield
precisely zero rows, and the PG executor happens to be more efficient
about that corner case in the one plan shape than the other.  The planner
doesn't take the possibility of that short-circuit happening into account,
since it generally cannot be sure that a sub-join wouldn't yield any rows.
So it's just luck that one plan is noticeably faster in this case.
        regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Strange influence of default_statistics_target

From
Вадим Акбашев
Date:
Hi, Tom! 
Thank you for your answer.
Those plans indeed were for the querry i've sent. But plans where not complete as i sent only parts of them that,as i tought, contain the problem. In this letter i attach full plans and also statistics for tables the querry works with. For default_statistics_target=700 and default_statistics_target=500.
This querry works much faster in our test enviroment which is 75% of our production server in size.
Also i've noticed that the plan becomes "bad" after 9-10 hours after it was fixed. And if I run ANALYZE again without changing effective_cache_size in postgresql.conf it remains "bad".

Thank you in advance
Vadim


2017-01-18 19:18 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:
\xC1\xBA\xFFВадим Акбашм\xAF\xF5в <ufaowl@gmail.com> writes:
> I have encountered a problem with querry plan building:
> I'd set  default_statistics_target=700, run analyze. Postgres optimize had
> chosen plan with hash_join and it took ~1 min for qerry to complete.
> Then i set default_statistics_target=500 and the plan was significantly
> changed and was using merge_join instead, complition time reduced in
> hundreds times, cost reduced drastically.
> Now i can't understand why more precise statistics leads to less optimized
> plan and what is the right way to use default_statistics_target parameter?
> I attach both good and bad querry plans and the querry itself

Are those really the same query?  Plan 2 is enforcing a "number_value IS
NOT NULL" condition on "attribute_value av1" that I don't see in plan 1.
And neither plan seems to have much to do with the query, since the
query has UNIONs that aren't in the plans.

But the short answer seems to be that in both cases, the only reason that
the plan doesn't take forever to run is that one sub-join chances to yield
precisely zero rows, and the PG executor happens to be more efficient
about that corner case in the one plan shape than the other.  The planner
doesn't take the possibility of that short-circuit happening into account,
since it generally cannot be sure that a sub-join wouldn't yield any rows.
So it's just luck that one plan is noticeably faster in this case.

                        regards, tom lane

Re: [BUGS] Strange influence of default_statistics_target

From
Вадим Акбашев
Date:
Sorry, i've forgot to attach files themselves

2017-01-18 19:18 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:
\xC1\xBA\xFFВадим Акбашм\xAF\xF5в <ufaowl@gmail.com> writes:
> I have encountered a problem with querry plan building:
> I'd set  default_statistics_target=700, run analyze. Postgres optimize had
> chosen plan with hash_join and it took ~1 min for qerry to complete.
> Then i set default_statistics_target=500 and the plan was significantly
> changed and was using merge_join instead, complition time reduced in
> hundreds times, cost reduced drastically.
> Now i can't understand why more precise statistics leads to less optimized
> plan and what is the right way to use default_statistics_target parameter?
> I attach both good and bad querry plans and the querry itself

Are those really the same query?  Plan 2 is enforcing a "number_value IS
NOT NULL" condition on "attribute_value av1" that I don't see in plan 1.
And neither plan seems to have much to do with the query, since the
query has UNIONs that aren't in the plans.

But the short answer seems to be that in both cases, the only reason that
the plan doesn't take forever to run is that one sub-join chances to yield
precisely zero rows, and the PG executor happens to be more efficient
about that corner case in the one plan shape than the other.  The planner
doesn't take the possibility of that short-circuit happening into account,
since it generally cannot be sure that a sub-join wouldn't yield any rows.
So it's just luck that one plan is noticeably faster in this case.

                        regards, tom lane

Attachment

Re: [BUGS] Strange influence of default_statistics_target

From
Вадим Акбашев
Date:
Hello.
Information i've sent, is it sufficient?
Thank you.

2017-01-19 14:11 GMT+05:00 Вадим Акбашев <ufaowl@gmail.com>:
Sorry, i've forgot to attach files themselves

2017-01-18 19:18 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:
Вадим Ак\xC1\xBA\xFFбашев <ufaowl@gmail.com> writes:
> I have encountered a problem with querry plan building:
> I'd set  default_statistics_target=700, run analyze. Postgres optimize had
> chosen plan with hash_join and it took ~1 min for qerry to complete.
> Then i set default_statistics_target=500 and the plan was significantly
> changed and was using merge_join instead, complition time reduced in
> hundreds times, cost reduced drastically.
> Now i can't understand why more precise statistics leads to less optimized
> plan and what is the right way to use default_statistics_target parameter?
> I attach both good and bad querry plans and the querry itself

Are those really the same query?  Plan 2 is enforcing a "number_value IS
NOT NULL" condition on "attribute_value av1" that I don't see in plan 1.
And neither plan seems to have much to do with the query, since the
query has UNIONs that aren't in the plans.

But the short answer seems to be that in both cases, the only reason that
the plan doesn't take forever to run is that one sub-join chances to yield
precisely zero rows, and the PG executor happens to be more efficient
about that corner case in the one plan shape than the other.  The planner
doesn't take the possibility of that short-circuit happening into account,
since it generally cannot be sure that a sub-join wouldn't yield any rows.
So it's just luck that one plan is noticeably faster in this case.

                        regards, tom lane