Thread: SELECT slows down on sixth execution
I have a very complex SELECT for which I use PREPARE and then EXECUTE. The first five times I run "explain (analyze, buffers) execute ..." in psql, it takes about 1s. Starting with the sixth execution, the plan changes and execution time doubles or more. The slower plan is used from then on. If I DEALLOCATE the prepared statement and PREPARE again, the cycle is reset and I get five good executions again. This behavior is utterly mystifying to me since I can see no reason for Postgres to change its plan after an arbitrary number of executions, especially for the worse. When I did the experiment on a development system, Postgres was doing nothing apart from the interactively executed statements. No data were inserted, no settings were changed and no other clients were active in any way. Is there some threshold for five or six executions of the same query? Without delving into the plans themselves yet, what could possibly cause the prepared statement to be re-planned? I have seen the same behavior on Postgres 9.2.10 and 9.4.1. -- Jonathan Rogers Socialserve.com by Emphasys Software jrogers@emphasys-software.com
Jonathan Rogers wrote: > I have a very complex SELECT for which I use PREPARE and then EXECUTE. > The first five times I run "explain (analyze, buffers) execute ..." in > psql, it takes about 1s. Starting with the sixth execution, the plan > changes and execution time doubles or more. The slower plan is used from > then on. If I DEALLOCATE the prepared statement and PREPARE again, the > cycle is reset and I get five good executions again. > > This behavior is utterly mystifying to me since I can see no reason for > Postgres to change its plan after an arbitrary number of executions, > especially for the worse. When I did the experiment on a development > system, Postgres was doing nothing apart from the interactively executed > statements. No data were inserted, no settings were changed and no other > clients were active in any way. Is there some threshold for five or six > executions of the same query? > > Without delving into the plans themselves yet, what could possibly cause > the prepared statement to be re-planned? I have seen the same behavior > on Postgres 9.2.10 and 9.4.1. You are encountering "custom plans", introduced in 9.2. When a statement with parameters is executed, PostgreSQL will not only generate a generic plan, but for the first 5 executions it will substitute the arguments and generate and execute a custom plan for that. After 5 executions, the cost of the generic plan is compared to the average of the costs of the custom plans. If the cost is less, the generic plan will be used from that point on. If the cost is more, a custom plan will be used. So what you encounter is probably caused by bad estimates for either the custom plan or the generic plan. Look at the EXPLAIN ANALYZE output for both the custom plan (one of the first five executions) and the generic plan (the one used from the sixth time on) and see if you can find and fix the cause for the misestimate. Other than that, you could stop using prepared statements, but that is probably not the optimal solution. Yours, Laurenz Albe
Hi
2015-10-14 9:38 GMT+02:00 Jonathan Rogers <jrogers@socialserve.com>:
I have a very complex SELECT for which I use PREPARE and then EXECUTE.
The first five times I run "explain (analyze, buffers) execute ..." in
psql, it takes about 1s. Starting with the sixth execution, the plan
changes and execution time doubles or more. The slower plan is used from
then on. If I DEALLOCATE the prepared statement and PREPARE again, the
cycle is reset and I get five good executions again.
This behavior is utterly mystifying to me since I can see no reason for
Postgres to change its plan after an arbitrary number of executions,
especially for the worse. When I did the experiment on a development
system, Postgres was doing nothing apart from the interactively executed
statements. No data were inserted, no settings were changed and no other
clients were active in any way. Is there some threshold for five or six
executions of the same query?
yes, there is. PostgreSQL try to run custom plans five times (optimized for specific parameters) and then compare average cost with cost of generic plan. If generic plan is cheaper, then PostgreSQL will use generic plan (that is optimized for most common value (not for currently used value)).
see https://github.com/postgres/postgres/blob/master/src/backend/utils/cache/plancache.c , function choose_custom_plan
What I know, this behave isn't possible to change from outside. Shouldn't be hard to write a extension for own PREPARE function, that set CURSOR_OPT_CUSTOM_PLAN option
Regards
Pavel
Without delving into the plans themselves yet, what could possibly cause
the prepared statement to be re-planned? I have seen the same behavior
on Postgres 9.2.10 and 9.4.1.
--
Jonathan Rogers
Socialserve.com by Emphasys Software
jrogers@emphasys-software.com
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On 10/14/2015 05:00 AM, Albe Laurenz wrote: > Jonathan Rogers wrote: >> I have a very complex SELECT for which I use PREPARE and then EXECUTE. >> The first five times I run "explain (analyze, buffers) execute ..." in >> psql, it takes about 1s. Starting with the sixth execution, the plan >> changes and execution time doubles or more. The slower plan is used from >> then on. If I DEALLOCATE the prepared statement and PREPARE again, the >> cycle is reset and I get five good executions again. >> >> This behavior is utterly mystifying to me since I can see no reason for >> Postgres to change its plan after an arbitrary number of executions, >> especially for the worse. When I did the experiment on a development >> system, Postgres was doing nothing apart from the interactively executed >> statements. No data were inserted, no settings were changed and no other >> clients were active in any way. Is there some threshold for five or six >> executions of the same query? >> >> Without delving into the plans themselves yet, what could possibly cause >> the prepared statement to be re-planned? I have seen the same behavior >> on Postgres 9.2.10 and 9.4.1. > > You are encountering "custom plans", introduced in 9.2. > > When a statement with parameters is executed, PostgreSQL will not only generate > a generic plan, but for the first 5 executions it will substitute the arguments > and generate and execute a custom plan for that. > > After 5 executions, the cost of the generic plan is compared to the average > of the costs of the custom plans. If the cost is less, the generic plan will > be used from that point on. If the cost is more, a custom plan will be used. > > So what you encounter is probably caused by bad estimates for either > the custom plan or the generic plan. Thanks. That does explain what I've seen. > > Look at the EXPLAIN ANALYZE output for both the custom plan (one of the > first five executions) and the generic plan (the one used from the sixth > time on) and see if you can find and fix the cause for the misestimate. Yes, I have been looking at both plans and can see where they diverge. How could I go about figuring out why Postgres fails to see the large difference in plan execution time? I use exactly the same parameters every time I execute the prepared statement, so how would Postgres come to think that those are not the norm? > > Other than that, you could stop using prepared statements, but that is > probably not the optimal solution. This is probably what I'll end up doing. The statement preparation is the result of a custom layer that does so universally and I'll probably just turn that feature off. -- Jonathan Rogers Socialserve.com by Emphasys Software jrogers@emphasys-software.com
Jonathan Rogers wrote: >> Look at the EXPLAIN ANALYZE output for both the custom plan (one of the >> first five executions) and the generic plan (the one used from the sixth >> time on) and see if you can find and fix the cause for the misestimate. > > Yes, I have been looking at both plans and can see where they diverge. > How could I go about figuring out why Postgres fails to see the large > difference in plan execution time? I use exactly the same parameters > every time I execute the prepared statement, so how would Postgres come > to think that those are not the norm? PostgreSQL does not consider the actual query execution time, it only compares its estimates for there general and the custom plan. Also, it does not keep track of the parameter values you supply, only of the average custom plan query cost estimate. The problem is either that the planner underestimates the cost of the generic plan or overestimates the cost of the custom plans. If you look at the EXPLAIN ANALYZE outputs (probably with http://explain.depesz.com ), are there any row count estimates that differ significantly from reality? Yours, Laurenz Albe
On 10/16/2015 08:37 AM, Albe Laurenz wrote: > Jonathan Rogers wrote: >>> Look at the EXPLAIN ANALYZE output for both the custom plan (one of the >>> first five executions) and the generic plan (the one used from the sixth >>> time on) and see if you can find and fix the cause for the misestimate. >> >> Yes, I have been looking at both plans and can see where they diverge. >> How could I go about figuring out why Postgres fails to see the large >> difference in plan execution time? I use exactly the same parameters >> every time I execute the prepared statement, so how would Postgres come >> to think that those are not the norm? > > PostgreSQL does not consider the actual query execution time, it only > compares its estimates for there general and the custom plan. > Also, it does not keep track of the parameter values you supply, > only of the average custom plan query cost estimate. OK, that makes more sense then. It's somewhat tedious for the purpose of testing to execute a prepared statement six times to see the plan which needs to be optimized. Unfortunately, there doesn't seem to be any way to force use of a generic plan in SQL based on Pavel Stehule's reply. > > The problem is either that the planner underestimates the cost of > the generic plan or overestimates the cost of the custom plans. > > If you look at the EXPLAIN ANALYZE outputs (probably with > http://explain.depesz.com ), are there any row count estimates that > differ significantly from reality? Now that I've read the help about "rows x" to understand what it means, I can see that while both plans underestimate returned rows, the generic one underestimates them by a much larger factor. In this case, the solution is to avoid preparing the query to ensure a custom plan is used every time. Since the planner is significantly underestimating row counts even when making custom plans, I will continue to try to improve the planner's information. My default_statistics_target is currently 500. I suppose I should experiment with increasing it for certain columns. Thanks for the pointers. -- Jonathan Rogers Socialserve.com by Emphasys Software jrogers@emphasys-software.com
On 10/14/2015 05:01 AM, Pavel Stehule wrote: > Hi > > 2015-10-14 9:38 GMT+02:00 Jonathan Rogers <jrogers@socialserve.com > <mailto:jrogers@socialserve.com>>: > > I have a very complex SELECT for which I use PREPARE and then EXECUTE. > The first five times I run "explain (analyze, buffers) execute ..." in > psql, it takes about 1s. Starting with the sixth execution, the plan > changes and execution time doubles or more. The slower plan is used from > then on. If I DEALLOCATE the prepared statement and PREPARE again, the > cycle is reset and I get five good executions again. > > This behavior is utterly mystifying to me since I can see no reason for > Postgres to change its plan after an arbitrary number of executions, > especially for the worse. When I did the experiment on a development > system, Postgres was doing nothing apart from the interactively executed > statements. No data were inserted, no settings were changed and no other > clients were active in any way. Is there some threshold for five or six > executions of the same query? > > > yes, there is. PostgreSQL try to run custom plans five times (optimized > for specific parameters) and then compare average cost with cost of > generic plan. If generic plan is cheaper, then PostgreSQL will use > generic plan (that is optimized for most common value (not for currently > used value)). > > see > https://github.com/postgres/postgres/blob/master/src/backend/utils/cache/plancache.c > , function choose_custom_plan > > What I know, this behave isn't possible to change from outside. > Shouldn't be hard to write a extension for own PREPARE function, that > set CURSOR_OPT_CUSTOM_PLAN option Thanks for the link. I can see the hard-coded "5" right there. I looked in the docs a bit and found the server C function "SPI_prepare_cursor" which allows explicit selection of a custom or generic plan. However, if I understand you correctly, there is currently no SQL interface to explicitly control what type of plan is used. So, the solution for my particular query is to avoid preparing it, ensuring it gets a custom plan every time. The decision to prepare it came from a client-side layer which defaults to preparing everything rather than any specific reason and we're now reconsidering that policy. -- Jonathan Rogers Socialserve.com by Emphasys Software jrogers@emphasys-software.com
2015-10-17 4:29 GMT+02:00 Jonathan Rogers <jrogers@socialserve.com>:
On 10/14/2015 05:01 AM, Pavel Stehule wrote:
> Hi
>
> 2015-10-14 9:38 GMT+02:00 Jonathan Rogers <jrogers@socialserve.com
> <mailto:jrogers@socialserve.com>>:
>
> I have a very complex SELECT for which I use PREPARE and then EXECUTE.
> The first five times I run "explain (analyze, buffers) execute ..." in
> psql, it takes about 1s. Starting with the sixth execution, the plan
> changes and execution time doubles or more. The slower plan is used from
> then on. If I DEALLOCATE the prepared statement and PREPARE again, the
> cycle is reset and I get five good executions again.
>
> This behavior is utterly mystifying to me since I can see no reason for
> Postgres to change its plan after an arbitrary number of executions,
> especially for the worse. When I did the experiment on a development
> system, Postgres was doing nothing apart from the interactively executed
> statements. No data were inserted, no settings were changed and no other
> clients were active in any way. Is there some threshold for five or six
> executions of the same query?
>
>
> yes, there is. PostgreSQL try to run custom plans five times (optimized
> for specific parameters) and then compare average cost with cost of
> generic plan. If generic plan is cheaper, then PostgreSQL will use
> generic plan (that is optimized for most common value (not for currently
> used value)).
>
> see
> https://github.com/postgres/postgres/blob/master/src/backend/utils/cache/plancache.c
> , function choose_custom_plan
>
> What I know, this behave isn't possible to change from outside.
> Shouldn't be hard to write a extension for own PREPARE function, that
> set CURSOR_OPT_CUSTOM_PLAN option
Thanks for the link. I can see the hard-coded "5" right there. I looked
in the docs a bit and found the server C function "SPI_prepare_cursor"
which allows explicit selection of a custom or generic plan. However, if
I understand you correctly, there is currently no SQL interface to
explicitly control what type of plan is used.
So, the solution for my particular query is to avoid preparing it,
ensuring it gets a custom plan every time. The decision to prepare it
came from a client-side layer which defaults to preparing everything
rather than any specific reason and we're now reconsidering that policy.
I was not 100% correct - you can use a parametrized queries via
PQexecParams http://www.postgresql.org/docs/9.4/static/libpq-exec.html
If this function is accessable from your environment, then you should to use it. It is protection against SQL injection, and it doesn't use generic plan. For your case the using of prepared statements is contra productive.
Any other solution is client side prepared statements - lot of API used by default.
Regards
Pavel
On 2015-10-14 03:00, Albe Laurenz wrote: > > You are encountering "custom plans", introduced in 9.2. > > When a statement with parameters is executed, PostgreSQL will not only generate > a generic plan, but for the first 5 executions it will substitute the arguments > and generate and execute a custom plan for that. Wow! Thanks. I feel this should be documented a bit better. Shouldn't this be explained in at least as much details as in your explanation, in the sql-prepare document? Yves. -- http://yves.zioup.com gpg: 4096R/32B0F416
2015-10-17 15:29 GMT+02:00 Yves Dorfsman <yves@zioup.com>:
On 2015-10-14 03:00, Albe Laurenz wrote:
>
> You are encountering "custom plans", introduced in 9.2.
>
> When a statement with parameters is executed, PostgreSQL will not only generate
> a generic plan, but for the first 5 executions it will substitute the arguments
> and generate and execute a custom plan for that.
Wow! Thanks. I feel this should be documented a bit better.
Shouldn't this be explained in at least as much details as in your
explanation, in the sql-prepare document?
probably - some section about benefits and risks can be useful - but it is task for somebody with better English than is mine :)
Regards
Pavel
Yves.
--
http://yves.zioup.com
gpg: 4096R/32B0F416
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Fri, Oct 16, 2015 at 9:14 PM, Jonathan Rogers <jrogers@socialserve.com> wrote: > On 10/16/2015 08:37 AM, Albe Laurenz wrote: >> Jonathan Rogers wrote: >>>> Look at the EXPLAIN ANALYZE output for both the custom plan (one of the >>>> first five executions) and the generic plan (the one used from the sixth >>>> time on) and see if you can find and fix the cause for the misestimate. >>> >>> Yes, I have been looking at both plans and can see where they diverge. >>> How could I go about figuring out why Postgres fails to see the large >>> difference in plan execution time? I use exactly the same parameters >>> every time I execute the prepared statement, so how would Postgres come >>> to think that those are not the norm? >> >> PostgreSQL does not consider the actual query execution time, it only >> compares its estimates for there general and the custom plan. >> Also, it does not keep track of the parameter values you supply, >> only of the average custom plan query cost estimate. > > OK, that makes more sense then. It's somewhat tedious for the purpose of > testing to execute a prepared statement six times to see the plan which > needs to be optimized. Unfortunately, there doesn't seem to be any way > to force use of a generic plan in SQL based on Pavel Stehule's reply. Yeah. In the worst case, a query can fail in the generic plan because it depends on the arguments for dubious things like SELECT CASE WHEN _arg = 'TEXT' THEN foo::text ... I'm ok with why those things must fail, but it'd sure be nice to be able to control the switch to the generic plan. merlin
Jonathan Rogers schrieb am 17.10.2015 um 04:14: >>> Yes, I have been looking at both plans and can see where they diverge. >>> How could I go about figuring out why Postgres fails to see the large >>> difference in plan execution time? I use exactly the same parameters >>> every time I execute the prepared statement, so how would Postgres come >>> to think that those are not the norm? >> >> PostgreSQL does not consider the actual query execution time, it only >> compares its estimates for there general and the custom plan. >> Also, it does not keep track of the parameter values you supply, >> only of the average custom plan query cost estimate. > > OK, that makes more sense then. It's somewhat tedious for the purpose of > testing to execute a prepared statement six times to see the plan which > needs to be optimized. Unfortunately, there doesn't seem to be any way > to force use of a generic plan in SQL based on Pavel Stehule's reply. If you are using JDBC the threshold can be changed: https://jdbc.postgresql.org/documentation/94/server-prepare.html https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/PGStatement.html#setPrepareThreshold%28int%29 As I don't think JDBC is using anything "exotic" I would be surprised if this can't be changed with other programming environments also. Thomas
2015-10-20 8:55 GMT+02:00 Thomas Kellerer <spam_eater@gmx.net>:
Jonathan Rogers schrieb am 17.10.2015 um 04:14:
>>> Yes, I have been looking at both plans and can see where they diverge.
>>> How could I go about figuring out why Postgres fails to see the large
>>> difference in plan execution time? I use exactly the same parameters
>>> every time I execute the prepared statement, so how would Postgres come
>>> to think that those are not the norm?
>>
>> PostgreSQL does not consider the actual query execution time, it only
>> compares its estimates for there general and the custom plan.
>> Also, it does not keep track of the parameter values you supply,
>> only of the average custom plan query cost estimate.
>
> OK, that makes more sense then. It's somewhat tedious for the purpose of
> testing to execute a prepared statement six times to see the plan which
> needs to be optimized. Unfortunately, there doesn't seem to be any way
> to force use of a generic plan in SQL based on Pavel Stehule's reply.
If you are using JDBC the threshold can be changed:
https://jdbc.postgresql.org/documentation/94/server-prepare.html
https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/PGStatement.html#setPrepareThreshold%28int%29
As I don't think JDBC is using anything "exotic" I would be surprised if this
can't be changed with other programming environments also.
This is some different - you can switch between server side prepared statements and client side prepared statements in JDBC. It doesn't change the behave of server side prepared statements in Postgres.
Pavel
Thomas
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On 10/20/2015 03:45 AM, Pavel Stehule wrote: > > > 2015-10-20 8:55 GMT+02:00 Thomas Kellerer <spam_eater@gmx.net > <mailto:spam_eater@gmx.net>>: > > Jonathan Rogers schrieb am 17.10.2015 um 04:14: > >>> Yes, I have been looking at both plans and can see where they > diverge. > >>> How could I go about figuring out why Postgres fails to see the > large > >>> difference in plan execution time? I use exactly the same parameters > >>> every time I execute the prepared statement, so how would > Postgres come > >>> to think that those are not the norm? > >> > >> PostgreSQL does not consider the actual query execution time, it only > >> compares its estimates for there general and the custom plan. > >> Also, it does not keep track of the parameter values you supply, > >> only of the average custom plan query cost estimate. > > > > OK, that makes more sense then. It's somewhat tedious for the > purpose of > > testing to execute a prepared statement six times to see the plan > which > > needs to be optimized. Unfortunately, there doesn't seem to be any way > > to force use of a generic plan in SQL based on Pavel Stehule's reply. > > > If you are using JDBC the threshold can be changed: > > https://jdbc.postgresql.org/documentation/94/server-prepare.html > > https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/PGStatement.html#setPrepareThreshold%28int%29 > > As I don't think JDBC is using anything "exotic" I would be > surprised if this > can't be changed with other programming environments also. > > > This is some different - you can switch between server side prepared > statements and client side prepared statements in JDBC. It doesn't > change the behave of server side prepared statements in Postgres. I am using psycopg2 with a layer on top which can automatically PREPARE statements, so I guess that implements something similar to the JDBC interface. I did solve my problem by turning off the automatic preparation. -- Jonathan Rogers Socialserve.com by Emphasys Software jrogers@emphasys-software.com
2015-10-20 16:48 GMT+02:00 Jonathan Rogers <jrogers@socialserve.com>:
On 10/20/2015 03:45 AM, Pavel Stehule wrote:
>
>
> 2015-10-20 8:55 GMT+02:00 Thomas Kellerer <spam_eater@gmx.net
> <mailto:spam_eater@gmx.net>>:I am using psycopg2 with a layer on top which can automatically PREPARE>
> Jonathan Rogers schrieb am 17.10.2015 um 04:14:
> >>> Yes, I have been looking at both plans and can see where they
> diverge.
> >>> How could I go about figuring out why Postgres fails to see the
> large
> >>> difference in plan execution time? I use exactly the same parameters
> >>> every time I execute the prepared statement, so how would
> Postgres come
> >>> to think that those are not the norm?
> >>
> >> PostgreSQL does not consider the actual query execution time, it only
> >> compares its estimates for there general and the custom plan.
> >> Also, it does not keep track of the parameter values you supply,
> >> only of the average custom plan query cost estimate.
> >
> > OK, that makes more sense then. It's somewhat tedious for the
> purpose of
> > testing to execute a prepared statement six times to see the plan
> which
> > needs to be optimized. Unfortunately, there doesn't seem to be any way
> > to force use of a generic plan in SQL based on Pavel Stehule's reply.
>
>
> If you are using JDBC the threshold can be changed:
>
> https://jdbc.postgresql.org/documentation/94/server-prepare.html
>
> https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/PGStatement.html#setPrepareThreshold%28int%29
>
> As I don't think JDBC is using anything "exotic" I would be
> surprised if this
> can't be changed with other programming environments also.
>
>
> This is some different - you can switch between server side prepared
> statements and client side prepared statements in JDBC. It doesn't
> change the behave of server side prepared statements in Postgres.
statements, so I guess that implements something similar to the JDBC
interface. I did solve my problem by turning off the automatic preparation.
yes, you did off server side prepared statements.
Pavel
--
Jonathan Rogers
Socialserve.com by Emphasys Software
jrogers@emphasys-software.com