Thread: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

[HACKERS] [FEATURE PATCH] pg_stat_statements with plans

From
Julian Markwort
Date:
Hello psql-hackers!

TL:DR;
We extended the functionality of pg_stat_statements so it can track 
worst and best case execution plans.

Based on a suggestion of my colleague Arne Scheffer, Marius Timmer and I 
extended pg_stat_statements so it can also record execution plans, 
whenever the execution time is exceeded (or deceeded) by a definable 
factor.
We were largely inspired by the pg_stat_plans extension by Peter 
Geoghegan and Simon Riggs - we don't claim any originality on this part 
- which is unfortunately not available on newer postgresql versions. 
There are a few differences which will become apparent in the following 
lines.

By default, the modified pg_stat_statements extension will now track 
good plans and bad plans for each entry in pg_stat_statements.
The plans are not normalized or hashed (as opposed to pg_stat_plans), 
they represent discreet statements.
A good plan is saved, whenever this sort of query has been used for the 
first time or the time of the previously recorded good plan has been 
deceeded by a smaller factor than 0.9 .
Analogous to this, a bad_plan is saved, when the time has been exceeded 
by a factor greater than 1.1 .
There are GUCs available so these parameters can be tuned to your 
liking. Tracking can be disabled for both plans individually.
A plan_format can be defined to enable better readability or 
processability through other tools.

You can reset your good and bad plans by using a
select on pg_stat_statements_good_plan_reset([queryid]);
resetting bad plans uses pg_stat_statements_bad_plan_reset, obviously.
In case of a reset, the execution time, timestamp and plan itself are 
just set to 0 respective NULL.

The pg_stat_statements view now provides six extra columns:
good_plan, good_plan_time, good_plan_timestamp, bad_plan, bad_plan_time 
and bad_plan_timestamp.

Plans are only displayed if the showtext argument is true and the user 
is the superuser or the user who has been associated with that entry.

Furthermore, we implemented a GUC that allows you to control the maximum 
refresh frequency to avoid performance impacts on restarts or resets.
A plan is only updated when tracking is enabled and more time than 
"plan_min_interval" has passed (default: 5 seconds) and the previously 
mentioned conditions for the execution time have been met.

The major selling point of this feature?
Beeing able to find plans that need optimization (e.g. by creating 
indexes). As pg_stat_statements tracks normalized queries, there might 
be certain values or even daytimes that result in very bad plans, while 
others result in perfectly fine plans.
Of course, the GUC log_min_duration_statement can also detect long 
runners, but the advantage of pg_stat_statements is that we count the 
total calls of normalized queries, which enables us to find plans, that 
don't count as long runners, while their aggregated time might show 
shortcomings regarding their plans.

We've found this sort of tool really useful when dealing with queries 
produced by ORM libraries, where optimization is not intuitive.

Various tests using pg_bench suggest that this extension does not worsen 
the performance of the database.

We're really looking forward to your opinions and feedback on this 
feature patch
Julian, Marius and Arne

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

From
Simon Riggs
Date:
On 25 January 2017 at 17:34, Julian Markwort
<julian.markwort@uni-muenster.de> wrote:

> Analogous to this, a bad_plan is saved, when the time has been exceeded by a
> factor greater than 1.1 .

...and the plan differs?

Probably best to use some stat math to calculate deviation, rather than fixed %.

Sounds good.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

From
David Steele
Date:
Hi Julian,

On 1/25/17 12:34 PM, Julian Markwort wrote:

> TL:DR;
> We extended the functionality of pg_stat_statements so it can track
> worst and best case execution plans.

pg_stat_statements is an important tool and perhaps one of the most used
core extensions.  Any improvements would be greatly welcome by the admin
community, I'm sure.

However, this is a rather large change which could be destabilizing to
the many users of this extension.  Even though the patch was posted well
in advance of the last CF it has received little discussion so is
essentially new.

I recommend moving this patch to the 2017-07 CF.

-- 
-David
david@pgmasters.net



Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

From
Peter Eisentraut
Date:
On 1/25/17 12:43, Simon Riggs wrote:
> On 25 January 2017 at 17:34, Julian Markwort
> <julian.markwort@uni-muenster.de> wrote:
> 
>> Analogous to this, a bad_plan is saved, when the time has been exceeded by a
>> factor greater than 1.1 .
> ...and the plan differs?
> 
> Probably best to use some stat math to calculate deviation, rather than fixed %.

Yeah, it seems to me too that this needs a bit more deeper analysis.  I
don't see offhand why a 10% deviation in execution time would be a
reasonable threshold for "good" or "bad".  A deviation approach like you
allude to would be better.

The other problem is that this measures execution time, which can vary
for reasons other than plan.  I would have expected that the cost
numbers are tracked somehow.

There is also the issue of generic vs specific plans, which this
approach might be papering over.

Needs more thought.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

From
Julian Markwort
Date:
> Any improvements would be greatly welcome by the admin
> community, I'm sure.
That's good to hear - on the other hand, I really appreciate the opinion 
of admins on this idea!
> However, this is a rather large change which could be destabilizing to
> the many users of this extension.
I'm fully aware of that, which is why there are already several switches 
in place so you can keep using the existing functionality without 
compromises or added complexity.
At the same time, I'm always open to suggestions regarding the reduction 
of complexity and probably more importantly the reduction of disk IO.
> I recommend moving this patch to the 2017-07 CF.
Since I do not have very much time for this at the moment I'll be 
looking forward to discussions on this patch in the next commitfest!

kind regards
Julian



Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

From
David Steele
Date:
Hi Julian,

On 3/4/17 8:41 AM, Julian Markwort wrote:
>> Any improvements would be greatly welcome by the admin
>> community, I'm sure.
> That's good to hear - on the other hand, I really appreciate the opinion
> of admins on this idea!
>> However, this is a rather large change which could be destabilizing to
>> the many users of this extension.
> I'm fully aware of that, which is why there are already several switches
> in place so you can keep using the existing functionality without
> compromises or added complexity.
> At the same time, I'm always open to suggestions regarding the reduction
> of complexity and probably more importantly the reduction of disk IO.
>> I recommend moving this patch to the 2017-07 CF.
> Since I do not have very much time for this at the moment I'll be
> looking forward to discussions on this patch in the next commitfest!

Since some concerns were raised about the implementation, I have instead
marked this "Returned with Feedback".  I encourage you to continue
developing the patch with the community and resubmit into the
appropriate CF when it is ready.

-- 
-David
david@pgmasters.net



Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

From
Julian Markwort
Date:
Alright, for the next version of this patch I'll look into standard 
deviation (an implementation of Welfords' algorithm already exists in 
pg_stat_statements).

On 3/4/17 14:18, Peter Eisentraut wrote:

> The other problem is that this measures execution time, which can vary
> for reasons other than plan.  I would have expected that the cost
> numbers are tracked somehow.
I've already thought of tracking specific parts of the explanation, like 
the cost numbers, instead of the whole string, I'll think of something, 
but if anybody has any bright ideas in the meantime, I'd gladly listen 
to them.

> There is also the issue of generic vs specific plans, which this
> approach might be papering over.
Would you be so kind and elaborate a little bit on this? I'm not sure if 
I understand this correctly. This patch only tracks specific plans, yes. 
The inital idea was that there might be some edge-cases that are not 
apparent when looking at generalized plans or queries.

kind regards
Julian



Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
Hello hackers,

I'd like to follow up to my previous proposition of tracking (some) best
and worst plans for different queries in the pg_stat_statements extension.

Based on the comments and suggestions made towards my last endeavour,
I've taken the path of computing the interquartile distance (by means of
an adapted z-test, under the assumption of normal distribution, based on
the mean_time and stddev_time already used by the extension).

A bad plan is recorded, if there is no previously recorded plan, or if
the current execution time is greater than the maximum of the previously
recorded plan's time and the query's mean+1.5*interquartile_distance.
A good plan is recorded on a similar condition; The execution time needs
to be shorter than the minimum of the previously recorded good plan's
time and the query's mean-1.5*interquartile_distance.

The boundaries are chosen to resemble the boundaries for whiskers in
boxplots.
Using these boundaries, plans will be updated very seldomly, as long as
they are more or less normally distributed.
Changes in the plans (for example the use of indices) used for each kind
of query will most likely result in execution times exceeding these
boundaries, so such changes are (very probably) recorded.

The ideal solution would be to compare the current plan with the last
plan and only update when there is a difference between them, however I
think this is unreasonably complex and a rather expensive task to
compute on the completion of every query.

The intent of this patch is to provide a quick insight into the plans
currently used by the database for the execution of certain queries. The
tracked plans only represent instances of queries with very good or very
poor performance.

I've (re)submitted this patch for the next commitfest as well.

Kind regards
Julian


On 03/04/2017 02:56 PM, Julian Markwort wrote:
> Alright, for the next version of this patch I'll look into standard
> deviation (an implementation of Welfords' algorithm already exists in
> pg_stat_statements).
>
> On 3/4/17 14:18, Peter Eisentraut wrote:
>
>> The other problem is that this measures execution time, which can vary
>> for reasons other than plan.  I would have expected that the cost
>> numbers are tracked somehow.
> I've already thought of tracking specific parts of the explanation,
> like the cost numbers, instead of the whole string, I'll think of
> something, but if anybody has any bright ideas in the meantime, I'd
> gladly listen to them.
>
>> There is also the issue of generic vs specific plans, which this
>> approach might be papering over.
> Would you be so kind and elaborate a little bit on this? I'm not sure
> if I understand this correctly. This patch only tracks specific plans,
> yes. The inital idea was that there might be some edge-cases that are
> not apparent when looking at generalized plans or queries.
>
> kind regards
> Julian


Attachment

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
David Fetter
Date:
On Wed, Jan 10, 2018 at 03:05:39PM +0100, Julian Markwort wrote:
> Hello hackers,
> 
> I'd like to follow up to my previous proposition of tracking (some) best and
> worst plans for different queries in the pg_stat_statements extension.
> 
> Based on the comments and suggestions made towards my last endeavour, I've
> taken the path of computing the interquartile distance (by means of an
> adapted z-test, under the assumption of normal distribution, based on the
> mean_time and stddev_time already used by the extension).

Is the assumption of a normal distribution reasonable for outlier
plans as you've seen them?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
On 01/11/2018 03:43 PM, David Fetter wrote:
> Is the assumption of a normal distribution reasonable for outlier
> plans as you've seen them?
This is a difficult but fair question.
First of all, I'd like to clarify that the normal distribution is
assumed for the set of all execution times matching a queryid; No
assumptions are made about the distribution of the outliers themselves.
The primary goal of this approach was the limitation of plan updates, to
avoid unnecessary IO operations.
When query performance does not vary much, no updates of the plans
should be necessary, but as soon as query performance varies too much,
the new plan should be stored.
For the purpose of distinguishing reasonable variance between execution
times and great variance due to changing conditions which ultimately
might result in a different plan, the assumption of a normal
distribution for all execution times suits well.

Based on some early testing, this results in only a few percent of
updates (1-3%) in relation to the total number of calls, when running
some short pgbench tests.
As the sample size grows, the assumption of a normal distribution
becomes increasingly accurate and the (unnecessary) sampling of plans
decreases.
In a different test, I ran several queries with identical table sizes,
the queries were fairly simple,  and the statistical evaluation led to
few updates during these tests. When I increased the table size
significantly, the database switched to a different plan. Because the
execution time differed significantly, this new bad plan was stored.
Similarly, after running a certain query a couple of times, I created an
index on my test data, which resulted in a speedup which was significant
enough to result in an update of the good plan.

Now, if a change to the data or the index situation only resulted in an
insignificant performance increase or decrease (one that falls into the
interval defined as [mean - 1.5*IQD, mean + 1-5*IQD] ), I think it might
be possible to assume that we are not interested in an updated plan for
this scenario.


Attachment

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Andres Freund
Date:
Hi,

On 2018-01-10 15:05:39 +0100, Julian Markwort wrote:
> I'd like to follow up to my previous proposition of tracking (some) best and
> worst plans for different queries in the pg_stat_statements extension.

Cool feature.

I think the patch probably doesn't apply anymore, due to other changes
to pg_stat_statements since its posting. Could you refresh?

I've not done any sort of review. Scrolling through I noticed //
comments which aren't pg coding style.

I'd like to see a small benchmark showing the overhead of the feature.
Both in runtime and storage size.

Greetings,

Andres Freund


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
Andres Freund wrote on 2018-03-01:
> I think the patch probably doesn't apply anymore, due to other changes
> to pg_stat_statements since its posting. Could you refresh?

pgss_plans_v02.patch applies cleanly to master, there were no changes to pg_stat_statements since the copyright updates
atthe beginning of January. 
(pgss_plans_v02.patch is attached to message 1bd396a9-4573-55ad-7ce8-fe7adffa1bd9@uni-muenster.de and can be found in
thecurrent commitfest as well.) 

> I've not done any sort of review. Scrolling through I noticed //
> comments which aren't pg coding style.

I'll fix that along with any other problems that might be found in a review.


> I'd like to see a small benchmark showing the overhead of the feature.
> Both in runtime and storage size.

I've tried to gather some meaningful results, however either my testing methodology was flawed (as variance between all
mypasses of pgbench was rather high) or the takeaway is that the feature only generates little overhead. 
This is what I've run on my workstation using a Ryzen 1700 and 16GB of RAM and an old Samsung 840 Evo as boot drive,
whichalso held the database: 
The database used for the tests was dropped and pgbench initialized anew for each test (pgss off, pgss on, pgss on with
plancollection) using a scaling of 16437704*0.003~=50 (roughly what the phoronix test suite uses for a buffer test). 
Also similar to the phoronix test suite, I used 8 jobs and 32 connections for a normal multithreaded load.

I then ran 10 passes, each for 60 seconds, with a 30 second pause between them, as well as another test which ran for
10minutes. 

With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 tps, while the patched version resulted in
1700tps, so a little over 7% overhead? Well, the "control run", without pg_stat_statements delivered only 1806 tps, so
varianceseems to be quite high. 

The results of the ten successive tests, each running 60 seconds and then waiting for 30 seconds, are displayed in the
attachedplot. 
I've tinkered with different settings with pgbench for quite some time now and all I can come up with are runs with
highvariance between them. 

If anybody has any recommendations for a setup that generates less variance, I'll try this again.

Finally, the more interesting metric regarding this patch is the size of the pg_stat_statements.stat file, which stores
allthe metrics while the database is shut down. I reckon that the size of pgss_query_texts.stat (which holds only the
querystrings and plan strings while the database is running) will be similar, however it might fluctuate more as new
stringsare simply appended to the file until the garbagecollector decides that it has to be cleaned up. 
After running the aforementioned tests, the file was 8566 bytes in size for pgss in it's unmodified form, while the
testsresulted in 32607 bytes for the pgss that collects plans as well. This seems reasonable as plans strings are
usuallylonger than the statements from which they result. Worst case, the pg_stat_statements.stat holds two plans for
eachtype of statement. 
I've not tested the length of the file with different encodings, such as JSON, YAML, or XML, however I do not expect
anyhugely different results. 

Greetings
Julian

Attachment

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Andres Freund
Date:
On 2018-03-02 18:07:32 +0100, Julian Markwort wrote:
> Andres Freund wrote on 2018-03-01:
> > I think the patch probably doesn't apply anymore, due to other changes
> > to pg_stat_statements since its posting. Could you refresh?
> 
> pgss_plans_v02.patch applies cleanly to master, there were no changes to pg_stat_statements since the copyright
updatesat the beginning of January.
 
> (pgss_plans_v02.patch is attached to message 1bd396a9-4573-55ad-7ce8-fe7adffa1bd9@uni-muenster.de and can be found in
thecurrent commitfest as well.)
 

Yea, I misread the diff to think you added a conflicting version. Due
to:
-DATA =3D pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
+DATA =3D pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \

and I'd checked that 1.5 already exists. But you just renamed the file,
presumably because it's essentially rewriting the whole file?  I'm not
sure I'm a big fan of doing so, because that makes testing the upgrade
path more work.

> I've tried to gather some meaningful results, however either my testing methodology was flawed (as variance between
allmy passes of pgbench was rather high) or the takeaway is that the feature only generates little overhead.
 
> This is what I've run on my workstation using a Ryzen 1700 and 16GB of RAM and an old Samsung 840 Evo as boot drive,
whichalso held the database:
 
> The database used for the tests was dropped and pgbench initialized anew for each test (pgss off, pgss on, pgss on
withplan collection) using a scaling of 16437704*0.003~=50 (roughly what the phoronix test suite uses for a buffer
test).
> Also similar to the phoronix test suite, I used 8 jobs and 32 connections for a normal multithreaded load.
> 
> I then ran 10 passes, each for 60 seconds, with a 30 second pause between them, as well as another test which ran for
10minutes.
 

What workload did you run? read/write or readonly? This seems like a
feature were readonly makes a lot more sense. But ~1800 tps strongly
suggests that's not what you did?


> With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 tps, while the patched version resulted in
1700tps, so a little over 7% overhead? Well, the "control run", without pg_stat_statements delivered only 1806 tps, so
varianceseems to be quite high.
 

That's quite some overhead, I'd say.


> If anybody has any recommendations for a setup that generates less variance, I'll try this again.

I'd suggest disabling turboost, in my experience that makes tests
painful to repeat, because it'll strongly depend on the current HW
temperature.

Greetings,

Andres Freund


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> Yea, I misread the diff to think you added a conflicting version. Due
> to:
> -DATA =3D pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
> +DATA =3D pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \

> and I'd checked that 1.5 already exists. But you just renamed the file,
> presumably because it's essentially rewriting the whole file?  I'm not
> sure I'm a big fan of doing so, because that makes testing the upgrade
> path more work.

Yeah, that is not project style anymore.  Just add a delta file and leave
it at that.  See e.g. de1d042f5979bc1388e9a6d52a4d445342b04932 for an
example.

(We might from time to time roll up the deltas and replace the base
file with a newer version, but I think that should happen in separate
housekeeping commits.)

            regards, tom lane


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
Andres Freund wrote on 2018-03-02:
> Yea, I misread the diff to think you added a conflicting version. Due
> to:
> -DATA =3D pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
> +DATA =3D pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \

> and I'd checked that 1.5 already exists. But you just renamed the file,
> presumably because it's essentially rewriting the whole file?  I'm not
> sure I'm a big fan of doing so, because that makes testing the upgrade
> path more work.

You're right about 1.5 already existing. I wasn't sure about the versioning policy for extensions and seeing as version
1.5only changed a few grants I quasi reused 1.5 for my intentions. 

> What workload did you run? read/write or readonly? This seems like a
> feature were readonly makes a lot more sense. But ~1800 tps strongly
> suggests that's not what you did?

I'm sorry I forgot to mention this; I ran all tests as read-write.

> > With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 tps, while the patched version resulted
in1700 tps, so a little over 7% overhead? Well, the "control run", without pg_stat_statements delivered only 1806 tps,
sovariance seems to be quite high. 

> That's quite some overhead, I'd say.

Yes, but I wouldn't give a warranty that it is neither more nor less overhead than 7%, seeing as for my testing, the
tpswere higher for (unmodified) pgss enabled vs no pgss at all. 

> > If anybody has any recommendations for a setup that generates less variance, I'll try this again.

> I'd suggest disabling turboost, in my experience that makes tests
> painful to repeat, because it'll strongly depend on the current HW
> temperature.
This might be a problem for average systems but I'm fairly certain this isn't the issue here.

I might try some more benchmarks and will in particular look into running read-only tests, as the aforementioned 840
EVOSSD ist -comparatively speaking- pretty slow. 
Do you have any recommendations as to what constitutes adequate testing times regarding pgbench?

Kind regards
Julian


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Tom Lane
Date:
Julian Markwort <julian.markwort@uni-muenster.de> writes:
> Andres Freund wrote on 2018-03-02:
>> and I'd checked that 1.5 already exists. But you just renamed the file,
>> presumably because it's essentially rewriting the whole file?  I'm not
>> sure I'm a big fan of doing so, because that makes testing the upgrade
>> path more work.

> You're right about 1.5 already existing. I wasn't sure about the versioning policy for extensions and seeing as
version1.5 only changed a few grants I quasi reused 1.5 for my intentions. 

Nope, that's totally wrong.  You can get away with that if we've not
already shipped a 1.5 release --- but we did ship it in v10, so that
version is frozen now.  You need to make your changes in a 1.5--1.6
upgrade file.  Otherwise there's no clean path for existing installations
to upgrade to the new version.

            regards, tom lane


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
Tom Lane wrote on 2018-03-02:
> You need to make your changes in a 1.5--1.6
> upgrade file.  Otherwise there's no clean path for existing
> installations
> to upgrade to the new version.

I've adressed all the issues that were brought up so far:
1. there is now only an added 1.5--1.6.sql file.
2. the overhead, as previously discussed (as much as a 12% decrease in
TPS during read-only tests), is now gone, the problem was that I was
collecting the plan String before checking if it needed to be stored at
all.
The patched version is now 99.95% as fast as unmodified
pg_stat_statements.
3. I've cleaned up my own code and made sure it adheres to GNU C coding
style, I was guilty of some // comments and curly brackets were
sometimes in the same line as my control structures.

I'd love to hear more feedback, here are two ideas to polish this
patch:
a) Right now, good_plan and bad_plan collection can be activated and
deactivated with separate GUCs. I think it would be sensible to collect
either both or none. (This would result in fewer convoluted
conditionals.)
b) Would you like to be able to tune the percentiles yourself, to
adjust for the point at which a new plan is stored?

Greetings
Julian
Attachment

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
I've goofed up now, sorry for failing to attach my updated patch.

Am Donnerstag, den 08.03.2018, 14:55 +0100 schrieb Julian Markwort:
> Tom Lane wrote on 2018-03-02:
> > You need to make your changes in a 1.5--1.6
> > upgrade file.  Otherwise there's no clean path for existing
> > installations
> > to upgrade to the new version.
>
> I've adressed all the issues that were brought up so far:
> 1. there is now only an added 1.5--1.6.sql file.
> 2. the overhead, as previously discussed (as much as a 12% decrease
> in
> TPS during read-only tests), is now gone, the problem was that I was
> collecting the plan String before checking if it needed to be stored
> at
> all.
> The patched version is now 99.95% as fast as unmodified
> pg_stat_statements.
> 3. I've cleaned up my own code and made sure it adheres to GNU C
> coding
> style, I was guilty of some // comments and curly brackets were
> sometimes in the same line as my control structures.
>
> I'd love to hear more feedback, here are two ideas to polish this
> patch:
> a) Right now, good_plan and bad_plan collection can be activated and
> deactivated with separate GUCs. I think it would be sensible to
> collect
> either both or none. (This would result in fewer convoluted
> conditionals.)
> b) Would you like to be able to tune the percentiles yourself, to
> adjust for the point at which a new plan is stored?
>
> Greetings
> Julian
Attachment

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Arthur Zakirov
Date:
Greetings,

On Thu, Mar 08, 2018 at 02:58:34PM +0100, Julian Markwort wrote:
> > I'd love to hear more feedback, here are two ideas to polish this
> > patch:
> > a) Right now, good_plan and bad_plan collection can be activated and
> > deactivated with separate GUCs. I think it would be sensible to
> > collect
> > either both or none. (This would result in fewer convoluted
> > conditionals.)
> > b) Would you like to be able to tune the percentiles yourself, to
> > adjust for the point at which a new plan is stored?

I'd like to review the patch and leave a feedback. I tested it with
different indexes on same table and with same queries and it works fine.

First of all, GUC variables and functions. How about union
'pg_stat_statements.good_plan_enable' and
'pg_stat_statements.bad_plan_enable' into
'pg_stat_statements.track_plan' with GUC_LIST_INPUT flag? It may accept
comma separated values 'good' and 'bad'. It lets to add another tracking
type in future without adding new variable.

In same manner pg_stat_statements_good_plan_reset() and
pg_stat_statements_bad_plan_reset() functions can be combined in
function:

pg_stat_statements_plan_reset(in queryid bigint, in good boolean, in bad
boolean)

Further comments on the code.

+-- test to see if any plans have been recorded.
+SELECT
+  CASE WHEN good_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN bad_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN good_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END,
+  CASE WHEN good_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END

I think here is a typo. Last case should be bad_plan_timestamp.

+            good_plan_str = palloc(1 * sizeof(char));
+            *good_plan_str = '\0';
...
+            bad_plan_str = palloc(1 * sizeof(char));
+            *bad_plan_str = '\0';

Here we can use empty string literals instead of pallocs. For example:

const char *good_plan_str;
const char *bad_plan_str;
...
good_plan_str = "";
...
bad_plan_str = "";

+            interquartile_dist = 2.0*(0.6745 * sqrt(e->counters.sum_var_time / e->counters.calls));

It is worth to check e->counters.calls for zero here. Because the entry
can be sticky.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
Arthur Zakirov wrote on 2018-03-09:
> I'd like to review the patch and leave a feedback. I tested it with
> different indexes on same table and with same queries and it works fine.

Thanks for taking some time to review my patch, Arthur!

> First of all, GUC variables and functions. How about union
> 'pg_stat_statements.good_plan_enable' and
> 'pg_stat_statements.bad_plan_enable' into
> 'pg_stat_statements.track_plan' with GUC_LIST_INPUT flag? It may accept
> comma separated values 'good' and 'bad'. It lets to add another tracking
> type in future without adding new variable.

This sounds like a good idea to me; Somebody already suggested that tracking an "average plan" would be interesting as
well,however I don't have any clever ideas on how to identify such a plan. 

> In same manner pg_stat_statements_good_plan_reset() and
> pg_stat_statements_bad_plan_reset() functions can be combined in
> function:

> pg_stat_statements_plan_reset(in queryid bigint, in good boolean, in bad
> boolean)

This is also sensible, however if any more kinds of plans would be added in the future, there would be a lot of flags
inthis function. I think having varying amounts of flags (between extension versions) as arguments to the function also
makesany upgrade paths difficult. Thinking more about this, we could also user function overloading to avoid this. 
An alternative would be to have the caller pass the type of plan he wants to reset. Omitting the type would result in
thedeletion of all plans, maybe? 
pg_stat_statements_plan_reset(in queryid bigint, in plan_type text)

I'm not sure if there are any better ways to do this?

> Further comments on the code.
 You're right about all of those, thanks!


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Arthur Zakirov
Date:
On Mon, Mar 12, 2018 at 02:11:42PM +0100, Julian Markwort wrote:
> > In same manner pg_stat_statements_good_plan_reset() and
> > pg_stat_statements_bad_plan_reset() functions can be combined in
> > function:
> 
> > pg_stat_statements_plan_reset(in queryid bigint, in good boolean, in bad
> > boolean)
> 
> This is also sensible, however if any more kinds of plans would be added in the future, there would be a lot of flags
inthis function. I think having varying amounts of flags (between extension versions) as arguments to the function also
makesany upgrade paths difficult. Thinking more about this, we could also user function overloading to avoid this.
 
> An alternative would be to have the caller pass the type of plan he wants to reset. Omitting the type would result in
thedeletion of all plans, maybe?
 
> pg_stat_statements_plan_reset(in queryid bigint, in plan_type text)

Yes, it is a good idea. But maybe instead of passing an empty string
into plans type we should overload the function with only queryid
argument. I think it is a common practice in PostgreSQL. Otherwise
allowance to pass empty string as plan type may confuse users. So
functions declaration may be the following:

pg_stat_statements_plan_reset(in queryid bigint, in plan_type text)
pg_stat_statements_plan_reset(in queryid bigint)

+            interquartile_dist = 2.0*(0.6745 * sqrt(e->counters.sum_var_time / e->counters.calls));

I think it would be better to have defines for 2.0 and 0.6745 values for
the sake of readability.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
To anyone who followed along with this for so long, I'd like to present my newest version of this patch.

As suggested by Arthur Zakirov, there is now only a single GUC ( pg_stat_statements.plan_track ) responsible for the
selectionof the plans that should be tracked. Possible values are: all, none, good, or bad. 
I've mostly copied functionality from pl_handler.c . This resulted in the need to include varlena.h so I could use the
SplitIdentifierString()function to parse the values, of which several (e.g. pg_stat_statements.plan_track='good, bad')
couldbe used. 

I've also added a new GUC:
pg_stat_statements.plan_fence_factor
This GUC can be used to scale the fences of the interval, outside of which a plan might be updated.
Right now, it is set to 1.5 (common factor for the definition of outliers in boxplots) and you can see through
additionalcolums in the pg_stat_statements view, how often these fences are surpassed by execution times and how often
theplans are updated. (The colums are: good_plan_outliers, good_plan_updates, bad_plan_outliers, bad_plan_updates and
areprimarily here for testing and review purposes and are not essential to this patch, they probably don't add any
valuefor the average user) 

Similarly to the first suggestion by Arthur, I've also changed the plan_reset functionality - there is now only one
function,pg_stat_statements_plan_reset(queryid bigint), overloaded with (queryid bigint, plantype cstring) args, that
canbe used to remove both plans (when omitting the cstring) or either of them. The cstring argument accepts 'good' or
'bad'.

I also added more comments to the estimations of the quartiles and the calculation of the fences.

The performance impact lies now at 139312 vs 141841 tps, so roughly 1.78% slower than default pg_stat_statements.
The fact that these results are a little worse than the previous iteration is due to some changes in the definition of
thefences which mistakenly calculated by adding the scaled interquartile distance to the mean, instead of adding it to
therespective quartiles, which means that plan updates are triggered a little more often. 
For 4259631 transactions however, only 11 updates for the bad plans where triggered.



I'm looking forward to your opinions!
Julian

Attachment

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Julian Markwort
Date:
I just realized I made a whitespace error when I modified the comments.
I hope I don't make a habit of sending erroneous mails.

Please accept my apologies, as well as a guaranteed whitespace-error-free patch (updated to version 5 for clarity).

Julian

Julian Markwort schrieb am 2018-03-20:
> To anyone who followed along with this for so long, I'd like to present my newest version of this patch.

> As suggested by Arthur Zakirov, there is now only a single GUC ( pg_stat_statements.plan_track ) responsible for the
selectionof the plans that should be tracked. Possible values are: all, none, good, or bad. 
> I've mostly copied functionality from pl_handler.c . This resulted in the need to include varlena.h so I could use
theSplitIdentifierString() function to parse the values, of which several (e.g. pg_stat_statements.plan_track='good,
bad')could be used. 

> I've also added a new GUC:
> pg_stat_statements.plan_fence_factor
> This GUC can be used to scale the fences of the interval, outside of which a plan might be updated.
> Right now, it is set to 1.5 (common factor for the definition of outliers in boxplots) and you can see through
additionalcolums in the pg_stat_statements view, how often these fences are surpassed by execution times and how often
theplans are updated. (The colums are: good_plan_outliers, good_plan_updates, bad_plan_outliers, bad_plan_updates and
areprimarily here for testing and review purposes and are not essential to this patch, they probably don't add any
valuefor the average user) 

> Similarly to the first suggestion by Arthur, I've also changed the plan_reset functionality - there is now only one
function,pg_stat_statements_plan_reset(queryid bigint), overloaded with (queryid bigint, plantype cstring) args, that
canbe used to remove both plans (when omitting the cstring) or either of them. The cstring argument accepts 'good' or
'bad'.

> I also added more comments to the estimations of the quartiles and the calculation of the fences.

> The performance impact lies now at 139312 vs 141841 tps, so roughly 1.78% slower than default pg_stat_statements.
> The fact that these results are a little worse than the previous iteration is due to some changes in the definition
ofthe fences which mistakenly calculated by adding the scaled interquartile distance to the mean, instead of adding it
tothe respective quartiles, which means that plan updates are triggered a little more often. 
> For 4259631 transactions however, only 11 updates for the bad plans where triggered.



> I'm looking forward to your opinions!
> Julian

Attachment

Re: [FEATURE PATCH] pg_stat_statements with plans (v02)

From
legrand legrand
Date:
Hello,
I'm very interested in pg_stat_statements usage, and I'm very happy to see
you adding plans to it.

Reading other pg_stat_statements threads on this forum, there are also activ
developments to add:
- planing duration,
- first date,
- last_update date,
- parameters for normalized queries,
- ...
as described in
http://www.postgresql-archive.org/pg-stat-statements-HLD-for-futur-developments-td6012381.html

I was wondering about how would your dev behave with all those new features.
It seems to me that bad and good plans will not have any of thoses
informations.

What would you think about displaying good, current, bad plans results in 3
lines inspite of only one line ?

queryid, plantype, query, ...
aaa     good      ...
aaa     current   ...
aaa     bad       ...

This would permit to get planing duration, first capture time, last capture,
executions, query parameters for all plans (good or bad inclusive).

Last question, didn't you think about a model to store all the different
plans using a planid like

queryid, planid, query, ...
aaa     plan1      ...
aaa     plan2    ...
aaa     plan3      ...
...

I can not imagine that there would be so many of them ;o)
Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


Re: [FEATURE PATCH] pg_stat_statements with plans (v02)

From
Arthur Zakirov
Date:
On Thu, Mar 22, 2018 at 11:16:30AM -0700, legrand legrand wrote:
> Reading other pg_stat_statements threads on this forum, there are also activ
> developments to add:
> - planing duration,
> - first date,
> - last_update date,
> - parameters for normalized queries,
> - ...
> as described in
> http://www.postgresql-archive.org/pg-stat-statements-HLD-for-futur-developments-td6012381.html
> 
> I was wondering about how would your dev behave with all those new features.
> It seems to me that bad and good plans will not have any of thoses
> informations.

Indeed. I think those developments require cooperation.
Also as you wrote an additional view can be added (pg_stat_statements_plans or pg_stat_plans). But I'm not sure about
thename for now and its use cases.
 

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company