Thread: idea: log_statement_sample_rate - bottom limit for sampling

idea: log_statement_sample_rate - bottom limit for sampling

From
Pavel Stehule
Date:
Hi

I like the idea of sampling slow statements via  log_statement_sample_rate. But I miss some parameter that can ensure so every query executed over this limit is logged.

Can we introduce new option

log_statement_sampling_limit

The query with execution time over this limit is logged every time.

What do you think about this?

Regards

Pavel

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Gilles Darold
Date:
Le 06/06/2019 à 10:38, Pavel Stehule a écrit :
> Hi
>
> I like the idea of sampling slow statements via 
> log_statement_sample_rate. But I miss some parameter that can ensure
> so every query executed over this limit is logged.
>
> Can we introduce new option
>
> log_statement_sampling_limit
>
> The query with execution time over this limit is logged every time.
>
> What do you think about this?
>
> Regards
>
> Pavel


+1, log_min_duration_statement is modulated by log_statement_sample_rate
that mean that there is no more way to log all statements over a certain
duration limit. log_statement_sampling_limit might probably always be
upper than log_min_duration_statement.


-- 
Gilles Darold
http://www.darold.net/




Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Pavel Stehule
Date:
Hi

čt 6. 6. 2019 v 10:48 odesílatel Gilles Darold <gilles@darold.net> napsal:
Le 06/06/2019 à 10:38, Pavel Stehule a écrit :
> Hi
>
> I like the idea of sampling slow statements via 
> log_statement_sample_rate. But I miss some parameter that can ensure
> so every query executed over this limit is logged.
>
> Can we introduce new option
>
> log_statement_sampling_limit
>
> The query with execution time over this limit is logged every time.
>
> What do you think about this?
>
> Regards
>
> Pavel


+1, log_min_duration_statement is modulated by log_statement_sample_rate
that mean that there is no more way to log all statements over a certain
duration limit. log_statement_sampling_limit might probably always be
upper than log_min_duration_statement.

Here is a patch

Regards

Pavel



--
Gilles Darold
http://www.darold.net/

Attachment

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Pavel Stehule
Date:
Hi

po 17. 6. 2019 v 22:40 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

čt 6. 6. 2019 v 10:48 odesílatel Gilles Darold <gilles@darold.net> napsal:
Le 06/06/2019 à 10:38, Pavel Stehule a écrit :
> Hi
>
> I like the idea of sampling slow statements via 
> log_statement_sample_rate. But I miss some parameter that can ensure
> so every query executed over this limit is logged.
>
> Can we introduce new option
>
> log_statement_sampling_limit
>
> The query with execution time over this limit is logged every time.
>
> What do you think about this?
>
> Regards
>
> Pavel


+1, log_min_duration_statement is modulated by log_statement_sample_rate
that mean that there is no more way to log all statements over a certain
duration limit. log_statement_sampling_limit might probably always be
upper than log_min_duration_statement.

Here is a patch

I did error in logic - fixed



Regards

Pavel



--
Gilles Darold
http://www.darold.net/

Attachment

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Adrien Nayrat
Date:
Hi,

I tried the patch, here my comment:

> gettext_noop("Zero effective disables sampling. "
>                          "-1 use sampling every time (without limit)."),

I do not agree with the zero case. In fact, sampling is disabled as soon as
setting is less than log_min_duration_statements. Furthermore, I think we should
provide a more straightforward description for users.

I changed few comments and documentation:

  * As we added much more logic in this function with statement and transaction
sampling. And now with statement_sample_rate, it is not easy to understand the
logic on first look. I reword comment in check_log_duration, I hope it is more
straightforward.

  * I am not sure if "every_time" is a good naming for the variable. In fact, if
duration exceeds limit we disable sampling. Maybe sampling_disabled is more clear?

  * I propose to add some words in log_min_duration_statement and
log_statement_sample_rate documentation.

  * Rephrased log_statement_sample_limit documentation, I hope it help
understanding.

Patch attached.

Regards,

--
Adrien

Attachment

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Pavel Stehule
Date:


út 18. 6. 2019 v 14:03 odesílatel Adrien Nayrat <adrien.nayrat@anayrat.info> napsal:
Hi,

I tried the patch, here my comment:

> gettext_noop("Zero effective disables sampling. "
>                          "-1 use sampling every time (without limit)."),

I do not agree with the zero case. In fact, sampling is disabled as soon as
setting is less than log_min_duration_statements. Furthermore, I think we should
provide a more straightforward description for users.

You have true, but I have not a idea,how to describe it in one line. In this case the zero is corner case, and sampling is disabled without dependency on log_min_duration_statement.
 
I think so this design has only few useful values and ranges

a) higher than log_min_duration_statement .. sampling is active with limit
b) 0 .. for this case - other way how to effective disable sampling - no dependency on other
c) -1 or negative value - sampling is allowed every time.

Sure, there is range (0..log_min_duration_statement), but for this range this value has not sense. I think so this case cannot be mentioned in short description. But it should be mentioned in documentation.


I changed few comments and documentation:

  * As we added much more logic in this function with statement and transaction
sampling. And now with statement_sample_rate, it is not easy to understand the
logic on first look. I reword comment in check_log_duration, I hope it is more
straightforward.

  * I am not sure if "every_time" is a good naming for the variable. In fact, if
duration exceeds limit we disable sampling. Maybe sampling_disabled is more clear?

For me important is following line

(exceeded && (in_sample || every_time))

I think so "every_time" or "always" or "every" is in this context more illustrative than "sampling_disabled". But my opinion is not strong in this case, and I have not a problem accept common opinion.
 

  * I propose to add some words in log_min_duration_statement and
log_statement_sample_rate documentation.

  * Rephrased log_statement_sample_limit documentation, I hope it help
understanding.

Patch attached.

Regards,

--
Adrien

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Adrien Nayrat
Date:
On 6/18/19 8:29 PM, Pavel Stehule wrote:
>
>
> út 18. 6. 2019 v 14:03 odesílatel Adrien Nayrat <adrien.nayrat@anayrat.info
> <mailto:adrien.nayrat@anayrat.info>> napsal:
>
>     Hi,
>
>     I tried the patch, here my comment:
>
>     > gettext_noop("Zero effective disables sampling. "
>     >                          "-1 use sampling every time (without limit)."),
>
>     I do not agree with the zero case. In fact, sampling is disabled as soon as
>     setting is less than log_min_duration_statements. Furthermore, I think we should
>     provide a more straightforward description for users.
>
>
> You have true, but I have not a idea,how to describe it in one line. In this
> case the zero is corner case, and sampling is disabled without dependency on
> log_min_duration_statement.
>  
> I think so this design has only few useful values and ranges
>
> a) higher than log_min_duration_statement .. sampling is active with limit
> b) 0 .. for this case - other way how to effective disable sampling - no
> dependency on other
> c) -1 or negative value - sampling is allowed every time.
>
> Sure, there is range (0..log_min_duration_statement), but for this range this
> value has not sense. I think so this case cannot be mentioned in short
> description. But it should be mentioned in documentation.

Yes, it took me a while to understand :) I am ok to keep simple in GUC
description and give more information in documentation.

>
>
>     I changed few comments and documentation:
>
>       * As we added much more logic in this function with statement and transaction
>     sampling. And now with statement_sample_rate, it is not easy to understand the
>     logic on first look. I reword comment in check_log_duration, I hope it is more
>     straightforward.
>
>       * I am not sure if "every_time" is a good naming for the variable. In fact, if
>     duration exceeds limit we disable sampling. Maybe sampling_disabled is more
>     clear?
>
>
> For me important is following line
>
> (exceeded && (in_sample || every_time))
>
> I think so "every_time" or "always" or "every" is in this context more
> illustrative than "sampling_disabled". But my opinion is not strong in this
> case, and I have not a problem accept common opinion.

Oh, yes, that's correct. I do not have a strong opinion too. Maybe someone else
will have better idea.

--
Adrien


Attachment

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Pavel Stehule
Date:


st 19. 6. 2019 v 10:49 odesílatel Adrien Nayrat <adrien.nayrat@anayrat.info> napsal:
On 6/18/19 8:29 PM, Pavel Stehule wrote:
>
>
> út 18. 6. 2019 v 14:03 odesílatel Adrien Nayrat <adrien.nayrat@anayrat.info
> <mailto:adrien.nayrat@anayrat.info>> napsal:
>
>     Hi,
>
>     I tried the patch, here my comment:
>
>     > gettext_noop("Zero effective disables sampling. "
>     >                          "-1 use sampling every time (without limit)."),
>
>     I do not agree with the zero case. In fact, sampling is disabled as soon as
>     setting is less than log_min_duration_statements. Furthermore, I think we should
>     provide a more straightforward description for users.
>
>
> You have true, but I have not a idea,how to describe it in one line. In this
> case the zero is corner case, and sampling is disabled without dependency on
> log_min_duration_statement.
>  
> I think so this design has only few useful values and ranges
>
> a) higher than log_min_duration_statement .. sampling is active with limit
> b) 0 .. for this case - other way how to effective disable sampling - no
> dependency on other
> c) -1 or negative value - sampling is allowed every time.
>
> Sure, there is range (0..log_min_duration_statement), but for this range this
> value has not sense. I think so this case cannot be mentioned in short
> description. But it should be mentioned in documentation.

Yes, it took me a while to understand :) I am ok to keep simple in GUC
description and give more information in documentation.

Maybe some like. "The zero block sampling. Negative value forces sampling without limit"


>
>
>     I changed few comments and documentation:
>
>       * As we added much more logic in this function with statement and transaction
>     sampling. And now with statement_sample_rate, it is not easy to understand the
>     logic on first look. I reword comment in check_log_duration, I hope it is more
>     straightforward.
>
>       * I am not sure if "every_time" is a good naming for the variable. In fact, if
>     duration exceeds limit we disable sampling. Maybe sampling_disabled is more
>     clear?
>
>
> For me important is following line
>
> (exceeded && (in_sample || every_time))
>
> I think so "every_time" or "always" or "every" is in this context more
> illustrative than "sampling_disabled". But my opinion is not strong in this
> case, and I have not a problem accept common opinion.

Oh, yes, that's correct. I do not have a strong opinion too. Maybe someone else
will have better idea.

the naming in this case is not hard issue, and comitter  can decide.

Regards

Pavel

--
Adrien

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Adrien Nayrat
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:       tested, failed
Spec compliant:           tested, failed
Documentation:            tested, failed

I test the latest patch attached to this thread (log_statement_sample_limit-3.patch). Everything looks good to me.

The new status of this patch is: Ready for Committer

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Adrien Nayrat
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Sorry, I forgot to tick "passed" boxes.

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Pavel Stehule
Date:
Hi

pá 12. 7. 2019 v 13:07 odesílatel Adrien Nayrat <adrien.nayrat@gmail.com> napsal:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Sorry, I forgot to tick "passed" boxes.

Thank you

Pavel
 

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
Hi,

I've started reviewing this patch, thinking that maybe I could get it
committed as it's marked as RFC. In general I agree with having this
fuature, but I think we need to rethink the GUC because the current
approach is just confusing.

The way the current patch works is that we have three GUCs:

  log_min_duration_statement
  log_statement_sample_limit
  log_statement_sample_rate

and it essentially works like this:

- If the duration exceeds log_min_duration_statement, we start sampling
  the commands with log_statement_sample rate.

- If the duration exceeds log_statement_sample_limit, we just log the
  command every time (i.e. we disable sampling, using sample rate 1.0).

IMO that's bound to be confusing for users, because one threshold
behaves as minimum while the other behaves as maximum.


What I think we should do instead is to use two minimum thresholds.

1) log_min_duration_sample - enables sampling of commands, using the
existing GUC log_statement_sample_rate

2) log_min_duration_statement - logs all commands exceeding this


I think this is going to be much easier for users to understand.


The one difference between those approaches is in how they work with
existing current settings. That is, let's say you have

  log_min_duration_statement = 1000
  log_statement_sample_rate = 0.01

then no queries below 1000ms will be logged, and 1% of longer queries
will be sampled. And with the original config (as proposed in v3 of the
patch), this would still work the same way.

With the new approach (two min thresholds) it'd behave differently,
because we'd log *all* queries longer than 1000ms (not just 1%). And
whether we'd sample any queries (using log_statement_sample_rate) would
depend on how we'd pick the default value for the other threshold.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> I've started reviewing this patch, thinking that maybe I could get it
> committed as it's marked as RFC. In general I agree with having this
> fuature, but I think we need to rethink the GUC because the current
> approach is just confusing.
> ...
> What I think we should do instead is to use two minimum thresholds.
> 1) log_min_duration_sample - enables sampling of commands, using the
> existing GUC log_statement_sample_rate
> 2) log_min_duration_statement - logs all commands exceeding this
> I think this is going to be much easier for users to understand.

I agree with Tomas' idea.

> The one difference between those approaches is in how they work with
> existing current settings. That is, let's say you have
>   log_min_duration_statement = 1000
>   log_statement_sample_rate = 0.01
> then no queries below 1000ms will be logged, and 1% of longer queries
> will be sampled. And with the original config (as proposed in v3 of the
> patch), this would still work the same way.
> With the new approach (two min thresholds) it'd behave differently,
> because we'd log *all* queries longer than 1000ms (not just 1%). And
> whether we'd sample any queries (using log_statement_sample_rate) would
> depend on how we'd pick the default value for the other threshold.

Well, we do not need to have a backwards-compatibility problem
here, because we have yet to release a version containing
log_statement_sample_rate.  I do not think it's too late to decide
that v12's semantics for that are broken, and either revert that
patch in v12, or back-patch a fix to make it match this idea.

            regards, tom lane



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
On Tue, Jul 30, 2019 at 03:43:58PM -0400, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> I've started reviewing this patch, thinking that maybe I could get it
>> committed as it's marked as RFC. In general I agree with having this
>> fuature, but I think we need to rethink the GUC because the current
>> approach is just confusing.
>> ...
>> What I think we should do instead is to use two minimum thresholds.
>> 1) log_min_duration_sample - enables sampling of commands, using the
>> existing GUC log_statement_sample_rate
>> 2) log_min_duration_statement - logs all commands exceeding this
>> I think this is going to be much easier for users to understand.
>
>I agree with Tomas' idea.
>
>> The one difference between those approaches is in how they work with
>> existing current settings. That is, let's say you have
>>   log_min_duration_statement = 1000
>>   log_statement_sample_rate = 0.01
>> then no queries below 1000ms will be logged, and 1% of longer queries
>> will be sampled. And with the original config (as proposed in v3 of the
>> patch), this would still work the same way.
>> With the new approach (two min thresholds) it'd behave differently,
>> because we'd log *all* queries longer than 1000ms (not just 1%). And
>> whether we'd sample any queries (using log_statement_sample_rate) would
>> depend on how we'd pick the default value for the other threshold.
>
>Well, we do not need to have a backwards-compatibility problem
>here, because we have yet to release a version containing
>log_statement_sample_rate.  I do not think it's too late to decide
>that v12's semantics for that are broken, and either revert that
>patch in v12, or back-patch a fix to make it match this idea.
>

I'm willing to try fixing this to salvage the feature for v12. The
question is how would that fix look like - IMO we'd need to introduce
the new threshold GUC, essentially implementing what this thread is
about. It's not a complex patch, but it kinda flies in the face of
feature freeze. OTOH if we call it a fix ...

The patch itself is not that complicated - attached is a WIP version,
(particularly) the docs need more work.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Michael Paquier
Date:
On Tue, Jul 30, 2019 at 03:43:58PM -0400, Tom Lane wrote:
> Well, we do not need to have a backwards-compatibility problem
> here, because we have yet to release a version containing
> log_statement_sample_rate.  I do not think it's too late to decide
> that v12's semantics for that are broken, and either revert that
> patch in v12, or back-patch a fix to make it match this idea.

With my RTM hat on, if we think that the current semantics of
log_statement_sample_rate are broken and need a redesign, then I would
take the safest path and just revert the original patch in v12, and
finally make sure that it brews correctly for v13.  We are in beta2
and close to a beta3, so redesigning things at this stage on a stable
branch sounds wrong.
--
Michael

Attachment

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Adrien Nayrat
Date:
On 7/28/19 12:19 AM, Tomas Vondra wrote:
> Hi,
>
> I've started reviewing this patch, thinking that maybe I could get it
> committed as it's marked as RFC. In general I agree with having this
> fuature, but I think we need to rethink the GUC because the current
> approach is just confusing.
>
> The way the current patch works is that we have three GUCs:
>
>  log_min_duration_statement
>  log_statement_sample_limit
>  log_statement_sample_rate
>
> and it essentially works like this:
>
> - If the duration exceeds log_min_duration_statement, we start sampling
>  the commands with log_statement_sample rate.
>
> - If the duration exceeds log_statement_sample_limit, we just log the
>  command every time (i.e. we disable sampling, using sample rate 1.0).
>
> IMO that's bound to be confusing for users, because one threshold
> behaves as minimum while the other behaves as maximum.

I agree, it took me a while to understand how it behave with the three GUC. That
why I tried to enrich documentation, but it may mean that the functionality is
not properly implemented.

>
>
> What I think we should do instead is to use two minimum thresholds.
>
> 1) log_min_duration_sample - enables sampling of commands, using the
> existing GUC log_statement_sample_rate
>
> 2) log_min_duration_statement - logs all commands exceeding this
>
>
> I think this is going to be much easier for users to understand.

+1, I like this idea.

I don't really have an opinion if we have to revert the whole feature or try to
fix it for v12. But it is true it is a late to fix it.

Furthermore, users who really want this feature in v12 can use an extension for
that purpose [1].

1: I made this extension with same kind of features :
https://github.com/anayrat/pg_sampletolog


Attachment

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Adrien Nayrat
Date:
Hi,

As we are at the end of this CF and there is still discussions about whether we
should revert log_statement_sample_limit and log_statement_sample_rate, or try
to fix it in v12.
I moved this patch to next commit fest and change status from "ready for
commiter" to "need review". I hope I didn't make a mistake.

Best regards,


Attachment

Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
On Thu, Aug 01, 2019 at 11:47:46AM +0200, Adrien Nayrat wrote:
>Hi,
>
>As we are at the end of this CF and there is still discussions about whether we
>should revert log_statement_sample_limit and log_statement_sample_rate, or try
>to fix it in v12.
>I moved this patch to next commit fest and change status from "ready for
>commiter" to "need review". I hope I didn't make a mistake.
>

Thanks. The RFC status was clearly stale, so thanks for updating. I should
have done that after my review. I think the patch would be moved to the
next CF at the end, but I might be wrong. In any case, I don't think
you've done any mistake.

As for the sampling patch - I think we'll end up reverting the feature for
v12 - it's far too late to rework it at this point. Sorry about that, I
know it's not a warm feeling when you get something done, and then it gets
reverted on the last minute. :-(

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Adrien Nayrat
Date:
On 8/1/19 12:04 PM, Tomas Vondra wrote:
> On Thu, Aug 01, 2019 at 11:47:46AM +0200, Adrien Nayrat wrote:
>> Hi,
>>
>> As we are at the end of this CF and there is still discussions about whether we
>> should revert log_statement_sample_limit and log_statement_sample_rate, or try
>> to fix it in v12.
>> I moved this patch to next commit fest and change status from "ready for
>> commiter" to "need review". I hope I didn't make a mistake.
>>
> 
> Thanks. The RFC status was clearly stale, so thanks for updating. I should
> have done that after my review. I think the patch would be moved to the
> next CF at the end, but I might be wrong. In any case, I don't think
> you've done any mistake.
> 
> As for the sampling patch - I think we'll end up reverting the feature for
> v12 - it's far too late to rework it at this point. Sorry about that, I
> know it's not a warm feeling when you get something done, and then it gets
> reverted on the last minute. :-(
> 

Don't worry, I understand. It is better to add straigforward GUC in v13 than
confusionning in v12 we will regret.





Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
On Fri, Aug 02, 2019 at 09:53:40AM +0200, Adrien Nayrat wrote:
>On 8/1/19 12:04 PM, Tomas Vondra wrote:
>> On Thu, Aug 01, 2019 at 11:47:46AM +0200, Adrien Nayrat wrote:
>>> Hi,
>>>
>>> As we are at the end of this CF and there is still discussions about whether we
>>> should revert log_statement_sample_limit and log_statement_sample_rate, or try
>>> to fix it in v12.
>>> I moved this patch to next commit fest and change status from "ready for
>>> commiter" to "need review". I hope I didn't make a mistake.
>>>
>>
>> Thanks. The RFC status was clearly stale, so thanks for updating. I should
>> have done that after my review. I think the patch would be moved to the
>> next CF at the end, but I might be wrong. In any case, I don't think
>> you've done any mistake.
>>
>> As for the sampling patch - I think we'll end up reverting the feature for
>> v12 - it's far too late to rework it at this point. Sorry about that, I
>> know it's not a warm feeling when you get something done, and then it gets
>> reverted on the last minute. :-(
>>
>
>Don't worry, I understand. It is better to add straigforward GUC in v13 than
>confusionning in v12 we will regret.
>

OK, I have the revert ready. The one thing I'm wondering about is
whether we need to revert log_transaction_sample_rate too? I think it's
pretty much independent feature, so I think we can keep that. Opinions?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> OK, I have the revert ready. The one thing I'm wondering about is
> whether we need to revert log_transaction_sample_rate too? I think it's
> pretty much independent feature, so I think we can keep that. Opinions?

Isn't the issue here the interaction between log_transaction_sample_rate
and log_min_duration_statement?  Seems like we have that question
regardless of whether log_statement_sample_limit exists.

            regards, tom lane



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> OK, I have the revert ready. The one thing I'm wondering about is
>> whether we need to revert log_transaction_sample_rate too? I think it's
>> pretty much independent feature, so I think we can keep that. Opinions?
>
>Isn't the issue here the interaction between log_transaction_sample_rate
>and log_min_duration_statement?  Seems like we have that question
>regardless of whether log_statement_sample_limit exists.
>

No, that interaction only affects statement-level sampling.

For transaction-level sampling we do the sampling independently of the
statement duration, i.e. we when starting a transaction we determine
whether the whole transaction will be sampled. It has nothing to do with
the proposed log_statement_sample_limit.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:
>> Isn't the issue here the interaction between log_transaction_sample_rate
>> and log_min_duration_statement?

> No, that interaction only affects statement-level sampling.

OK, I was confusing the features.

> For transaction-level sampling we do the sampling independently of the
> statement duration, i.e. we when starting a transaction we determine
> whether the whole transaction will be sampled. It has nothing to do with
> the proposed log_statement_sample_limit.

So, to clarify: our plan is that a given statement will be logged
if any of these various partial-logging features says to do so?

(And the knock on HEAD's behavior is exactly that it breaks that
independence for log_min_duration_statement.)

            regards, tom lane



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
On Sun, Aug 04, 2019 at 04:25:12PM -0400, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:
>>> Isn't the issue here the interaction between log_transaction_sample_rate
>>> and log_min_duration_statement?
>
>> No, that interaction only affects statement-level sampling.
>
>OK, I was confusing the features.
>
>> For transaction-level sampling we do the sampling independently of the
>> statement duration, i.e. we when starting a transaction we determine
>> whether the whole transaction will be sampled. It has nothing to do with
>> the proposed log_statement_sample_limit.
>
>So, to clarify: our plan is that a given statement will be logged
>if any of these various partial-logging features says to do so?
>

Yes, I think that's the expected behavior.

- did it exceed log_min_duration_statement? -> log it
- is it part of sampled xact? -> log it
- maybe sample the statement (to be reverted / reimplemented)

>(And the knock on HEAD's behavior is exactly that it breaks that
>independence for log_min_duration_statement.)
>

Yeah. There's no way to use sampling, while ensure logging of all
queries longer than some limit.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
On Sun, Aug 04, 2019 at 10:48:48PM +0200, Tomas Vondra wrote:
>On Sun, Aug 04, 2019 at 04:25:12PM -0400, Tom Lane wrote:
>>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>>On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:
>>>>Isn't the issue here the interaction between log_transaction_sample_rate
>>>>and log_min_duration_statement?
>>
>>>No, that interaction only affects statement-level sampling.
>>
>>OK, I was confusing the features.
>>
>>>For transaction-level sampling we do the sampling independently of the
>>>statement duration, i.e. we when starting a transaction we determine
>>>whether the whole transaction will be sampled. It has nothing to do with
>>>the proposed log_statement_sample_limit.
>>
>>So, to clarify: our plan is that a given statement will be logged
>>if any of these various partial-logging features says to do so?
>>
>
>Yes, I think that's the expected behavior.
>
>- did it exceed log_min_duration_statement? -> log it
>- is it part of sampled xact? -> log it
>- maybe sample the statement (to be reverted / reimplemented)
>
>>(And the knock on HEAD's behavior is exactly that it breaks that
>>independence for log_min_duration_statement.)
>>
>
>Yeah. There's no way to use sampling, while ensure logging of all
>queries longer than some limit.
>

FWIW I've reverted the log_statement_sample_rate (both from master and
REL_12_STABLE). May the buildfarm be merciful to me.

I've left the log_transaction_sample_rate in, as that seems unaffected
by this discussion.


regards


-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
On Sun, Aug 04, 2019 at 11:41:54PM +0200, Tomas Vondra wrote:
>On Sun, Aug 04, 2019 at 10:48:48PM +0200, Tomas Vondra wrote:
>>On Sun, Aug 04, 2019 at 04:25:12PM -0400, Tom Lane wrote:
>>>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>>>On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:
>>>>>Isn't the issue here the interaction between log_transaction_sample_rate
>>>>>and log_min_duration_statement?
>>>
>>>>No, that interaction only affects statement-level sampling.
>>>
>>>OK, I was confusing the features.
>>>
>>>>For transaction-level sampling we do the sampling independently of the
>>>>statement duration, i.e. we when starting a transaction we determine
>>>>whether the whole transaction will be sampled. It has nothing to do with
>>>>the proposed log_statement_sample_limit.
>>>
>>>So, to clarify: our plan is that a given statement will be logged
>>>if any of these various partial-logging features says to do so?
>>>
>>
>>Yes, I think that's the expected behavior.
>>
>>- did it exceed log_min_duration_statement? -> log it
>>- is it part of sampled xact? -> log it
>>- maybe sample the statement (to be reverted / reimplemented)
>>
>>>(And the knock on HEAD's behavior is exactly that it breaks that
>>>independence for log_min_duration_statement.)
>>>
>>
>>Yeah. There's no way to use sampling, while ensure logging of all
>>queries longer than some limit.
>>
>
>FWIW I've reverted the log_statement_sample_rate (both from master and
>REL_12_STABLE). May the buildfarm be merciful to me.
>
>I've left the log_transaction_sample_rate in, as that seems unaffected
>by this discussion.
>

I've pushed the reworked version of log_statement_sample_rate patch [1].
If I understand correctly, that makes this patch unnecessary, and we
should mark it as rejected. Or do we still need it?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Adrien Nayrat
Date:
On 11/6/19 7:21 PM, Tomas Vondra wrote:
> I've pushed the reworked version of log_statement_sample_rate patch [1].
> If I understand correctly, that makes this patch unnecessary, and we
> should mark it as rejected. Or do we still need it?

Yes, the goal of this patch was to disable sampling and log all queries whose
duration exceed  log_statement_sample_limit.

For now it is possible just with log_min_duration_statement which log all
queries whose duration exceed it.

-- 
Adrien




Re: idea: log_statement_sample_rate - bottom limit for sampling

From
Tomas Vondra
Date:
On Wed, Nov 06, 2019 at 08:00:57PM +0100, Adrien Nayrat wrote:
>On 11/6/19 7:21 PM, Tomas Vondra wrote:
>> I've pushed the reworked version of log_statement_sample_rate patch [1].
>> If I understand correctly, that makes this patch unnecessary, and we
>> should mark it as rejected. Or do we still need it?
>
>Yes, the goal of this patch was to disable sampling and log all queries whose
>duration exceed  log_statement_sample_limit.
>
>For now it is possible just with log_min_duration_statement which log all
>queries whose duration exceed it.
>

OK, I've marked it as rejected. If someone thinks we should still have
something like it, please submit a patch implementing it.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services