Thread: Log a sample of transactions
Hello, Per idea of Nikolay Samokhvalov[1] I propose this patch to add the possibility to log all statements from a fraction of transactions. I have several questions: * Do we want this feature? * How can I add tests? I seems hard to handle tests when a there is duration in the output. * Should I handle the case when someone try to change the setting *inside* a transaction: # SET client_min_messages to LOG; SET # SET log_transaction_sample_rate to 1; SET # BEGIN; LOG: duration: 0.364 ms statement: BEGIN; BEGIN # SELECT 1; LOG: duration: 0.530 ms statement: SELECT 1; -[ RECORD 1 ] ?column? | 1 # SET log_transaction_sample_rate to 0; LOG: duration: 0.333 ms statement: SET log_transaction_sample_rate to 0; SET # SELECT 1; LOG: duration: 0.586 ms statement: SELECT 1; -[ RECORD 1 ] ?column? | 1 Thanks [1]: https://www.postgresql.org/message-id/CANNMO%2BLg65EFqHb%2BZYbMLKyE2y498HJzsdFrMnW1dQ6AFJ3Mpw%40mail.gmail.com -- Adrien NAYRAT
Attachment
On 12/12/2018 22:32, Adrien Nayrat wrote: > Per idea of Nikolay Samokhvalov[1] I propose this patch to add the possibility > to log all statements from a fraction of transactions. > > I have several questions: > * Do we want this feature? It's not clear to me what the use is. The per-statement sampling allows you to capture slow queries without overwhelming the logs. We don't have any logging of slow transactions or any other transaction scope logging, so what will this sample? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Le 4 janvier 2019 13:20:09 GMT+01:00, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> a écrit : >On 12/12/2018 22:32, Adrien Nayrat wrote: >> Per idea of Nikolay Samokhvalov[1] I propose this patch to add the >possibility >> to log all statements from a fraction of transactions. >> >> I have several questions: >> * Do we want this feature? > >It's not clear to me what the use is. The per-statement sampling >allows >you to capture slow queries without overwhelming the logs. We don't >have any logging of slow transactions or any other transaction scope >logging, so what will this sample? > >-- >Peter Eisentraut http://www.2ndQuadrant.com/ >PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services Hello, When you troubleshoot applicative issues with multi-statements transaction, you may have to log all queries to find all statementsof one transaction. With high throughput, it could be hard to log all queries without causing troubles. The goal of this patch is to be able to log a sample of transactions. I agree the use case is specific. Regards -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
Dear Adrien, I applied your file and faced a bug-candidate. (I have not read your source yet. Sorry.) When I tried to use tab-complition, some dirty messages were appeared. Messages I faced are attached in this mail. Best Regards, Hayato Kuroda Fujitsu LIMITED
Attachment
On 1/15/19 9:00 AM, Kuroda, Hayato wrote: > Dear Adrien, > > I applied your file and faced a bug-candidate. > (I have not read your source yet. Sorry.) > > When I tried to use tab-complition, some dirty messages were appeared. > > Messages I faced are attached in this mail. > > Best Regards, > Hayato Kuroda > Fujitsu LIMITED > Hello, Thanks for your review ;) Your messages looks normal to me, you will have same messages if you put log_min_duration to 0. However, my compiler warn me of multiple definition of xact_is_sampled. I fix it in this 2nd patch. Regards,
Attachment
On Sat, Jan 5, 2019 at 12:57 AM Adrien Mobile <adrien.nayrat@anayrat.info> wrote: > > Le 4 janvier 2019 13:20:09 GMT+01:00, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> a écrit : > >On 12/12/2018 22:32, Adrien Nayrat wrote: > >> Per idea of Nikolay Samokhvalov[1] I propose this patch to add the > >possibility > >> to log all statements from a fraction of transactions. > >> > >> I have several questions: > >> * Do we want this feature? > > > >It's not clear to me what the use is. The per-statement sampling > >allows > >you to capture slow queries without overwhelming the logs. We don't > >have any logging of slow transactions or any other transaction scope > >logging, so what will this sample? > > > >-- > >Peter Eisentraut http://www.2ndQuadrant.com/ > >PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > Hello, > When you troubleshoot applicative issues with multi-statements transaction, you may have to log all queries to find allstatements of one transaction. With high throughput, it could be hard to log all queries without causing troubles. Hm, can we use log_min_duration_statement to find slow queries of a transaction instead? Could you please elaborate on the use-case? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 1/15/19 11:42 AM, Masahiko Sawada wrote: >> When you troubleshoot applicative issues with multi-statements transaction, you may have to log all queries to find allstatements of one transaction. With high throughput, it could be hard to log all queries without causing troubles. > Hm, can we use log_min_duration_statement to find slow queries of a > transaction instead? Could you please elaborate on the use-case? Hello, The goal is not to find slow queries in a transaction, but troubleshoot applicative issue when you have short queries. Sometimes you want to understand what happens in a transaction, either you perfectly know your application, either you have to log all queries and find ones with the same transaction ID (%x). It could be problematic if you have a huge traffic with fast queries. Thanks,
Dear Adrien, > Your messages looks normal to me, you will have same messages if you put > log_min_duration to 0. That was my lack of understanding. Sorry. By the way, I think the name of this parameter should be changed. In the Cambridge dictionary, the word "rate" means as follows: the speed at which something happens or changes, or the amount or number of times it happens or changes in a particular period. This parameter represents "probability" whether it log, therefore this name is inappropriate. You should use "probability" or "ratio", I think. Next week I will give you some comments about your good source. Best Regards, Hayato Kuroda Fujitsu LIMITED
On Tue, Jan 15, 2019 at 06:03:36PM +0100, Adrien NAYRAT wrote: > The goal is not to find slow queries in a transaction, but troubleshoot > applicative issue when you have short queries. > > Sometimes you want to understand what happens in a transaction, either you > perfectly know your application, either you have to log all queries and find > ones with the same transaction ID (%x). It could be problematic if you have > a huge traffic with fast queries. Looks like a sensible argument to me. High log throughput can cause Postgres performance to go down by a couple of percents, which kills the purpose of tracking down performance problems as this could impact directly the application. -- Michael
Attachment
On 1/16/19 1:40 AM, Kuroda, Hayato wrote: > By the way, I think the name of this parameter should be changed. > In the Cambridge dictionary, the word "rate" means as follows: > > the speed at which something happens or changes, > or the amount or number of times it happens or changes in a particular period. > > This parameter represents "probability" whether it log, > therefore this name is inappropriate. > You should use "probability" or "ratio", I think. Hum, I am not an english native speaker, I choosed "rate" because it is already used for auto_explain.sample_rate and for log_statement_sample_rate > > > Next week I will give you some comments about your good source. Thanks!
On Wed, Jan 16, 2019 at 2:03 AM Adrien NAYRAT <adrien.nayrat@anayrat.info> wrote: > > On 1/15/19 11:42 AM, Masahiko Sawada wrote: > >> When you troubleshoot applicative issues with multi-statements transaction, you may have to log all queries to findall statements of one transaction. With high throughput, it could be hard to log all queries without causing troubles. > > Hm, can we use log_min_duration_statement to find slow queries of a > > transaction instead? Could you please elaborate on the use-case? > > Hello, > > The goal is not to find slow queries in a transaction, but troubleshoot > applicative issue when you have short queries. > > Sometimes you want to understand what happens in a transaction, either > you perfectly know your application, either you have to log all queries > and find ones with the same transaction ID (%x). It could be problematic > if you have a huge traffic with fast queries. > Thank you for the explain! I understood the use case of this patch. This feature would be helpful for troubleshooting. Since we set xact_is_sampled only when transaction starts and see it during transaction we cannot disable logging during transaction and vice versa. I can imagine the use case where user wants to disable logging during transaction. So it might be better to also check if log_xact_sample_rate > 0 in check_log_duration(). Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 1/16/19 10:09 AM, Masahiko Sawada wrote: > Since we set xact_is_sampled only when transaction starts and see it > during transaction we cannot disable logging during transaction and > vice versa. I can imagine the use case where user wants to disable > logging during transaction. So it might be better to also check if > log_xact_sample_rate > 0 in check_log_duration(). I Agree, here is third patch. Thanks!
Attachment
On 15/01/2019 18:03, Adrien NAYRAT wrote: > The goal is not to find slow queries in a transaction, but troubleshoot > applicative issue when you have short queries. > > Sometimes you want to understand what happens in a transaction, either > you perfectly know your application, either you have to log all queries > and find ones with the same transaction ID (%x). It could be problematic > if you have a huge traffic with fast queries. But if you have trouble with a specific transaction, how will a setting help that randomly logs transactions, not necessarily the one you are concerned about? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/18/19 9:03 AM, Peter Eisentraut wrote: > But if you have trouble with a specific transaction, how will a setting > help that randomly logs transactions, not necessarily the one you are > concerned about? Yes, it assumes your application performs same type of transaction. Maybe the use case is too specific to have this feature in core?
Dear Adrien, > Hum, I am not an english native speaker, I choosed "rate" because it is > already used for auto_explain.sample_rate and for log_statement_sample_rate If the community agree those name, renaming parameters are not needed. Consistency is the most important in a DBMS. :-) I read your source code and I thought it is generally good. Here are some minor suggestions, but you don't have to follow strictly: ---- config.sgml ---- You must document the behavior when users change the parameter during a transaction. あやしい・・・ ---- postgres.c ---- I give you three comments. > /* flag for logging statements in this transaction */ I think "a" or the plural form should be used instead of "this" * xact_is_sampled is left at the end of a transaction. Should the parameter be set to false at the lowest layer of the transaction system? I understand it is unnecessary for the functionality, but it have more symmetry. * check_log_duration should be used only when postgres check the duration. But I'm not sure a new function such as check_is_sampled is needed because A processing in new function will be as almostsame as check_log_duration. Best Regards, Hayato Kuroda Fujitsu LIMITED
On Fri, Jan 18, 2019 at 8:23 AM Adrien NAYRAT <adrien.nayrat@anayrat.info> wrote: > On 1/18/19 9:03 AM, Peter Eisentraut wrote: > > But if you have trouble with a specific transaction, how will a setting > > help that randomly logs transactions, not necessarily the one you are > > concerned about? > > Yes, it assumes your application performs same type of transaction. > Maybe the use case is too specific to have this feature in core? It doesn't sound too crazy to me. Say you log a sample of statements. But then you're like, wow, this is hard to interpret, because I don't know what happened earlier in the transaction. So then you use this feature instead. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 1/23/19 3:12 AM, Kuroda, Hayato wrote: > Dear Adrien, Hello Kuroda-san > > > ---- config.sgml ---- > You must document the behavior when users change the parameter during a transaction. > あやしい・・・ Agreed, I added a wording. > > ---- postgres.c ---- > I give you three comments. > >> /* flag for logging statements in this transaction */ > I think "a" or the plural form should be used instead of "this" Fixed > > * xact_is_sampled is left at the end of a transaction. > Should the parameter be set to false at the lowest layer of the transaction system? > I understand it is unnecessary for the functionality, but it have more symmetry. Yes, it is not necessary. I wonder what is more important : keep some kind of symmetry or avoid unnecessary code (which can be source of mistake)? > > * check_log_duration should be used only when postgres check the duration. > But I'm not sure a new function such as check_is_sampled is needed because A processing in new function will be as almostsame as check_log_duration. I agree, I asked myself the same question and I preferred to keep code simple. Here is 4th patch. Thanks!
Attachment
Dear Adrien, >> * xact_is_sampled is left at the end of a transaction. >> Should the parameter be set to false at the lowest layer of the transaction system? >> I understand it is unnecessary for the functionality, but it have more symmetry. > > Yes, it is not necessary. I wonder what is more important : keep some >kind of symmetry or avoid unnecessary code (which can be source of mistake)? >> >> * check_log_duration should be used only when postgres check the duration. >> But I'm not sure a new function such as check_is_sampled is needed because A processing in new function will be as almostsame as check_log_duration. > I agree, I asked myself the same question and I preferred to keep code simple. I think your point is also correct. You should inquire superior reviewers or committers because I cannot judge which one is better. BTW, I give you a suggestion about a test. This parameter enables users to log statements randomly, hence adding some tests is very difficult. Perhaps Only three cases are available: * When log_transaction_sample_rate is set to 1, all statements are logged. * When the parameter is set to 0, they are never logged. * When the parameter change to 0 inside the transaction, logging is immediately stopped. Best Regards, Hayato Kuroda Fujitsu LIMITED
On 1/28/19 2:53 AM, Kuroda, Hayato wrote: > BTW, I give you a suggestion about a test. > This parameter enables users to log statements randomly, hence adding some tests is very difficult. > Perhaps Only three cases are available: > > * When log_transaction_sample_rate is set to 1, all statements are logged. > * When the parameter is set to 0, they are never logged. > * When the parameter change to 0 inside the transaction, logging is immediately stopped. I agree we should add tests, my main problem is log output contains duration information: postgres=# select 1; LOG: duration: 0.539 ms statement: select 1; I did not find any test for log_min_duration that could help me. LCOV indicate there is no tests on this part (look check_log_duration()): https://coverage.postgresql.org/src/backend/tcop/postgres.c.gcov.html I will look how to test this properly with test infrastructure. Maybe a simple regex to remove duration part will be enough.
Attachment
On Sun, Feb 03, 2019 at 12:23:00PM +0100, Adrien Nayrat wrote: > I did not find any test for log_min_duration that could help me. LCOV indicate > there is no tests on this part (look check_log_duration()): > https://coverage.postgresql.org/src/backend/tcop/postgres.c.gcov.html These would take time to execute, even if you need to measure one second, and may be hard to make stable on slow machines. Moved to next CF. -- Michael
Attachment
On 2/4/19 3:26 AM, Michael Paquier wrote: > These would take time to execute, even if you need to measure one > second, and may be hard to make stable on slow machines. > Could it be acceptable if I set log_min_duration_statement = 0? > Moved to next CF. Thanks
On Mon, Feb 04, 2019 at 04:32:05PM +0100, Adrien NAYRAT wrote: > Could it be acceptable if I set log_min_duration_statement = 0? At least this has the merit to make the behavior not rely on randomness. I have not looked at the patch in details so I cannot say for sure, but I am not sure if it is worth the extra cycles to have tests integrated so they may be removed from the final commit if they are too expensive for little value. It is always helpful to have deterministic tests to give a shot at the feature and ease review of course. -- Michael
Attachment
Hi, On 2019-01-26 11:44:58 +0100, Adrien NAYRAT wrote: > + <varlistentry id="guc-transaction-sample-rate" xreflabel="log_transaction_sample_rate"> > + <term><varname>log_transaction_sample_rate</varname> (<type>real</type>) > + <indexterm> > + <primary><varname>log_transaction_sample_rate</varname> configuration parameter</primary> > + </indexterm> > + </term> > + <listitem> > + <para> > + Set the fraction of transactions whose statements are logged. It applies > + to each new transaction regardless of their duration. The default is > + <literal>0</literal>, meaning do not log statements from this transaction. > + Setting this to <literal>1</literal> logs all statements for all transactions. > + Logging can be disabled inside a transaction by setting this to 0. > + <varname>log_transaction_sample_rate</varname> is helpful to track a > + sample of transaction. > + </para> > + </listitem> > + </varlistentry> > + > </variablelist> I wonder if this doesn't need a warning, explaining that using this when there are large transactions could lead to slowdowns. > <para> > diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c > index 7c3a9c1e89..1a52c10c39 100644 > --- a/src/backend/access/transam/xact.c > +++ b/src/backend/access/transam/xact.c > @@ -1821,6 +1821,9 @@ StartTransaction(void) > s->state = TRANS_START; > s->transactionId = InvalidTransactionId; /* until assigned */ > > + /* Determine if we log statements in this transaction */ > + xact_is_sampled = random() <= log_xact_sample_rate * MAX_RANDOM_VALUE; > + > /* > * initialize current transaction state fields > * > diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c > index e773f20d9f..a11667834e 100644 > --- a/src/backend/tcop/postgres.c > +++ b/src/backend/tcop/postgres.c > @@ -100,7 +100,8 @@ int max_stack_depth = 100; > /* wait N seconds to allow attach from a debugger */ > int PostAuthDelay = 0; > > - > +/* flag for logging statements in a transaction */ > +bool xact_is_sampled = false; It seems pretty weird to have this in postgres.c, given you declared it in xact.h? > @@ -2218,7 +2221,8 @@ check_log_statement(List *stmt_list) > int > check_log_duration(char *msec_str, bool was_logged) > { > - if (log_duration || log_min_duration_statement >= 0) > + if (log_duration || log_min_duration_statement >= 0 || > + (xact_is_sampled && log_xact_sample_rate > 0)) Why are both of these checked? ISTM once xact_is_sampled is set, we ought not to respect a different value of log_xact_sample_rate for the rest of that transaction. As far as I can tell xact_is_sampled is not properly reset in case of errors? Greetings, Andres Freund
On 2/14/19 9:14 PM, Andres Freund wrote: > Hi, > Hello Andres, > On 2019-01-26 11:44:58 +0100, Adrien NAYRAT wrote: >> + <varlistentry id="guc-transaction-sample-rate" xreflabel="log_transaction_sample_rate"> >> + <term><varname>log_transaction_sample_rate</varname> (<type>real</type>) >> + <indexterm> >> + <primary><varname>log_transaction_sample_rate</varname> configuration parameter</primary> >> + </indexterm> >> + </term> >> + <listitem> >> + <para> >> + Set the fraction of transactions whose statements are logged. It applies >> + to each new transaction regardless of their duration. The default is >> + <literal>0</literal>, meaning do not log statements from this transaction. >> + Setting this to <literal>1</literal> logs all statements for all transactions. >> + Logging can be disabled inside a transaction by setting this to 0. >> + <varname>log_transaction_sample_rate</varname> is helpful to track a >> + sample of transaction. >> + </para> >> + </listitem> >> + </varlistentry> >> + >> </variablelist> > > I wonder if this doesn't need a warning, explaining that using this when > there are large transactions could lead to slowdowns. Yes, I will add some wording > > >> <para> >> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c >> index 7c3a9c1e89..1a52c10c39 100644 >> --- a/src/backend/access/transam/xact.c >> +++ b/src/backend/access/transam/xact.c >> @@ -1821,6 +1821,9 @@ StartTransaction(void) >> s->state = TRANS_START; >> s->transactionId = InvalidTransactionId; /* until assigned */ >> >> + /* Determine if we log statements in this transaction */ >> + xact_is_sampled = random() <= log_xact_sample_rate * MAX_RANDOM_VALUE; >> + >> /* >> * initialize current transaction state fields >> * >> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c >> index e773f20d9f..a11667834e 100644 >> --- a/src/backend/tcop/postgres.c >> +++ b/src/backend/tcop/postgres.c >> @@ -100,7 +100,8 @@ int max_stack_depth = 100; >> /* wait N seconds to allow attach from a debugger */ >> int PostAuthDelay = 0; >> >> - >> +/* flag for logging statements in a transaction */ >> +bool xact_is_sampled = false; > > It seems pretty weird to have this in postgres.c, given you declared it > in xact.h? Yes, I have to revise my C. I will move it to src/backend/access/transam/xact.c > > >> @@ -2218,7 +2221,8 @@ check_log_statement(List *stmt_list) >> int >> check_log_duration(char *msec_str, bool was_logged) >> { >> - if (log_duration || log_min_duration_statement >= 0) >> + if (log_duration || log_min_duration_statement >= 0 || >> + (xact_is_sampled && log_xact_sample_rate > 0)) > > Why are both of these checked? ISTM once xact_is_sampled is set, we > ought not to respect a different value of log_xact_sample_rate for the > rest of that transaction. I added theses checks to allow to disable logging during a sampled transaction, per suggestion of Masahiko Sawada: https://www.postgresql.org/message-id/CAD21AoD4t%2BhTV6XfK5Yz%3DEocB8oMyJSYFfjAryCDYtqfib2GrA%40mail.gmail.com > > > As far as I can tell xact_is_sampled is not properly reset in case of > errors? > Yes, I wonder where we should reset it? Is it in AbortTransaction() or CleanupTransaction()? Thanks!
Hello, On 2/15/19 3:24 PM, Adrien NAYRAT wrote: > On 2/14/19 9:14 PM, Andres Freund wrote: >> I wonder if this doesn't need a warning, explaining that using this when >> there are large transactions could lead to slowdowns. > > Yes, I will add some wording Warning added. >> It seems pretty weird to have this in postgres.c, given you declared it >> in xact.h? > > Yes, I have to revise my C. I will move it to > src/backend/access/transam/xact.c Fixed >> Why are both of these checked? ISTM once xact_is_sampled is set, we >> ought not to respect a different value of log_xact_sample_rate for the >> rest of that transaction. > > I added theses checks to allow to disable logging during a sampled > transaction, per suggestion of Masahiko Sawada: > https://www.postgresql.org/message-id/CAD21AoD4t%2BhTV6XfK5Yz%3DEocB8oMyJSYFfjAryCDYtqfib2GrA%40mail.gmail.com I added a comment to explain why transaction logging is rechecked. >> >> As far as I can tell xact_is_sampled is not properly reset in case of >> errors? >> > I am not sure if I should disable logging in case of errors. Actually we have: postgres=# set log_transaction_sample_rate to 1; SET postgres=# set client_min_messages to 'LOG'; LOG: duration: 0.392 ms statement: set client_min_messages to 'LOG'; SET postgres=# begin; LOG: duration: 0.345 ms statement: begin; BEGIN postgres=# select 1; LOG: duration: 0.479 ms statement: select 1; ?column? ---------- 1 (1 row) postgres=# select * from missingtable; ERROR: relation "missingtable" does not exist LINE 1: select * from missingtable; ^ postgres=# select 1; ERROR: current transaction is aborted, commands ignored until end of transaction block postgres=# rollback; LOG: duration: 11390.295 ms statement: rollback; ROLLBACK If I reset xact_is_sampled (after the first error inside AbortTransaction if I am right), "rollback" statement will not be logged. I wonder if this "statement" should be logged? If the answer is no, I will reset xact_is_sampled in AbortTransaction. Patch attached with fix mentioned above, but without xact_is_sampled reset. Cheers,
Attachment
Hi Hayato, On 3/1/19 10:17 PM, Adrien NAYRAT wrote: > > Patch attached with fix mentioned above, but without xact_is_sampled reset. Are you planning to review this patch? If not I remove you from the reviewers so perhaps somebody else will have a look. Regards, -- -David david@pgmasters.net
Dear David, I have a will and already read the patch, but I thought it's not my turn. Sorry. Adrien, > I did not find any test for log_min_duration that could help me. LCOV indicate > there is no tests on this part (look check_log_duration()): > https://coverage.postgresql.org/src/backend/tcop/postgres.c.gcov.html I understand the unnecessarily of some test case. It's OK. Finally, how do you think about the deviation of randomness? If this parameter is set very low, nobody may be output because of the deviation. we can avoid this phenomenon by counting up internal parameter for each transactions and output to log file if the parameterbecomes more than 1. After consideration for this case and rebasing, I think this patch is enough. Do I have to measure the change of throughput? Best Regards, Hayato Kuroda Fujitsu LIMITED
On 3/26/19 1:54 AM, Kuroda, Hayato wrote: > Dear David, > > I have a will and already read the patch, but I thought it's not my turn. > Sorry. > > Hello, > Adrien, > >> I did not find any test for log_min_duration that could help me. LCOV indicate >> there is no tests on this part (look check_log_duration()): >> https://coverage.postgresql.org/src/backend/tcop/postgres.c.gcov.html > > I understand the unnecessarily of some test case. It's OK. > > Finally, how do you think about the deviation of randomness? > If this parameter is set very low, nobody may be output because of the deviation. > we can avoid this phenomenon by counting up internal parameter for each transactions and output to log file if the parameterbecomes more than 1. I don't understand what you want to do and I am afraid of the cost it could add. There was a long discussion about random, maybe the thread could answer some of your questions? https://www.postgresql.org/message-id/3859.1545849900%40sss.pgh.pa.us Furthermore, I wonder if it is really necessary : the goal of this patch is to log a sample of transaction, not to perform precise stats computation. FIY we do not perform this kind of checks for log_statement_sample_rate. > > After consideration for this case and rebasing, I think this patch is enough. > Do I have to measure the change of throughput? > I don't think it is necessary or maybe I am missing something. I already perform performance test for log_statement_sample_rate: https://www.postgresql.org/message-id/8a608ccf-f78b-dfad-4c5f-8e6fd068c59c%40anayrat.info It should be less expensive as we call random() for new transaction only. log_statement_sample_rate call it for each statement. I don't know if there is cases where random() call could be slow? There is still a point raised by Andres: >> As far as I can tell xact_is_sampled is not properly reset in case of >> errors? >> > > I am not sure if I should disable logging in case of errors. Actually we have: > > postgres=# set log_transaction_sample_rate to 1; > SET > postgres=# set client_min_messages to 'LOG'; > LOG: duration: 0.392 ms statement: set client_min_messages to 'LOG'; > SET > postgres=# begin; > LOG: duration: 0.345 ms statement: begin; > BEGIN > postgres=# select 1; > LOG: duration: 0.479 ms statement: select 1; > ?column? > ---------- > 1 > (1 row) > > postgres=# select * from missingtable; > ERROR: relation "missingtable" does not exist > LINE 1: select * from missingtable; > ^ > postgres=# select 1; > ERROR: current transaction is aborted, commands ignored until end of transaction block > postgres=# rollback; > LOG: duration: 11390.295 ms statement: rollback; > ROLLBACK > > If I reset xact_is_sampled (after the first error inside AbortTransaction if I am right), "rollback" statement will notbe logged. I wonder if this "statement" should be logged? > > If the answer is no, I will reset xact_is_sampled in AbortTransaction. Thanks!
On 3/27/19 10:22 AM, Adrien NAYRAT wrote: > Hello, > > On 3/27/19 7:05 AM, Masahiko Sawada wrote: >>>>> Why are both of these checked? ISTM once xact_is_sampled is set, we >>>>> ought not to respect a different value of log_xact_sample_rate for the >>>>> rest of that transaction. >>>> I added theses checks to allow to disable logging during a sampled >>>> transaction, per suggestion of Masahiko Sawada: >>>> https://www.postgresql.org/message-id/CAD21AoD4t%2BhTV6XfK5Yz%3DEocB8oMyJSYFfjAryCDYtqfib2GrA%40mail.gmail.com >>>> >>> I added a comment to explain why transaction logging is rechecked. >> Umm, I'm inclined to think that we should not disable logging within >> the transaction. If we allow that, since users can disable the logging >> within the transaction by setting it to 0 they may think that we can >> change the rate during the transaction, which is wrong. If we want >> this behavior we should document it but we can make user not being >> able to change the rate during the transaction to avoid confusion. And >> the latter looks more understandable and straightforward. This is >> different comment what I did before, I'm sorry for confusing you. > > Don't worry, I will change that. > > >> >>>>> As far as I can tell xact_is_sampled is not properly reset in case of >>>>> errors? >>>>> >>> I am not sure if I should disable logging in case of errors. Actually we >>> have: >>> >>> postgres=# set log_transaction_sample_rate to 1; >>> SET >>> postgres=# set client_min_messages to 'LOG'; >>> LOG: duration: 0.392 ms statement: set client_min_messages to 'LOG'; >>> SET >>> postgres=# begin; >>> LOG: duration: 0.345 ms statement: begin; >>> BEGIN >>> postgres=# select 1; >>> LOG: duration: 0.479 ms statement: select 1; >>> ?column? >>> ---------- >>> 1 >>> (1 row) >>> >>> postgres=# select * from missingtable; >>> ERROR: relation "missingtable" does not exist >>> LINE 1: select * from missingtable; >>> ^ >>> postgres=# select 1; >>> ERROR: current transaction is aborted, commands ignored until end of >>> transaction block >>> postgres=# rollback; >>> LOG: duration: 11390.295 ms statement: rollback; >>> ROLLBACK >>> >>> If I reset xact_is_sampled (after the first error inside >>> AbortTransaction if I am right), "rollback" statement will not be >>> logged. I wonder if this "statement" should be logged? >>> >>> If the answer is no, I will reset xact_is_sampled in AbortTransaction. >>> >> FWIW, I'd prefer to log "rollback" as well so as user can recognize >> the end of transaction. > > Ok. > >> >> + /* Determine if statements are logged in this transaction */ >> + xact_is_sampled = random() <= log_xact_sample_rate * >> MAX_RANDOM_VALUE; >> >> If log_xact_sample_rate is 1 we always log all statement in the >> transaction regardless of value of random(). Similarly if it's 0, we >> never log. I think we can avoid unnecessary random() call in both case >> like log_statement_sample_rate does. > > I agree, I will change that. > >> >> {"log_statement_sample_rate", PGC_SUSET, LOGGING_WHEN, >> gettext_noop("Fraction of statements over >> log_min_duration_statement to log."), >> gettext_noop("If you only want a sample, use a >> value between 0 (never " >> - "log) and 1.0 (always >> log).") >> + "log) and 1 (always >> log).") >> }, >> &log_statement_sample_rate, >> 1.0, 0.0, 1.0, >> NULL, NULL, NULL >> }, >> >> This change is not relevant with log_transaction_sample_rate feature >> but why do we need this change? In postgresql.conf.sample the >> description of both log_statement_sample_rate and >> log_transaction_sample_rate use 1.0 instead of 1, like "1.0 logs all >> statements from all transactions, 0 never logs". If we need this >> change I think it should be a separate patch. > > > Sorry, I changed that, someone suggest using either "0" and "1", or > "0.0" and "1.0" but not mixing both. I will remove this change. > > Thanks for your review. > > > Patch attached with all changes requested.
Attachment
Thank you for updating the patch! On Wed, Mar 27, 2019 at 10:28 PM Adrien NAYRAT <adrien.nayrat@anayrat.info> wrote: > > On 3/27/19 10:22 AM, Adrien NAYRAT wrote: > > Hello, > > > > On 3/27/19 7:05 AM, Masahiko Sawada wrote: > >>>>> Why are both of these checked? ISTM once xact_is_sampled is set, we > >>>>> ought not to respect a different value of log_xact_sample_rate for the > >>>>> rest of that transaction. > >>>> I added theses checks to allow to disable logging during a sampled > >>>> transaction, per suggestion of Masahiko Sawada: > >>>> https://www.postgresql.org/message-id/CAD21AoD4t%2BhTV6XfK5Yz%3DEocB8oMyJSYFfjAryCDYtqfib2GrA%40mail.gmail.com > >>>> > >>> I added a comment to explain why transaction logging is rechecked. > >> Umm, I'm inclined to think that we should not disable logging within > >> the transaction. If we allow that, since users can disable the logging > >> within the transaction by setting it to 0 they may think that we can > >> change the rate during the transaction, which is wrong. If we want > >> this behavior we should document it but we can make user not being > >> able to change the rate during the transaction to avoid confusion. And > >> the latter looks more understandable and straightforward. This is > >> different comment what I did before, I'm sorry for confusing you. > > > > Don't worry, I will change that. > > > > > >> > >>>>> As far as I can tell xact_is_sampled is not properly reset in case of > >>>>> errors? > >>>>> > >>> I am not sure if I should disable logging in case of errors. Actually we > >>> have: > >>> > >>> postgres=# set log_transaction_sample_rate to 1; > >>> SET > >>> postgres=# set client_min_messages to 'LOG'; > >>> LOG: duration: 0.392 ms statement: set client_min_messages to 'LOG'; > >>> SET > >>> postgres=# begin; > >>> LOG: duration: 0.345 ms statement: begin; > >>> BEGIN > >>> postgres=# select 1; > >>> LOG: duration: 0.479 ms statement: select 1; > >>> ?column? > >>> ---------- > >>> 1 > >>> (1 row) > >>> > >>> postgres=# select * from missingtable; > >>> ERROR: relation "missingtable" does not exist > >>> LINE 1: select * from missingtable; > >>> ^ > >>> postgres=# select 1; > >>> ERROR: current transaction is aborted, commands ignored until end of > >>> transaction block > >>> postgres=# rollback; > >>> LOG: duration: 11390.295 ms statement: rollback; > >>> ROLLBACK > >>> > >>> If I reset xact_is_sampled (after the first error inside > >>> AbortTransaction if I am right), "rollback" statement will not be > >>> logged. I wonder if this "statement" should be logged? > >>> > >>> If the answer is no, I will reset xact_is_sampled in AbortTransaction. > >>> > >> FWIW, I'd prefer to log "rollback" as well so as user can recognize > >> the end of transaction. > > > > Ok. > > > >> > >> + /* Determine if statements are logged in this transaction */ > >> + xact_is_sampled = random() <= log_xact_sample_rate * > >> MAX_RANDOM_VALUE; > >> > >> If log_xact_sample_rate is 1 we always log all statement in the > >> transaction regardless of value of random(). Similarly if it's 0, we > >> never log. I think we can avoid unnecessary random() call in both case > >> like log_statement_sample_rate does. > > > > I agree, I will change that. > > > >> > >> {"log_statement_sample_rate", PGC_SUSET, LOGGING_WHEN, > >> gettext_noop("Fraction of statements over > >> log_min_duration_statement to log."), > >> gettext_noop("If you only want a sample, use a > >> value between 0 (never " > >> - "log) and 1.0 (always > >> log).") > >> + "log) and 1 (always > >> log).") > >> }, > >> &log_statement_sample_rate, > >> 1.0, 0.0, 1.0, > >> NULL, NULL, NULL > >> }, > >> > >> This change is not relevant with log_transaction_sample_rate feature > >> but why do we need this change? In postgresql.conf.sample the > >> description of both log_statement_sample_rate and > >> log_transaction_sample_rate use 1.0 instead of 1, like "1.0 logs all > >> statements from all transactions, 0 never logs". If we need this > >> change I think it should be a separate patch. > > > > > > Sorry, I changed that, someone suggest using either "0" and "1", or > > "0.0" and "1.0" but not mixing both. Agreed with using "0.0" and "1.0". > > I will remove this change. > > --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample +#log_transaction_sample_rate = 0 # Fraction of transactions whose statements + # are logged regardless of their duration. 1.0 logs all + # statements from all transactions, 0 never logs. --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c + {"log_transaction_sample_rate", PGC_SUSET, LOGGING_WHEN, + gettext_noop("Set the fraction of transactions to log for new transactions."), + gettext_noop("Logs all statements from a fraction of transactions. " + "Use a value between 0 (never log) and 1 (log all " + "statements for all transactions).") I think it would be better to use the same number in both guc.c and postgresql.conf.sample. How about unifying them to "0.0" and "1.0" for consistency? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hello, On 3/28/19 3:18 AM, Masahiko Sawada wrote: >>> >>> Sorry, I changed that, someone suggest using either "0" and "1", or >>> "0.0" and "1.0" but not mixing both. > > Agreed with using "0.0" and "1.0". > >>> I will remove this change. >>> > > --- a/src/backend/utils/misc/postgresql.conf.sample > +++ b/src/backend/utils/misc/postgresql.conf.sample > +#log_transaction_sample_rate = 0 # Fraction of transactions > whose statements > + # are logged regardless of > their duration. 1.0 logs all > + # statements from all > transactions, 0 never logs. > > --- a/src/backend/utils/misc/guc.c > +++ b/src/backend/utils/misc/guc.c > + {"log_transaction_sample_rate", PGC_SUSET, LOGGING_WHEN, > + gettext_noop("Set the fraction of transactions > to log for new transactions."), > + gettext_noop("Logs all statements from a > fraction of transactions. " > + "Use a value between > 0 (never log) and 1 (log all " > + "statements for all > transactions).") > > I think it would be better to use the same number in both guc.c and > postgresql.conf.sample. How about unifying them to "0.0" and "1.0" for > consistency? > I changed to use both 0.0 and 1.0. Thanks!
Attachment
On Thu, Mar 28, 2019 at 6:33 PM Adrien NAYRAT <adrien.nayrat@anayrat.info> wrote: > > Hello, > > On 3/28/19 3:18 AM, Masahiko Sawada wrote: > >>> > >>> Sorry, I changed that, someone suggest using either "0" and "1", or > >>> "0.0" and "1.0" but not mixing both. > > > > Agreed with using "0.0" and "1.0". > > > >>> I will remove this change. > >>> > > > > --- a/src/backend/utils/misc/postgresql.conf.sample > > +++ b/src/backend/utils/misc/postgresql.conf.sample > > +#log_transaction_sample_rate = 0 # Fraction of transactions > > whose statements > > + # are logged regardless of > > their duration. 1.0 logs all > > + # statements from all > > transactions, 0 never logs. > > > > --- a/src/backend/utils/misc/guc.c > > +++ b/src/backend/utils/misc/guc.c > > + {"log_transaction_sample_rate", PGC_SUSET, LOGGING_WHEN, > > + gettext_noop("Set the fraction of transactions > > to log for new transactions."), > > + gettext_noop("Logs all statements from a > > fraction of transactions. " > > + "Use a value between > > 0 (never log) and 1 (log all " > > + "statements for all > > transactions).") > > > > I think it would be better to use the same number in both guc.c and > > postgresql.conf.sample. How about unifying them to "0.0" and "1.0" for > > consistency? > > > > I changed to use both 0.0 and 1.0. > Thank you for updating the patch! The patch looks good to me. I'll mark this patch as Ready for Committer if other reviewer has no review comment on v7 patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: Adrien NAYRAT 2019-03-28 <c86d676a-0b1e-f9be-3da2-63fa15070db0@anayrat.info> > > > > Sorry, I changed that, someone suggest using either "0" and "1", or > > > > "0.0" and "1.0" but not mixing both. > > > > Agreed with using "0.0" and "1.0". > > > > > > I will remove this change. > +#log_transaction_sample_rate = 0 # Fraction of transactions whose statements > + # are logged regardless of their duration. 1.0 logs all > + # statements from all transactions, 0.0 never logs. This still looks weird to me. If the default is to never log, and that's 0.0 as per the comment, the setting should be 0.0 as well. The same applies to log_statement_sample_rate which is already committed, were it looks even worse: #log_statement_sample_rate = 1 # Fraction of logged statements over # log_min_duration_statement. 1.0 logs all statements, # 0 never logs. Is "over" even correct English? ("out of" "from"?) Christoph
On 3/28/19 10:45 AM, Christoph Berg wrote: > Re: Adrien NAYRAT 2019-03-28 <c86d676a-0b1e-f9be-3da2-63fa15070db0@anayrat.info> >>>>> Sorry, I changed that, someone suggest using either "0" and "1", or >>>>> "0.0" and "1.0" but not mixing both. >>> >>> Agreed with using "0.0" and "1.0". >>> >>>>> I will remove this change. > >> +#log_transaction_sample_rate = 0 # Fraction of transactions whose statements >> + # are logged regardless of their duration. 1.0 logs all >> + # statements from all transactions, 0.0 never logs. > > This still looks weird to me. If the default is to never log, and > that's 0.0 as per the comment, the setting should be 0.0 as well. Right, fixed in patch attached > > > The same applies to log_statement_sample_rate which is already > committed, were it looks even worse: Yes, I fixed in this patch, but, as it is not related to this patch, it should be fixed in another patch. > > #log_statement_sample_rate = 1 # Fraction of logged statements over > # log_min_duration_statement. 1.0 logs all statements, > # 0 never logs. > > Is "over" even correct English? ("out of" "from"?) > I have no idea on it, I am not an English native speaker. Thanks,
Attachment
Dear Adrien, I understood the cost of randomizing is very low. Maybe it's OK.. I'll change the status to "Ready For Committer." Finally, I apologize for having delayed review. Thank you for your good opportunities. Best Regards, Hayato Kuroda Fujitsu LIMITED
On 3/29/19 3:06 AM, Kuroda, Hayato wrote: > Dear Adrien, Hello Kuroda-san, > > I understood the cost of randomizing is very low. Maybe it's OK.. > I'll change the status to "Ready For Committer." Thanks, I hope it will be commited in PG12 as the feature is closed to log_statement_sample_rate. > > Finally, I apologize for having delayed review. > Thank you for your good opportunities. Don't worry, we all do what we can. Thanks for your review ;) Best Regards,
On 2019-Mar-29, Adrien NAYRAT wrote: > On 3/29/19 3:06 AM, Kuroda, Hayato wrote: > > Dear Adrien, > > Hello Kuroda-san, > > > > > I understood the cost of randomizing is very low. Maybe it's OK.. > > I'll change the status to "Ready For Committer." > > Thanks, I hope it will be commited in PG12 as the feature is closed to > log_statement_sample_rate. Done, thanks. I tweaked the wording of the docs somewhat; please review that part. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 4/3/19 11:45 PM, Alvaro Herrera wrote: > On 2019-Mar-29, Adrien NAYRAT wrote: > >> On 3/29/19 3:06 AM, Kuroda, Hayato wrote: >>> Dear Adrien, >> >> Hello Kuroda-san, >> >>> >>> I understood the cost of randomizing is very low. Maybe it's OK.. >>> I'll change the status to "Ready For Committer." >> >> Thanks, I hope it will be commited in PG12 as the feature is closed to >> log_statement_sample_rate. > > Done, thanks. I tweaked the wording of the docs somewhat; please > review that part. > Thanks Alvaro for committing this and also thanks to all reviewers. As I am not a native English speaker, I don't have a specific advice on the documentation "style".