Thread: Re: Truncate logs by max_log_size
On Thu, Sep 26, 2024, at 3:30 PM, diPhantxm wrote:
I would like to suggest a patch to truncate some long queries. I believe sometimes there is no need to log a query containing some gigabytes of blob, for example. In patch a new parameter, named max_log_size, is introduced. It defines the maximum size of logged query, in bytes. Everything beyond that size is truncated.
I don't know how useful is this proposal. IMO the whole query is usually
crucial for an analysis. Let's say you arbitrarily provide max_log_size = 100
but it means you cannot see a WHERE clause and you have a performance issue in
that query. It won't be possible to obtain the whole query for an EXPLAIN. It
would break audit systems that requires the whole query. I don't know if there
are some log-based replication systems but it would break such tools too.
There are other ways to avoid logging such long queries. The GRANT ... ON
PARAMETER and SET LOCAL commands are your friends. Hence, you can disable
specific long queries even if you are not a superuser.
If your main problem is disk space, you can adjust the rotation settings or have
an external tool to manage your log files (or even use syslog).
> On 27 Sep 2024, at 03:30, Euler Taveira <euler@eulerto.com> wrote: > > Let's say you arbitrarily provide max_log_size = 100 Consider max_log_size = 10Mb. The perspective might look very different. It’s not about WHERE anymore. It's a guard againstheavy abuse. The feature looks very important for me. Best regards, Andrey Borodin.
On 27.09.24 12:36, Andrey M. Borodin wrote: > Consider max_log_size = 10Mb. The perspective might look very different. > It’s not about WHERE anymore. It's a guard against heavy abuse. > > The feature looks very important for me. I have the same opinion. As a fail safe it sounds very useful to me. Unfortunately, the patch does not apply: $ git apply ~/patches/max_log_query/0001-parameter-max_log_size-to-truncate-logs.patch -v Checking patch src/backend/utils/error/elog.c... error: while searching for: char *Log_destination_string = NULL; bool syslog_sequence_numbers = true; bool syslog_split_messages = true; /* Processed form of backtrace_symbols GUC */ static char *backtrace_symbol_list; error: patch failed: src/backend/utils/error/elog.c:114 error: src/backend/utils/error/elog.c: patch does not apply Checking patch src/backend/utils/misc/guc_tables.c... Hunk #1 succeeded at 3714 (offset 247 lines). Checking patch src/backend/utils/misc/postgresql.conf.sample... Hunk #1 succeeded at 615 (offset 23 lines). Checking patch src/bin/pg_ctl/t/004_logrotate.pl... error: while searching for: check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node); check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node); $node->stop(); done_testing(); error: patch failed: src/bin/pg_ctl/t/004_logrotate.pl:135 error: src/bin/pg_ctl/t/004_logrotate.pl: patch does not apply Checking patch src/include/utils/elog.h... Hunk #1 succeeded at 502 (offset 1 line). Does it truncate only single queries or also a transaction with many statements? Thanks! Best, Jim
On Fri, Sep 27, 2024 at 6:37 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
Consider max_log_size = 10Mb. The perspective might look very different. It’s not about WHERE anymore. It's a guard against heavy abuse.
Can you elaborate on this? Do you mean someone purposefully writing large entries to your log file?
On 01.10.24 17:46, Kirill Gavrilov wrote: > My apologies, attached patch should work on master branch. Nice. I tested the feature and it does what it is intended to. postgres=# SHOW max_log_size; max_log_size -------------- 10 (1 row) postgres=# INSERT INTO t VALUES (1,2,3,4,5,6,7,8,9,0); ERROR: relation "t" does not exist LINE 1: INSERT INTO t VALUES (1,2,3,4,5,6,7,8,9,0); Log file: 2024-10-02 00:57:13.618 CEST [1975926] ERROR: relation "t" does not exist at character 13 2024-10-02 00:57:13.618 CEST [1975926] STATEMENT: INSERT INT A few observations: 1) You missed a const qualifier in elog.c: elog.c: In function ‘EmitErrorReport’: elog.c:1699:29: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 1699 | char* str = debug_query_string; | ^~~~~~~~~~~~~~~~~~ 2) The new parameter cannot be set within a session with SET. Is it supposed to be like this? IMHO it would be nice to able to temporarily set this parameter without having to reload the postgresql.conf postgres=# SET max_log_size TO 100; ERROR: parameter "max_log_size" cannot be changed now 3) I personally find -1 more intuitive than 0 to disable a parameter, but I have no strong opinion about it. 4) It still lacks documentation. -- Jim
On Tue, 1 Oct 2024 at 20:46, Kirill Gavrilov <diphantxm@gmail.com> wrote: > > On Tue, Oct 1, 2024 at 2:46 PM Jim Jones <jim.jones@uni-muenster.de> wrote: >> >> >> On 27.09.24 12:36, Andrey M. Borodin wrote: >> > Consider max_log_size = 10Mb. The perspective might look very different. >> > It’s not about WHERE anymore. It's a guard against heavy abuse. >> > >> > The feature looks very important for me. >> I have the same opinion. As a fail safe it sounds very useful to me. >> >> Unfortunately, the patch does not apply: >> >> $ git apply >> ~/patches/max_log_query/0001-parameter-max_log_size-to-truncate-logs.patch >> -v >> Checking patch src/backend/utils/error/elog.c... >> error: while searching for: >> char *Log_destination_string = NULL; >> bool syslog_sequence_numbers = true; >> bool syslog_split_messages = true; >> >> /* Processed form of backtrace_symbols GUC */ >> static char *backtrace_symbol_list; >> >> error: patch failed: src/backend/utils/error/elog.c:114 >> error: src/backend/utils/error/elog.c: patch does not apply >> Checking patch src/backend/utils/misc/guc_tables.c... >> Hunk #1 succeeded at 3714 (offset 247 lines). >> Checking patch src/backend/utils/misc/postgresql.conf.sample... >> Hunk #1 succeeded at 615 (offset 23 lines). >> Checking patch src/bin/pg_ctl/t/004_logrotate.pl... >> error: while searching for: >> check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node); >> check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node); >> >> $node->stop(); >> >> done_testing(); >> >> error: patch failed: src/bin/pg_ctl/t/004_logrotate.pl:135 >> error: src/bin/pg_ctl/t/004_logrotate.pl: patch does not apply >> Checking patch src/include/utils/elog.h... >> Hunk #1 succeeded at 502 (offset 1 line). >> >> Does it truncate only single queries or also a transaction with many >> statements? >> >> Thanks! >> >> Best, Jim > > > My apologies, attached patch should work on master branch. > It truncates single queries and a statement in transaction that occurred an error. > Hello! Please deliver the most recent patch version and fix the issues Jim identified [0] as the current commitfest draws to a close. Do not forget to include both parts of this patch (This was actually developed off-list, and we are now using this on our cloud PostgreSQL distribution on versions 12–17). [0] https://www.postgresql.org/message-id/35096a36-04d4-480b-a7cd-a2d8151fb737%40uni-muenster.de -- Best regards, Kirill Reshke
On 25.11.24 17:52, Kirill Reshke wrote: > Hello! Please deliver the most recent patch version and fix the issues > Jim identified [0] as the current commitfest draws to a close. > Do not forget to include both parts of this patch (This was actually > developed off-list, and we are now using this on our cloud PostgreSQL > distribution on versions 12–17). > > [0] https://www.postgresql.org/message-id/35096a36-04d4-480b-a7cd-a2d8151fb737%40uni-muenster.de In addition to these points, this feature seems to fail with queries containing special characters (more than one byte): With this "max_log_size".. postgres=# SHOW max_log_size; max_log_size -------------- 20 (1 row) ... and this query .. postgres=# SELECT "ÄÜÖ" FROM t; ERROR: relation "t" does not exist LINE 1: SELECT "ÄÜÖ" FROM t; .. this is the [truncated] log entry we get .. 2024-11-28 14:58:57.912 CET [2258876] ERROR: relation "t" does not exist at character 19 2024-11-28 14:58:57.912 CET [2258876] STATEMENT: SELECT "ÄÜÖ" FROM ... although the query originally had exactly 20 characters: postgres=# SELECT length('SELECT "ÄÜÖ" FROM t;'); length -------- 20 (1 row) postgres=# SELECT length('ÄÜÖ'::bytea), length('AUO'::bytea); length | length --------+-------- 6 | 3 (1 row) If it is supposed to be like this, it should be clearly stated so in the docs. -- Jim
On 28.11.24 20:20, Kirill Gavrilov wrote: > Here is version 3 of this patch. I found another place where this > setting can be applied. > Also added some documentation and specified that this setting > truncates queries by size in bytes. Thanks. It is now possible to change the parameter using SET postgres=# SET max_log_size TO 15; SET postgres=# SHOW max_log_size ; max_log_size -------------- 15 (1 row) In the postgresql.conf the default value is set to 0 #max_log_size = 0 But if we take a look at the parameter it shows something else postgres=# SHOW max_log_size ; max_log_size -------------- 5242880 (1 row) Perhaps it would be best to keep the default value in the postgresql.conf consistent with the documentation and the actual max_log_size value. IMHO 0 (or -1) should disable it. There are also several compilation issues: postgres.c: In function ‘exec_simple_query’: postgres.c:1040:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] 1040 | bool copied = false; | ^~~~ elog.c: In function ‘EmitErrorReport’: elog.c:1697:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] 1697 | const char* old_query_string = debug_query_string; | ^~~~~ elog.c:1744:23: warning: passing argument 1 of ‘pfree’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 1744 | pfree(debug_query_string); | ^~~~~~~~~~~~~~~~~~ In file included from ../../../../src/include/postgres.h:47, from elog.c:55: ../../../../src/include/utils/palloc.h:86:25: note: expected ‘void *’ but argument is of type ‘const char *’ 86 | extern void pfree(void *pointer); | ~~~~~~^~~~~~~ elog.c: In function ‘build_query_log’: elog.c:3798:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] 3798 | size_t query_len = strlen(query); | ^~~~~~ elog.c:3801:24: warning: return discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 3801 | return query; | ^~~~~ elog.c:3805:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] 3805 | size_t query_log_len = pg_mbcliplen(query, query_len, max_log_size); | ^~~~~~ and there is a trailing whitespace at: + * If query needs to be truncated, copied will be set to true /home/jim/patches/max_log_query/V3-0001-parameter-max_log_size-to-truncate-logs.patch:141: trailing whitespace. * If query needs to be truncated, copied will be set to true Checking patch doc/src/sgml/config.sgml... Hunk #1 succeeded at 7865 (offset -48 lines). Checking patch src/backend/tcop/postgres.c... Hunk #1 succeeded at 71 (offset 1 line). Hunk #2 succeeded at 1031 (offset 1 line). Hunk #3 succeeded at 1083 (offset 1 line). Hunk #4 succeeded at 1382 (offset 1 line). Hunk #5 succeeded at 1393 (offset 1 line). Checking patch src/backend/utils/error/elog.c... Hunk #4 succeeded at 3781 (offset -3 lines). Checking patch src/backend/utils/misc/guc_tables.c... Hunk #1 succeeded at 3714 (offset -10 lines). Checking patch src/backend/utils/misc/postgresql.conf.sample... Hunk #1 succeeded at 615 (offset -2 lines). Checking patch src/bin/pg_ctl/t/004_logrotate.pl... Checking patch src/include/utils/elog.h... Applied patch doc/src/sgml/config.sgml cleanly. Applied patch src/backend/tcop/postgres.c cleanly. Applied patch src/backend/utils/error/elog.c cleanly. Applied patch src/backend/utils/misc/guc_tables.c cleanly. Applied patch src/backend/utils/misc/postgresql.conf.sample cleanly. Applied patch src/bin/pg_ctl/t/004_logrotate.pl cleanly. Applied patch src/include/utils/elog.h cleanly. warning: 1 line adds whitespace errors. -- Jim
On 29.11.24 12:05, Kirill Gavrilov wrote: > Setting max_log_size to 0 already disables truncation. > I changed guc units to bytes, just to make it more human-friendly. > Did some refactoring and fixed all warnings. Set default value in > postgresql.conf. Nice. The patch applies cleanly and the warnings are gone. A few other things: IMHO the documentation can be a bit clearer. It says to input the values in bytes, but the default value is in megabytes. + If greater than zero, each query logged is truncated to this many bytes. + Zero disables the setting. + Default value is 5MB. Perhaps something like "If this value is specified without units, it is taken as bytes." (see wal_keep_size) And a pedantic note: This hint suggests that you can use TB as unit .. postgres=# set max_log_size = '1foo'; ERROR: invalid value for parameter "max_log_size": "1foo" HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB". .. but it can never be used, as it exceeds the integer range: postgres=# SET max_log_size = '1TB'; ERROR: invalid value for parameter "max_log_size": "1TB" HINT: Value exceeds integer range. Thanks -- Jim
On Sat, 30 Nov 2024 at 01:57, Kirill Gavrilov <diphantxm@gmail.com> wrote: > > On Fri, Nov 29, 2024 at 10:12 PM Jim Jones <jim.jones@uni-muenster.de> wrote: >> >> >> >> On 29.11.24 12:05, Kirill Gavrilov wrote: >> > Setting max_log_size to 0 already disables truncation. >> > I changed guc units to bytes, just to make it more human-friendly. >> > Did some refactoring and fixed all warnings. Set default value in >> > postgresql.conf. >> >> Nice. The patch applies cleanly and the warnings are gone. >> >> A few other things: >> >> IMHO the documentation can be a bit clearer. It says to input the values >> in bytes, but the default value is in megabytes. >> >> + If greater than zero, each query logged is truncated to this >> many bytes. >> + Zero disables the setting. >> + Default value is 5MB. >> >> Perhaps something like "If this value is specified without units, it is >> taken as bytes." (see wal_keep_size) > > > Added to documentation. I hope it's clear now. > >> >> >> And a pedantic note: This hint suggests that you can use TB as unit .. >> >> postgres=# set max_log_size = '1foo'; >> ERROR: invalid value for parameter "max_log_size": "1foo" >> HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB". >> >> .. but it can never be used, as it exceeds the integer range: >> >> postgres=# SET max_log_size = '1TB'; >> ERROR: invalid value for parameter "max_log_size": "1TB" >> HINT: Value exceeds integer range. >> >> Thanks >> >> -- >> Jim > > > Same thing applies to log_parameter_max_length, for example. > > postgres=# set log_parameter_max_length = '1foo'; > ERROR: invalid value for parameter "log_parameter_max_length": "1foo" > HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB". > postgres=# set log_parameter_max_length = '1TB'; > ERROR: invalid value for parameter "log_parameter_max_length": "1TB" > HINT: Value exceeds integer range. > > I think we can leave it as is. Hi > +for (my $attempts = 0; $attempts < $max_attempts; $attempts++) > +{ > + eval { > + $current_logfiles = slurp_file($node->data_dir . '/current_logfiles'); > + }; > + last unless $@; > + usleep(100_000); > +} `usleep` in tap tests is usually a bad pattern. Do we have a chance to test this using `wait_for_log` or similar? -- Best regards, Kirill Reshke
Hi
> +for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
> +{
> + eval {
> + $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
> + };
> + last unless $@;
> + usleep(100_000);
> +}
`usleep` in tap tests is usually a bad pattern. Do we have a chance to
test this using `wait_for_log` or similar?
I'm not sure we can use `wait_for_log` because it checks for only one logfile. But even if it's possible, I don't think it's a good idea to use different checks in the same file or to change tests for another feature. I used test case from above as an example for mine.
On 29.11.24 21:57, Kirill Gavrilov wrote: > Same thing applies to log_parameter_max_length, for example. > > postgres=# set log_parameter_max_length = '1foo'; > ERROR: invalid value for parameter "log_parameter_max_length": "1foo" > HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB". > postgres=# set log_parameter_max_length = '1TB'; > ERROR: invalid value for parameter "log_parameter_max_length": "1TB" > HINT: Value exceeds integer range. > > I think we can leave it as is. I see. So I guess it is out of scope to change this message here. Small nitpicks: 1) The indentation of the comment at postgresql.conf.sample is a little bit off #max_log_size = 0 # max size of logged statement # 0 disables the feature IMHO it looks better like this: #max_log_size = 0 # max size of logged statement # 0 disables the feature 2) You introduced a trailing whitespace at L34 (Not critical :)) + Zero disables the setting. It happens to me all the time, so I usually try to apply my patches in a clean branch just to make sure I didn't miss anything. Other than that, I have nothing more to add at this point. Thanks -- Jim