Thread: Re: Truncate logs by max_log_size

Re: Truncate logs by max_log_size

From
"Euler Taveira"
Date:
On Thu, Sep 26, 2024, at 3:30 PM, diPhantxm wrote:
I would like to suggest a patch to truncate some long queries. I believe sometimes there is no need to log a query containing some gigabytes of blob, for example. In patch a new parameter, named max_log_size, is introduced. It defines the maximum size of logged query, in bytes. Everything beyond that size is truncated.

I don't know how useful is this proposal. IMO the whole query is usually
crucial for an analysis. Let's say you arbitrarily provide max_log_size = 100
but it means you cannot see a WHERE clause and you have a performance issue in
that query. It won't be possible to obtain the whole query for an EXPLAIN. It
would break audit systems that requires the whole query. I don't know if there
are some log-based replication systems but it would break such tools too.

There are other ways to avoid logging such long queries. The GRANT ... ON
PARAMETER and SET LOCAL commands are your friends. Hence, you can disable
specific long queries even if you are not a superuser.

If your main problem is disk space, you can adjust the rotation settings or have
an external tool to manage your log files (or even use syslog).


--
Euler Taveira

Re: Truncate logs by max_log_size

From
"Andrey M. Borodin"
Date:

> On 27 Sep 2024, at 03:30, Euler Taveira <euler@eulerto.com> wrote:
>
> Let's say you arbitrarily provide max_log_size = 100

Consider max_log_size = 10Mb. The perspective might look very different. It’s not about WHERE anymore. It's a guard
againstheavy abuse. 

The feature looks very important for me.


Best regards, Andrey Borodin.


Re: Truncate logs by max_log_size

From
Jim Jones
Date:
On 27.09.24 12:36, Andrey M. Borodin wrote:
> Consider max_log_size = 10Mb. The perspective might look very different.
>  It’s not about WHERE anymore. It's a guard against heavy abuse.
>
> The feature looks very important for me.
I have the same opinion. As a fail safe it sounds very useful to me.

Unfortunately, the patch does not apply:

$ git apply
~/patches/max_log_query/0001-parameter-max_log_size-to-truncate-logs.patch
-v
Checking patch src/backend/utils/error/elog.c...
error: while searching for:
char       *Log_destination_string = NULL;
bool            syslog_sequence_numbers = true;
bool            syslog_split_messages = true;

/* Processed form of backtrace_symbols GUC */
static char *backtrace_symbol_list;

error: patch failed: src/backend/utils/error/elog.c:114
error: src/backend/utils/error/elog.c: patch does not apply
Checking patch src/backend/utils/misc/guc_tables.c...
Hunk #1 succeeded at 3714 (offset 247 lines).
Checking patch src/backend/utils/misc/postgresql.conf.sample...
Hunk #1 succeeded at 615 (offset 23 lines).
Checking patch src/bin/pg_ctl/t/004_logrotate.pl...
error: while searching for:
check_log_pattern('csvlog',  $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);

$node->stop();

done_testing();

error: patch failed: src/bin/pg_ctl/t/004_logrotate.pl:135
error: src/bin/pg_ctl/t/004_logrotate.pl: patch does not apply
Checking patch src/include/utils/elog.h...
Hunk #1 succeeded at 502 (offset 1 line).

Does it truncate only single queries or also a transaction with many
statements?

Thanks!

Best, Jim



Re: Truncate logs by max_log_size

From
Greg Sabino Mullane
Date:
On Fri, Sep 27, 2024 at 6:37 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
Consider max_log_size = 10Mb. The perspective might look very different. It’s not about WHERE anymore. It's a guard against heavy abuse.

Can you elaborate on this? Do you mean someone purposefully writing large entries to your log file?
 

Re: Truncate logs by max_log_size

From
Jim Jones
Date:

On 01.10.24 17:46, Kirill Gavrilov wrote:
>  My apologies, attached patch should work on master branch.

Nice.

I tested the feature and it does what it is intended to.

postgres=# SHOW max_log_size;
 max_log_size
--------------
 10
(1 row)

postgres=# INSERT INTO t VALUES (1,2,3,4,5,6,7,8,9,0);
ERROR:  relation "t" does not exist
LINE 1: INSERT INTO t VALUES (1,2,3,4,5,6,7,8,9,0);

Log file:

2024-10-02 00:57:13.618 CEST [1975926] ERROR:  relation "t" does not
exist at character 13
2024-10-02 00:57:13.618 CEST [1975926] STATEMENT:  INSERT INT

A few observations:

1) You missed a const qualifier in elog.c:

elog.c: In function ‘EmitErrorReport’:
elog.c:1699:29: warning: initialization discards ‘const’ qualifier from
pointer target type [-Wdiscarded-qualifiers]
 1699 |                 char* str = debug_query_string;
      |                             ^~~~~~~~~~~~~~~~~~

2) The new parameter cannot be set within a session with SET. Is it
supposed to be like this? IMHO it would be nice to able to temporarily
set this parameter without having to reload the postgresql.conf

postgres=# SET max_log_size TO 100;
ERROR:  parameter "max_log_size" cannot be changed now

3) I personally find -1 more intuitive than 0 to disable a parameter,
but I have no strong opinion about it.

4) It still lacks documentation.

-- 
Jim




Re: Truncate logs by max_log_size

From
Kirill Reshke
Date:
On Tue, 1 Oct 2024 at 20:46, Kirill Gavrilov <diphantxm@gmail.com> wrote:
>
> On Tue, Oct 1, 2024 at 2:46 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
>>
>>
>> On 27.09.24 12:36, Andrey M. Borodin wrote:
>> > Consider max_log_size = 10Mb. The perspective might look very different.
>> >  It’s not about WHERE anymore. It's a guard against heavy abuse.
>> >
>> > The feature looks very important for me.
>> I have the same opinion. As a fail safe it sounds very useful to me.
>>
>> Unfortunately, the patch does not apply:
>>
>> $ git apply
>> ~/patches/max_log_query/0001-parameter-max_log_size-to-truncate-logs.patch
>> -v
>> Checking patch src/backend/utils/error/elog.c...
>> error: while searching for:
>> char       *Log_destination_string = NULL;
>> bool            syslog_sequence_numbers = true;
>> bool            syslog_split_messages = true;
>>
>> /* Processed form of backtrace_symbols GUC */
>> static char *backtrace_symbol_list;
>>
>> error: patch failed: src/backend/utils/error/elog.c:114
>> error: src/backend/utils/error/elog.c: patch does not apply
>> Checking patch src/backend/utils/misc/guc_tables.c...
>> Hunk #1 succeeded at 3714 (offset 247 lines).
>> Checking patch src/backend/utils/misc/postgresql.conf.sample...
>> Hunk #1 succeeded at 615 (offset 23 lines).
>> Checking patch src/bin/pg_ctl/t/004_logrotate.pl...
>> error: while searching for:
>> check_log_pattern('csvlog',  $new_current_logfiles, 'syntax error', $node);
>> check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);
>>
>> $node->stop();
>>
>> done_testing();
>>
>> error: patch failed: src/bin/pg_ctl/t/004_logrotate.pl:135
>> error: src/bin/pg_ctl/t/004_logrotate.pl: patch does not apply
>> Checking patch src/include/utils/elog.h...
>> Hunk #1 succeeded at 502 (offset 1 line).
>>
>> Does it truncate only single queries or also a transaction with many
>> statements?
>>
>> Thanks!
>>
>> Best, Jim
>
>
>  My apologies, attached patch should work on master branch.
> It truncates single queries and a statement in transaction that occurred an error.
>

Hello! Please deliver the most recent patch version and fix the issues
Jim identified [0] as the current commitfest draws to a close.
Do not forget to include both parts of this patch (This was actually
developed off-list, and we are now using this on our cloud PostgreSQL
distribution on versions 12–17).

[0] https://www.postgresql.org/message-id/35096a36-04d4-480b-a7cd-a2d8151fb737%40uni-muenster.de

--
Best regards,
Kirill Reshke



Re: Truncate logs by max_log_size

From
Jim Jones
Date:

On 25.11.24 17:52, Kirill Reshke wrote:
> Hello! Please deliver the most recent patch version and fix the issues
> Jim identified [0] as the current commitfest draws to a close.
> Do not forget to include both parts of this patch (This was actually
> developed off-list, and we are now using this on our cloud PostgreSQL
> distribution on versions 12–17).
>
> [0] https://www.postgresql.org/message-id/35096a36-04d4-480b-a7cd-a2d8151fb737%40uni-muenster.de

In addition to these points, this feature seems to fail with queries
containing special characters (more than one byte):

With this "max_log_size"..

postgres=# SHOW max_log_size;
 max_log_size
--------------
 20
(1 row)

... and this query ..

postgres=# SELECT "ÄÜÖ" FROM t;
ERROR:  relation "t" does not exist
LINE 1: SELECT "ÄÜÖ" FROM t;

.. this is the [truncated] log entry we get ..

2024-11-28 14:58:57.912 CET [2258876] ERROR:  relation "t" does not
exist at character 19
2024-11-28 14:58:57.912 CET [2258876] STATEMENT:  SELECT "ÄÜÖ" FROM

... although the query originally had exactly 20 characters:

postgres=# SELECT length('SELECT "ÄÜÖ" FROM t;');
 length
--------
     20
(1 row)


postgres=# SELECT length('ÄÜÖ'::bytea), length('AUO'::bytea);
 length | length
--------+--------
      6 |      3
(1 row)

If it is supposed to be like this, it should be clearly stated so in the
docs.

-- 
Jim




Re: Truncate logs by max_log_size

From
Jim Jones
Date:

On 28.11.24 20:20, Kirill Gavrilov wrote:
>   Here is version 3 of this patch. I found another place where this
> setting can be applied. 
>   Also added some documentation and specified that this setting
> truncates queries by size in bytes.

Thanks. It is now possible to change the parameter using SET

postgres=# SET max_log_size TO 15;
SET
postgres=# SHOW max_log_size ;
 max_log_size
--------------
 15
(1 row)


In the postgresql.conf the default value is set to 0

#max_log_size = 0

But if we take a look at the parameter it shows something else

postgres=# SHOW max_log_size ;
 max_log_size
--------------
 5242880
(1 row)

Perhaps it would be best to keep the default value in the
postgresql.conf consistent with the documentation and the actual
max_log_size value. IMHO 0 (or -1) should disable it.

There are also several compilation issues:

postgres.c: In function ‘exec_simple_query’:
postgres.c:1040:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
 1040 |         bool copied = false;
      |         ^~~~
elog.c: In function ‘EmitErrorReport’:
elog.c:1697:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
 1697 |         const char* old_query_string = debug_query_string;
      |         ^~~~~
elog.c:1744:23: warning: passing argument 1 of ‘pfree’ discards ‘const’
qualifier from pointer target type [-Wdiscarded-qualifiers]
 1744 |                 pfree(debug_query_string);
      |                       ^~~~~~~~~~~~~~~~~~
In file included from ../../../../src/include/postgres.h:47,
                 from elog.c:55:
../../../../src/include/utils/palloc.h:86:25: note: expected ‘void *’
but argument is of type ‘const char *’
   86 | extern void pfree(void *pointer);
      |                   ~~~~~~^~~~~~~
elog.c: In function ‘build_query_log’:
elog.c:3798:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
 3798 |         size_t query_len = strlen(query);
      |         ^~~~~~
elog.c:3801:24: warning: return discards ‘const’ qualifier from pointer
target type [-Wdiscarded-qualifiers]
 3801 |                 return query;
      |                        ^~~~~
elog.c:3805:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
 3805 |         size_t query_log_len = pg_mbcliplen(query, query_len,
max_log_size);
      |         ^~~~~~

and there is a trailing whitespace at:

+ *    If query needs to be truncated, copied will be set to true

/home/jim/patches/max_log_query/V3-0001-parameter-max_log_size-to-truncate-logs.patch:141:
trailing whitespace.
 *      If query needs to be truncated, copied will be set to true
Checking patch doc/src/sgml/config.sgml...
Hunk #1 succeeded at 7865 (offset -48 lines).
Checking patch src/backend/tcop/postgres.c...
Hunk #1 succeeded at 71 (offset 1 line).
Hunk #2 succeeded at 1031 (offset 1 line).
Hunk #3 succeeded at 1083 (offset 1 line).
Hunk #4 succeeded at 1382 (offset 1 line).
Hunk #5 succeeded at 1393 (offset 1 line).
Checking patch src/backend/utils/error/elog.c...
Hunk #4 succeeded at 3781 (offset -3 lines).
Checking patch src/backend/utils/misc/guc_tables.c...
Hunk #1 succeeded at 3714 (offset -10 lines).
Checking patch src/backend/utils/misc/postgresql.conf.sample...
Hunk #1 succeeded at 615 (offset -2 lines).
Checking patch src/bin/pg_ctl/t/004_logrotate.pl...
Checking patch src/include/utils/elog.h...
Applied patch doc/src/sgml/config.sgml cleanly.
Applied patch src/backend/tcop/postgres.c cleanly.
Applied patch src/backend/utils/error/elog.c cleanly.
Applied patch src/backend/utils/misc/guc_tables.c cleanly.
Applied patch src/backend/utils/misc/postgresql.conf.sample cleanly.
Applied patch src/bin/pg_ctl/t/004_logrotate.pl cleanly.
Applied patch src/include/utils/elog.h cleanly.
warning: 1 line adds whitespace errors.

-- 
Jim




Re: Truncate logs by max_log_size

From
Jim Jones
Date:

On 29.11.24 12:05, Kirill Gavrilov wrote:
>   Setting max_log_size to 0 already disables truncation.
>   I changed guc units to bytes, just to make it more human-friendly.
> Did some refactoring and fixed all warnings. Set default value in
> postgresql.conf.

Nice. The patch applies cleanly and the warnings are gone.

A few other things:

IMHO the documentation can be a bit clearer. It says to input the values
in bytes, but the default value is in megabytes.

+        If greater than zero, each query logged is truncated to this
many bytes.
+        Zero disables the setting.
+        Default value is 5MB.

Perhaps something like "If this value is specified without units, it is
taken as bytes." (see wal_keep_size)

And a pedantic note: This hint suggests that you can use TB as unit ..

postgres=# set max_log_size = '1foo';
ERROR:  invalid value for parameter "max_log_size": "1foo"
HINT:  Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".

.. but it can never be used, as it exceeds the integer range:

postgres=# SET max_log_size = '1TB';
ERROR:  invalid value for parameter "max_log_size": "1TB"
HINT:  Value exceeds integer range.

Thanks

-- 
Jim




Re: Truncate logs by max_log_size

From
Kirill Reshke
Date:
On Sat, 30 Nov 2024 at 01:57, Kirill Gavrilov <diphantxm@gmail.com> wrote:
>
> On Fri, Nov 29, 2024 at 10:12 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
>>
>>
>>
>> On 29.11.24 12:05, Kirill Gavrilov wrote:
>> >   Setting max_log_size to 0 already disables truncation.
>> >   I changed guc units to bytes, just to make it more human-friendly.
>> > Did some refactoring and fixed all warnings. Set default value in
>> > postgresql.conf.
>>
>> Nice. The patch applies cleanly and the warnings are gone.
>>
>> A few other things:
>>
>> IMHO the documentation can be a bit clearer. It says to input the values
>> in bytes, but the default value is in megabytes.
>>
>> +        If greater than zero, each query logged is truncated to this
>> many bytes.
>> +        Zero disables the setting.
>> +        Default value is 5MB.
>>
>> Perhaps something like "If this value is specified without units, it is
>> taken as bytes." (see wal_keep_size)
>
>
> Added to documentation. I hope it's clear now.
>
>>
>>
>> And a pedantic note: This hint suggests that you can use TB as unit ..
>>
>> postgres=# set max_log_size = '1foo';
>> ERROR:  invalid value for parameter "max_log_size": "1foo"
>> HINT:  Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".
>>
>> .. but it can never be used, as it exceeds the integer range:
>>
>> postgres=# SET max_log_size = '1TB';
>> ERROR:  invalid value for parameter "max_log_size": "1TB"
>> HINT:  Value exceeds integer range.
>>
>> Thanks
>>
>> --
>> Jim
>
>
> Same thing applies to log_parameter_max_length, for example.
>
> postgres=# set log_parameter_max_length = '1foo';
> ERROR:  invalid value for parameter "log_parameter_max_length": "1foo"
> HINT:  Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".
> postgres=# set log_parameter_max_length = '1TB';
> ERROR:  invalid value for parameter "log_parameter_max_length": "1TB"
> HINT:  Value exceeds integer range.
>
> I think we can leave it as is.


Hi

> +for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
> +{
> + eval {
> + $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
> + };
> + last unless $@;
> + usleep(100_000);
> +}


`usleep` in tap tests is usually a bad pattern. Do we have a chance to
test this using `wait_for_log` or similar?


--
Best regards,
Kirill Reshke



Re: Truncate logs by max_log_size

From
Kirill Gavrilov
Date:

Hi

> +for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
> +{
> + eval {
> + $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
> + };
> + last unless $@;
> + usleep(100_000);
> +}


`usleep` in tap tests is usually a bad pattern. Do we have a chance to
test this using `wait_for_log` or similar?
 
I'm not sure we can use `wait_for_log` because it checks for only one logfile. But even if it's possible, I don't think it's a good idea to use different checks in the same file or to change tests for another feature. I used test case from above as an example for mine.

Re: Truncate logs by max_log_size

From
Jim Jones
Date:

On 29.11.24 21:57, Kirill Gavrilov wrote:
> Same thing applies to log_parameter_max_length, for example.
>
> postgres=# set log_parameter_max_length = '1foo';
> ERROR:  invalid value for parameter "log_parameter_max_length": "1foo"
> HINT:  Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".
> postgres=# set log_parameter_max_length = '1TB';
> ERROR:  invalid value for parameter "log_parameter_max_length": "1TB"
> HINT:  Value exceeds integer range. 
>
> I think we can leave it as is.

I see. So I guess it is out of scope to change this message here.

Small nitpicks:

1) The indentation of the comment at postgresql.conf.sample is a little
bit off

#max_log_size = 0                 # max size of logged statement
                    # 0 disables the feature

IMHO it looks better like this:

#max_log_size = 0   # max size of logged statement
                    # 0 disables the feature


2) You introduced a trailing whitespace at L34 (Not critical :))

+        Zero disables the setting.

It happens to me all the time, so I usually try to apply my patches in a
clean branch just to make sure I didn't miss anything.

Other than that, I have nothing more to add at this point.

Thanks

-- 
Jim