Thread: Send duration output to separate log files

Send duration output to separate log files

From
Greg Sabino Mullane
Date:
Please find attached a patch to allow for durations to optionally be sent to separate log files. In other words, rather than cluttering up our postgres202007.log file with tons of output from log_min_duration_statement, duration lines are sent instead to the file postgres202007.duration.

Over the years, durations have been the number one cause of very large log files, in which the more "important" items get buried in the noise. Also, programs that are scanning for durations typically do not care about the normal, non-duration output. Some people have a policy of logging everything, which in effect means setting log_min_duration_statement to 0, which in turn makes your log files nearly worthless for spotting non-duration items. This feature will also be very useful for those who need to temporarily turn on log_min_duration_statement, for some quick auditing of exactly what is being run on their database. When done, you can move or remove the duration file without messing up your existing log stream.

This only covers the case when both the duration and statement are set on the same line. In other words, log_min_duration_statement output, but not log_duration (which is best avoided anyway). It also requires logging_collector to be on, obviously.

Details:

The edata structure is expanded to have a new message_type, with a matching function errmessagetype() created.
[include/utils/elog.h]
[backend/utils/elog.c]

Any errors that have both a duration and a statement are marked via errmessagetype()
[backend/tcop/postgres.c]

A new GUC named "log_duration_destination" is created, which supports any combination of stderr, csvlog, and jsonlog. It does not need to match log_destination, in order to support different use cases. For example, the user wants durations sent to a CSV file for processing by some other tool, but still wants everything else going to a normal text log file.

Code: [include/utils/guc_hooks.h] [backend/utils/misc/guc_tables.c]
Docs: [sgml/config.sgml]  [backend/utils/misc/postgresql.conf.sample]

Create a new flag called PIPE_PROTO_DEST_DURATION
[include/postmaster/syslogger.h]

Create new flags:
  LOG_DESTINATION_DURATION,
  LOG_DESTINATION_DURATION_CSV,
  LOG_DESTINATION_DURATION_JSON
[include/utils/elog.h]

Routing and mapping LOG_DESTINATION to PIPE_PROTO
[backend/utils/error/elog.c]

Minor rerouting when using alternate forms
[backend/utils/error/csvlog.c]
[backend/utils/error/jsonlog.c]

Create new filehandles, do log rotation, map PIPE_PROTO to LOG_DESTINATION. Rotation and entry into the "current_logfiles" file are the same as existing log files. The new names/suffixes are duration, duration.csv, and duration.json.
[backend/postmaster/syslogger.c]

Testing to ensure combinations of log_destination and log_duration_destination work as intended
[bin/pg_ctl/meson.build]

Questions I've asked along the way, and perhaps other might as well:

What about logging other things? Why just duration?

Duration logging is a very unique event in our logs. There is nothing quite like it - it's always client-driven, yet automatically generated. And it can be extraordinarily verbose. Removing it from the existing logging stream has no particular downsides. Almost any other class of log message would likely meet resistance as far as moving it to a separate log file, with good reason.

Why not build a more generic log filtering case?

I looked into this, but it would be a large undertaking, given the way our logging system works. And as per above, I don't think the pain would be worth it, as duration covers 99% of the use cases for separate logs. Certainly, nothing else other than a recurring ERROR from the client can cause massive bloat in the size of the files. (There is a nearby patch to exclude certain errors from the log file as a way to mitigate the error spam - I don't like that idea, but should mention it here as another effort to keep the log files a manageable size)

Why not use an extension for this?

I did start this as an extension, but it only goes so far. We can use emit_log_hook, but it requires careful coordination of open filehandles, has to do inefficient regex of every log message, and cannot do things like log rotation.

Why not bitmap PIPE_PROTO *and* LOG_DESTINATION?

I tried to do both as simple bitmaps (i.e. duration+csv = duration.csv), and not have to use e.g. LOG_DESTIATION_DURATION_CSV, but size_rotation_for ruined it for me. Since our PIPE always sends one thing at a time, a single new flag enables it to stay as a clean bits8 type.

What about Windows?

Untested. I don't have access to a Windows build, but I think in theory it should work fine.

Cheers,
Greg

Attachment

Re: Send duration output to separate log files

From
Alastair Turner
Date:
On Wed, 10 Jul 2024 at 16:58, Greg Sabino Mullane <htamfids@gmail.com> wrote:
--snip--
Why not build a more generic log filtering case?

I looked into this, but it would be a large undertaking, given the way our logging system works. And as per above, I don't think the pain would be worth it, as duration covers 99% of the use cases for separate logs.
 
 The other category of logging which would benefit from a separate file is audit. It also can create massive volumes of log content. Splitting audit information off into a separate file for use by a separate team or function is also a request I have heard from some financial institutions adopting Postgres. With audit being provided by an extension, this would become quite an intrusive change.

Why not use an extension for this?

I did start this as an extension, but it only goes so far. We can use emit_log_hook, but it requires careful coordination of open filehandles, has to do inefficient regex of every log message, and cannot do things like log rotation.

Would an extension be able to safely modify the message_type field you have added using emit_log_hook? If so, the field becomes more of a log destination label than a type marker. If an extension could hook into the log file creation/rotation process, that would be a nice basis for enabling extensions (I'm particularly thinking of pgAudit) to manage separate logging destinations.

Re: Send duration output to separate log files

From
Greg Sabino Mullane
Date:
On Thu, Jul 11, 2024 at 6:47 AM Alastair Turner <minion@decodable.me> wrote:
 The other category of logging which would benefit from a separate file is audit. It also can create massive volumes of log content. Splitting audit information off into a separate file for use by a separate team or function is also a request I have heard from some financial institutions adopting Postgres. With audit being provided by an extension, this would become quite an intrusive change.

Thanks for the feedback. I agree pgaudit is another thing that can create massive log files, and should be solved at some point.  However, I wanted to keep this patch self-contained to in-core stuff. And pgaudit is already an odd duck, in that it puts CSV into your stderr stream (and into your json!). Ideally it would put a single CSV stream into a separate csv file. Perhaps even something that did not necessarily live in log_directory.

Would an extension be able to safely modify the message_type field you have added using emit_log_hook? If so, the field becomes more of a log destination label than a type marker. If an extension could hook into the log file creation/rotation process, that would be a nice basis for enabling extensions (I'm particularly thinking of pgAudit) to manage separate logging destinations.

Yes, I had more than duration in mind when I created errmessagetype. A hook to set it would be the obvious next step, and then some sort of way of mapping that to arbitrary log files. But I see that as mostly orthagonal to this patch (and certainly a much larger endeavor).

(wades in anyways). I'm not sure about hooking into the log rotation process so much as registering something on startup, then letting Postgres handle all the log files in the queue. Although as I alluded to above, sometimes having large log files NOT live in the data directory (or more specifically, not hang out with the log_directory crowd), could be a plus for space, efficiency, and security reasons. That makes log rotation harder, however. And do we / should we put extension-driven logfiles into current_logfiles? Do we still fall back to stderr even for extension logs? Lots to ponder. :)

Cheers,
Greg





 

Re: Send duration output to separate log files

From
Alastair Turner
Date:
On Fri, 12 Jul 2024 at 15:58, Greg Sabino Mullane <htamfids@gmail.com> wrote:
On Thu, Jul 11, 2024 at 6:47 AM Alastair Turner <minion@decodable.me> wrote:
 The other category of logging which would benefit from a separate file is audit. It also can create massive volumes of log content. Splitting audit information off into a separate file for use by a separate team or function is also a request I have heard from some financial institutions adopting Postgres. With audit being provided by an extension, this would become quite an intrusive change.

Thanks for the feedback. I agree pgaudit is another thing that can create massive log files, and should be solved at some point.  However, I wanted to keep this patch self-contained to in-core stuff. And pgaudit is already an odd duck, in that it puts CSV into your stderr stream (and into your json!). Ideally it would put a single CSV stream into a separate csv file. Perhaps even something that did not necessarily live in log_directory.

Would an extension be able to safely modify the message_type field you have added using emit_log_hook? If so, the field becomes more of a log destination label than a type marker. If an extension could hook into the log file creation/rotation process, that would be a nice basis for enabling extensions (I'm particularly thinking of pgAudit) to manage separate logging destinations.

Yes, I had more than duration in mind when I created errmessagetype. A hook to set it would be the obvious next step, and then some sort of way of mapping that to arbitrary log files. But I see that as mostly orthagonal to this patch (and certainly a much larger endeavor).

Ok. This facility to separate out the logging of the duration messages is useful on its own and I can see the reasoning for using the core functionality for log rotation to manage these separate logs rather than redoing all the file handling work in an extension. A broader framework for mapping messages to arbitrary log files is a far larger set of changes which can be tackled later if desired.

I've had a look at the patch. The test cases look comprehensive. The patch applies cleanly. The newly supplied tests (13 of the 40) and the test_misc/003_check_guc (1 - no parameters missing from postgresql.conf.sample) fail.

To leave some runway for this idea to be extended on without disrupting the user experience could the GUC name be feature qualified as duration_log.log_destination? This would provide a clear naming structure for the most obvious follow-on patch to this one - allowing users to set log_directory separately for these duration logs - as well as any further separate logging efforts. I know that these dot-separated GUC names are generally associated with extensions, but I can't find a hard rule on the issue anywhere, and it feels like a reasonable way to group up the purpose (in this case logging duration messages) for which there are specific values of a number of GUCs (log_destination, log_directory, even log_filename, ...).

Cheers
Alastair

Re: Send duration output to separate log files

From
Pavel Stehule
Date:
Hi

st 10. 7. 2024 v 17:58 odesílatel Greg Sabino Mullane <htamfids@gmail.com> napsal:
Please find attached a patch to allow for durations to optionally be sent to separate log files. In other words, rather than cluttering up our postgres202007.log file with tons of output from log_min_duration_statement, duration lines are sent instead to the file postgres202007.duration.

Over the years, durations have been the number one cause of very large log files, in which the more "important" items get buried in the noise. Also, programs that are scanning for durations typically do not care about the normal, non-duration output. Some people have a policy of logging everything, which in effect means setting log_min_duration_statement to 0, which in turn makes your log files nearly worthless for spotting non-duration items. This feature will also be very useful for those who need to temporarily turn on log_min_duration_statement, for some quick auditing of exactly what is being run on their database. When done, you can move or remove the duration file without messing up your existing log stream.

This only covers the case when both the duration and statement are set on the same line. In other words, log_min_duration_statement output, but not log_duration (which is best avoided anyway). It also requires logging_collector to be on, obviously.

Details:

The edata structure is expanded to have a new message_type, with a matching function errmessagetype() created.
[include/utils/elog.h]
[backend/utils/elog.c]

Any errors that have both a duration and a statement are marked via errmessagetype()
[backend/tcop/postgres.c]

A new GUC named "log_duration_destination" is created, which supports any combination of stderr, csvlog, and jsonlog. It does not need to match log_destination, in order to support different use cases. For example, the user wants durations sent to a CSV file for processing by some other tool, but still wants everything else going to a normal text log file.

Code: [include/utils/guc_hooks.h] [backend/utils/misc/guc_tables.c]
Docs: [sgml/config.sgml]  [backend/utils/misc/postgresql.conf.sample]

Create a new flag called PIPE_PROTO_DEST_DURATION
[include/postmaster/syslogger.h]

Create new flags:
  LOG_DESTINATION_DURATION,
  LOG_DESTINATION_DURATION_CSV,
  LOG_DESTINATION_DURATION_JSON
[include/utils/elog.h]

Routing and mapping LOG_DESTINATION to PIPE_PROTO
[backend/utils/error/elog.c]

Minor rerouting when using alternate forms
[backend/utils/error/csvlog.c]
[backend/utils/error/jsonlog.c]

Create new filehandles, do log rotation, map PIPE_PROTO to LOG_DESTINATION. Rotation and entry into the "current_logfiles" file are the same as existing log files. The new names/suffixes are duration, duration.csv, and duration.json.
[backend/postmaster/syslogger.c]

Testing to ensure combinations of log_destination and log_duration_destination work as intended
[bin/pg_ctl/meson.build]

Questions I've asked along the way, and perhaps other might as well:

What about logging other things? Why just duration?

Duration logging is a very unique event in our logs. There is nothing quite like it - it's always client-driven, yet automatically generated. And it can be extraordinarily verbose. Removing it from the existing logging stream has no particular downsides. Almost any other class of log message would likely meet resistance as far as moving it to a separate log file, with good reason.

Why not build a more generic log filtering case?

I looked into this, but it would be a large undertaking, given the way our logging system works. And as per above, I don't think the pain would be worth it, as duration covers 99% of the use cases for separate logs. Certainly, nothing else other than a recurring ERROR from the client can cause massive bloat in the size of the files. (There is a nearby patch to exclude certain errors from the log file as a way to mitigate the error spam - I don't like that idea, but should mention it here as another effort to keep the log files a manageable size)

Why not use an extension for this?

I did start this as an extension, but it only goes so far. We can use emit_log_hook, but it requires careful coordination of open filehandles, has to do inefficient regex of every log message, and cannot do things like log rotation.

Why not bitmap PIPE_PROTO *and* LOG_DESTINATION?

I tried to do both as simple bitmaps (i.e. duration+csv = duration.csv), and not have to use e.g. LOG_DESTIATION_DURATION_CSV, but size_rotation_for ruined it for me. Since our PIPE always sends one thing at a time, a single new flag enables it to stay as a clean bits8 type.

What about Windows?

Untested. I don't have access to a Windows build, but I think in theory it should work fine.

I like the proposed feature, but I miss two points. 

1. possibility to support auto_explain

2. possibility to support rsyslog by setting different or some syslog related redirection by setting different facility.

Regards

Pavel


Cheers,
Greg