Re: Send duration output to separate log files - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Send duration output to separate log files
Date
Msg-id CAFj8pRBYCsbP9qmFZzm3B5viX7=FO_OdCjYU+gf2YP+r50rsbQ@mail.gmail.com
Whole thread Raw
In response to Send duration output to separate log files  (Greg Sabino Mullane <htamfids@gmail.com>)
List pgsql-hackers
Hi

st 10. 7. 2024 v 17:58 odesílatel Greg Sabino Mullane <htamfids@gmail.com> napsal:
Please find attached a patch to allow for durations to optionally be sent to separate log files. In other words, rather than cluttering up our postgres202007.log file with tons of output from log_min_duration_statement, duration lines are sent instead to the file postgres202007.duration.

Over the years, durations have been the number one cause of very large log files, in which the more "important" items get buried in the noise. Also, programs that are scanning for durations typically do not care about the normal, non-duration output. Some people have a policy of logging everything, which in effect means setting log_min_duration_statement to 0, which in turn makes your log files nearly worthless for spotting non-duration items. This feature will also be very useful for those who need to temporarily turn on log_min_duration_statement, for some quick auditing of exactly what is being run on their database. When done, you can move or remove the duration file without messing up your existing log stream.

This only covers the case when both the duration and statement are set on the same line. In other words, log_min_duration_statement output, but not log_duration (which is best avoided anyway). It also requires logging_collector to be on, obviously.

Details:

The edata structure is expanded to have a new message_type, with a matching function errmessagetype() created.
[include/utils/elog.h]
[backend/utils/elog.c]

Any errors that have both a duration and a statement are marked via errmessagetype()
[backend/tcop/postgres.c]

A new GUC named "log_duration_destination" is created, which supports any combination of stderr, csvlog, and jsonlog. It does not need to match log_destination, in order to support different use cases. For example, the user wants durations sent to a CSV file for processing by some other tool, but still wants everything else going to a normal text log file.

Code: [include/utils/guc_hooks.h] [backend/utils/misc/guc_tables.c]
Docs: [sgml/config.sgml]  [backend/utils/misc/postgresql.conf.sample]

Create a new flag called PIPE_PROTO_DEST_DURATION
[include/postmaster/syslogger.h]

Create new flags:
  LOG_DESTINATION_DURATION,
  LOG_DESTINATION_DURATION_CSV,
  LOG_DESTINATION_DURATION_JSON
[include/utils/elog.h]

Routing and mapping LOG_DESTINATION to PIPE_PROTO
[backend/utils/error/elog.c]

Minor rerouting when using alternate forms
[backend/utils/error/csvlog.c]
[backend/utils/error/jsonlog.c]

Create new filehandles, do log rotation, map PIPE_PROTO to LOG_DESTINATION. Rotation and entry into the "current_logfiles" file are the same as existing log files. The new names/suffixes are duration, duration.csv, and duration.json.
[backend/postmaster/syslogger.c]

Testing to ensure combinations of log_destination and log_duration_destination work as intended
[bin/pg_ctl/meson.build]

Questions I've asked along the way, and perhaps other might as well:

What about logging other things? Why just duration?

Duration logging is a very unique event in our logs. There is nothing quite like it - it's always client-driven, yet automatically generated. And it can be extraordinarily verbose. Removing it from the existing logging stream has no particular downsides. Almost any other class of log message would likely meet resistance as far as moving it to a separate log file, with good reason.

Why not build a more generic log filtering case?

I looked into this, but it would be a large undertaking, given the way our logging system works. And as per above, I don't think the pain would be worth it, as duration covers 99% of the use cases for separate logs. Certainly, nothing else other than a recurring ERROR from the client can cause massive bloat in the size of the files. (There is a nearby patch to exclude certain errors from the log file as a way to mitigate the error spam - I don't like that idea, but should mention it here as another effort to keep the log files a manageable size)

Why not use an extension for this?

I did start this as an extension, but it only goes so far. We can use emit_log_hook, but it requires careful coordination of open filehandles, has to do inefficient regex of every log message, and cannot do things like log rotation.

Why not bitmap PIPE_PROTO *and* LOG_DESTINATION?

I tried to do both as simple bitmaps (i.e. duration+csv = duration.csv), and not have to use e.g. LOG_DESTIATION_DURATION_CSV, but size_rotation_for ruined it for me. Since our PIPE always sends one thing at a time, a single new flag enables it to stay as a clean bits8 type.

What about Windows?

Untested. I don't have access to a Windows build, but I think in theory it should work fine.

I like the proposed feature, but I miss two points. 

1. possibility to support auto_explain

2. possibility to support rsyslog by setting different or some syslog related redirection by setting different facility.

Regards

Pavel


Cheers,
Greg

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: improve ssl error code, 2147483650
Next
From: Tom Lane
Date:
Subject: Re: warning: dereferencing type-punned pointer