Re: RFC: pg_stat_logmsg - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: RFC: pg_stat_logmsg |
Date | |
Msg-id | Zpb9gBjB2FfiD8jO@paquier.xyz Whole thread Raw |
In response to | Re: RFC: pg_stat_logmsg (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: RFC: pg_stat_logmsg
|
List | pgsql-hackers |
On Wed, Jul 17, 2024 at 12:14:36AM +0200, Tomas Vondra wrote: > I noticed this patch hasn't moved since September 2023, so I wonder > what's the main blocker / what is needed to move this? + /* Location of permanent stats file (valid when database is shut down) */ + #define PGLM_DUMP_FILE PGSTAT_STAT_PERMANENT_DIRECTORY "/pg_stat_logmsg.stat Perhaps this does not count as a valid reason, but does it really make sense to implement things this way, knowing that this could be changed to rely on a potential pluggable pgstats? I mean this one I've proposed: https://www.postgresql.org/message-id/Zmqm9j5EO0I4W8dx%40paquier.xyz One potential implementation is to stick that to be fixed-numbered, because there is a maximum cap to the number of entries proposed by the patch, while keeping the whole in memory. + logmsg_store(ErrorData *edata, Size *logmsg_offset, + int *logmsg_len, int *gc_count) The patch shares a lot of perks with pg_stat_statements that don't scale well. I'm wondering if it is a good idea to duplicate these properties in a second, different, module, like the external file can could be written out on a periodic basis depending on the workload. I am not saying that the other thread is a magic solution for everything (not looked yet at how this would plug with the cap in entries that pg_stat_statements wants), just one option on the table. > As for the code, I wonder if the instability of line numbers could be a > problem - these can change (a little bit) between minor releases, so > after an upgrade we'll read the dump file with line numbers from the old > release, and then start adding entries with new line numbers. Do we need > to handle this in some way? Indeed. Perhaps a PostgreSQL version number assigned to each entry to know from which binary an entry comes from? This would cost a couple of extra bytes for each entry still that would be the best information possible to match that with a correct code tree. If it comes to that, even getting down to a commit SHA1 could be useful to provide the lowest level of granularity. Another thing would be to give up on the line number, stick to the uniqueness in the stats entries with the errcode and the file name, but that won't help for things like tablecmds.c. > This might be partially solved by eviction of entries from the old > release - we apply decay, so after a while their usage will be 0. But > what if there's no pressure for space, we'll not actually evict them. > And it'll be confusing to have a mix of old/new line numbers. Once we know that these stats not going to be relevant anymore as of a minor upgrade flow, resetting them could be the move that makes the most sense, leaving the reset to the provider doing the upgrades, while taking a snapshot of the past data before the reset? I find the whole problem tricky to define, TBH. -- Michael
Attachment
pgsql-hackers by date: