Re: pg_walinspect - a new extension to get raw WAL data and WAL stats - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: pg_walinspect - a new extension to get raw WAL data and WAL stats |
Date | |
Msg-id | CALj2ACVcNmNq2_Div=9OgUcq40iG2Epik9JYSajVNr3J=Se9Hw@mail.gmail.com Whole thread Raw |
In response to | Re: pg_walinspect - a new extension to get raw WAL data and WAL stats (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Tue, Mar 22, 2022 at 11:30 PM Andres Freund <andres@anarazel.de> wrote: > > > > Why "CALLED ON NULL INPUT"? It doesn't make sense to call the function with a > > > NULL lsn, does it? Also, that's the default, why is it restated here? > > > > pg_get_wal_records_info needed that option (if end_lsn being the > > default, providing WAL info upto the end of WAL). Also, we can emit > > better error message ("invalid WAL start LSN") instead of generic one. > > I wanted to keep error message and code same across all the functions > > hence CALLED ON NULL INPUT option for pg_get_raw_wal_record. > > I think it should be strict if it behaves strict. I fail to see what > consistency in error messages is worth here. And I'd probably just create two > different functions for begin and begin & end LSN and mark those as strict as > well. I'm okay with changing them to be STRICT. Right now, the behaviour of pg_get_wal_records_info is this: CREATE FUNCTION pg_get_wal_records_info(IN start_lsn pg_lsn, IN end_lsn pg_lsn DEFAULT NULL, select pg_get_wal_records_info(start_lsn, end_lsn); if start_lsn is future, then errors out if end_lsn is future, then errors out otherwise, returns WAL records info between start_lsn and end_lsn select pg_get_wal_records_info(start_lsn); if start_lsn is future, then errors out sets end_lsn = current server lsn and returns WAL records info between start_lsn and end_lsn Same is true for pg_get_wal_stats. Getting WAL records info provided start_lsn until end-of-WAL is a basic ask and a good function to have. Now, if I were to make pg_get_wal_records_info STRICT, then I would need to have another function like pg_get_wal_records_info_till_end_of_wal/pg_get_wal_stats_till_end_of_wal much like ones in few of my initial patches upthread. Is it okay to have these functions pg_get_wal_records_info(start_lsn, end_lsn)/pg_get_wal_stats(start_lsn, end_lsn) and pg_get_wal_records_info_till_end_of_wal(start_lsn)/pg_get_wal_stats_till_end_of_wal(start_lsn)? This way, it will be more clear to the user actually than to stuff more than one behaviour in a single function with default values. Please let me know your thoughts. > > Yeah. It is to handle some edge cases to print the WAL upto end_lsn > > and avoid waiting in read_local_xlog_page. I will change it. > > > > Actually, there's an open point as specified in [3]. Any thoughts on it? > > Seems more user-friendly to wait - it's otherwise hard for a user to know what > LSN to put in. I agree with Kyotaro-san that the wait behavior isn't a good choice, because CTRL+C would not emit the accumulated info/stats unlike pg_waldump. Also, with wait behaviour it's easy for a user to trick the server with an unreasonably futuristic WAL LSN, say F/FFFFFFFF. Also, if we use pg_walinspect functions, say, within a WAL monitoring app, the wait behaviour isn't good there as it might look like the functions hanging the app. We might think about adding a timeout for waiting, but that doesn't seem an elegant way. Users/Apps can easily figure out the LSNs to get WAL info/stats - either they can use pg_current_wal_XXXX or by looking at the control file or server logs or pg_stat_replication, what not. LSNs are everywhere within the postgres eco-system. Instead, the functions simply can figure out what's current server LSN at-the-moment and choose to error out if any of the provided input LSN is beyond that as it's being done currently. This looks simpler and user-friendly. On Wed, Mar 23, 2022 at 8:27 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Wed, 23 Mar 2022 11:51:25 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > > The two places emit different outputs but the only difference is the > > delimiter between two blockrefs. (By the way, the current code forgets > > to insert a delimiter there). So even if the function took "bool > > is_waldump", it is used only when appending a line delimiter. It > > would be nicer if the "bool is_waldump" were "char *delimiter". > > Others might think differently, though.. > > > > So, the function looks like this. > > > > StringInfo XLogBlockRefInfos(XLogReaderState *record, char *delimiter, > > uint32 &fpi_len); > > By the way, xlog_block_info@xlogrecovery.c has the subset of the > function. So the function can be shared with the callers of > xlog_block_info but I'm not sure it is not too-much... > > StringInfo XLogBlockRefInfos(XLogReaderState *record, char *delimiter, > bool fpw_detail, uint32 &fpi_len); > Yes, putting them in a common function is a good idea. I'm thinking something like below. StringInfo XLogBlockRefInfos(XLogReaderState *record, char *delimiter, uint32 *fpi_len, bool detailed_format) I will try to put the common functions in xlogreader.h/.c, so that both pg_waldump and pg_walinspect can make use of it. Thoughts? Regards, Bharath Rupireddy.
pgsql-hackers by date: