Re: pg_walinspect - a new extension to get raw WAL data and WAL stats - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: pg_walinspect - a new extension to get raw WAL data and WAL stats |
Date | |
Msg-id | CALj2ACVX5wCtr7Q4kq0VgrnAS28jJGufZOQt074J8=EWyyDCUw@mail.gmail.com Whole thread Raw |
In response to | Re: pg_walinspect - a new extension to get raw WAL data and WAL stats (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
List | pgsql-hackers |
On Fri, Mar 11, 2022 at 8:08 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > > Attaching the v8 patch-set resolving above comments and some tests for > > checking function permissions. Please review it further. > > I played with this a bit, and would like to share some thoughts on it. Thanks a lot Kyotaro-san for reviewing. > It seems to me too rigorous that pg_get_wal_records_info/stats() > reject future LSNs as end-LSN and I think WARNING or INFO and stop at > the real end-of-WAL is more kind to users. I think the same with the > restriction that start and end LSN are required to be different. Throwing error on future LSNs is the same behaviour for all of the pg_walinspect function input LSNs. IMO it is a cleaner thing to do rather than confuse the users with different behaviours for each function. The principle is this - pg_walinspect functions can't show future WAL info. Having said that, I agree to make it a WARNING instead of ERROR, for the simple reason that ERROR aborts the txn and the applications can retry without aborting the txn. For instance, pg_terminate_backend emits a WARNING if the PID isn't a postgres process id. PS: WARNING may not be a better idea than ERROR if we turn pg_get_wal_stats a SQL function, see my response below. > The definition of end-lsn is fuzzy here. If I fed a future LSN to the > functions, they tell me the beginning of the current insertion point > in error message. On the other hand they don't accept the same > value as end-LSN. I think it is right that they tell the current > insertion point and they should take the end-LSN as the LSN to stop > reading. The future LSN is determined by this: if (!RecoveryInProgress()) available_lsn = GetFlushRecPtr(NULL); else available_lsn = GetXLogReplayRecPtr(NULL); GetFlushRecPtr returns last byte + 1 flushed meaning this is the end LSN currently known in the server, but it is not the start LSN of the last WAL record in the server. Same goes with GetXLogReplayRecPtr which gives lastReplayedEndRecPtr end+1 position. I picked GetFlushRecPtr and GetXLogReplayRecPtr to determine the future WAL LSN because this is how read_local_xlog_page determines to read WAL upto and goes for wait mode, but I wanted to avoid the wait mode completely for all the pg_walinspect functions (to keep things simple for now), hence doing the similar checks within the input validation code and emitting warning. And you are right when we emit something like below, users tend to use 0/15B6D68 (from the DETAIL message) as the end LSN. I don't want to ignore this DETAIL message altogether as it gives an idea where the server is. How about rephrasing the DETAIL message a bit, something like "Database system flushed the WAL up to WAL LSN %X/% X.'' or some other better phrasing? WARNING: WAL start LSN cannot be a future WAL LSN DETAIL: Last known WAL LSN on the database system is 0/15B6D68. If the users aren't sure about what's the end record LSN, they can just use pg_get_wal_records_info and pg_get_wal_stats without end LSN: select * from pg_get_wal_records_info('0/15B6D68'); select * from pg_get_wal_stats('0/15B6D68'); > I think pg_get_wal_stats() is worth to have but I think it should be > implemented in SQL. Currently pg_get_wal_records_info() doesn't tell > about FPI since pg_waldump doesn't but it is internally collected (of > course!) and easily revealed. If we do that, the > pg_get_wal_records_stats() would be reduced to the following SQL > statement > > SELECT resource_manager resmgr, > count(*) AS N, > (count(*) * 100 / sum(count(*)) OVER tot)::numeric(5,2) AS "%N", > sum(total_length) AS "combined size", > (sum(total_length) * 100 / sum(sum(total_length)) OVER tot)::numeric(5,2) AS "%combined size", > sum(fpi_len) AS fpilen, > (sum(fpi_len) * 100 / sum(sum(fpi_len)) OVER tot)::numeric(5,2) AS "%fpilen" > FROM pg_get_wal_records_info('0/1000000', '0/175DD7f') > GROUP by resource_manager > WINDOW tot AS () > ORDER BY "combined size" desc; > > The only difference with pg_waldump is the statement above doesn't > show lines for the resource managers that don't contained in the > result of pg_get_wal_records_info(). But I don't think that matters. Yeah, this is better. One problem with the above is when pg_get_wal_records_info emits a warning for future LSN. But this shouldn't stop us doing it via SQL. Instead I would let all the pg_walinspect functions emit errors as opposed to WARNING. Thoughts? postgres=# SELECT resource_manager, count(*) AS count, (count(*) * 100 / sum(count(*)) OVER tot)::numeric(5,2) AS count_percentage, sum(total_length) AS combined_size, (sum(total_length) * 100 / sum(sum(total_length)) OVER tot)::numeric(5,2) AS combined_size_percentage FROM pg_get_wal_records_info('0/10A3E50', '0/25B6F00') GROUP BY resource_manager WINDOW tot AS () ORDER BY combined_size desc; WARNING: WAL end LSN cannot be a future WAL LSN DETAIL: Last known WAL LSN on the database system is 0/15CAA70. resource_manager | count | count_percentage | combined_size | combined_size_percentage ------------------+-------+------------------+---------------+-------------------------- | 1 | 100.00 | | (1 row) > Sometimes the field description has very long (28kb long) content. It > makes the result output almost unreadable and I had a bit hard time > struggling with the output full of '-'s. I would like have a default > limit on the length of such fields that can be long but I'm not sure > we want that. Yeah, it's a text column, let's leave it as-is, if required users can always ignore the description columns. > And about pg_get_raw_wal_record(). I don't see any use-case of the > function alone on SQL interface. Even if we need to inspect broken > WAL files, it needs profound knowledge of WAL format and tools that > doesn't work on SQL interface. >However like pageinspect, if we separate the WAL-record fetching and > parsing it could be thought as useful. > SELECT * FROM pg_walinspect_parse(raw) > FROM (SELECT * FROM pg_walinspect_get_raw(start_lsn, end_lsn)); > > And pg_get_wal_stats woule be like: > > SELECT * FROM pg_walinpect_stat(pg_walinspect_parse(raw)) > FROM (SELECT * FROM pg_walinspect_get_raw(start_lsn, end_lsn))); Imagine pg_get_raw_wal_record function feeding raw WAL record to an external tool/extension that understands the WAL. Apart from this, I don't have a concrete reason either. I'm open to removing this function as well and adding it along with the raw WAL parsing function in future. I haven't thought about the raw WAL parsing functions for now. In fact, there are many functions we can add to pg_walinspect - functions with wait mode for future WAL, WAL parsing, function to return all the WAL record info/stats given a WAL file name, functions to return WAL info/stats from historic timelines as well, function to see if the given WAL file is valid and so on. We can park these functions for future versions of pg_walinspect once the extension itself with basic yet-useful-and-effective functions gets in. I will make a note of these functions and will work in future based on how pg_walinspect gets received by the users and community out there. Regards, Bharath Rupireddy.
pgsql-hackers by date: