Re: Combine pg_walinspect till_end_of_wal functions with others - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Combine pg_walinspect till_end_of_wal functions with others
Date
Msg-id CAEze2WhB7z8b9oPgLL+Qqyuk7+riksPKSRctujrYiaAER9mE0w@mail.gmail.com
Whole thread Raw
In response to Re: Combine pg_walinspect till_end_of_wal functions with others  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Mon, 6 Mar 2023 at 16:37, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Mar 6, 2023 at 8:52 PM Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> >
> > On Mon, 6 Mar 2023 at 16:06, Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > If we try to make these functions figure out the oldest WAl file and
> > > start from there, then it'll unnecessarily complicate the APIs and
> > > functions. If we still think we need a better function for the users
> > > to figure out the oldest WAL file, perhaps, add a SQL-only
> > > view/function to pg_walinspect that returns "select name from
> > > pg_ls_waldir() order by name limit 1;", but honestly, that's so
> > > trivial.
> >
> > That "order by name limit 1" has subtle bugs when you're working on a
> > system that has experienced timeline switches. It is entirely possible
> > that the first file (as sorted by the default collation) is not the
> > first record you can inspect, or even in your timeline's history.
>
> Hm. Note that pg_walinspect currently searches WAL on insertion
> timeline; it doesn't care about the older timelines. The idea of
> making it look at WAL on an older timeline was discussed, but for the
> sake of simplicity we kept the functions simple. If needed, I can try
> adding the timeline as input parameters to all the functions (with
> default -1 meaning current insertion timeline; if specified, look for
> WAL on that timeline).
>
> Are you saying that a pg_walinspect function that traverses the pg_wal
> directory and figures out the old valid WAL on a given timeline is
> still useful? Or make the functions look for older WAL if start_lsn is
> given as NULL or invalid?

The specific comment I made was only regarding the following issue: An
instance may still have WAL segments from before the latest timeline
switch. These segments may have a higher LSN and lower timeline number
than your current running timeline+LSN (because of e.g. pg_rewind).
This will then result in unwanted behaviour when you sort the segments
numerically/alphabetically and then assume that the first file's LSN
is valid (or available) in your current timeline.

That is why "order by name limit 1" isn't a good solution, and that's
what I was commenting on: you need to parse the timeline hierarchy to
determine which timelines you can use which WAL segments of.

To answer your question on whether I'd like us to traverse timeline
switches: Yes, I'd really like it if we were able to decode the
current timeline's hierarchical WAL of a PG instance in one go, from
the start at (iirc) 0x10000 all the way to the current LSN, assuming
the segments are available.


Kind regards,

Matthias van de Meent



pgsql-hackers by date:

Previous
From: Jelte Fennema
Date:
Subject: Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Next
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum