Re: why pg_walfile_name() cannot be executed during recovery? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: why pg_walfile_name() cannot be executed during recovery?
Date
Msg-id CA+TgmoY0pQaS5KRFqUxMW+SaU8M=r+q4H7b8acjYAUv94bPirw@mail.gmail.com
Whole thread Raw
In response to Re: why pg_walfile_name() cannot be executed during recovery?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: why pg_walfile_name() cannot be executed during recovery?  (Michael Paquier <michael@paquier.xyz>)
Re: why pg_walfile_name() cannot be executed during recovery?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Thu, Apr 7, 2022 at 9:32 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> I spent some time today to allow pg_walfile_{name, name_offset} run in
> recovery. Timeline ID is computed while in recovery as follows - WAL
> receiver's last received and flushed WAL record's TLI if it's
> streaming, otherwise the last replayed WAL record's TLI. This way,
> these functions can be used on standby or PITR server or even in crash
> recovery if the server opens up for read-only connections.

I don't think this is a good definition. Suppose I ask for
pg_walfile_name() using an older LSN. With this approach, we're going
to get a filename based on the idea that the TLI that was in effect
back then is the same one as the TLI that is in effect now, which
might not be true. For example, suppose that the current TLI is 2 and
it branched off of timeline 1 at 10/0. If I ask for
pg_walfile_name('F/0'), it's going to give me the name of a WAL file
that has never existed. That seems bad.

It's also worth noting that there's a bit of a definitional problem
here. If in the same situation, I ask for pg_walfile_name('11/0'),
it's going to give me a filename based on TLI 2, but there's also a
WAL file for that LSN with TLI 1. How do we know which one the user
wants? Perhaps one idea would be to say that the relevant TLI is the
one which was in effect at the time that LSN was replayed. If we do
that, what about future LSNs? We could assume that for future LSNs,
the TLI should be the same as the current TLI, but maybe that's also
misleading, because recovery_target_timeline could be set.

I think it's really important to start by being precise about the
question that we think pg_walfile_name() ought to be answering. If we
don't know that, then we really can't say what TLI it should be using.
It's not hard to make the function return SOME answer using SOME TLI,
but then it's not clear that the answer is the right one for any
particular purpose. And in that case the function is more dangerous
than useful, because people will write code that uses it to do stuff,
and then that stuff won't actually work correctly under all
circumstances.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Guillaume Lelarge
Date:
Subject: Re: REINDEX blocks virtually any queries but some prepared queries.
Next
From: Masahiko Sawada
Date:
Subject: Re: Add index scan progress to pg_stat_progress_vacuum