Re: Fetching timeline during recovery - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Fetching timeline during recovery
Date
Msg-id 20190724004905.GG2059@paquier.xyz
Whole thread Raw
In response to Fetching timeline during recovery  (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Responses Re: Fetching timeline during recovery
List pgsql-hackers
On Tue, Jul 23, 2019 at 06:05:18PM +0200, Jehan-Guillaume de Rorthais wrote:
> Please, find in attachment a first trivial patch to support pg_walfile_name()
> and pg_walfile_name_offset() on a standby.
> Previous restriction on this functions seems related to ThisTimeLineID not
> being safe on standby. This patch is fetching the timeline from
> WalRcv->receivedTLI using GetWalRcvWriteRecPtr(). As far as I understand,
> this is updated each time some data are flushed to the WAL.

FWIW, I don't have any objections to lift a bit the restrictions on
those functions if we can make that reliable enough.  Now during
recovery you cannot rely on ThisTimeLineID as you say, per mostly the
following bit in xlog.c (the comment block a little bit up also has
explanations):
   /*
    * ThisTimeLineID is normally not set when we're still in recovery.
    * However, recycling/preallocating segments above needed ThisTimeLineID
    * to determine which timeline to install the segments on. Reset it now,
    * to restore the normal state of affairs for debugging purposes.
    */
    if (RecoveryInProgress())
        ThisTimeLineID = 0;

Your patch does not count for the case of archive recovery, where
there is no WAL receiver, and as the shared memory state of the WAL
receiver is not set 0 would be set.  The replay timeline is something
we could use here instead via GetXLogReplayRecPtr().
CreateRestartPoint actually takes the latest WAL receiver or replayed
point for its end LSN position, whichever is newer.

> Last, I plan to produce an extension to support this on older release. Is
> it something that could be integrated in official source tree during a minor
> release or should I publish it on eg. pgxn?

Unfortunately no.  This is a behavior change so it cannot find its way
into back branches.  The WAL receiver state is in shared memory and
published, so that's easy enough to get.  We don't do that for XLogCtl
unfortunately.  I think that there are arguments for being more
flexible with it, and perhaps have a system-level view to be able to
look at some of its fields.

There is also a downside with get_controlfile(), which is that it
fetches directly the data from the on-disk pg_control, and
post-recovery this only gets updated at the first checkpoint.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: stress test for parallel workers
Next
From: "Jamison, Kirk"
Date:
Subject: RE: [PATCH] Speedup truncates of relation forks