Re: Fetching timeline during recovery - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: Fetching timeline during recovery
Date
Msg-id 20190726100258.2dca596c@firost
Whole thread Raw
In response to Re: Fetching timeline during recovery  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Fetching timeline during recovery
List pgsql-hackers
On Fri, 26 Jul 2019 16:49:53 +0900 (Tokyo Standard Time)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

> Hi.
> 
> At Thu, 25 Jul 2019 19:38:08 +0200, Jehan-Guillaume de Rorthais
> <jgdr@dalibo.com> wrote in <20190725193808.1648ddc8@firost>
> > On Wed, 24 Jul 2019 14:33:27 +0200
> > Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> >   
> > > On Wed, 24 Jul 2019 09:49:05 +0900
> > > Michael Paquier <michael@paquier.xyz> wrote:
> > >   
> > > > On Tue, Jul 23, 2019 at 06:05:18PM +0200, Jehan-Guillaume de Rorthais
> > > > wrote:    
> > [...]  
> > > > I think that there are arguments for being more flexible with it, and
> > > > perhaps have a system-level view to be able to look at some of its
> > > > fields.    
> > > 
> > > Great idea. I'll give it a try to keep the discussion on.  
> > 
> > After some thinking, I did not find enough data to expose to justify the
> > creation a system-level view. As I just need the current timeline I
> > wrote "pg_current_timeline()". Please, find the patch in attachment.
> > 
> > The current behavior is quite simple: 
> > * if the cluster is in production, return ThisTimeLineID
> > * else return walrcv->receivedTLI (using GetWalRcvWriteRecPtr)
> > 
> > This is really naive implementation. We should probably add some code around
> > the startup process to gather and share general recovery stats. This would
> > allow to fetch eg. the current recovery method, latest xlog file name
> > restored from archives or streaming, its timeline, etc.
> > 
> > Any thoughts?  
> 
> If replay is delayed behind timeline switch point, replay-LSN and
> receive/write/flush LSNs are on different timelines.  When
> replica have not reached the new timeline to which alredy
> received file belongs, the fucntion returns wrong file name,
> specifically a name consisting of the latest segment number and
> the older timeline where the segment doesn't belong to.

Indeed.

> We have an LSN reporting function each for several objectives.
> 
>  pg_current_wal_lsn
>  pg_current_wal_insert_lsn
>  pg_current_wal_flush_lsn
>  pg_last_wal_receive_lsn
>  pg_last_wal_replay_lsn

Yes. In fact, my current implementation might be split as:

  pg_current_wal_tl: returns TL on a production cluster
  pg_last_wal_received_tl: returns last received TL on a standby

If useful, I could add pg_last_wal_replayed_tl. I don't think *insert_tl and
*flush_tl would be useful as a cluster in production is not supposed to
change its timeline during its lifetime.

> But, I'm not sure just adding further pg_last_*_timeline() to
> this list is a good thing..

I think this is a much better idea than mixing different case (production and
standby) in the same function as I did. Moreover, it's much more coherent with
other existing functions.

> The function returns NULL for NULL input (STRICT behavior) but
> returns (NULL, NULL) for undefined timeline. I don't think the
> differene is meaningful.

Unless I'm missing something, nothing
returns "(NULL, NULL)" in 0001-v1-Add-function-pg_current_timeline.patch.

Thank you for your feedback!



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: block-level incremental backup
Next
From: Michael Paquier
Date:
Subject: Re: Add parallelism and glibc dependent only options to reindexdb