Re: Fetching timeline during recovery - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: Fetching timeline during recovery
Date
Msg-id 20191219234136.160113ac@firost
Whole thread Raw
In response to Re: Fetching timeline during recovery  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Wed, 11 Dec 2019 14:20:02 +0900
Michael Paquier <michael@paquier.xyz> wrote:

> On Thu, Sep 26, 2019 at 07:20:46PM +0200, Jehan-Guillaume de Rorthais wrote:
> > If this solution is accepted, some other function of the same family might
> > be good candidates as well, for the sake of homogeneity:
> > 
> > * pg_current_wal_insert_lsn
> > * pg_current_wal_flush_lsn
> > * pg_last_wal_replay_lsn
> > 
> > However, I'm not sure how useful this would be.
> > 
> > Thanks again for your time, suggestions and review!  
> 
> +{ oid => '3435', descr => 'current wal flush location',
> +  proname => 'pg_last_wal_receive_lsn', provolatile => 'v',
> proisstrict => 'f',
> This description is incorrect.

Indeed. And the one for pg_current_wal_lsn(bool) as well.

> And please use OIDs in the range of 8000~9999 for patches in
> development.  You could just use src/include/catalog/unused_oids which
> would point out a random range.

Thank you for this information, I wasn't aware.

> +   if (recptr == 0) {
> +       nulls[0] = 1;
> +       nulls[1] = 1;
> +   }
> The indendation of the code is incorrect, these should use actual
> booleans and recptr should be InvalidXLogRecPtr (note also the
> existence of the macro XLogRecPtrIsInvalid).  Just for the style.

Fixed on my side. Thanks.

> As said in the last emails exchanged on this thread, I don't see how
> you cannot use multiple functions which have different meaning
> depending on if the cluster is a primary or a standby knowing that we
> have two different concepts of WAL when at recovery: the received
> LSN and the replayed LSN, and three concepts for primaries (insert,
> current, flush).  

As I wrote in my previous email, existing functions could be overloaded
as well for the sake of homogeneity. So the five of them would have similar
behavior/API.

> I agree as well with the point of Fujii-san about
> not returning the TLI and the LSN across different functions as this
> opens the door for a risk of inconsistency for the data received by
> the client.

My last patch fixed that, indeed.

> + * When the first parameter (variable 'with_tli') is true, returns the
> current
> + * timeline as second field. If false, second field is null.
> I don't see much the point of having this input parameter which
> determines the NULL-ness of one of the result columns, and I think
> that you had better use a completely different function name for each
> one of them instead of enforcing the functions.  Let's remember that a
> lot of tools use the existing functions directly in the SELECT clause
> for LSN calculations, which is just a 64-bit integer *without* a
> timeline assigned to it.  However your patch mixes both concepts by
> using pg_current_wal_lsn.

Sorry, I realize I was not clear enough about implementation details.
My latest patch does **not** introduce regression for existing tools. If you do
not pass any parameter, the behavior is the same, only one column:

  # primary
  $ cat <<EOQ|psql -XAtp 5432
  select * from pg_current_wal_lsn();
  select * from pg_current_wal_lsn(NULL);
  select * from pg_current_wal_lsn(true);
  EOQ
  0/15D5BA0
  0/15D5BA0|
  0/15D5BA0|1

  # secondary
  $ cat <<EOQ|psql -XAtp 5433
  select * from pg_last_wal_receive_lsn();
  select * from pg_last_wal_receive_lsn(NULL);
  select * from pg_last_wal_receive_lsn(true);
  EOQ
  0/15D5BA0
  0/15D5BA0|
  0/15D5BA0|1

It's kind of the same approach than when parameters has been added to
eg. pg_stat_backup() to change its behavior between exclusive and
non-exclusive backups. But I admit I know no function changing its return type
based on the given parameter. I understand your concern.

> So we could do more with the introduction of five new functions which 
> allow to grab the LSN and the TLI in use for replay, received, insert,
> write and flush positions:
> - pg_current_wal_flush_info
> - pg_current_wal_insert_info
> - pg_current_wal_info
> - pg_last_wal_receive_info
> - pg_last_wal_replay_info

I could go this way if you prefer, maybe using _tli as suffix instead of _info
as this is the only new info added. I think it feels redundant with original
funcs, but it might be the simplest solution.

> I would be actually tempted to do the following: one single SRF
> function, say pg_wal_info which takes a text argument in input with
> the following values: flush, write, insert, receive, replay.  Thinking
> more about it that would be rather neat, and more extensible than the
> rest discussed until now.  See for example PostgresNode::lsn.

I'll answer in your other mail that summary other possibilities.

Thanks!



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: global / super barriers (for checksums)
Next
From: John Naylor
Date:
Subject: Re: Add support for automatically updating Unicode derived files