Re: Fetching timeline during recovery - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Fetching timeline during recovery
Date
Msg-id CAHGQGwEuEPjkkN607qZRqm=LWZ+iwqvdXqMz53KX1ST=QHvJFg@mail.gmail.com
Whole thread Raw
In response to Re: Fetching timeline during recovery  (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Responses Re: Fetching timeline during recovery
List pgsql-hackers
On Mon, Jul 29, 2019 at 7:26 PM Jehan-Guillaume de Rorthais
<jgdr@dalibo.com> wrote:
>
> On Fri, 26 Jul 2019 18:22:25 +0200
> Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
>
> > On Fri, 26 Jul 2019 10:02:58 +0200
> > Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> >
> > > On Fri, 26 Jul 2019 16:49:53 +0900 (Tokyo Standard Time)
> > > Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > [...]
> > > > We have an LSN reporting function each for several objectives.
> > > >
> > > >  pg_current_wal_lsn
> > > >  pg_current_wal_insert_lsn
> > > >  pg_current_wal_flush_lsn
> > > >  pg_last_wal_receive_lsn
> > > >  pg_last_wal_replay_lsn
> > >
> > > Yes. In fact, my current implementation might be split as:
> > >
> > >   pg_current_wal_tl: returns TL on a production cluster
> > >   pg_last_wal_received_tl: returns last received TL on a standby
> > >
> > > If useful, I could add pg_last_wal_replayed_tl. I don't think *insert_tl and
> > > *flush_tl would be useful as a cluster in production is not supposed to
> > > change its timeline during its lifetime.
> > >
> > > > But, I'm not sure just adding further pg_last_*_timeline() to
> > > > this list is a good thing..
> > >
> > > I think this is a much better idea than mixing different case (production
> > > and standby) in the same function as I did. Moreover, it's much more
> > > coherent with other existing functions.
> >
> > Please, find in attachment a new version of the patch. It now creates two new
> > fonctions:
> >
> >   pg_current_wal_tl()
> >   pg_last_wal_received_tl()
>
> I just found I forgot to use PG_RETURN_INT32 in pg_last_wal_received_tl().
> Please find the corrected patch in attachment:
> 0001-v3-Add-functions-to-get-timeline.patch

Thanks for the patch! Here are some comments from me.

You need to write the documentation explaining the functions
that you're thinking to add.

+/*
+ * Returns the current timeline on a production cluster
+ */
+Datum
+pg_current_wal_tl(PG_FUNCTION_ARGS)

The timeline ID that this function returns seems almost
the same as pg_control_checkpoint().timeline_id,
when the server is in production. So I'm not sure
if it's worth adding that new function.

+ currentTL = GetCurrentTimeLine();
+
+ PG_RETURN_INT32(currentTL);

Is GetCurrentTimeLine() really necessary? Seems ThisTimeLineID can be
returned directly since it indicates the current timeline ID in production.

+pg_last_wal_received_tl(PG_FUNCTION_ARGS)
+{
+ TimeLineID lastReceivedTL;
+ WalRcvData *walrcv = WalRcv;
+
+ SpinLockAcquire(&walrcv->mutex);
+ lastReceivedTL = walrcv->receivedTLI;
+ SpinLockRelease(&walrcv->mutex);

I think that it's smarter to use GetWalRcvWriteRecPtr() to
get the last received TLI, like pg_last_wal_receive_lsn() does.

The timeline ID that this function returns is the same as
pg_stat_wal_receiver.received_tli while walreceiver is running.
But when walreceiver is not running, pg_stat_wal_receiver returns
no record, and pg_last_wal_received_tl() would be useful to
get the timeline only in this case. Is this my understanding right?

> Also, TimeLineID is declared as a uint32. So why do we use
> PG_RETURN_INT32/Int32GetDatum to return a timeline and not PG_RETURN_UINT32?
> See eg. in pg_stat_get_wal_receiver().

pg_stat_wal_receiver.received_tli is declared as integer.

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Thunder
Date:
Subject: Got "FATAL: could not access status of transaction" in PG 11.2
Next
From: Merlin Moncure
Date:
Subject: Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free orcorruption (!prev)