Re: standby apply lag on inactive servers - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: standby apply lag on inactive servers
Date
Msg-id 20200130.111917.2199841860956912043.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: standby apply lag on inactive servers  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: standby apply lag on inactive servers  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
At Wed, 29 Jan 2020 19:11:31 -0300, Alvaro Herrera <alvherre@2ndquadrant.com> wrote in 
> On 2020-Jan-28, Kyotaro Horiguchi wrote:
> 
> > The aim of the patch seem reasonable. XLOG_END_OF_RECOVERY and
> > XLOG_XACT_PREPARE also have a timestamp but it doesn't help much. (But
> > could be included for consistency.)
> 
> Hmm I think I should definitely include those.

I agree to that, given the following change of log messages.

> > The timestamp of a checkpoint record is the start time of a checkpoint
> > (and doesn't have subseconds part, but it's not a problem.). That
> > means that the timestamp is usually far behind the time at the record
> > has been inserted. As the result the stored timestamp can go back by a
> > significant internval. I don't think that causes an actual problem but
> > the movement looks wierd as the result of
> > pg_last_xact_replay_timestamp().
> 
> A problem I see with this is that setting the timestamp is done with
> XLogCtl->lock held; since now we need to make the update conditional,
> we'd have to read the current value, compare with the checkpoint time,
> then set, all with the spinlock held.  That seems way too expensive.
> 
> A compromise might be to do the compare only when it's done for
> checkpoint.  These occur seldom enough that it shouldn't be a problem
> (as opposed to commit records, which can be very frequent).

I think we don't need to that, given the following change.

> > Asides from the backward movement, a timestamp from other than
> > commit/abort records in recvoeryLastXTime affects the following code.
> > 
> > xlog.c:7329: (similar code exists at line 9332)
> > >    ereport(LOG,
> > >            (errmsg("redo done at %X/%X",
> > >                    (uint32) (ReadRecPtr >> 32), (uint32) ReadRecPtr)));
> > >    xtime = GetLatestXTime();
> > >    if (xtime)
> > >        ereport(LOG,
> > >                (errmsg("last completed transaction was at log time %s",
> > >                        timestamptz_to_str(xtime))));
> > 
> > This code assumes (and the name GetLatestXTime() suggests, I first
> > noticed that here..) that the timestamp comes from commit/abort logs,
> > so otherwise it shows a wrong timestamp.  We shouldn't update the
> > variable by other than that kind of records.
> 
> Thinking about this some more, I think we should do keep the message the
> same backbranches (avoid breaking anything that might be reading the log
> -- a remote but not inexistent possibility), and adjust the message in
> master to be something like "last timestamped WAL activity at time %s",
> and document that it means commit, abort, restore label, checkpoint.

Agreed about backbranches. I'd like to preserve the word "transaction"
as it is more familiar to users. How about something like the follows?

"transactions are completed up to log time %s"

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
Next
From: Michael Paquier
Date:
Subject: Re: Physical replication slot advance is not persistent