Re: standby apply lag on inactive servers - Mailing list pgsql-hackers
From | Alvaro Herrera |
---|---|
Subject | Re: standby apply lag on inactive servers |
Date | |
Msg-id | 20200129221131.GA16112@alvherre.pgsql Whole thread Raw |
In response to | Re: standby apply lag on inactive servers (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: standby apply lag on inactive servers
|
List | pgsql-hackers |
On 2020-Jan-28, Kyotaro Horiguchi wrote: > The aim of the patch seem reasonable. XLOG_END_OF_RECOVERY and > XLOG_XACT_PREPARE also have a timestamp but it doesn't help much. (But > could be included for consistency.) Hmm I think I should definitely include those. > The timestamp of a checkpoint record is the start time of a checkpoint > (and doesn't have subseconds part, but it's not a problem.). That > means that the timestamp is usually far behind the time at the record > has been inserted. As the result the stored timestamp can go back by a > significant internval. I don't think that causes an actual problem but > the movement looks wierd as the result of > pg_last_xact_replay_timestamp(). A problem I see with this is that setting the timestamp is done with XLogCtl->lock held; since now we need to make the update conditional, we'd have to read the current value, compare with the checkpoint time, then set, all with the spinlock held. That seems way too expensive. A compromise might be to do the compare only when it's done for checkpoint. These occur seldom enough that it shouldn't be a problem (as opposed to commit records, which can be very frequent). > Asides from the backward movement, a timestamp from other than > commit/abort records in recvoeryLastXTime affects the following code. > > xlog.c:7329: (similar code exists at line 9332) > > ereport(LOG, > > (errmsg("redo done at %X/%X", > > (uint32) (ReadRecPtr >> 32), (uint32) ReadRecPtr))); > > xtime = GetLatestXTime(); > > if (xtime) > > ereport(LOG, > > (errmsg("last completed transaction was at log time %s", > > timestamptz_to_str(xtime)))); > > This code assumes (and the name GetLatestXTime() suggests, I first > noticed that here..) that the timestamp comes from commit/abort logs, > so otherwise it shows a wrong timestamp. We shouldn't update the > variable by other than that kind of records. Thinking about this some more, I think we should do keep the message the same backbranches (avoid breaking anything that might be reading the log -- a remote but not inexistent possibility), and adjust the message in master to be something like "last timestamped WAL activity at time %s", and document that it means commit, abort, restore label, checkpoint. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: