Home > mailing lists

Re: Allow reading LSN written by walreciever, but not flushed yet - Mailing list pgsql-hackers

From	Alexander Kukushkin
Subject	Re: Allow reading LSN written by walreciever, but not flushed yet
Date	May 14 09:54:05
Msg-id	CAFh8B==0DONwvHHvW_YBN1LSi=Gd3iPc=enqCk5vNDOfDwtq2g@mail.gmail.com Whole thread Raw
In response to	Re: Allow reading LSN written by walreciever, but not flushed yet (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses	Re: Allow reading LSN written by walreciever, but not flushed yet
List	pgsql-hackers

Tree view

Hi Fujii,

On Tue, 13 May 2025 at 13:13, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

In this case, doesn't the flush LSN typically catch up to the write LSN on node2
after a few seconds? Even if the walreceiver exits while there's still written
but unflushed WAL, it looks like WalRcvDie() ensures everything is flushed by
calling XLogWalRcvFlush(). So, isn't it safe to rely on the flush LSN when selecting
the most advanced node? No?

I think it is a bit more complex than that. There are also cases when we want to ensure that there are "healthy" standby nodes when switchover is requested.

Meaning of "healthy" could be something like: "According to the write LSN it is not lagging more than 16MB" or similar.

Now it is possible to extract this value using pg_stat_get_wal_receiver()/pg_stat_wal_receiver, but it works only when the walreceiver process is alive.

>>> Caveat: we already have a function pg_last_wal_receive_lsn(), which in fact returns flushed LSN, not written. I propose to add a new function which returns LSN actually written. Internals of this function are already implemented (GetWalRcvWriteRecPtr()), but unused.

GetWalRcvWriteRecPtr() returns walrcv->writtenUpto, which can move backward
when the walreceiver restarts. This behavior is OK for your purpose?

IMO, most of HA tools are prepared for it. They can't rely only on write/flush LSN, because standby may be replaying WALs from the archive using restore_command and as a result only replay LSN is progressing.

That is, they are supposed to be doing something like max(write_lsn, replay_lsn).

Regards,

Alexander Kukushkin

pgsql-hackers by date:

From: Amit Kapila
Date: 14 May, 09:35:16
Subject: Re: Limiting overshoot in nbtree's parallel SAOP index scans

From: Alexander Kukushkin
Date: 14 May, 10:02:47
Subject: Re: Allow reading LSN written by walreciever, but not flushed yet

Re: Allow reading LSN written by walreciever, but not flushed yet - Mailing list pgsql-hackers

Previous

Next