Re: pg_stat_replication.*_lag sometimes shows NULL during active replication - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date
Msg-id CAHGQGwE=kyQ+YnGPn8zpZ959+3ywg8OR_Nu__uXxxuE0E+Y_Zg@mail.gmail.com
Whole thread
In response to pg_stat_replication.*_lag sometimes shows NULL during active replication  (Shinya Kato <shinya11.kato@gmail.com>)
Responses Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
List pgsql-hackers
On Tue, Feb 24, 2026 at 3:54 PM Shinya Kato <shinya11.kato@gmail.com> wrote:
>
> Hi hackers,
>
> I have noticed that pg_stat_replication.*_lag sometimes shows NULL
> when inserting a record per second for health checking. This happens
> when the startup process replays WAL fast enough before the
> walreceiver sends its flush notification to the walsender.
>
> Here is the sequence that triggers the issue: (See normal.svg and
> error.svg for diagrams of the normal and problematic cases.)
>
> 1. The walreceiver receives, writes, and flushes WAL, then wakes the
> startup process via WakeupRecovery().
>
> 2. The startup process replays all available WAL quickly, then calls
> WalRcvForceReply() to set force_reply = true and wakes the
> walreceiver.
>
> 3. The walreceiver sends a flush notification to the walsender
> (XLogWalRcvSendReply() in XLogWalRcvFlush()). Since the startup has
> already replayed the WAL by this point, this message reports the
> incremented applyPtr, which equals sentPtr. The walsender processes
> this message, consuming the LagTracker samples and setting
> fullyAppliedLastTime = true.
>
> 4. In the next loop iteration, the walreceiver sees force_reply = true
> and sends another reply with the same positions. The walsender sees
> applyPtr == sentPtr for the second consecutive time and sets
> clearLagTimes = true. Since the LagTracker samples were already
> consumed by step 3, all lag values are -1. With clearLagTimes = true,
> these -1 values are written to walsnd->*Lag, causing
> pg_stat_replication to show NULL.
>
> The comment in ProcessStandbyReplyMessage() says:
>
>      * If the standby reports that it has fully replayed the WAL in two
>      * consecutive reply messages, then the second such message must result
>      * from wal_receiver_status_interval expiring on the standby.
>
> But as shown above, the second message can also come from
> WalRcvForceReply(), violating this assumption.
>
> The attached patch fixes this by adding a check that all lag values
> are -1 to the clearLagTimes condition. This ensures that clearLagTimes
> only triggers when there are truly no new lag samples in two
> consecutive messages (i.e., the system is genuinely idle), and not
> when the samples were simply consumed by a preceding message in a
> burst of replies.

Thanks for the patch!

With the patch applied, I set up a logical replication and inserted a row every
second. Even with continuous inserts, NULL was shown in the lag columns of
pg_stat_replication. That makes me wonder whether the patch's approach is
sufficient to address the issue.

Relying solely on replies from the standby or subscriber seems a bit fragile to
me. If the goal is to keep showing the last measured lag for some time,
perhaps we should introduce a rate limit on when NULL is displayed in the lag
columns?

For example, if there has been no activity (i.e., sentPtr == applyPtr and
applyPtr has not changed since the previous cycle) for, say, 10 seconds,
then we could allow NULL to be shown. Thought?

Regards,

--
Fujii Masao



pgsql-hackers by date:

Previous
From: Antonin Houska
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: Tom Lane
Date:
Subject: Re: Question: rebuilding frontend tools after libpgfeutils.a changes?