Home > mailing lists

Re: pg_stat_replication.*_lag sometimes shows NULL during active replication - Mailing list pgsql-hackers

From	Fujii Masao
Subject	Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date	March 10 04:54:13
Msg-id	CAHGQGwEmMBBAE0RG-R3_LacfT4fbB55qGE6n9O5mNwrqvbNBtw@mail.gmail.com Whole thread Raw
In response to	Re: pg_stat_replication.*_lag sometimes shows NULL during active replication (Shinya Kato <shinya11.kato@gmail.com>)
Responses	Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
List	pgsql-hackers

Tree view

On Tue, Mar 10, 2026 at 10:02 AM Shinya Kato <shinya11.kato@gmail.com> wrote:
>
> On Mon, Mar 9, 2026 at 8:21 PM Fujii Masao <masao.fujii@gmail.com> wrote:
> > > The attached v2 patch takes a different approach: it additionally
> > > requires that all reported positions (write/flush/apply) remain
> > > unchanged from the previous reply. This directly detects a truly idle
> > > system without relying on timeouts—if any position has advanced, new
> > > WAL activity must have occurred, so we should not clear the lag values
> > > even if the lag tracker is empty.
> >
> > This approach looks good to me.
>
> Thank you for looking into this.
>
> > One comment: currently, the lag becomes NULL basically after about one
> > wal_receiver_status_interval during periods of no activity. OTOH, with this
> > approach, it seems it would take about twice wal_receiver_status_interval.
> > Is this understanding correct?
>
> Exactly. With this patch, it takes about two
> wal_receiver_status_interval cycles to show NULL instead of one. I
> think this is an acceptable trade-off because it is better to take a
> bit longer to detect inactivity than to incorrectly show NULL during
> active replication.

Even with your latest patch, if we remove fullyAppliedLastTime, and set
clearLagTimes to true when applyPtr == sentPtr && noLagSamples &&
positionsUnchanged,
wouldn't the time for the lag to become NULL be almost the same as
wal_receiver_status_interval?

The documentation doesn't clearly specify how long it should take for
the lag to become NULL, so doubling that time might be acceptable.
However, if we can keep it roughly the same without much complexity,
I think that would be preferable.

Thought?

--
Fujii Masao

pgsql-hackers by date:

From: Chao Li
Date: 10 March, 04:42:17
Subject: Re: client_connection_check_interval default value

From: Manni Wood
Date: 10 March, 05:30:21
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD

Re: pg_stat_replication.*_lag sometimes shows NULL during active replication - Mailing list pgsql-hackers

Previous

Next