On Tue, Mar 10, 2026 at 10:02 AM Shinya Kato <shinya11.kato@gmail.com> wrote:
>
> On Mon, Mar 9, 2026 at 8:21 PM Fujii Masao <masao.fujii@gmail.com> wrote:
> > > The attached v2 patch takes a different approach: it additionally
> > > requires that all reported positions (write/flush/apply) remain
> > > unchanged from the previous reply. This directly detects a truly idle
> > > system without relying on timeouts—if any position has advanced, new
> > > WAL activity must have occurred, so we should not clear the lag values
> > > even if the lag tracker is empty.
> >
> > This approach looks good to me.
>
> Thank you for looking into this.
>
> > One comment: currently, the lag becomes NULL basically after about one
> > wal_receiver_status_interval during periods of no activity. OTOH, with this
> > approach, it seems it would take about twice wal_receiver_status_interval.
> > Is this understanding correct?
>
> Exactly. With this patch, it takes about two
> wal_receiver_status_interval cycles to show NULL instead of one. I
> think this is an acceptable trade-off because it is better to take a
> bit longer to detect inactivity than to incorrectly show NULL during
> active replication.
Even with your latest patch, if we remove fullyAppliedLastTime, and set
clearLagTimes to true when applyPtr == sentPtr && noLagSamples &&
positionsUnchanged,
wouldn't the time for the lag to become NULL be almost the same as
wal_receiver_status_interval?
The documentation doesn't clearly specify how long it should take for
the lag to become NULL, so doubling that time might be acceptable.
However, if we can keep it roughly the same without much complexity,
I think that would be preferable.
Thought?
--
Fujii Masao