Re: pg_stat_replication lag fields return non-NULL values even withNULL LSNs - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: pg_stat_replication lag fields return non-NULL values even withNULL LSNs
Date
Msg-id 20190813021953.GB2551@paquier.xyz
Whole thread Raw
In response to Re: pg_stat_replication lag fields return non-NULL values even withNULL LSNs  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: pg_stat_replication lag fields return non-NULL values even withNULL LSNs  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Tue, Aug 13, 2019 at 11:15:42AM +1200, Thomas Munro wrote:
> Hmm.  It's working as designed, but indeed it's not very newsworthy
> information in this case.  If you run pg_receivewal --synchronous then
> you get sensible looking flush_lag times.  Without that, flush_lag
> only goes up, and of course replay_lag only goes up, so although it's
> telling the truth, I think your proposal makes sense.

Thanks!

> One question I had is what would happen with your patch without
> --synchronous, once it flushes a whole file and opens a new one; I
> wondered if your new boring-information-hiding behaviour would stop
> working after one segment file because of that.

Indeed.

> I tested that and the column remains NULL when we move to a new
> file, so that's good.

Thanks for looking.

> One thing I noticed in passing is that you always get the same times
> in the write_lag and flush_lag columns, in --synchronous mode, and the
> times updates infrequently.  That's not the case with regular
> replicas; I suspect there is a difference in the time and frequency of
> replies sent to the server, which I guess might make synchronous
> commit a bit "lumpier", but I didn't dig further today.

The messages are sent by pg_receivewal via sendFeedback() in
receivelog.c.  It gets triggered for the --synchronous case once a
flush is done (but you are not surprised by my reply here, right!),
and most likely the matches you are seeing some from the messages sent
at the beginning of HandleCopyStream() where the flush and write
LSNs are equal.  This code behaves as I would expect based on your
description and a read of the code I have just done to refresh my
mind, but we may of course have some issues or potential
improvements.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Regression test failure in regression test temp.sql
Next
From: Michael Paquier
Date:
Subject: Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server