Thread: pg_stat_replication lag fields return non-NULL values even with NULLLSNs
Hi all, (Thomas in CC as per 6912acc0) I got surprised by the following behavior from pg_stat_get_wal_senders when connecting for example pg_receivewal to a primary: =# select application_name, flush_lsn, replay_lsn, flush_lag, replay_lag from pg_stat_replication; application_name | flush_lsn | replay_lsn | flush_lag | replay_lag ------------------+-----------+------------+-----------------+----------------- receivewal | null | null | 00:09:13.578185 | 00:09:13.578185 (1 row) It makes little sense to me, as we are reporting a replay lag on a position which has never been reported yet, so it cannot actually be used as a comparison base for the lag. Am I missing something or should we return NULL for those fields if we have no write, flush or apply LSNs like in the attached? Thoughts? -- Michael
Attachment
Re: pg_stat_replication lag fields return non-NULL values even withNULL LSNs
From
Thomas Munro
Date:
On Wed, Jul 17, 2019 at 1:52 PM Michael Paquier <michael@paquier.xyz> wrote: > I got surprised by the following behavior from pg_stat_get_wal_senders > when connecting for example pg_receivewal to a primary: > =# select application_name, flush_lsn, replay_lsn, flush_lag, > replay_lag from pg_stat_replication; > application_name | flush_lsn | replay_lsn | flush_lag | replay_lag > ------------------+-----------+------------+-----------------+----------------- > receivewal | null | null | 00:09:13.578185 | 00:09:13.578185 > (1 row) > > It makes little sense to me, as we are reporting a replay lag on a > position which has never been reported yet, so it cannot actually be > used as a comparison base for the lag. Am I missing something or > should we return NULL for those fields if we have no write, flush or > apply LSNs like in the attached? Hmm. It's working as designed, but indeed it's not very newsworthy information in this case. If you run pg_receivewal --synchronous then you get sensible looking flush_lag times. Without that, flush_lag only goes up, and of course replay_lag only goes up, so although it's telling the truth, I think your proposal makes sense. One question I had is what would happen with your patch without --synchronous, once it flushes a whole file and opens a new one; I wondered if your new boring-information-hiding behaviour would stop working after one segment file because of that. I tested that and the column remains NULL when we move to a new file, so that's good. One thing I noticed in passing is that you always get the same times in the write_lag and flush_lag columns, in --synchronous mode, and the times updates infrequently. That's not the case with regular replicas; I suspect there is a difference in the time and frequency of replies sent to the server, which I guess might make synchronous commit a bit "lumpier", but I didn't dig further today. -- Thomas Munro https://enterprisedb.com
Re: pg_stat_replication lag fields return non-NULL values even withNULL LSNs
From
Michael Paquier
Date:
On Tue, Aug 13, 2019 at 11:15:42AM +1200, Thomas Munro wrote: > Hmm. It's working as designed, but indeed it's not very newsworthy > information in this case. If you run pg_receivewal --synchronous then > you get sensible looking flush_lag times. Without that, flush_lag > only goes up, and of course replay_lag only goes up, so although it's > telling the truth, I think your proposal makes sense. Thanks! > One question I had is what would happen with your patch without > --synchronous, once it flushes a whole file and opens a new one; I > wondered if your new boring-information-hiding behaviour would stop > working after one segment file because of that. Indeed. > I tested that and the column remains NULL when we move to a new > file, so that's good. Thanks for looking. > One thing I noticed in passing is that you always get the same times > in the write_lag and flush_lag columns, in --synchronous mode, and the > times updates infrequently. That's not the case with regular > replicas; I suspect there is a difference in the time and frequency of > replies sent to the server, which I guess might make synchronous > commit a bit "lumpier", but I didn't dig further today. The messages are sent by pg_receivewal via sendFeedback() in receivelog.c. It gets triggered for the --synchronous case once a flush is done (but you are not surprised by my reply here, right!), and most likely the matches you are seeing some from the messages sent at the beginning of HandleCopyStream() where the flush and write LSNs are equal. This code behaves as I would expect based on your description and a read of the code I have just done to refresh my mind, but we may of course have some issues or potential improvements. -- Michael
Attachment
Re: pg_stat_replication lag fields return non-NULL values even withNULL LSNs
From
Thomas Munro
Date:
On Tue, Aug 13, 2019 at 2:20 PM Michael Paquier <michael@paquier.xyz> wrote: > On Tue, Aug 13, 2019 at 11:15:42AM +1200, Thomas Munro wrote: > > One thing I noticed in passing is that you always get the same times > > in the write_lag and flush_lag columns, in --synchronous mode, and the > > times updates infrequently. That's not the case with regular > > replicas; I suspect there is a difference in the time and frequency of > > replies sent to the server, which I guess might make synchronous > > commit a bit "lumpier", but I didn't dig further today. > > The messages are sent by pg_receivewal via sendFeedback() in > receivelog.c. It gets triggered for the --synchronous case once a > flush is done (but you are not surprised by my reply here, right!), > and most likely the matches you are seeing some from the messages sent > at the beginning of HandleCopyStream() where the flush and write > LSNs are equal. This code behaves as I would expect based on your > description and a read of the code I have just done to refresh my > mind, but we may of course have some issues or potential > improvements. Right. For a replica server we call XLogWalRcvSendReply() after writing, and then again inside XLogWalRcvFlush(). So the primary gets to measure write_lag and flush_lag separately. If pg_receivewal just sends one reply after flushing, then turning on --synchronous has the effect of showing the flush lag in both write_lag and flush_lag columns. Of course those things aren't quite as independent as they should be anyway, since the flush is blocking and therefore delays the next write. <mind-reading-mode>That's why Simon probably wants to move the flush to the WAL writer process, and Andres probably wants to change the whole thing to use some kind of async IO[1].</mind-reading-mode> [1] https://lwn.net/Articles/789024/ -- Thomas Munro https://enterprisedb.com