Thread: pg_stat_replication view
I'm not sure if this is a documentation issue, or something else.
* catchup: This WAL sender's connected standby is catching up with the primary.
* streaming: This WAL sender is streaming changes after its connected standby server has caught up with the primary.
The description of the pg_stat_replication.state column gives:
* streaming: This WAL sender is streaming changes after its connected standby server has caught up with the primary.
What does this mean? Is the standby "caught up" when it replays the LSN which was current on the master as-of the time that the standby initiated this connection? Or is it caught up when the master receives at least one notification that a certain LSN was replayed on the replica, and verifies that no new WAL has been generated after that certain LSN was generated? Neither of those things?
If a replica has caught up and then fallen behind again, is that different from a user/dba perspective than if it never caught up in the first place?
Also, the docs say "Lag times work automatically for physical replication. Logical decoding plugins may optionally emit tracking messages; if they do not, the tracking mechanism will simply display NULL lag." Does the logical decoding plugin associated with built-in PUBLICATION/SUBSCRIPTION mechanism introduced in v10 emit tracking messages?
Cheers,
Jeff
On Mon, Dec 10, 2018 at 02:24:43PM -0500, Jeff Janes wrote: > What does this mean? Is the standby "caught up" when it replays the LSN > which was current on the master as-of the time that the standby initiated > this connection? Or is it caught up when the master receives at least one > notification that a certain LSN was replayed on the replica, and verifies > that no new WAL has been generated after that certain LSN was generated? > Neither of those things? The WAL sender would switch from catchup to streaming mode when it sees that there is no more data to send to the standby. Please look for the call of WalSndSetState(WALSNDSTATE_STREAMING) in walsender.c. > If a replica has caught up and then fallen behind again, is that different > from a user/dba perspective than if it never caught up in the first > place? Not really, because it means that it has been able to catch up with the latest LSN of the primary at least once. Perhaps you have suggestions to improve the documentation? -- Michael