Thread: pg_stat_replication view

pg_stat_replication view

From
Jeff Janes
Date:
I'm not sure if this is a documentation issue, or something else.

The description of the pg_stat_replication.state column gives:

* catchup: This WAL sender's connected standby is catching up with the primary.

* streaming: This WAL sender is streaming changes after its connected standby server has caught up with the primary.

What does this mean?  Is the standby "caught up" when it replays the LSN which was current on the master as-of the time that the standby initiated this connection?  Or is it caught up when the master receives at least one notification that a certain LSN was replayed on the replica, and verifies that no new WAL has been generated after that certain LSN was generated?  Neither of those things?

If a replica has caught up and then fallen behind again, is that different from a user/dba perspective than if it never caught up in the first place?

Also, the docs say "Lag times work automatically for physical replication. Logical decoding plugins may optionally emit tracking messages; if they do not, the tracking mechanism will simply display NULL lag."  Does the logical decoding plugin associated with built-in PUBLICATION/SUBSCRIPTION mechanism introduced in v10 emit tracking messages?

Cheers,

Jeff

Re: pg_stat_replication view

From
Michael Paquier
Date:
On Mon, Dec 10, 2018 at 02:24:43PM -0500, Jeff Janes wrote:
> What does this mean?  Is the standby "caught up" when it replays the LSN
> which was current on the master as-of the time that the standby initiated
> this connection?  Or is it caught up when the master receives at least one
> notification that a certain LSN was replayed on the replica, and verifies
> that no new WAL has been generated after that certain LSN was generated?
> Neither of those things?

The WAL sender would switch from catchup to streaming mode when it sees
that there is no more data to send to the standby.  Please look for the
call of WalSndSetState(WALSNDSTATE_STREAMING) in walsender.c.

> If a replica has caught up and then fallen behind again, is that different
> from a user/dba perspective than if it never caught up in the first
> place?

Not really, because it means that it has been able to catch up with the
latest LSN of the primary at least once.  Perhaps you have suggestions
to improve the documentation?
--
Michael

Attachment