Re: Add pg_stat_recovery system view - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Add pg_stat_recovery system view
Date
Msg-id aaj_WA1Du4micf9t@paquier.xyz
Whole thread Raw
In response to Re: Add pg_stat_recovery system view  (Xuneng Zhou <xunengzhou@gmail.com>)
Responses Re: Add pg_stat_recovery system view
List pgsql-hackers
On Wed, Mar 04, 2026 at 09:02:29AM +0800, Xuneng Zhou wrote:
> Just rebase.

I have applied 0001, that simply moves some code around.

Regarding 0002, I would recommend to not bump the catalog version in
catversion.h when sending a patch to the lists.  This task is left to
committers when the code gets merged into the tree.  And this is
annoying for submitters because it can create a lot of conflicts.

Using a set-returning function is I think wrong here, betraying the
representation of the recovery status as stored in the system.  We
know that there is only one state of recovery, fixed in shared memory.
Like the cousins of this new function, let's make thinks non-SRF,
returning one row all the time with PG_RETURN_NULL() if the conditions
for information display are not satisfied.  When we are not in
recovery or when the role querying the function is not granted
ROLE_PG_READ_ALL_STATS, that will simplify the patch as there is no
need for the else branch with the nulls, as written in your patch.

The field values are acquired the right way, spinlock acquisitions
have to be short.

Like pg_stat_wal_receiver, let's add to the view definition a qual to
return a row only if the fields are not NULL.

pg_get_wal_replay_pause_state() displays the pause_state, and it is
not hidden behind the stats read role.  I am not really convinced that
this is worth bothering to treat as an exception in this patch.  It
makes it a bit more complicated, for little gain.  I would just hide
all the fields behind the role granted, to keep the code simpler, even
if that's slightly inconsistent with pg_get_wal_replay_pause_state().

After writing this last point, as far as I can see there is a small
optimization possible in the patch.  When a role is not granted
ROLE_PG_READ_ALL_STATS, we know that we will not return any
information so we could skip the spinlock acquisition and avoid
spinlock acquisitions when one queries the function but has no access
to its data.

+       True if a promotion has been triggered for this standby server.

Standby servers are implied if data is returned, this sentence can be
simpler and cut the "standby server" part.

+       Start LSN of the last WAL record replayed during recovery.
[...]
+       End LSN of the last WAL record replayed during recovery.
[...]
+       Timeline of the last replayed WAL record.
For other system views with LSN information, we don't use "LSN", but
"write-ahead log location".  I'd suggest the same term here.  These
three fields refer to the last record *successfully* replayed.  It
seems important to mention this fact, I guess?

+       <structfield>replay_end_lsn</structfield> <type>pg_lsn</type>
+      </para>
+      <para>
+       Current replay position. When replaying a record, this is the end
+       position of the record being replayed; otherwise it equals
+       <structfield>last_replayed_end_lsn</structfield>.

Slightly inexact.  This is the end LSN + 1.

+       <structfield>replay_end_lsn</structfield> <type>pg_lsn</type>
[..]
+       <structfield>replay_end_tli</structfield> <type>integer</type>

Okay, I can fall behind the addition of these two, it could be helpful
in debugging something like a locking issue when replaying a specific
record, I guess.  At least I'd want to know what is happening for a
record currently being replayed.  It seems to me that this could be
more precise, mentioning that these refer to a record *currently*
being replayed.

Is recoveryLastXTime actually that relevant to have?  We use it for
logging and for recovery target times, which is something less
relevant than the other fields, especially if we think about standbys
where these have no targets to reach.

currentChunkStartTime, on the contrary, is much more relevant to me,
due to the fact that we use it in WaitForWALToBecomeAvailable() with
active WAL receivers running.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [19] CREATE SUBSCRIPTION ... SERVER
Next
From: Michael Paquier
Date:
Subject: Re: Improve checks for GUC recovery_target_xid