Thread: Re: per backend WAL statistics

Re: per backend WAL statistics

From
Rahila Syed
Date:
Hi,

Thank you for the patchset. Having per-backend WAL statistics,
in addition to cluster-wide ones, is useful.

I had a few comments while looking at v6-0003-* patch.

+ /*
+ * This could be an auxiliary process but these do not report backend
+ * statistics due to pgstat_tracks_backend_bktype(), so there is no need
+ * for an extra call to AuxiliaryPidGetProc().
+ */
+ if (!proc)
+ PG_RETURN_NULL();

Maybe an explicit call to AuxiliaryPidGetProc() followed by a check
for pgstat_tracks_backend_bktype() would be more maintainable.
Since the processes tracked by AuxiliaryPidGetProc and
pgstat_tracks_backend_bktype might diverge in future.  

On that note, it is not clear to me why the WAL writer statistics are not
included in per backend
wal statistics? I understand the same limitation currently exists in
pgstats_track_io_bktype(), but why does that need to be extended to
WAL statistics?

+         <primary>pg_stat_get_backend_wal</primary>
+        </indexterm>
+        <function>pg_stat_get_backend_wal</function> ( <type>integer</type> )
+        <returnvalue>record</returnvalue>
+       </para>
Should the naming describe what is being returned more clearly? 
Something like pg_stat_get_backend_wal_activity()? Currently it 
suggests that it returns a backend's WAL, which is not the case.

+ if (pgstat_tracks_backend_bktype(MyBackendType))
+ {
+ PendingBackendStats.pending_wal.wal_write++;
+
+ if (track_wal_io_timing)
+ INSTR_TIME_ACCUM_DIFF(PendingBackendStats.pending_wal.wal_write_time,
+  end, start);
+ }
At the risk of nitpicking, may I suggest moving the above code, which is under the 
track_wal_io_timing check, to the existing check before this added chunk?
This way, all code related to track_wal_io_timing will be grouped together, 
closer to where the "end" variable is computed.


Thank you,
Rahila Syed



On Tue, Jan 21, 2025 at 12:50 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
Hi,

On Fri, Jan 17, 2025 at 08:43:57AM +0900, Michael Paquier wrote:
> On Thu, Jan 16, 2025 at 12:44:20PM -0500, Andres Freund wrote:
> > On 2025-01-16 17:11:09 +0000, Bertrand Drouvot wrote:
> >> So, do you think that the initial proposal that has been made here (See R1. in
> >> [2]) i.e make use of a new PendingBackendWalStats variable:
> >
> > Well, I think this first needs be fixed for for the IO stats change made in
> >
> > Once we have a pattern to model after, we can apply the same scheme here.
>
> Okay, thanks for the input.  I was not sure what you intended
> originally with all this part of the backend code, and how much would
> be acceptable.  The line is clear now.
>
> >> 0003 does not rely on pgstat_prep_backend_pending() for its pending statistics
> >> but on a new PendingBackendWalStats variable. The reason is that the pending wal
> >> statistics are incremented in a critical section (see XLogWrite(), and so
> >> a call to pgstat_prep_pending_entry() could trigger a failed assertion:
> >> MemoryContextAllocZero()->"CritSectionCount == 0 || (context)->allowInCritSection"
> >> "
> >>
> >> and implemented up to v4 is a viable approach?
> >
> > Yes-ish.  I think it would be better to make it slightly more general than
> > that, handling this for all types of backend stats, not just for WAL.
>
> Agreed to use the same concept for all these parts of the backend
> stats kind rather than two of them.  Will send a reply on the original
> backend I/O thread as well.

PFA v6 that now relies on the new PendingBackendStats variable introduced in
4feba03d8b9.

Remark: I moved PendingBackendStats back to pgstat.h because I think that the
"simple" pending stats increment that we are adding in xlog.c are not worth
an extra function call overhead (while it made more sense for the more complex IO
stats handling). So PendingBackendStats is now visible to the outside world like
PendingWalStats and friends.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: per backend WAL statistics

From
Bertrand Drouvot
Date:
Hi,

On Wed, Jan 29, 2025 at 01:14:09PM +0530, Rahila Syed wrote:
> Hi,
> 
> Thank you for the patchset. Having per-backend WAL statistics,
> in addition to cluster-wide ones, is useful.

Thanks for looking at it!

> I had a few comments while looking at v6-0003-* patch.
> 
> + /*
> + * This could be an auxiliary process but these do not report backend
> + * statistics due to pgstat_tracks_backend_bktype(), so there is no need
> + * for an extra call to AuxiliaryPidGetProc().
> + */
> + if (!proc)
> + PG_RETURN_NULL();
> 
> Maybe an explicit call to AuxiliaryPidGetProc() followed by a check
> for pgstat_tracks_backend_bktype() would be more maintainable.
> Since the processes tracked by AuxiliaryPidGetProc and
> pgstat_tracks_backend_bktype might diverge in future.

I think that could make sense but that might need a separate thread as this
is not only related to this patch (already done that way in pg_stat_reset_backend_stats()
and pg_stat_get_backend_io()).

> On that note, it is not clear to me why the WAL writer statistics are not
> included in per backend
> wal statistics? I understand the same limitation currently exists in
> pgstats_track_io_bktype(), but why does that need to be extended to
> WAL statistics?

WAL writer might be fine but that would not add that much value here because
it's going to appear anyway in pg_stat_io once Nazir's patch [1] gets in.

> +         <primary>pg_stat_get_backend_wal</primary>
> +        </indexterm>
> +        <function>pg_stat_get_backend_wal</function> (
> <type>integer</type> )
> +        <returnvalue>record</returnvalue>
> +       </para>
> Should the naming describe what is being returned more clearly?
> Something like pg_stat_get_backend_wal_activity()? Currently it
> suggests that it returns a backend's WAL, which is not the case.

Not sure. It aligns with pg_stat_get_backend_io() and the "stat" in its name
suggests this is related to stats.

> + if (pgstat_tracks_backend_bktype(MyBackendType))
> + {
> + PendingBackendStats.pending_wal.wal_write++;
> +
> + if (track_wal_io_timing)
> + INSTR_TIME_ACCUM_DIFF(PendingBackendStats.pending_wal.wal_write_time,
> +  end, start);
> + }
> At the risk of nitpicking, may I suggest moving the above code, which is
> under the
> track_wal_io_timing check, to the existing check before this added chunk?
> This way, all code related to track_wal_io_timing will be grouped together,
> closer to where the "end" variable is computed.

I think we're waiting [1] to be in before moving forward with this patch. I think
that [1] also touches this part of the code. I'll keep your remark in mind and
see if it still makes sense once [1] gets in.

[1]:
https://www.postgresql.org/message-id/flat/CAN55FZ3AiQ%2BZMxUuXnBpd0Rrh1YhwJ5FudkHg%3DJU0P%2B-W8T4Vg%40mail.gmail.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com