On Mon, 31 Mar 2025 at 18:35, Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Mar 24, 2025 at 08:41:20AM +0900, Michael Paquier wrote:
> > On Wed, Mar 19, 2025 at 04:00:49PM +0800, Xuneng Zhou wrote:
> > > Hi,
> > > Moving the other two provides a more complete view of the settings. For
> > > newcomers(like me) to the codebase, seeing all three related values in one
> > > place helps avoid a narrow view of the settings.
> > >
> > > But I am not sure that I understand the cons of this well.
> >
> > While I don't disagree with the use of a hardcoded interval of time to
> > control timing the flush of the WAL sender stats, do we really need to
> > rely on the timing defined by pgstat.c?
>
> No but I thought it could make sense.
>
> > Wouldn't it be simpler to
> > assign one in walsender.c and pick up a different, perhaps higher,
> > value?
>
> I don't have a strong opinion on it so done as suggested above in the attached.
>
> I think that the 1s value is fine because: 1. it is consistent with
> PGSTAT_MIN_INTERVAL and 2. it already needs that the sender is caught up or
> has pending data to send (means it could be higher than 1s already). That said,
> I don't think that would hurt if you think of a higher value.
>
> > At the end the timestamp calculations are free because we can rely on
> > the existing call of GetCurrentTimestamp() for the physical WAL
> > senders to get an idea of the current time.
>
> Yup
>
> > For the logical WAL
> > senders, perhaps we'd better do the reports in WalSndWaitForWal(),
> > actually. There is also a call to GetCurrentTimestamp() that we can
> > rely on in this path.
>
> I think it's better to flush the stats in a shared code path. I think it's
> easier to maintain and that there is no differences between logical and
> physical walsenders that would justify to flush the stats in specific code
> paths.
Couple of suggestions:
1) I felt we can include a similar verification for one of the logical
replication tests too:
+# Wait for the walsender to update its IO statistics.
+# Has to be done before the next restart and far enough from the
+# pg_stat_reset_shared('io') to minimize the risk of polling for too long.
+$node_primary->poll_query_until(
+ 'postgres',
+ qq[SELECT sum(reads) > 0
+ FROM pg_catalog.pg_stat_io
+ WHERE backend_type = 'walsender'
+ AND object = 'wal']
+ )
+ or die
+ "Timed out while waiting for the walsender to update its IO statistics";
+
2) We can comment this in a single line itself:
+ /*
+ * Report IO statistics
+ */
Regards,
Vignesh