Re: Show WAL write and fsync stats in pg_stat_io - Mailing list pgsql-hackers

From Nazir Bilal Yavuz
Subject Re: Show WAL write and fsync stats in pg_stat_io
Date
Msg-id CAN55FZ03z1s-KzMOFFoXeQcvzXsY_p_JZA2tRz65_AHCoyTnww@mail.gmail.com
Whole thread Raw
In response to Re: Show WAL write and fsync stats in pg_stat_io  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Show WAL write and fsync stats in pg_stat_io
List pgsql-hackers
Hi,

Thank you for the feedback!

On Thu, 26 Oct 2023 at 09:28, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Sep 20, 2023 at 10:57:48AM +0300, Nazir Bilal Yavuz wrote:
> > Any kind of feedback would be appreciated.
>
> This was registered in the CF, so I have given it a look.  Note that
> 0001 has a conflict with pgstat_count_io_op_time(), so it cannot be
> applied.
>
> +pgstat_should_track_io_time(IOObject io_object, IOContext io_context)
> +{
> +       /*
> +        * io times of IOOBJECT_WAL IOObject needs to be tracked when
> +        * 'track_wal_io_timing' is set regardless of 'track_io_timing'.
> +        */
> +       if (io_object == IOOBJECT_WAL)
> +               return track_wal_io_timing;
> +
> +       return track_io_timing;
>
> I can see the temptation to do that, but I have mixed feelings about
> the approach of mixing two GUCs in a code path dedicated to pg_stat_io
> where now we only rely on track_io_timing.  The result brings
> confusion, while making pg_stat_io, which is itself only used for
> block-based operations, harder to read.
>
> The suggestion I am seeing here to have a pg_stat_io_wal (with a SRF)
> is quite tempting, actually, creating a neat separation between the
> existing pg_stat_io and pg_stat_wal (not a SRF), with a third view
> that provides more details about the contexts and backend types for
> the WAL stats with its relevant fields:
> https://www.postgresql.org/message-id/CAAKRu_bM55pj3pPRW0nd_-paWHLRkOU69r816AeztBBa-N1HLA@mail.gmail.com
>
> And perhaps just putting that everything that calls
> pgstat_count_io_op_time() under track_io_timing is just natural?
> What's the performance regression you would expect if both WAL and
> block I/O are controlled by that, still one would expect only one of
> them?

I will check these and I hope I will come back with something meaningful.

>
> +      /*  Report pending statistics to the cumulative stats system */
> +      pgstat_report_wal(false);
>
> This is hidden in 0001, still would be better if handled as a patch on
> its own and optionally backpatch it as we did for the bgwriter with
> e64c733bb1?

I thought about it again and found the use of
'pgstat_report_wal(false);' here wrong. This was mainly for flushing
WAL stats because of the WAL reads but pg_stat_wal doesn't have WAL
read stats, so there is no need to flush WAL stats here. I think this
should be replaced with 'pgstat_flush_io(false);'.

>
> Side note: I think that we should spend more efforts in documenting
> what IOContext and IOOp mean.  Not something directly related to this
> patch, still this patch or things similar make it a bit harder which
> part of it is used for what by reading pgstat.h.

I agree.

Regards,
Nazir Bilal Yavuz
Microsoft



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: A recent message added to pg_upgade
Next
From: hubert depesz lubaczewski
Date:
Subject: Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}