Re: Add WAL read stats to pg_stat_wal - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Add WAL read stats to pg_stat_wal
Date
Msg-id CALj2ACXWQ8HL59oU4F4R7wNn=p14uv=ya2rEvbQhOSFE9dLN9Q@mail.gmail.com
Whole thread Raw
In response to Re: Add WAL read stats to pg_stat_wal  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Fri, Feb 17, 2023 at 12:41 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-02-16 23:39:00 +0530, Bharath Rupireddy wrote:
> > While working on [1], I was in need to know WAL read stats (number of
> > times and amount of WAL data read from disk, time taken to read) to
> > measure the benefit. I had to write a developer patch to capture WAL
> > read stats as pg_stat_wal currently emits WAL write stats. With recent
> > works on pg_stat_io which emit data read IO stats too, I think it's
> > better to not miss WAL read stats. It might help others who keep an
> > eye on IOPS of the production servers for various reasons. The WAL
> > read stats I'm thinking useful are wal_read_bytes - total amount of
> > WAL read, wal_read - total number of times WAL is read from disk,
> > wal_read_time - total amount of time spent reading WAL (tracked only
> > when an existing GUC track_wal_io_timing is on).
>
> I doesn't really seem useful to have this in pg_stat_wal, because you can't
> really figure out where those reads are coming from. Are they crash recovery?
> Walsender? ...?

Yes, that's the limitation with what I've proposed.

> I think this'd better be handled by adding WAL support for pg_stat_io. Then
> the WAL reads would be attributed to the relevant backend type, making it
> easier to answer such questions.  Going forward I want to add support for
> seeing pg_stat_io for individual connections, which'd then automatically
> support this feature for the WAL reads as well.
>
> Eventually I think pg_stat_wal should only track wal_records, wal_fpi,
> wal_buffers_full and fill the other columns from pg_stat_io.

pg_stat_io being one place for all IO related information sounds apt
and useful. And similarly, we might want to push write/read/flush info
from pg_stat_slru to pg_stat_io.

> However, this doesn't "solve" the following issue:
>
> > Note that the patch needs a bit more work, per [2]. With the patch,
> > the WAL senders (processes exiting after checkpointer) will generate
> > stats and we need to either let all or only one WAL sender to write
> > stats to disk. Allowing one WAL sender to write might be tricky.
> > Allowing all WAL senders to write might make too many writes to the
> > stats file. And, we need a lock to let only one process write. I can't
> > think of a best way here at the moment.
> >
> > Thoughts?
> >
> > [1] https://www.postgresql.org/message-id/CALj2ACXKKK=wbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54+Na=Q@mail.gmail.com
> > [2]
> >     /*
> >      * Write out stats after shutdown. This needs to be called by exactly one
> >      * process during a normal shutdown, and since checkpointer is shut down
> >      * very late...
> >      *
> >      * Walsenders are shut down after the checkpointer, but currently don't
> >      * report stats. If that changes, we need a more complicated solution.
> >      */
> >     before_shmem_exit(pgstat_before_server_shutdown, 0);
>
> I wonder if we should keep the checkpointer around for longer. If we have
> checkpointer signal postmaster after it wrote the shutdown checkpoint,
> postmaster could signal walsenders to shut down, and checkpointer could do
> some final work, like writing out the stats.
>
> I suspect this could be useful for other things as well. It's awkward that we
> don't have a place to put "just before shutting down" type tasks. And
> checkpointer seems well suited for that.

Yes, there are some places that still assume checkpointer is the last
process to exit, for instance see [1]. If we can truly make it happen,
it'll be useful. I'll come up with more thoughts (and perhaps a patch)
on this soon.

[1]
        /*
         * Checkpointer is the last process to shut down, so we ask it to hold
         * the keys for a range of other tasks required most of which have
         * nothing to do with checkpointing at all.
         *
         * For various reasons, some config values can change dynamically so
         * the primary copy of them is held in shared memory to make sure all
         * backends see the same value.  We make Checkpointer responsible for
         * updating the shared memory copy if the parameter setting changes
         * because of SIGHUP.
         */

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: psql \watch 2nd argument: iteration count
Next
From: Justin Pryzby
Date:
Subject: Re: Add support for unit "B" to pg_size_pretty()