Add WAL read stats to pg_stat_wal - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Add WAL read stats to pg_stat_wal
Date
Msg-id CALj2ACXf8_H1WXoTNA8+6jQ_K=a-UdDyr9a4F8FJaovj4NwwiQ@mail.gmail.com
Whole thread Raw
Responses Re: Add WAL read stats to pg_stat_wal
List pgsql-hackers
Hi,

While working on [1], I was in need to know WAL read stats (number of
times and amount of WAL data read from disk, time taken to read) to
measure the benefit. I had to write a developer patch to capture WAL
read stats as pg_stat_wal currently emits WAL write stats. With recent
works on pg_stat_io which emit data read IO stats too, I think it's
better to not miss WAL read stats. It might help others who keep an
eye on IOPS of the production servers for various reasons. The WAL
read stats I'm thinking useful are wal_read_bytes - total amount of
WAL read, wal_read - total number of times WAL is read from disk,
wal_read_time - total amount of time spent reading WAL (tracked only
when an existing GUC track_wal_io_timing is on).

I came up with a patch and attached it here. The WAL readers that add
to WAL read stats are WAL senders, startup process and other backends
using xlogreader for logical replication or pg_walinspect SQL
functions. They all report stats to shared memory by calling
pgstat_report_wal() in appropriate locations. In standby mode, calling
pgstat_report_wa() for every record seems to be costly. Therefore, I
chose to report stats every 1024 WAL records (a random number,
suggestions for a better a way are welcome here).

Note that the patch needs a bit more work, per [2]. With the patch,
the WAL senders (processes exiting after checkpointer) will generate
stats and we need to either let all or only one WAL sender to write
stats to disk. Allowing one WAL sender to write might be tricky.
Allowing all WAL senders to write might make too many writes to the
stats file. And, we need a lock to let only one process write. I can't
think of a best way here at the moment.

Thoughts?

[1] https://www.postgresql.org/message-id/CALj2ACXKKK=wbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54+Na=Q@mail.gmail.com
[2]
    /*
     * Write out stats after shutdown. This needs to be called by exactly one
     * process during a normal shutdown, and since checkpointer is shut down
     * very late...
     *
     * Walsenders are shut down after the checkpointer, but currently don't
     * report stats. If that changes, we need a more complicated solution.
     */
    before_shmem_exit(pgstat_before_server_shutdown, 0);

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: ATTACH PARTITION seems to ignore column generation status
Next
From: Jacob Champion
Date:
Subject: Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert