Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) |
Date | |
Msg-id | 20220706192047.sikr2vuwvpslqdrw@alap3.anarazel.de Whole thread Raw |
In response to | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
|
List | pgsql-hackers |
Hi, On 2022-07-05 13:24:55 -0400, Melanie Plageman wrote: > From 2d089e26236c55d1be5b93833baa0cf7667ba38d Mon Sep 17 00:00:00 2001 > From: Melanie Plageman <melanieplageman@gmail.com> > Date: Tue, 28 Jun 2022 11:33:04 -0400 > Subject: [PATCH v22 1/3] Add BackendType for standalone backends > > All backends should have a BackendType to enable statistics reporting > per BackendType. > > Add a new BackendType for standalone backends, B_STANDALONE_BACKEND (and > alphabetize the BackendTypes). Both the bootstrap backend and single > user mode backends will have BackendType B_STANDALONE_BACKEND. > > Author: Melanie Plageman <melanieplageman@gmail.com> > Discussion: https://www.postgresql.org/message-id/CAAKRu_aaq33UnG4TXq3S-OSXGWj1QGf0sU%2BECH4tNwGFNERkZA%40mail.gmail.com > --- > src/backend/utils/init/miscinit.c | 17 +++++++++++------ > src/include/miscadmin.h | 5 +++-- > 2 files changed, 14 insertions(+), 8 deletions(-) > > diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c > index eb43b2c5e5..07e6db1a1c 100644 > --- a/src/backend/utils/init/miscinit.c > +++ b/src/backend/utils/init/miscinit.c > @@ -176,6 +176,8 @@ InitStandaloneProcess(const char *argv0) > { > Assert(!IsPostmasterEnvironment); > > + MyBackendType = B_STANDALONE_BACKEND; Hm. This is used for singleuser mode as well as bootstrap. Should we split those? It's not like bootstrap mode really matters for stats, so I'm inclined not to. > @@ -375,6 +376,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only) > * out the initial relation mapping files. > */ > RelationMapFinishBootstrap(); > + // TODO: should this be done for bootstrap? > + pgstat_report_io_ops(); Hm. Not particularly useful, but also not harmful. But we don't need an explicit call, because it'll be done at process exit too. At least I think, it could be that it's different for bootstrap. > diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c > index 2e146aac93..e6dbb1c4bb 100644 > --- a/src/backend/postmaster/autovacuum.c > +++ b/src/backend/postmaster/autovacuum.c > @@ -1712,6 +1712,9 @@ AutoVacWorkerMain(int argc, char *argv[]) > recentXid = ReadNextTransactionId(); > recentMulti = ReadNextMultiXactId(); > do_autovacuum(); > + > + // TODO: should this be done more often somewhere in do_autovacuum()? > + pgstat_report_io_ops(); > } Don't think you need all these calls before process exit - it'll happen via pgstat_shutdown_hook(). IMO it'd be a good idea to add pgstat_report_io_ops() to pgstat_report_vacuum()/analyze(), so that the stats for a longrunning autovac worker get updated more regularly. > diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c > index 91e6f6ea18..87e4b9e9bd 100644 > --- a/src/backend/postmaster/bgwriter.c > +++ b/src/backend/postmaster/bgwriter.c > @@ -242,6 +242,7 @@ BackgroundWriterMain(void) > > /* Report pending statistics to the cumulative stats system */ > pgstat_report_bgwriter(); > + pgstat_report_io_ops(); > > if (FirstCallSinceLastCheckpoint()) > { How about moving the pgstat_report_io_ops() into pgstat_report_bgwriter(), pgstat_report_autovacuum() etc? Seems unnecessary to have multiple pgstat_* calls in these places. > +/* > + * Flush out locally pending IO Operation statistics entries > + * > + * If nowait is true, this function returns false on lock failure. Otherwise > + * this function always returns true. Writer processes are mutually excluded > + * using LWLock, but readers are expected to use change-count protocol to avoid > + * interference with writers. > + * > + * If nowait is true, this function returns true if the lock could not be > + * acquired. Otherwise return false. > + * > + */ > +bool > +pgstat_flush_io_ops(bool nowait) > +{ > + PgStat_IOPathOps *dest_io_path_ops; > + PgStatShared_BackendIOPathOps *stats_shmem; > + > + PgBackendStatus *beentry = MyBEEntry; > + > + if (!have_ioopstats) > + return false; > + > + if (!beentry || beentry->st_backendType == B_INVALID) > + return false; > + > + stats_shmem = &pgStatLocal.shmem->io_ops; > + > + if (!nowait) > + LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE); > + else if (!LWLockConditionalAcquire(&stats_shmem->lock, LW_EXCLUSIVE)) > + return true; Wonder if it's worth making the lock specific to the backend type? > + dest_io_path_ops = > + &stats_shmem->stats[backend_type_get_idx(beentry->st_backendType)]; > + This could be done before acquiring the lock, right? > +void > +pgstat_io_ops_snapshot_cb(void) > +{ > + PgStatShared_BackendIOPathOps *stats_shmem = &pgStatLocal.shmem->io_ops; > + PgStat_IOPathOps *snapshot_ops = pgStatLocal.snapshot.io_path_ops; > + PgStat_IOPathOps *reset_ops; > + > + PgStat_IOPathOps *reset_offset = stats_shmem->reset_offset; > + PgStat_IOPathOps reset[BACKEND_NUM_TYPES]; > + > + pgstat_copy_changecounted_stats(snapshot_ops, > + &stats_shmem->stats, sizeof(stats_shmem->stats), > + &stats_shmem->changecount); This doesn't make sense - with multiple writers you can't use the changecount approach (and you don't in the flush part above). > + LWLockAcquire(&stats_shmem->lock, LW_SHARED); > + memcpy(&reset, reset_offset, sizeof(stats_shmem->stats)); > + LWLockRelease(&stats_shmem->lock); Which then also means that you don't need the reset offset stuff. It's only there because with the changecount approach we can't take a lock to reset the stats (since there is no lock). With a lock you can just reset the shared state. > +void > +pgstat_count_io_op(IOOp io_op, IOPath io_path) > +{ > + PgStat_IOOpCounters *pending_counters = &pending_IOOpStats.data[io_path]; > + PgStat_IOOpCounters *cumulative_counters = > + &cumulative_IOOpStats.data[io_path]; the pending_/cumultive_ prefix before an uppercase-first camelcase name seems ugly... > + switch (io_op) > + { > + case IOOP_ALLOC: > + pending_counters->allocs++; > + cumulative_counters->allocs++; > + break; > + case IOOP_EXTEND: > + pending_counters->extends++; > + cumulative_counters->extends++; > + break; > + case IOOP_FSYNC: > + pending_counters->fsyncs++; > + cumulative_counters->fsyncs++; > + break; > + case IOOP_WRITE: > + pending_counters->writes++; > + cumulative_counters->writes++; > + break; > + } > + > + have_ioopstats = true; > +} Doing two math ops / memory accesses every time seems off. Seems better to maintain cumultive_counters whenever reporting stats, just before zeroing pending_counters? > +/* > + * Report IO operation statistics > + * > + * This works in much the same way as pgstat_flush_io_ops() but is meant for > + * BackendTypes like bgwriter for whom pgstat_report_stat() will not be called > + * frequently enough to keep shared memory stats fresh. > + * Backends not typically calling pgstat_report_stat() can invoke > + * pgstat_report_io_ops() explicitly. > + */ > +void > +pgstat_report_io_ops(void) > +{ This shouldn't be needed - the flush function above can be used. > + PgStat_IOPathOps *dest_io_path_ops; > + PgStatShared_BackendIOPathOps *stats_shmem; > + > + PgBackendStatus *beentry = MyBEEntry; > + > + Assert(!pgStatLocal.shmem->is_shutdown); > + pgstat_assert_is_up(); > + > + if (!have_ioopstats) > + return; > + > + if (!beentry || beentry->st_backendType == B_INVALID) > + return; Is there a case where this may be called where we have no beentry? Why not just use MyBackendType? > + stats_shmem = &pgStatLocal.shmem->io_ops; > + > + dest_io_path_ops = > + &stats_shmem->stats[backend_type_get_idx(beentry->st_backendType)]; > + > + pgstat_begin_changecount_write(&stats_shmem->changecount); A mentioned before, the changecount stuff doesn't apply here. You need a lock. > +PgStat_IOPathOps * > +pgstat_fetch_backend_io_path_ops(void) > +{ > + pgstat_snapshot_fixed(PGSTAT_KIND_IOOPS); > + return pgStatLocal.snapshot.io_path_ops; > +} > + > +PgStat_Counter > +pgstat_fetch_cumulative_io_ops(IOPath io_path, IOOp io_op) > +{ > + PgStat_IOOpCounters *counters = &cumulative_IOOpStats.data[io_path]; > + > + switch (io_op) > + { > + case IOOP_ALLOC: > + return counters->allocs; > + case IOOP_EXTEND: > + return counters->extends; > + case IOOP_FSYNC: > + return counters->fsyncs; > + case IOOP_WRITE: > + return counters->writes; > + default: > + elog(ERROR, "IO Operation %s for IO Path %s is undefined.", > + pgstat_io_op_desc(io_op), pgstat_io_path_desc(io_path)); > + } > +} There's currently no user for this, right? Maybe let's just defer the cumulative stuff until we need it? > +const char * > +pgstat_io_path_desc(IOPath io_path) > +{ > + const char *io_path_desc = "Unknown IO Path"; > + This should be unreachable, right? > From f2b5b75f5063702cbc3c64efdc1e7ef3cf1acdb4 Mon Sep 17 00:00:00 2001 > From: Melanie Plageman <melanieplageman@gmail.com> > Date: Mon, 4 Jul 2022 15:44:17 -0400 > Subject: [PATCH v22 3/3] Add system view tracking IO ops per backend type > Add pg_stat_buffers, a system view which tracks the number of IO > operations (allocs, writes, fsyncs, and extends) done through each IO > path (e.g. shared buffers, local buffers, unbuffered IO) by each type of > backend. I think I like pg_stat_io a bit better? Nearly everything in here seems to fit better in that. I guess we could split out buffers allocated, but that's actually interesting in the context of the kind of IO too. > <row> > <entry><structname>pg_stat_wal</structname><indexterm><primary>pg_stat_wal</primary></indexterm></entry> > <entry>One row only, showing statistics about WAL activity. See > @@ -3595,7 +3604,102 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i > <structfield>stats_reset</structfield> <type>timestamp with time zone</type> > </para> > <para> > - Time at which these statistics were last reset > + Time at which these statistics were last reset. > + </para></entry> Grammar critique time :) > +CREATE VIEW pg_stat_buffers AS > +SELECT > + b.backend_type, > + b.io_path, > + b.alloc, > + b.extend, > + b.fsync, > + b.write, > + b.stats_reset > +FROM pg_stat_get_buffers() b; Do we want to expose all data to all users? I guess pg_stat_bgwriter does? But this does split things out a lot more... > + for (int i = 0; i < BACKEND_NUM_TYPES; i++) > + { > + PgStat_IOOpCounters *counters = io_path_ops->data; > + Datum backend_type_desc = > + CStringGetTextDatum(GetBackendTypeDesc(idx_get_backend_type(i))); > + /* const char *log_name = GetBackendTypeDesc(idx_get_backend_type(i)); */ > + > + for (int j = 0; j < IOPATH_NUM_TYPES; j++) > + { > + Datum values[BUFFERS_NUM_COLUMNS]; > + bool nulls[BUFFERS_NUM_COLUMNS]; > + memset(values, 0, sizeof(values)); > + memset(nulls, 0, sizeof(nulls)); > + > + values[BUFFERS_COLUMN_BACKEND_TYPE] = backend_type_desc; > + values[BUFFERS_COLUMN_IO_PATH] = CStringGetTextDatum(pgstat_io_path_desc(j)); Random musing: I wonder if we should start to use SQL level enums for this kind of thing. > DROP TABLE trunc_stats_test, trunc_stats_test1, trunc_stats_test2, trunc_stats_test3, trunc_stats_test4; > DROP TABLE prevstats; > +SELECT pg_stat_reset_shared('buffers'); > + pg_stat_reset_shared > +---------------------- > + > +(1 row) > + > +SELECT pg_stat_force_next_flush(); > + pg_stat_force_next_flush > +-------------------------- > + > +(1 row) > + > +SELECT write = 0 FROM pg_stat_buffers WHERE io_path = 'Shared' and backend_type = 'checkpointer'; > + ?column? > +---------- > + t > +(1 row) Don't think you can rely on that. The lookup of the view, functions might have needed to load catalog data, which might have needed to evict buffers. I think you can do something more reliable by checking that there's more written buffers after a checkpoint than before, or such. Would be nice to have something testing that the ringbuffer stats stuff does something sensible - that feels not entirely trivial. Greetings, Andres Freund
pgsql-hackers by date: