Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Date
Msg-id CAAKRu_ZiLuEPANqsHqqRPbgt4BTmgMqtPpyJJaTQxLs818tvKg@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Lukas Fittl <lukas@fittl.com>)
Responses Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Melanie Plageman <melanieplageman@gmail.com>)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
v31 attached
I've also addressed failing test mentioned by Andres in [1]

On Fri, Sep 30, 2022 at 7:18 PM Lukas Fittl <lukas@fittl.com> wrote:
>
> On Tue, Sep 27, 2022 at 11:20 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> First of all, I'm excited about this patch, and I think it will be a big help to understand better which part of
Postgresis producing I/O (and why). 
>

Thanks! I'm happy to hear that.

> I've paired up with Maciek (CCed) on a review of this patch and had a few comments, focused on the user experience:
>

Thanks for taking the time to review!

> The term "strategy" as an "io_context" is hard to understand, as its not a concept an end-user / DBA would be
familiarwith. Since this comes from BufferAccessStrategyType (i.e. anything not NULL/BAS_NORMAL is treated as
"strategy"),maybe we could instead split this out into the individual strategy types? i.e. making "strategy" three
differentI/O contexts instead: "shared_bulkread", "shared_bulkwrite" and "shared_vacuum", retaining "shared" to mean
NULL/ BAS_NORMAL. 

I have split strategy out into "vacuum", "bulkread", and "bulkwrite". I
thought it was less clear with shared as a prefix. If we were to have
BufferAccessStrategies in the future which acquire local buffers (for
example), we could start prefixing the columns to differentiate.

This opened up some new questions about which BufferAccessStrategies
will be employed by which BackendTypes and which IOOps will be valid in
a given BufferAccessStrategy.

I've excluded IOCONTEXT_BULKREAD and IOCONTEXT_BULKWRITE for autovacuum
worker -- though those may not be inherently invalid, they seem not to
be done now and added extra rows to the view.

I've also disallowed IOOP_EXTEND for IOCONTEXT_BULKREAD.

> Separately, could we also track buffer hits without incurring extra overhead? (not just allocs and reads) -- Whilst
wealready have shared read and hit counters in a few other places, this would help make the common "What's my cache hit
ratio"question more accurate to answer in the presence of different shared buffer access strategies. Tracking hits
couldalso help for local buffers (e.g. to tune temp_buffers based on seeing a low cache hit ratio). 

I've started tracking hits and added "hit" to the view.
I added IOOP_HIT and IOOP_ACQUIRE to those IOOps disallowed for
checkpointer and bgwriter.

I have added tests for hit, but I'm not sure I can keep them. It seems
like they might fail if the blocks are evicted between the first and
second time I try to read them.

> Additionally, some minor notes:
>
> - Since the stats are counting blocks, it would make sense to prefix the view columns with "blks_", and word them in
thepast tense (to match current style), i.e. "blks_written", "blks_read", "blks_extended", "blks_fsynced"
(realisticallyone would combine this new view with other data e.g. from pg_stat_database or pg_stat_statements, which
alluse the "blks_" prefix, and stop using pg_stat_bgwriter for this which does not use such a prefix) 

I have changed the column names to be in the past tense.

There are no columns equivalent to "dirty" or "misses" from the other
views containing information on buffer hits/block reads/writes/etc. I'm
not sure whether or not those make sense in this context.

Because we want to add non-block-oriented IO in the future (like
temporary file IO) to this view and want to use the same "read",
"written", "extended" columns, I would prefer not to prefix the columns
with "blks_". I have added a column "unit" which would contain the unit
in which read, written, and extended are in. Unfortunately, fsyncs are
not per block, so "unit" doesn't really work for this. I documented
this.

The most correct thing to do to accommodate block-oriented and
non-block-oriented IO would be to specify all the values in bytes.
However, I would like this view to be usable visually (as opposed to
just in scripts and by tools). The only current value of unit is
"block_size" which could potentially be combined with the value of the
GUC to get bytes.

I've hard-coded the string "block_size" into the view generation
function pg_stat_get_io(), so, if this idea makes sense, perhaps I
should do something better there.

> - "alloc" as a name doesn't seem intuitive (and it may be confused with memory allocations) - whilst this is already
namedthis way in pg_stat_bgwriter, it feels like this is an opportunity to eventually deprecate the column there and
makethis easier to understand - specifically, maybe we can clarify that this means buffer *acquisitions*? (either by
renamingthe field to "blks_acquired", or clarifying in the documentation) 

I have renamed it to acquired. It doesn't overlap completely with
buffers_alloc in pg_stat_bgwriter, so I didn't mention that in docs.

> - Assuming we think this view could realistically cover all I/O produced by Postgres in the future (thus warranting
thename "pg_stat_io"), it may be best to have an explicit list of things that are not currently tracked in the
documentation,to reduce user confusion (i.e. WAL writes are not tracked, temporary files are not tracked, and some
formsof direct writes are not tracked, e.g. when a table moves to a different tablespace) 

I have added this to the docs. The list is not exhaustive, so I would
love to get feedback on if there are other specific examples of IO which
is using smgr* directly that users will wonder about and I should call
out.

> - In the view documentation, it would be good to explain the different values for "io_strategy" (and what they mean)

I have added this and would love feedback on my docs additions.

> - Overall it would be helpful if we had a dedicated documentation page on I/O statistics that's linked from the
pg_stat_ioview description, and explains how the I/O statistics tie into the various concepts of shared buffers /
bufferaccess strategies / etc (and what is not tracked today) 

I haven't done this yet. How specific were you thinking -- like
interpretations of all the combinations and what to do with what you
see? Like you should run pg_prewarm if you see X? Specific checkpointer
or bgwriter GUCs to change? Or just links to other docs pages on
recommended tunings?

Were you imagining the other IO statistics views (like
pg_statio_all_tables and pg_stat_database) also being included in this
page? Like would it be a comprehensive guide to IO statistics and what
their significance/purposes are?

- Melanie

[1] https://www.postgresql.org/message-id/20221002172404.xyzhftbedh4zpio2%40awork3.anarazel.de

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Reducing the chunk header sizes on all memory context types
Next
From: Alvaro Herrera
Date:
Subject: Re: list of acknowledgments for PG15