Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Date
Msg-id 20230111215850.GO9837@telsasoft.com
Whole thread Raw
In response to Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
> Subject: [PATCH v45 4/5] Add system view tracking IO ops per backend type

The patch can/will fail with:

CREATE TABLESPACE test_io_shared_stats_tblspc LOCATION '';
+WARNING:  tablespaces created by regression test cases should have names starting with "regress_"

CREATE TABLESPACE test_stats LOCATION '';
+WARNING:  tablespaces created by regression test cases should have names starting with "regress_"

(I already sent patches to address the omission in cirrus.yml)

1760             :                  errhint("Target must be \"archiver\", \"io\", \"bgwriter\", \"recovery_prefetch\",
or\"wal\".")));
 
=> Do you want to put these in order?

pgstat_get_io_op_name() isn't currently being hit by tests; actually,
it's completely unused.

FlushRelationBuffers() isn't being hit for local buffers.

> +      <entry><structname>pg_stat_io</structname><indexterm><primary>pg_stat_io</primary></indexterm></entry>
> +      <entry>
> +       One row per backend type, context, target object combination showing
> +       cluster-wide I/O statistics.

I suggest: "One row for each combination of of .."

> +   The <structname>pg_stat_io</structname> and
> +   <structname>pg_statio_</structname> set of views are especially useful for
> +   determining the effectiveness of the buffer cache.  When the number of actual
> +   disk reads is much smaller than the number of buffer hits, then the cache is
> +   satisfying most read requests without invoking a kernel call.

I would change this say "Postgres' own buffer cache is satisfying ..."

> However, these
> +   statistics do not give the entire story: due to the way in which
> +   <productname>PostgreSQL</productname> handles disk I/O, data that is not in
> +   the <productname>PostgreSQL</productname> buffer cache might still reside in
> +   the kernel's I/O cache, and might therefore still be fetched without

I suggest to refer to "the kernel's page cache"

> +   The <structname>pg_stat_io</structname> view will contain one row for each
> +   backend type, I/O context, and target I/O object combination showing
> +   cluster-wide I/O statistics. Combinations which do not make sense are
> +   omitted.

"..for each combination of .."

> +          <varname>io_context</varname> for a type of I/O operation. For

"for I/O operations"

> +          <literal>vacuum</literal>: I/O operations done outside of shared
> +          buffers incurred while vacuuming and analyzing permanent relations.

s/incurred/performed/

> +          <literal>bulkread</literal>: Qualifying large read I/O operations
> +          done outside of shared buffers, for example, a sequential scan of a
> +          large table.

I don't think it's correct to say that it's "outside of" shared-buffers.
s/Qualifying/Certain/

> +          <literal>bulkwrite</literal>: Qualifying large write I/O operations
> +          done outside of shared buffers, such as <command>COPY</command>.

Same

> +        Target object of an I/O operation. Possible values are:
> +       <itemizedlist>
> +        <listitem>
> +         <para>
> +          <literal>relation</literal>: This includes permanent relations.

It says "includes permanent" but what seems to mean is that it
"exclusive of temporary relations".

> +     <row>
> +      <entry role="catalog_table_entry">
> +       <para role="column_definition">
> +        <structfield>read</structfield> <type>bigint</type>
> +       </para>
> +       <para>
> +        Number of read operations in units of <varname>op_bytes</varname>.

This looks too much like it means "bytes".
Should say: "in number of blocks of size >op_bytes<"

But wait - is it the number of read operations "in units of op_bytes"
(which would means this already multiplied by op_bytes, and is in units
of bytes).

Or the "number of read operations" *of* op_bytes chunks ?  Which would
mean this is a "pure" number, and could be multipled by op_bytes to
obtain a size in bytes.

> +        Number of write operations in units of <varname>op_bytes</varname>.

> +        Number of relation extend operations in units of
> +        <varname>op_bytes</varname>.

same

> +        In <varname>io_context</varname> <literal>normal</literal>, this counts
> +        the number of times a block was evicted from a buffer and replaced with
> +        another block. In <varname>io_context</varname>s
> +        <literal>bulkwrite</literal>, <literal>bulkread</literal>, and
> +        <literal>vacuum</literal>, this counts the number of times a block was
> +        evicted from shared buffers in order to add the shared buffer to a
> +        separate size-limited ring buffer.

This never defines what "evicted" means.  Does it mea that a dirty
buffer was written out ?

> +        The number of times an existing buffer in a size-limited ring buffer
> +        outside of shared buffers was reused as part of an I/O operation in the
> +        <literal>bulkread</literal>, <literal>bulkwrite</literal>, or
> +        <literal>vacuum</literal> <varname>io_context</varname>s.

Maybe say "as part of a bulk I/O operation (bulkread, bulkwrite, or
vacuum)."

> +  <para>
> +   <structname>pg_stat_io</structname> can be used to inform database tuning.

> +   For example:
> +   <itemizedlist>
> +    <listitem>
> +     <para>
> +      A high <varname>evicted</varname> count can indicate that shared buffers
> +      should be increased.
> +     </para>
> +    </listitem>
> +    <listitem>
> +     <para>
> +      Client backends rely on the checkpointer to ensure data is persisted to
> +      permanent storage. Large numbers of <varname>files_synced</varname> by
> +      <literal>client backend</literal>s could indicate a misconfiguration of
> +      shared buffers or of checkpointer. More information on checkpointer

of *the* checkpointer

> +      Normally, client backends should be able to rely on auxiliary processes
> +      like the checkpointer and background writer to write out dirty data as

*the* bg writer

> +      much as possible. Large numbers of writes by client backends could
> +      indicate a misconfiguration of shared buffers or of checkpointer. More

*the* ckpointer

Should this link to various docs for checkpointer/bgwriter ?

Maybe the docs for ALTER/COPY/VACUUM/CREATE/etc should be updated to
refer to some central description of ring buffers.  Maybe something
should be included to the appendix.

-- 
Justin



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
Next
From: Thomas Munro
Date:
Subject: Re: [PATCH] psql: Add tab-complete for optional view parameters