Re: Make pg_stat_io view count IOs as bytes instead of blocks - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Make pg_stat_io view count IOs as bytes instead of blocks
Date
Msg-id CAAKRu_bFF_9H7ss8yEyi8FjEWEqN15Pjui8GxXr-Gdg3fTeasg@mail.gmail.com
Whole thread Raw
Responses Re: Make pg_stat_io view count IOs as bytes instead of blocks
List pgsql-hackers
On Wed, Sep 11, 2024 at 7:19 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> Currently, in the pg_stat_io view, IOs are counted as blocks. However, there are two issues with this approach:
>
> 1- The actual number of IO requests to the kernel is lower because IO requests can be merged before sending the final
request.Additionally, it appears that all IOs are counted in block size.
 

I think this is a great idea. It will allow people to tune
io_combine_limit as you mention below.

> 2- Some IOs may not align with block size. For example, WAL read IOs are done in variable bytes and it is not
possibleto correctly show these IOs in the pg_stat_io view [1].
 

Yep, this makes a lot of sense as a solution.

> To address this, I propose showing the total number of IO requests to the kernel (as smgr function calls) and the
totalnumber of bytes in the IO. To implement this change, the op_bytes column will be removed from the pg_stat_io view.
Instead,the [reads | writes | extends] columns will track the total number of IO requests, and newly added [read |
write| extend]_bytes columns will track the total number of bytes in the IO.
 

smgr API seems like the right place for this.

> Example benefit of this change:
>
> Running query [2], the result is:
>
> ╔═══════════════════╦══════════╦══════════╦═══════════════╗
> ║    backend_type   ║  object  ║  context ║ avg_io_blocks ║
> ╠═══════════════════╬══════════╬══════════╬═══════════════╣
> ║   client backend  ║ relation ║ bulkread ║     15.99     ║
> ╠═══════════════════╬══════════╬══════════╬═══════════════╣
> ║ background worker ║ relation ║ bulkread ║     15.99     ║
> ╚═══════════════════╩══════════╩══════════╩═══════════════╝

I don't understand why background worker is listed here.

> You can rerun the same query [2] after setting io_combine_limit to 32 [3]. The result is:
>
> ╔═══════════════════╦══════════╦══════════╦═══════════════╗
> ║    backend_type   ║  object  ║  context ║ avg_io_blocks ║
> ╠═══════════════════╬══════════╬══════════╬═══════════════╣
> ║   client backend  ║ relation ║ bulkread ║     31.70     ║
> ╠═══════════════════╬══════════╬══════════╬═══════════════╣
> ║ background worker ║ relation ║ bulkread ║     31.60     ║
> ╚═══════════════════╩══════════╩══════════╩═══════════════╝
>
> I believe that having visibility into avg_io_[bytes | blocks] is valuable information that could help optimize
Postgres.

In general, for this example, I think it would be more clear if you
compared what visibility we have in pg_stat_io on master with what
visibility we have with your patch.

I like that you show how io_combine_limit can be tuned using this, but
I don't think the problem statement is clear nor is the full
narrative.

> CREATE TABLE t as select i, repeat('a', 600) as filler from generate_series(1, 10000000) as i;
> SELECT pg_stat_reset_shared('io');
> SELECT * FROM t WHERE i = 0;
> SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT current_setting('block_size')::numeric)),
2)as avg_io_blocks FROM pg_stat_io WHERE reads > 0;
 

I like that you calculate the avg_io_blocks, but I think it is good to
show the raw columns as well.

- Melanie

pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: per backend I/O statistics
Next
From: Robert Haas
Date:
Subject: Re: Self contradictory examining on rel's baserestrictinfo