Re: Track IO times in pg_stat_io - Mailing list pgsql-hackers

From Drouvot, Bertrand
Subject Re: Track IO times in pg_stat_io
Date
Msg-id 7b450eef-8717-ce4a-a9c8-ccab24ecd89a@gmail.com
Whole thread Raw
In response to Re: Track IO times in pg_stat_io  (Andres Freund <andres@anarazel.de>)
Responses Re: Track IO times in pg_stat_io  (Andres Freund <andres@anarazel.de>)
Re: Track IO times in pg_stat_io  ("Imseih (AWS), Sami" <simseih@amazon.com>)
List pgsql-hackers
Hi,

On 3/7/23 7:47 PM, Andres Freund wrote:
> On 2023-03-07 13:43:28 -0500, Melanie Plageman wrote:
>>> Now I've a second thought: what do you think about resetting the related number
>>> of operations and *_time fields when enabling/disabling track_io_timing? (And mention it in the doc).
>>>
>>> That way it'd prevent bad interpretation (at least as far the time per operation metrics are concerned).
>>>
>>> Thinking that way as we'd loose some (most?) benefits of the new *_time columns
>>> if one can't "trust" their related operations and/or one is not sampling pg_stat_io frequently enough (to discard
thesamples
 
>>> where the track_io_timing changes occur).
>>>
>>> But well, resetting the operations could also lead to bad interpretation about the operations...
>>>
>>> Not sure about which approach I like the most yet, what do you think?
>>
>> Oh, this is an interesting idea. I think you are right about the
>> synchronization issues making the statistics untrustworthy and, thus,
>> unuseable.
> 
> No, I don't think we can do that. It can be enabled on a per-session basis.

Oh right. So it's even less clear to me to get how one would make use of those new *_time fields, given that:

- pg_stat_io is "global" across all sessions. So, even if one session is doing some "testing" and needs to turn
track_io_timingon, then it
 
is even not sure it's only reflecting its own testing (as other sessions may have turned it on too).

- There is the risk mentioned above of bad interpretations for the "time per operation" metrics.

- Even if there is frequent enough sampling of it pg_stat_io, one does not know which samples contain track_io_timing
changes(at the cluster or session level).
 

> I think we simply shouldn't do anything here. This is a pre-existing issue.

Oh, never thought about it. You mean like for pg_stat_database.blks_read and pg_stat_database.blk_read_time for
example?

> I also think that loosing stats when turning track_io_timing on/off would not be
> helpful.
> 

Yeah not 100% sure too as that would lead to other possible bad interpretations.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Next
From: Sandro Santilli
Date:
Subject: Re: [PATCH] Support % wildcard in extension upgrade filenames