Re: relfilenode statistics - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: relfilenode statistics |
Date | |
Msg-id | Zma01IBCIiBDZk4Z@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: relfilenode statistics (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: relfilenode statistics
|
List | pgsql-hackers |
Hi, On Fri, Jun 07, 2024 at 09:24:41AM -0400, Robert Haas wrote: > On Thu, Jun 6, 2024 at 11:17 PM Andres Freund <andres@anarazel.de> wrote: > > If we just want to keep prior stats upon arelation rewrite, we can just copy > > the stats from the old relfilenode. Or we can decide that those stats don't > > really make sense anymore, and start from scratch. > > I think we need to think carefully about what we want the user > experience to be here. "Per-relfilenode stats" could mean "sometimes I > don't know the relation OID so I want to use the relfilenumber > instead, without changing the user experience" or it could mean "some > of these stats actually properly pertain to the relfilenode rather > than the relation so I want to associate them with the right object > and that will affect how the user sees things." We need to decide > which it is. If it's the former, then we need to examine whether the > goal of hiding the distinction between relfilenode stats and relation > stats from the user is in fact feasible. If it's the latter, then we > need to make sure the whole patch reflects that design, which would > include e.g. NOT copying stats from the old to the new relfilenode, > and which would also include documenting the behavior in a way that > will be understandable to users. Thanks for sharing your thoughts! Let's take the current heap_blks_read as an example: it currently survives a relation rewrite and I guess we don't want to change the existing user experience for it. Now say we want to add "heap_blks_written" (like in this POC patch) then I think that it makes sense for the user to 1) query this new stat from the same place as the existing heap_blks_read: from pg_statio_all_tables and 2) to have the same experience as far the relation rewrite is concerned (keep the previous stats). To achieve the rewrite behavior we could: 1) copy the stats from the OLD relfilenode to the relation (like in the POC patch) 2) copy the stats from the OLD relfilenode to the NEW one (could be in a dedicated field) The PROS of 1) is that the behavior is consistent with the current heap_blks_read and that the user could still see the current relfilenode stats (through a new API) if he wants to. > In my experience, the worst thing you can do in cases like this is be > somewhere in the middle. Then you tend to end up with stuff like: the > difference isn't supposed to be something that the user knows or cares > about, except that they do have to know and care because you haven't > thoroughly covered up the deception, and often they have to reverse > engineer the behavior because you didn't document what was really > happening because you imagined that they wouldn't notice. My idea was to move all that is in pg_statio_all_tables to relfilenode stats and 1) add new stats to pg_statio_all_tables (like heap_blks_written), 2) ensure the user can still retrieve the stats from pg_statio_all_tables in such a way that it survives a rewrite, 3) provide dedicated APIs to retrieve relfilenode stats but only for the current relfilenode, 4) document this behavior. This is what the POC patch is doing for heap_blks_written (would need to do the same for heap_blks_read and friends) except for the documentation part. What do you think? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: