Re: relfilenode statistics - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: relfilenode statistics
Date
Msg-id Zo9j69GhexDpeV4k@paquier.xyz
Whole thread Raw
In response to Re: relfilenode statistics  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Responses Re: relfilenode statistics
List pgsql-hackers
On Wed, Jul 10, 2024 at 01:38:06PM +0000, Bertrand Drouvot wrote:
> On Wed, Jul 10, 2024 at 03:02:34PM +0900, Michael Paquier wrote:
>> and I am troubled by the approach taken (mentioned down by you), but
>> that's invasive compared to how pgstats wants to be transparent with
>> its stats kinds.
>>
>> +   Oid         objoid;         /* object ID, either table or function
>> or tablespace. */
>> +   RelFileNumber relfile;      /* relfilenumber for RelFileLocator. */
>>  } PgStat_HashKey;
>>
>> This adds a relfilenode component to the central hash key used for the
>> dshash of pgstats, which is something most stats types don't care
>> about.
>
> That's right but that's an existing behavior without the patch as:
>
> PGSTAT_KIND_DATABASE does not care care about the objoid
> PGSTAT_KIND_REPLSLOT does not care care about the dboid
> PGSTAT_KIND_SUBSCRIPTION does not care care about the dboid
>
> That's 3 kinds out of the 5 non fixed stats kind.

I'd like to think that this is just going to increase across time.

>> That looks like the incorrect thing to do to me, particularly
>> seeing a couple of lines down that a stats kind is assigned so the
>> HashKey uniqueness is ensured by the KindInfo:
>> +   [PGSTAT_KIND_RELFILENODE] = {
>> +       .name = "relfilenode",
>
> You mean, just rely on kind, dboid and relfile to ensure uniqueness?

Or table OID for the objid, with a hardcoded number of past
relfilenodes stats stored, to limit bloating the dshash with too much
past stats.  See below.

> So, I think it makes sense to link the hashkey to all the RelFileLocator
> fields, means:
>
> dboid (linked to RelFileLocator's dbOid)
> objoid (linked to RelFileLocator's spcOid)
> relfile (linked to RelFileLocator's relNumber)

Hmm.  How about using the table OID as objoid, but store in the stats
of the new KindInfo an array of entries with the relfilenodes (current
and past, perhaps with more data than the relfilenode to ensure the
uniqueness tracking) and each of its stats?  The number of past
relfilenodes would be fixed, meaning that there would be a strict
control with the retention of the past stats.  When a table is
dropped, removing its relfilenode stats would be as cheap as when its
PGSTAT_KIND_RELATION is dropped.

> Yeah, I also thought about keeping a list of "previous" relfilenodes stats for a
> relation but that would lead to:
>
> 1. Keep previous relfilnode stats
> 2. A more complicated way to look at relation stats (as you said)
> 3. Extra memory usage
>
> I think the only reason "previous" relfilenode stats are needed is to provide
> accurate stats for the relation. Outside of this need, I don't think we would
> want to retrieve "individual" previous relfilenode stats in the past.
>
> That's why the POC patch "simply" copies the stats to the relation during a
> rewrite (before getting rid of the "previous" relfilenode stats).

Hmm.  Okay.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Speed up Hash Join by teaching ExprState about hashing
Next
From: shveta malik
Date:
Subject: Re: Conflict detection and logging in logical replication