Re: relfilenode statistics - Mailing list pgsql-hackers

From Andres Freund
Subject Re: relfilenode statistics
Date
Msg-id zferux2jlbhqymubzhpubfrkjzhzxzguq4eprtycojtif5vbqh@2t7cu2teyqmi
Whole thread Raw
In response to Re: relfilenode statistics  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Hi,

On 2025-12-16 16:33:17 +0900, Michael Paquier wrote:
> On Mon, Dec 15, 2025 at 12:48:25PM -0500, Andres Freund wrote:
> > I don't think this is true as stated. Two reasons:
> > 
> > 1) This afaict guarantees that the relfilenode will not clash with oids, but
> >    it does *NOT* guarantee that it does not clash with other relfilenodes
> > 
> > 2) Note that GetNewRelFileNumber() does *NOT* check for conflicts when
> >    creating a new relfilenode for an existing relation:
> >  * If the relfilenumber will also be used as the relation's OID, pass the
> >  * opened pg_class catalog, and this routine will guarantee that the result
> >  * is also an unused OID within pg_class.  If the result is to be used only
> >  * as a relfilenumber for an existing relation, pass NULL for pg_class.
> 
> FWIW, I am also still troubled by the part of the proposed patch set
> where we are trying to hide the idea of a partitioned table has a
> relfilenode set by using its relid instead in the key for the data.
> This leads to a huge amount of complexity in the patch, mainly to
> store data for autovacuum that we do not need at the end:
> - autovacuum discards partitioned tables in do_autovacuum(), so the
> stats related to partitioned tables that we need to select the
> relations does not matter.

I feel like that's an implementation wart that we ought to fix. It's not
infrequently a problem that we don't automatically analyze partitioned
tables. Weren't there even a couple threads on that on the list in the last
weeks?


> - manual vacuums may include partitioned tables to extract its
> partitions, vacuum_rel() at the end discarding them.  Well, stats
> don't matter anyway.
> 
> We only need to attach three fields to let autovacuum know if a
> relation needs to run or not: dead_tuples, ins_since_vacuum,
> mod_since_analyze.

That may be true for autovacuum today, but I don't see any reason for
live_tuples, tuples_inserted etc to be inaccurate after a failover.


> Most the fields of PgStat_StatTabEntry make sense
> only for tables, few are required by indexes for pg_stat_all_indexes.
> Some fields actually make sense because they refer to on-disk files,
> mostly for pg_statio_all_tables (blocks_fetched, blocks_hit).
> 
> Hence, why don't we split PgStat_StatTabEntry into three things from
> the start, even if it means to duplicate some of them?  Say:
> - Table fields: includes [auto]vacuum/analyze data, block fields,
> fields of pg_stat_all_tables.

What do you mean with "block fields"? pg_statio_all_tables? If so, what's the
point of including them here, rather than in the relfilenode fields?


> - Index fields: no need for the [auto]vacuum/analyze time and counts,
> block fields, pg_stat_all_indexes fields.

I think we actually should populate the [auto]vac fields for indexes, right
now it's impossible to figure out from stats whether indexes are frequently
scanned as part of vacuum or not.


> - Relfilenode fields: dead_tuples, ins_since_vacuum and
> mod_since_analyze.  Does not apply to partitioned tables and indexes,
> only applies to tables.  Provides a clean split, embrace the fact that
> these are the only three fields we need to worry about during
> recovery.

I think we really ought to populate not just these during recovery, but also
at least n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd, n_live_tup.

I don't understand why we would want to only populate these three fields?


I'm not against splitting the index fields off, but it seems pretty orthogonal
to what we're discussing here.  If we were to split of index stats into a
separate stat, why wouldn't we keep the statio fields in the relfilenode
stats, since they're obviously intimately tied to that?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Paul A Jungwirth
Date:
Subject: Re: amcheck: support for GiST
Next
From: Dilip Kumar
Date:
Subject: Replace is_publishable_class() with relispublishable column in pg_class