Home > mailing lists

Re: relfilenode statistics - Mailing list pgsql-hackers

From	Bertrand Drouvot
Subject	Re: relfilenode statistics
Date	November 12 20:03:55
Msg-id	aRS9+yaGhjDslRWj@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw
In response to	Re: relfilenode statistics (Michael Paquier <michael@paquier.xyz>)
Responses	Re: relfilenode statistics
List	pgsql-hackers

Tree view

Hi,

On Mon, Nov 10, 2025 at 05:53:45PM +0900, Michael Paquier wrote:
> On Sun, Nov 09, 2025 at 08:33:54AM +0900, Michael Paquier wrote:
> > Looking at this part of the patch set for now, not looked at the rest
> > yet.  This new stats_1.out is 2k lines long, introduced for the tests
> > related to rewrites as an effect of 2PC.  It seems to me that a split 
> > into a new stats_rewrite would be justified for this case, to reduce
> > the output duplication.
> 
> did a split into a new file, which should also be fine because we
> don't rely on a system-wide stats reset, then applied the result.

Thanks!

> The patch is spending a great deal of effort on three fronts:
> - making sure that the statistics are copied over after a relation
> rewrite.

Right, in 0003.

> - making sure that we assign a "correct" object ID, assigning
> the fields of RelFileLocator based on a relation ID.  Mapped and
> shared relations make the exercise a bit more difficult.  It would be
> nice to avoid this kind of duplication with other code paths that
> assign a RelFileLocator.

Are you referring to the new pgstat_reloid_to_relfilelocator() function?
If so, I'll try to avoid code duplication with other code paths as suggested.

> - Partitioned tables, where we don't have a relfilenode but we need to
> track statistics.  The patch relies on the relation oid to assign a
> key, as far as I've read.

Right. It's not doing that much in this area. It's needed so that things like
"last_analyze" on a partitioned table is populated (see "Ensure only the
partitioned table is analyzed" in vacuum.sql).

> Among the three points, the first one is the most invasive in the
> patch, it seems, and do we actually want to keep the stats across
> rewrites at all?

Not doing so would mean that all stats related to a relation will be lost after
a rewrite. I think that would be a major regression as compared to the current
behavior.

> The main reason of doing the relfilenode move 
> would be to rebuild these stats on a WAL-record basis because the
> relfile locator is the only thing we know in the startup process, and
> once rewritten the state of the data is different.

> relation_needs_vacanalyze() then cares about three fields:
> - Number of dead tuples, which would be 0 after a rewrite.
> - ins_since_vacuum, which would be 0 after a rewrite.
> - mod_since_analyze, for analyze, again 0.

> 
> I have not checked the recent autovacuum scheduling thread to see if
> this set changes there.
> 
> Are these numbers worth the effort of copying over at the end?

I think so because that would impact all the other relation's stats (not only
the ones linked to relation_needs_vacanalyze()).

> Was
> this particular point discussed?  I've seen this mentioned once here,
> but I am wondering what are the arguments in favor of copying the
> stats data versus not copying it across rewrites:
> https://www.postgresql.org/message-id/20240607031736.7izmr2yirznvidka%40awork3.anarazel.de

In favor of copying, I would say:

- no regression as compared to the current behavior. That means, for example,
not breaking DBA's activities/decisions based on the pg_stat_all_tables fields
after a rewrite.

- a rewrite is not changing the number of dead tuples, ins_since_vacuum and 
mod_since_analyze. So, if don't copy those, then we'd change the
relation_needs_vacanalyze() decision(s) as compared to the current one(s) for no
reasons (as a rewrite has no impact on those).

In favor of not copying, I would say make the code simpler.

I'm in favor of copying but open to different point of views.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: Arseniy Mukhin
Date: 12 November, 19:57:51
Subject: Re: Optimize LISTEN/NOTIFY

From: Álvaro Herrera
Date: 12 November, 20:09:40
Subject: Re: obsolete autovacuum comment

Re: relfilenode statistics - Mailing list pgsql-hackers

Previous

Next