On Sun, Oct 22, 2023 at 12:07:59PM -0700, Andres Freund wrote:
> Medium term, I think we need an approximate xid->"time of assignment" mapping that's continually maintained on the
primary.One of the things that'd show us to do is introduce a GUC to control the maximum effect of hs_feedback on the
primary,in a useful unit. Numbers of xids are not a useful unit (100k xids is forever on some systems, a few minutes at
beston others, the rate is not necessarily that steady when plpgsql exception handles are used, ...)
>
> It'd be useful to have such a mapping for other features too. E.g.
>
> - making it visible in pg_stat _activity how problematic a longrunning xact is - a 3 day old xact that doesn't have
anxid assigned and has a recent xmin is fine, it won't prevent vacuum from doing things. But a somewhat recent xact
thatstill has a snapshot from before an old xact was cancelled could be problematic.
>
> - turn pg_class.relfrozenxid into an understandable timeframe. It's a fair bit of mental effort to classify "370M
xidsold" into problem/fine (it's e.g. not a problem on a system with a high xid rate, on a big table that takes a bit
toa bit to vacuum).
>
> - using the mapping to compute an xid consumption rate IMO would be one building block for smarter AV scheduling.
Togetherwith historical vacuum runtimes it'd allow us to start vacuuming early enough to prevent hitting thresholds,
adaptpacing, prioritize between tables etc.
Big +1 to all of this.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com