Hi,
On 2022-12-28 15:13:23 -0500, Tom Lane wrote:
> [ redirecting to -hackers because patch attached ]
>
> Peter Geoghegan <pg@bowt.ie> writes:
> > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> That is a really good point. How about teaching VACUUM to track
> >> the oldest original relfrozenxid and relminmxid among the table(s)
> >> it processed, and skip vac_update_datfrozenxid unless at least one
> >> of those matches the database's values? For extra credit, also
> >> skip if we didn't successfully advance the source rel's value.
>
> > Hmm. I think that that would probably work.
>
> I poked into that idea some more and concluded that getting VACUUM to
> manage it behind the user's back is not going to work very reliably.
> The key problem is explained by this existing comment in autovacuum.c:
>
> * Even if we didn't vacuum anything, it may still be important to do
> * this, because one indirect effect of vac_update_datfrozenxid() is to
> * update ShmemVariableCache->xidVacLimit. That might need to be done
> * even if we haven't vacuumed anything, because relations with older
> * relfrozenxid values or other databases with older datfrozenxid values
> * might have been dropped, allowing xidVacLimit to advance.
>
> That is, if the table that's holding back datfrozenxid gets dropped
> between VACUUM runs, VACUUM would never think that it might have
> advanced the global minimum.
I wonder if a less aggressive version of this idea might still work. Perhaps
we could use ShmemVariableCache->latestCompletedXid or
ShmemVariableCache->nextXid to skip at least some updates?
Obviously this isn't going to help if there's a lot of concurrent activity,
but the case of just running vacuumdb -a might be substantially improved.
Separately I wonder if it's worth micro-optimizing vac_update_datfrozenxid() a
bit. I e.g. see a noticable speedup bypassing systable_getnext() and using
heap_getnext(). It's really too bad that we want to check for "in the future"
xids, otherwise we could use a ScanKey to filter at a lower level.
Greetings,
Andres Freund