Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date
Msg-id 20220331175007.lgx3huuc5gyjgk3e@alap3.anarazel.de
Whole thread Raw
In response to Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
List pgsql-hackers
Hi,

On 2022-03-31 10:12:49 -0700, Peter Geoghegan wrote:
> On Wed, Mar 30, 2022 at 9:59 PM Andres Freund <andres@anarazel.de> wrote:
> > I'm not sure there's a proper bug on HEAD here. I think at worst it can delay
> > the horizon increasing a bunch, by falsely not using an aggressive vacuum when
> > we should have - might even be limited to a single autovacuum cycle.
> 
> So, to be clear: vac_update_relstats() never actually considered the
> new relfrozenxid value from its vacuumlazy.c caller to be "in the
> future"?

No, I added separate debug messages for those, and also applied your patch,
and it didn't trigger.

I don't immediately see how we could end up computing a frozenxid value that
would be problematic? The pgcform->relfrozenxid value will always be the
"local" value, which afaics can be behind the other database's value (and thus
behind the value from the relcache init file). But it can't be ahead, we have
the proper invalidations for that (I think).


I do think we should apply a version of the warnings you have (with a WARNING
instead of PANIC obviously). I think it's bordering on insanity that we have
so many paths to just silently fix stuff up around vacuum. It's like we want
things to be undebuggable, and to give users no warnings about something being
up.


> It just looked that way to the failing assertion in
> vacuumlazy.c, because its own version of the original relfrozenxid was
> stale from the beginning? And so the worst problem is probably just
> that we don't use aggressive VACUUM when we really should in rare
> cases?

Yes, I think that's right.

Can you repro the issue with my recipe? FWIW, adding log_min_messages=debug5
and fsync=off made the crash trigger more quickly.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: support for MERGE
Next
From: Daniel Gustafsson
Date:
Subject: Re: support for MERGE