Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id CAH2-Wz=zNsPN_kOHiV852tj7bqGUCXZPVbCgo9iwxgQNmL4WjQ@mail.gmail.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: New strategies for freezing, advancing relfrozenxid early  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Mon, Oct 3, 2022 at 5:41 PM Jeff Davis <pgsql@j-davis.com> wrote:
> I like this general approach. The existing GUCs have evolved in a
> confusing way.

Thanks for taking a look!

> > For the most part the
> > skipping/freezing strategy stuff has a good sense of what matters
> > already, and shouldn't need to be guided very often.
>
> I'd like to know more clearly where manual VACUUM fits in here. Will it
> user a more aggressive strategy than an autovacuum, and how so?

There is no change whatsoever in the relationship between manually
issued VACUUMs and autovacuums. We interpret autovacuum_freeze_max_age
in almost the same way as HEAD. The only detail that's changed is that
we almost always interpret "freeze_table_age" as "just use
autovacuum_freeze_max_age" in the patch, rather than as
"vacuum_freeze_table_age, though never more than 95% of
autovacuum_freeze_max_age", as on HEAD.

Maybe this would be less confusing if I went just a bit further, and
totally got rid of the concept that vacuumlazy.c calls aggressive
VACUUM on HEAD -- then there really would be exactly one kind of
VACUUM, just like before the visibility map was first introduced back
in 2009. This would relegate antiwraparound-ness to just another
condition that autovacuum.c used to launch VACUUMs.

Giving VACUUM the freedom to choose where and how to freeze and
advance relfrozenxid based on both costs and benefits is key here.
Anything that needlessly imposes a rigid rule on vacuumlazy.c
undermines that -- it ties VACUUM's hands. The user can still
influence many of the details using high-level GUCs that work at the
table level, rather than GUCs that can only work at the level of
individual VACUUM operations (that leaves too much to chance). Users
shouldn't want or need to micromanage VACUUM.

> > The patch relegates vacuum_freeze_table_age to a compatibility
> > option,
> > making its default -1, meaning "just use autovacuum_freeze_max_age".
>
> The purpose of vacuum_freeze_table_age seems to be that, if you
> regularly issue VACUUM commands, it will prevent a surprise
> antiwraparound vacuum. Is that still the case?

The user really shouldn't need to do anything with
vacuum_freeze_table_age at all now. It's mostly just a way for the
user to optionally insist on advancing relfrozenxid via a
antiwraparound/aggressive VACUUM -- like in a manual VACUUM FREEZE.
Even VACUUM FREEZE shouldn't be necessary very often.

> Maybe it would make more sense to have vacuum_freeze_table_age be a
> fraction of autovacuum_freeze_max_age, and be treated as a maximum so
> that other intelligence might kick in and freeze sooner?

That's kind of how the newly improved skipping strategy stuff works.
It gives some weight to table age as one additional factor (based on
how close the table's age is to autovacuum_freeze_max_age or its Multi
equivalent).

If table age is (say) 60% of autovacuum_freeze_max_age, then VACUUM
should be "60% as aggressive" as a conventional
aggressive/antiwraparound autovacuum would be. What that actually
means is that the VACUUM will tend to prefer advancing relfrozenxid
the closer we get to the cutoff, gradually giving less and less
consideration to putting off work as we get closer and closer. When we
get to 100% then we'll definitely advance relfrozenxid (via a
conventional aggressive/antiwraparound VACUUM).

The precise details are unsettled, but I'm pretty sure that the
general idea is sound. Basically we're replacing
vacuum_freeze_table_age with a dynamic, flexible version of the same
basic idea. Now we don't just care about the need to advance
relfrozenxid (benefits), though; we also care about costs.

> >  This makes things less confusing for users and hackers.
>
> It may take an adjustment period ;-)

Perhaps this is more of an aspiration at this point.  :-)

> Yes, it's clearing things up, but it's still a complex problem.
> There's:
>
>  a. xid age vs the actual amount of deferred work to be done
>  b. advancing relfrozenxid vs skipping all-visible pages
>  c. difficulty in controlling reasonable behavior (e.g.
>     vacuum_freeze_min_age often being ignored, freezing
>     individual tuples rather than pages)
>
> Your first email described the motivation in terms of (a), but the
> patches seem more focused on (b) and (c).

I think that all 3 areas are deeply and hopelessly intertwined.

For example, vacuum_freeze_min_age is effectively ignored in many
important cases right now precisely because we senselessly skip
all-visible pages with unfrozen tuples, no matter what -- the problem
actually comes from the visibility map, which vacuum_freeze_min_age
predates by quite a few years. So how can you possibly address the
vacuum_freeze_min_age issues without also significantly revising VM
skipping behavior? They're practically the same problem!

And once you've fixed vacuum_freeze_min_age (and skipping), how can
you then pass up the opportunity to advance relfrozenxid early when
doing so will require only a little extra work? I'm going to regress
some cases if I simply ignore the relfrozenxid factor. Finally, the
debt issue is itself a consequence of the other problems.

Perhaps this is an example of the inventor's paradox, where the more
ambitious plan may actually be easier and more likely to succeed than
a more limited plan that just focuses on one immediate problem. All of
these problems seem to be a result of adding accretion after accretion
over the years. A high-level rethink is well overdue. We need to
return to basics.

> >  The skipping strategy decision making process isn't
> > particularly complicated, but it now looks more like an optimization
> > problem of some kind or other.
>
> There's another important point here, which is that it gives an
> opportunity to decide to freeze some all-visible pages in a given round
> just to reduce the deferred work, without worrying about advancing
> relfrozenxid.

True. Though I think that a strong bias in the direction of advancing
relfrozenxid by some amount (not necessarily by very many XIDs) still
makes sense, especially when we're already freezing aggressively.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: shadow variables - pg15 edition
Next
From: Andres Freund
Date:
Subject: Re: [RFC] building postgres with meson - v13