Home > mailing lists

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date	January 7, 2022 01:45:51
Msg-id	CAH2-Wzmiq1ZDLnSfePJHjLA-nYfjGyoQHRy6RARG50ek_-Ea_Q@mail.gmail.com Whole thread Raw
In response to	Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
List	pgsql-hackers

Tree view

On Thu, Jan 6, 2022 at 12:54 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Dec 17, 2021 at 9:30 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > Can we fully get rid of vacuum_freeze_table_age? Maybe even get rid of
> > vacuum_freeze_min_age, too? Freezing tuples is a maintenance task for
> > physical blocks, but we use logical units (XIDs).
>
> I don't see how we can get rid of these. We know that catastrophe will
> ensue if we fail to freeze old XIDs for a sufficiently long time ---
> where sufficiently long has to do with the number of XIDs that have
> been subsequently consumed.

I don't really disagree with anything you've said, I think. There are
a few subtleties here. I'll try to tease them apart.

I agree that we cannot do without something like vacrel->FreezeLimit
for the foreseeable future -- but the closely related GUC
(vacuum_freeze_min_age) is another matter. Although everything you've
said in favor of the GUC seems true, the GUC is not a particularly
effective (or natural) way of constraining the problem. It just
doesn't make sense as a tunable.

One obvious reason for this is that the opportunistic freezing stuff
is expected to be the thing that usually forces freezing -- not
vacuum_freeze_min_age, nor FreezeLimit, nor any other XID-based
cutoff. As you more or less pointed out yourself, we still need
FreezeLimit as a backstop mechanism. But the value of FreezeLimit can
just come from autovacuum_freeze_max_age/2 in all cases (no separate
GUC), or something along those lines. We don't particularly expect the
value of FreezeLimit to matter, at least most of the time. It should
only noticeably affect our behavior during anti-wraparound VACUUMs,
which become rare with the patch (e.g. my pgbench_accounts example
upthread). Most individual tables will never get even one
anti-wraparound VACUUM -- it just doesn't ever come for most tables in
practice.

My big issue with vacuum_freeze_min_age is that it doesn't really work
with the freeze map work in 9.6, which creates problems that I'm
trying to address by freezing early and so on. After all, HEAD (and
all stable branches) can easily set a page to all-visible (but not
all-frozen) in the VM, meaning that the page's tuples won't be
considered for freezing until the next aggressive VACUUM. This means
that vacuum_freeze_min_age is already frequently ignored by the
implementation -- it's conditioned on other things that are practically
impossible to predict.

Curious about your thoughts on this existing issue with
vacuum_freeze_min_age. I am concerned about the "freezing cliff" that
it creates.

> So it's natural to decide whether or not
> we're going to wait for cleanup locks on pages on the basis of how old
> the XIDs they contain actually are.

I agree, but again, it's only a backstop. With the patch we'd have to
be rather unlucky to ever need to wait like this.

What are the chances that we keep failing to freeze an old XID from
one particular page, again and again? My testing indicates that it's a
negligible concern in practice (barring pathological cases with idle
cursors, etc).

> I think vacuum_freeze_min_age also serves a useful purpose: it
> prevents us from freezing data that's going to be modified again or
> even deleted in the near future. Since we can't know the future, we
> must base our decision on the assumption that the future will be like
> the past: if the page hasn't been modified for a while, then we should
> assume it's not likely to be modified again soon; otherwise not.

But the "freeze early" heuristics work a bit like that anyway. We
won't freeze all the tuples on a whole heap page early if we won't
otherwise set the heap page to all-visible (not all-frozen) in the VM
anyway.

> If we
> knew the time at which the page had last been modified, it would be
> very reasonable to use that here - say, freeze the XIDs if the page
> hasn't been touched in an hour, or whatever. But since we lack such
> timestamps the XID age is the closest proxy we have.

XID age is a *terrible* proxy. The age of an XID in a tuple header may
advance quickly, even when nobody modifies the same table at all.

I concede that it is true that we are (in some sense) "gambling" by
freezing early -- we may end up freezing a tuple that we subsequently
update anyway. But aren't we also "gambling" by *not* freezing early?
By not freezing, we risk getting into "freezing debt" that will have
to be paid off in one ruinously large installment. I would much rather
"gamble" on something where we can tolerate consistently "losing" than
gamble on something where I cannot ever afford to lose (even if it's
much less likely that I'll lose during any given VACUUM operation).

Besides all this, I think that we have a rather decent chance of
coming out ahead in practice by freezing early. In practice the
marginal cost of freezing early is consistently pretty low.
Cost-control-driven (as opposed to need-driven) freezing is *supposed*
to be cheaper, of course. And like it or not, freezing is really just part of
the cost of storing data using Postgres (for the time being, at least).

> > The
> > risk mostly comes from how much total work we still need to do to
> > advance relfrozenxid. If the single old XID is quite old indeed (~1.5
> > billion XIDs), but there is only one, then we just have to freeze one
> > tuple to be able to safely advance relfrozenxid (maybe advance it by a
> > huge amount!). How long can it take to freeze one tuple, with the
> > freeze map, etc?
>
> I don't really see any reason for optimism here.

> IOW, the time that it takes to freeze that one tuple *in theory* might
> be small. But in practice it may be very large, because we won't
> necessarily get around to it on any meaningful time frame.

On second thought I agree that my specific example of 1.5 billion XIDs
was a little too optimistic of me. But 50 million XIDs (i.e. the
vacuum_freeze_min_age default) is too pessimistic. The important point
is that FreezeLimit could plausibly become nothing more than a
backstop mechanism, with the design from the patch series -- something
that typically has no effect on what tuples actually get frozen.

--
Peter Geoghegan

pgsql-hackers by date:

From: Peter Smith
Date: 07 January 2022, 01:44:25
Subject: Re: row filtering for logical replication

From: Andrew Dunstan
Date: 07 January 2022, 02:28:26
Subject: Re: Add jsonlog log_destination for JSON server logs

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

Previous

Next