Re: Eager page freeze criteria clarification - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Eager page freeze criteria clarification
Date
Msg-id CAH2-WzkWA9FDN-igtuWj7SYVC_cy_brRNbOJh8nW4-K+UUUM+w@mail.gmail.com
Whole thread Raw
In response to Re: Eager page freeze criteria clarification  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Eager page freeze criteria clarification
List pgsql-hackers
On Wed, Sep 6, 2023 at 7:46 AM Robert Haas <robertmhaas@gmail.com> wrote:
> It's interesting that when you raised the threshold the rate of
> vacuuming dropped to zero. I haven't seen that behavior. pgbench does
> rely very, very heavily on HOT-pruning, so VACUUM only cleans up a
> small percentage of the dead tuples that get created. However, it's
> still needed for index vacuuming, and I'm not sure what would cause it
> not to trigger for that purpose.

This was a case where index vacuuming was never required. It's just a
simple and easy to recreate example of what I think of as a more
general problem.

> > > And I'm not sure why we
> > > want to do that. If the table is being vacuumed a lot, it's probably
> > > also being modified a lot, which suggests that we ought to be more
> > > cautious about freezing, rather than the reverse.
> >
> > Why wouldn't it be both things at the same time, for the same table?
>
> Both of what things?

Why wouldn't we expect a table to have some pages that ought to be
frozen right away, and others where freezing should in theory be put
off indefinitely? I think that that's very common.

> > Why not also avoid setting pages all-visible? The WAL records aren't
> > too much smaller than most freeze records these days -- 64 bytes on
> > most systems. I realize that the rules for FPIs are a bit different
> > when page-level checksums aren't enabled, but fundamentally it's the
> > same situation. No?
>
> It's an interesting point. AFAIK, whether or not page-level checksums
> are enabled doesn't really matter here. But it seems fair to ask - if
> freezing is too aggressive, why is setting all-visible not also too
> aggressive? I don't have a good answer to that question right now.

As you know, I am particularly concerned about the tendency of
unfrozen all-visible pages to accumulate without bound (at least
without bound expressed in physical units such as pages). The very
fact that pages are being set all-visible by VACUUM can be seen as a
part of a high-level systemic problem -- a problem that plays out over
time, across multiple VACUUM operations. So even if the cost of
setting pages all-visible happened to be much lower than the cost of
freezing (which it isn't), setting pages all-visible without freezing
has unique downsides.

If VACUUM freezes too aggressively, then (pretty much by definition)
we can be sure that the next VACUUM will scan the same pages -- there
may be some scope for VACUUM to "learn from its mistake" when we err
in the direction of over-freezing. But when VACUUM makes the opposite
mistake (doesn't freeze when it should have), it won't scan those same
pages again for a long time, by design. It therefore has no plausible
way of "learning from its mistakes" before it becomes an extremely
expensive and painful lesson (which happens whenever the next
aggressive VACUUM takes place). This is in large part a consequence of
the way that VACUUM dutifully sets pages all-visible whenever
possible. That behavior interacts badly with many workloads, over
time.

VACUUM simply ignores such second-order effects. Perhaps it would be
practical to address some of the issues in this area by avoiding
setting pages all-visible without freezing them, in some general
sense. That at least creates a kind of symmetry between mistakes in
the direction of under-freezing and mistakes in the direction of
over-freezing. That might enable VACUUM to course-correct in either
direction.

Melanie is already planning on combining the WAL records (PRUNE,
FREEZE_PAGE, and VISIBLE). Perhaps that'll weaken the argument for
setting unfrozen pages all-visible even further.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Can a role have indirect ADMIN OPTION on another role?
Next
From: Bruce Momjian
Date:
Subject: Re: Debian 12 gcc warning