Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date
Msg-id CAH2-Wzm4+b7Bpc2JL_BjQ3SzgOXPWTfzuewCgakLb27ppcvTVQ@mail.gmail.com
Whole thread Raw
In response to Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Feb 7, 2022 at 12:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Feb 7, 2022 at 11:43 AM Peter Geoghegan <pg@bowt.ie> wrote:
> > > That's because, if VACUUM is only ever getting triggered by XID
> > > age advancement and not by bloat, there's no opportunity for your
> > > patch set to advance relfrozenxid any sooner than we're doing now.
> >
> > We must distinguish between:
> >
> > 1. "VACUUM is fundamentally never going to need to run unless it is
> > forced to, just to advance relfrozenxid" -- this applies to tables
> > like the stock and customers tables from the benchmark.
> >
> > and:
> >
> > 2. "VACUUM must sometimes run to mark newly appended heap pages
> > all-visible, and maybe to also remove dead tuples, but not that often
> > -- and yet we current only get expensive and inconveniently timed
> > anti-wraparound VACUUMs, no matter what" -- this applies to all the
> > other big tables in the benchmark, in particular to the orders and
> > order lines tables, but also to simpler cases like pgbench_history.
>
> It's not really very understandable for me when you refer to the way
> table X behaves in Y benchmark, because I haven't studied that in
> enough detail to know. If you say things like insert-only table, or a
> continuous-random-updates table, or whatever the case is, it's a lot
> easier to wrap my head around it.

What I've called category 2 tables are the vast majority of big tables
in practice. They include pure append-only tables, but also tables
that grow and grow from inserts, but also have some updates. The point
of the TPC-C order + order lines examples was to show how broad the
category really is. And how mixtures of inserts and bloat from updates
on one single table confuse the implementation in general.

> > Does that make sense? It's pretty subtle, admittedly, and you no doubt
> > have (very reasonable) concerns about the extremes, even if you accept
> > all that. I just want to get the general idea across here, as a
> > starting point for further discussion.
>
> Not really. I think you *might* be saying tables which currently get
> only wraparound vacuums will end up getting other kinds of vacuums
> with your patch because things will improve enough for other tables in
> the system that they will be able to get more attention than they do
> currently.

Yes, I am.

> But I'm not sure I am understanding you correctly, and even
> if I am I don't understand why that would be so, and even if it is I
> think it doesn't help if essentially all the tables in the system are
> suffering from the problem.

When I say "relfrozenxid advancement has been qualitatively improved
by the patch", what I mean is that we are much closer to a rate of
relfrozenxid advancement that is far closer to the theoretically
optimal rate for our current design, with freezing and with 32-bit
XIDs, and with the invariants for freezing.

Consider the extreme case, and generalize. In the simple append-only
table case, it is most obvious. The final relfrozenxid is very close
to OldestXmin (only tiny noise level differences appear), regardless
of XID consumption by the system in general, and even within the
append-only table in particular. Other cases are somewhat trickier,
but have roughly the same quality, to a surprising degree. Lots of
things that never really should have affected relfrozenxid to begin
with do not, for the first time.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Esteban Zimanyi
Date:
Subject: Re: Storage for multiple variable-length attributes in a single row
Next
From: Fujii Masao
Date:
Subject: Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit