Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New IndexAM API controlling index vacuum strategies
Date
Msg-id CAH2-Wz=b9bYB474sQNQZtPSYw7TDRYZPequQ0yV4uBx8s9c3Yg@mail.gmail.com
Whole thread Raw
In response to Re: New IndexAM API controlling index vacuum strategies  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: New IndexAM API controlling index vacuum strategies  (Peter Geoghegan <pg@bowt.ie>)
Re: New IndexAM API controlling index vacuum strategies  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Mon, Mar 22, 2021 at 6:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> But we're not sure when the next anti-wraparound vacuum will take
> place. Since the table is already vacuumed by a non-aggressive vacuum
> with disabling index cleanup, an autovacuum will process the table
> when the table gets modified enough or the table's relfrozenxid gets
> older than autovacuum_vacuum_max_age. If the new threshold, probably a
> new GUC, is much lower than autovacuum_vacuum_max_age and
> vacuum_freeze_table_age, the table is continuously vacuumed without
> advancing relfrozenxid, leading to unnecessarily index bloat. Given
> the new threshold is for emergency purposes (i.g., advancing
> relfrozenxid faster), I think it might be better to use
> vacuum_freeze_table_age as the lower bound of the new threshold. What
> do you think?

As you know, when the user sets vacuum_freeze_table_age to a value
that is greater than the value of autovacuum_vacuum_max_age, the two
GUCs have values that are contradictory. This contradiction is
resolved inside vacuum_set_xid_limits(), which knows that it should
"interpret" the value of vacuum_freeze_table_age as
(autovacuum_vacuum_max_age * 0.95) to paper-over the user's error.
This 0.95 behavior is documented in the user docs, though it happens
silently.

You seem to be concerned about a similar contradiction. In fact it's
*very* similar contradiction, because this new GUC is naturally a
"sibling GUC" of both vacuum_freeze_table_age and
autovacuum_vacuum_max_age (the "units" are the same, though the
behavior that each GUC triggers is different -- but
vacuum_freeze_table_age and autovacuum_vacuum_max_age are both already
*similar and different* in the same way). So perhaps the solution
should be similar -- silently interpret the setting of the new GUC to
resolve the contradiction.

(Maybe I should say "these two new GUCs"? MultiXact variant might be needed...)

This approach has the following advantages:

* It follows precedent.

* It establishes that the new GUC is a logical extension of the
existing vacuum_freeze_table_age and autovacuum_vacuum_max_age GUCs.

* The default value for the new GUC will be so much higher (say 1.8
billion XIDs) than even the default of autovacuum_vacuum_max_age that
it won't disrupt anybody's existing postgresql.conf setup.

* For the same reason (the big space between autovacuum_vacuum_max_age
and the new GUC with default settings), you can almost set the new GUC
without needing to know about autovacuum_vacuum_max_age.

* The overall behavior isn't actually restrictive/paternalistic. That
is, if you know what you're doing (say you're testing the feature) you
can reduce all 3 sibling GUCs to 0 and get the testing behavior that
you desire.

What do you think?

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: shared memory stats: high level design decisions: consistency, dropping
Next
From: Laurenz Albe
Date:
Subject: Re: Disable WAL logging to speed up data loading