Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New IndexAM API controlling index vacuum strategies
Date
Msg-id CAH2-Wz=YP_y+P9K9V13fL-dwA9xwkP2EmxKGNkP64bftMAc9_Q@mail.gmail.com
Whole thread Raw
In response to Re: New IndexAM API controlling index vacuum strategies  (Andres Freund <andres@anarazel.de>)
Responses Re: New IndexAM API controlling index vacuum strategies
Re: New IndexAM API controlling index vacuum strategies
List pgsql-hackers
On Wed, Apr 14, 2021 at 12:33 PM Andres Freund <andres@anarazel.de> wrote:
> I'm getting a bit bothered by the speed at which you're pushing fairly
> substantial behavioural for vacuum. In this case without even a warning
> that you're about to do so.

To a large degree the failsafe is something that is written in the
hope that it will never be needed. This is unlike most other things,
and has its own unique risks.

I think that the proper thing to do is to accept a certain amount of
risk in this area. The previous status quo was *appalling*, and so it
seems very unlikely that the failsafe hasn't mostly eliminated a lot
of risk for users. That factor is not everything, but it should count
for a lot. The only way that we're going to have total confidence in
anything like this is through the experience of it mostly working over
several releases.

> I don't think it's that blindingly obvious that skipping truncation is
> the right thing to do that it doesn't need review. Consider e.g. the
> case that you're close to wraparound because you ran out of space for
> the amount of WAL VACUUM produces, previously leading to autovacuums
> being aborted / the server restarted. The user might then stop regular
> activity and try to VACUUM. Skipping the truncation might now make it
> harder to actually vacuum all the tables without running out of space.

Note that the criteria for whether or not "hastup=false" for a page is
slightly different in lazy_scan_prune() -- I added a comment that
points this out directly (the fact that it works that way is not new,
and might have originally been a happy mistake). Unlike
count_nondeletable_pages(), which is used by heap truncation,
lazy_scan_prune() is concerned about whether or not it's *likely to be
possible* to truncate away the page by the time lazy_truncate_heap()
is reached (if it is reached at all). And so it's optimistic about
LP_DEAD items that it observes being removed by
lazy_vacuum_heap_page() before we get to lazy_truncate_heap(). It's
inherently race-prone anyway, so it might as well assume that LP_DEAD
items will eventually become LP_UNUSED items later on.

It follows that the chances of lazy_truncate_heap() failing to
truncate anything when the failsafe has already triggered are
exceptionally high -- all the LP_DEAD items are still there, and
cannot be safely removed during truncation (for the usual reasons). I
just went one step further than that in this recent commit. I didn't
point these details out before now because (to me) this is beside the
point. Which is that the failsafe is just that -- a failsafe. Anything
that adds unnecessary unpredictable delay in reaching the point of
advancing relfrozenxid should be avoided. (Besides, the design of
should_attempt_truncation() and lazy_truncate_heap() is very far from
guaranteeing that truncation will take place at the best of times.)

FWIW, my intention is to try to get as much feedback about the
failsafe as I possibly can -- it's hard to reason about exceptional
events. I'm also happy to further discuss the specifics with you now.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Vik Fearing
Date:
Subject: Re: Converting contrib SQL functions to new style
Next
From: Mark Dilger
Date:
Subject: Re: Converting contrib SQL functions to new style