Thread: Backpatching nbtree VACUUM (page deletion) hardening

Backpatching nbtree VACUUM (page deletion) hardening

From
Peter Geoghegan
Date:
Postgres 14 commit 5b861baa55 added hardening to nbtree page deletion.
This had the effect of making nbtree VACUUM robust against misbehaving
operator classes -- we just LOG the problem and move on, without
throwing an error. In practice a "misbehaving operator class" is often
a problem with collation versioning.

I think that this should be backpatched now, to protect users from
particularly nasty problems that hitting the error eventually leads
to.

An error ends the whole VACUUM operation. If VACUUM cannot delete the
page the first time, there is no reason to think that it'll be any
different on the second or the tenth attempt. The eventual result
(absent user/DBA intervention) is that no antiwraparound autovacuum
will ever complete, leading to an outage when the system hits
xidStopLimit. (Actually this scenario won't result in the system
hitting xidStopLimit where the failsafe is available, but that's
another thing that is only in 14, so that's not any help.)

This seems low risk. The commit in question is very simple. It just
downgrades an old 9.4-era ereport() from ERROR to LOG, and adds a
"return false;" immediately after that. The function in question is
fundamentally structured in a way that allows it to back out of page
deletion because of problems that are far removed from where the
caller starts from. When and why we back out of page deletion is
already opaque to the caller, so it's very hard to imagine a new
problem caused by backpatching. Besides all this, 14 has been out for
a while now.

-- 
Peter Geoghegan



Re: Backpatching nbtree VACUUM (page deletion) hardening

From
Nathan Bossart
Date:
On Fri, Sep 02, 2022 at 02:13:15PM -0700, Peter Geoghegan wrote:
> I think that this should be backpatched now, to protect users from
> particularly nasty problems that hitting the error eventually leads
> to.

+1

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Backpatching nbtree VACUUM (page deletion) hardening

From
Michael Paquier
Date:
On Fri, Sep 02, 2022 at 02:13:15PM -0700, Peter Geoghegan wrote:
> Postgres 14 commit 5b861baa55 added hardening to nbtree page deletion.
> This had the effect of making nbtree VACUUM robust against misbehaving
> operator classes -- we just LOG the problem and move on, without
> throwing an error. In practice a "misbehaving operator class" is often
> a problem with collation versioning.

This has been a problem for years, and still for years to come with
libc updates.  I am not much into this stuff, but does running VACUUM
in this case help with the state of the index that used a past,
now-invalid, collation (be it libc or ICU) to get a bit cleaned up?

> An error ends the whole VACUUM operation. If VACUUM cannot delete the
> page the first time, there is no reason to think that it'll be any
> different on the second or the tenth attempt. The eventual result
> (absent user/DBA intervention) is that no antiwraparound autovacuum
> will ever complete, leading to an outage when the system hits
> xidStopLimit. (Actually this scenario won't result in the system
> hitting xidStopLimit where the failsafe is available, but that's
> another thing that is only in 14, so that's not any help.)

When written like that, this surely sounds extremely bad and this
would need more complex chirurgy (or just running with a build that
includes this patch?).

> This seems low risk. The commit in question is very simple. It just
> downgrades an old 9.4-era ereport() from ERROR to LOG, and adds a
> "return false;" immediately after that. The function in question is
> fundamentally structured in a way that allows it to back out of page
> deletion because of problems that are far removed from where the
> caller starts from. When and why we back out of page deletion is
> already opaque to the caller, so it's very hard to imagine a new
> problem caused by backpatching. Besides all this, 14 has been out for
> a while now.

Yeah, I can take it that we would have seen reports if this was an
issue, and I don't recall seeing one on the community lists, at
least.
--
Michael

Attachment

Re: Backpatching nbtree VACUUM (page deletion) hardening

From
Peter Geoghegan
Date:
On Fri, Sep 2, 2022 at 6:14 PM Michael Paquier <michael@paquier.xyz> wrote:
> This has been a problem for years, and still for years to come with
> libc updates.  I am not much into this stuff, but does running VACUUM
> in this case help with the state of the index that used a past,
> now-invalid, collation (be it libc or ICU) to get a bit cleaned up?

Yes -- nbtree VACUUM generally can cope quite well, even when the
index is corrupt. It should mostly manage to do what is expected here,
even with a misbehaving opclass, because it relies as little as
possible on user-defined opclass code.

Even without the hardening in place, nbtree VACUUM will still do a
*surprisingly* good job of recovering when the opclass is broken in
some way: VACUUM just needs the insertion scankey operator class code
to initially determine roughly where to look for the to-be-deleted
page's downlink, one level up in the tree. Even when an operator class
is wildly broken (e.g. the comparator gives a result that it
determines at random), we still won't see problems in nbtree VACUUM
most of the time -- because even being roughly correct is good enough
in practice!

You have to be quite unlucky to hit this, even when the opclass is
wildly broken (which is probably much less common than "moderately
broken").

> When written like that, this surely sounds extremely bad and this
> would need more complex chirurgy (or just running with a build that
> includes this patch?).

The patch will fix the case in question, which I have seen internal
AWS reports about -- though the initial fix that went into 14 wasn't
driven by any complaint from any user. I just happened to notice that
we were throwing an ERROR in nbtree VACUUM for no good reason, which
is something that should be avoided on general principle.

In theory there could be other ways in which you'd run into the same
basic problem (in any index AM). The important point is that we're
better off not throwing any errors in the first place, but if we must
then they had better not be errors that will be repeated again and
again, without any chance of the problem going away naturally. (Not
that it never makes sense to just throw an error; there are meaningful
gradations of "totally unacceptable problem".)

-- 
Peter Geoghegan



Re: Backpatching nbtree VACUUM (page deletion) hardening

From
Peter Geoghegan
Date:
On Fri, Sep 2, 2022 at 6:51 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Yes -- nbtree VACUUM generally can cope quite well, even when the
> index is corrupt. It should mostly manage to do what is expected here,
> even with a misbehaving opclass, because it relies as little as
> possible on user-defined opclass code.

I just backpatched the hardening commit from 14 to every supported branch.

-- 
Peter Geoghegan