Re: pg_amcheck contrib application - Mailing list pgsql-hackers

From Robert Haas
Subject Re: pg_amcheck contrib application
Date
Msg-id CA+TgmoZzy_Gn-3yh_iGiTz7dWWM0MmytKEFv5HoJWLko0cEVqA@mail.gmail.com
Whole thread Raw
In response to Re: pg_amcheck contrib application  (Mark Dilger <mark.dilger@enterprisedb.com>)
Responses Re: pg_amcheck contrib application  (Mark Dilger <mark.dilger@enterprisedb.com>)
List pgsql-hackers
On Thu, Apr 1, 2021 at 12:32 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> > - If xmax is a multi but seems to be garbled, I changed it to return
> > true rather than false. The inserter is known to have committed by
> > that point, so I think it's OK to try to deform the tuple. We just
> > shouldn't try to check TOAST.
>
> It is hard to know what to do when at least one tuple header field is corrupt.  You don't necesarily know which one
itis.  For example, if HEAP_XMAX_IS_MULTI is set, we try to interpret the xmax as a mxid, and if it is out of bounds,
wereport it as corrupt.  But was the xmax corrupt?  Or was the HEAP_XMAX_IS_MULTI bit corrupt?  It's not clear.  I took
theview that if either xmin or xmax appear to be corrupt when interpreted in light of the various tuple header bits,
allwe really know is that the set of fields/bits don't make sense as a whole, so we report corruption, don't trust any
ofthem, and abort further checking of the tuple.  You have be burden of proof the other way around.  If the xmin
appearsfine, and xmax appears corrupt, then we only know that xmax is corrupt, so the tuple is checkable because
accordingto the xmin it committed. 

I agree that it's hard to be sure what's gone once we start finding
corrupted data, but deciding that maybe xmin didn't really commit
because we see that there's something wrong with xmax seems excessive
to me. I thought about a related case: if xmax is a bad multi but is
also hinted invalid, should we try to follow TOAST pointers? I think
that's hard to say, because we don't know whether (1) the invalid
marking is in error, (2) it's wrong to consider it a multi rather than
an XID, (3) the stored multi got overwritten with a garbage value, or
(4) the stored multi got removed before the tuple was frozen. Not
knowing which of those is the case, how are we supposed to decide
whether the TOAST tuples might have been (or be about to get) pruned?

But, in the case we're talking about here, I don't think it's a
particularly close decision. All we need to say is that if xmax or the
infomask bits pertaining to it are corrupted, we're still going to
suppose that xmin and the infomask bits pertaining to it, which are
all different bytes and bits, are OK. To me, the contrary decision,
namely that a bogus xmax means xmin was probably lying about the
transaction having been committed in the first place, seems like a
serious overreaction. As you say:

> I don't think how you have it causes undue problems, since deforming the tuple when you shouldn't merely risks a
bunchof extra not-so-helpful corruption messages.  And hey, maybe they're helpful to somebody clever enough to diagnose
whythat particular bit of noise was generated. 

I agree. The biggest risk here is that we might omit >0 complaints
when only 0 are justified. That will panic users. The possibility that
we might omit >x complaints when only x are justified, for some x>0,
is also a risk, but it's not nearly as bad, because there's definitely
something wrong, and it's just a question of what it is exactly. So we
have to be really conservative about saying that X is corruption if
there's any possibility that it might be fine. But once we've
complained about one thing, we can take a more balanced approach about
whether to risk issuing more complaints. The possibility that
suppressing the additional complaints might complicate resolution of
the issue also needs to be considered.

>             * If xmin_status happens to be XID_IN_PROGRESS, then in theory
>
> Did you mean to say XID_IS_CURRENT_XID here?

Yes, I did, thanks.

> /* xmax is an MXID, not an MXID. Sanity check it. */
>
> Is it an MXID or isn't it?

Good catch.

New patch attached.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Replication slot stats misgivings
Next
From: Mark Dilger
Date:
Subject: Re: pg_amcheck contrib application