Re: amcheck (B-Tree integrity checking tool) - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: amcheck (B-Tree integrity checking tool)
Date
Msg-id CAB7nPqRdVbg0W+dbHgUWFg_SnE7L+1_BEJGom6sgJOOy52NmhQ@mail.gmail.com
Whole thread Raw
In response to Re: amcheck (B-Tree integrity checking tool)  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On Mon, Oct 17, 2016 at 10:46 AM, Noah Misch <noah@leadboat.com> wrote:
> - Detect impossible conditions in the hint bits.  A tuple should not have both
>   HEAP_XMAX_COMMITTED and HEAP_XMAX_INVALID.  Every tuple bearing
>   HEAP_ONLY_TUPLE should bear HEAP_UPDATED.  HEAP_HASVARWIDTH should be true
>   if and only if the tuple has a non-NULL value in a negative-typlen column,
>   possibly a dropped column.  A tuple should not have both HEAP_KEYS_UPDATED
>   and HEAP_XMAX_LOCK_ONLY.
>
> - Report evidence of upgrades from older versions.  If the tool sees
>   HEAP_MOVED_IN or HEAP_MOVED_OFF, it can report that the cluster was
>   binary-upgraded from 8.3 or 8.4.  If the user did not upgrade from such a
>   version, the user should assume corruption.
>
> - Check VARSIZE() of each variable-length datum.  Corrupt lengths might direct
>   you to seek past the end of the tuple, or they might imply excess free space
>   at the end of the tuple.
>
> - Verify agreement between CLOG, MULTIXACT, and hint bits.  If the hint bits
>   include HEAP_XMAX_LOCK_ONLY, the multixact should not contain a
>   MultiXactStatusUpdate member.  README.tuplock documents other invariants.
>   If the tool sees a tuple passing HEAP_LOCKED_UPGRADED, it can report that
>   the cluster was binary-upgraded from a version in [8.3, 9.1].
>
> - Verify that TOAST pointers (in non-dropped columns of visible tuples) point
>   to valid data in the TOAST relation.  This is much more expensive than the
>   other checks I've named, so it should be optional.

There is a lot of value in that actually. I got bitten lately by an
issue where this got corrupted because of an incorrect failover flow.

> - If VM_ALL_VISIBLE() or VM_ALL_FROZEN() passes for a particular page, verify
>   that the visibility data stored in the page is compatible with that claim.
>
> - Examine PageHeaderData values.  If pd_checksum is non-zero in a cluster with
>   checksums disabled, the cluster was binary-upgraded from [8.3, 9.2].

Agreed basically on all those items, though the first version does not
need to do everything :)

Honestly just btree checks have enough value to validate the
integration of this module. You can get that as "I'll look at your
patch and provide a review soon because I have a hell lot of cases
where this will be useful".
-- 
Michael



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: amcheck (B-Tree integrity checking tool)
Next
From: Haribabu Kommi
Date:
Subject: Re: macaddr 64 bit (EUI-64) datatype support