Re: pg_amcheck contrib application - Mailing list pgsql-hackers

From Robert Haas
Subject Re: pg_amcheck contrib application
Date
Msg-id CA+Tgmob=vyOn=iQRd18Y=EDXC8-KtUUXbznGKjFdhOBmuaP-rg@mail.gmail.com
Whole thread Raw
In response to Re: pg_amcheck contrib application  (Mark Dilger <mark.dilger@enterprisedb.com>)
Responses Re: pg_amcheck contrib application  (Mark Dilger <mark.dilger@enterprisedb.com>)
List pgsql-hackers
On Fri, Apr 9, 2021 at 2:50 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> I think #4, above, requires some clarification.  If there are missing chunks, the very definition of how large we
expectsubsequent chunks to be is ill-defined.  I took a fairly conservative approach to avoid lots of bogus complaints
aboutchunks that are of unexpected size.   Not all such complaints are removed, but enough are removed that I needed to
adda final complaint at the end about the total size seen not matching the total size expected. 

My instinct is to suppose that the size that we expect for future
chunks is independent of anything being wrong with previous chunks. So
if each chunk is supposed to be 2004 bytes (which probably isn't the
real number) and the value is 7000 bytes long, we expect chunks 0-2 to
be 2004 bytes each, chunk 3 to be 988 bytes, and chunk 4 and higher to
not exist. If chunk 1 happens to be missing or the wrong length or
whatever, our expectations for chunks 2 and 3 are utterly unchanged.

> Corruption #1:
>
>         UPDATE $toastname SET chunk_seq = chunk_seq + 1000
>
> Before:
>
> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
> #     toast value 16445 chunk 0 has sequence number 1000, but expected sequence number 0
> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
> #     toast value 16445 chunk 1 has sequence number 1001, but expected sequence number 1
> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
> #     toast value 16445 chunk 2 has sequence number 1002, but expected sequence number 2
> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
> #     toast value 16445 chunk 3 has sequence number 1003, but expected sequence number 3
> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
> #     toast value 16445 chunk 4 has sequence number 1004, but expected sequence number 4
> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
> #     toast value 16445 chunk 5 has sequence number 1005, but expected sequence number 5
>
> After:
>
> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
> #     toast value 16445 missing chunks 0 through 999

Applying the above principle would lead to complaints that chunks 0-5
are missing, and 1000-1005 are extra.

> Corruption #2:
>
>         UPDATE $toastname SET chunk_seq = chunk_seq * 1000

Similarly here, except the extra chunk numbers are different.

> Corruption #3:
>
>         UPDATE $toastname SET chunk_id = (chunk_id::integer + 10000000)::oid WHERE chunk_seq = 3

And here we'd just get a complaint that chunk 3 is missing.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: WIP: WAL prefetch (another approach)
Next
From: Tomas Vondra
Date:
Subject: Re: Processing btree walks as a batch to parallelize IO