Re: pg_amcheck contrib application - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: pg_amcheck contrib application
Date
Msg-id 7D3B9BF6-50D0-4C30-8506-1C1851C7F96F@enterprisedb.com
Whole thread Raw
In response to Re: pg_amcheck contrib application  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_amcheck contrib application
List pgsql-hackers

> On Apr 9, 2021, at 1:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Apr 9, 2021 at 2:50 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>> I think #4, above, requires some clarification.  If there are missing chunks, the very definition of how large we
expectsubsequent chunks to be is ill-defined.  I took a fairly conservative approach to avoid lots of bogus complaints
aboutchunks that are of unexpected size.   Not all such complaints are removed, but enough are removed that I needed to
adda final complaint at the end about the total size seen not matching the total size expected. 
>
> My instinct is to suppose that the size that we expect for future
> chunks is independent of anything being wrong with previous chunks. So
> if each chunk is supposed to be 2004 bytes (which probably isn't the
> real number) and the value is 7000 bytes long, we expect chunks 0-2 to
> be 2004 bytes each, chunk 3 to be 988 bytes, and chunk 4 and higher to
> not exist. If chunk 1 happens to be missing or the wrong length or
> whatever, our expectations for chunks 2 and 3 are utterly unchanged.

Fair enough.

>> Corruption #1:
>>
>>        UPDATE $toastname SET chunk_seq = chunk_seq + 1000
>>
>> Before:
>>
>> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
>> #     toast value 16445 chunk 0 has sequence number 1000, but expected sequence number 0
>> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
>> #     toast value 16445 chunk 1 has sequence number 1001, but expected sequence number 1
>> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
>> #     toast value 16445 chunk 2 has sequence number 1002, but expected sequence number 2
>> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
>> #     toast value 16445 chunk 3 has sequence number 1003, but expected sequence number 3
>> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
>> #     toast value 16445 chunk 4 has sequence number 1004, but expected sequence number 4
>> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
>> #     toast value 16445 chunk 5 has sequence number 1005, but expected sequence number 5
>>
>> After:
>>
>> # heap table "postgres"."public"."test", block 0, offset 2, attribute 2:
>> #     toast value 16445 missing chunks 0 through 999
>
> Applying the above principle would lead to complaints that chunks 0-5
> are missing, and 1000-1005 are extra.

That sounds right.  It now reports:

# heap table "postgres"."public"."test", block 0, offset 16, attribute 2:
#     toast value 16459 missing chunks 0 through 4 with expected size 1996 and chunk 5 with expected size 20
# heap table "postgres"."public"."test", block 0, offset 16, attribute 2:
#     toast value 16459 unexpected chunks 1000 through 1004 each with size 1996 followed by chunk 1005 with size 20


>> Corruption #2:
>>
>>        UPDATE $toastname SET chunk_seq = chunk_seq * 1000
>
> Similarly here, except the extra chunk numbers are different.

It now reports:

# heap table "postgres"."public"."test", block 0, offset 17, attribute 2:
#     toast value 16460 missing chunks 1 through 4 with expected size 1996 and chunk 5 with expected size 20
# heap table "postgres"."public"."test", block 0, offset 17, attribute 2:
#     toast value 16460 unexpected chunk 1000 with size 1996
# heap table "postgres"."public"."test", block 0, offset 17, attribute 2:
#     toast value 16460 unexpected chunk 2000 with size 1996
# heap table "postgres"."public"."test", block 0, offset 17, attribute 2:
#     toast value 16460 unexpected chunk 3000 with size 1996
# heap table "postgres"."public"."test", block 0, offset 17, attribute 2:
#     toast value 16460 unexpected chunk 4000 with size 1996
# heap table "postgres"."public"."test", block 0, offset 17, attribute 2:
#     toast value 16460 unexpected chunk 5000 with size 20

I don't see any good way in this case to report the extra chunks in one row, as in the general case there could be
arbitrarilymany of them, with the message text getting arbitrarily large if it reported the chunks as "chunks 1000,
2000,3000, 4000, 5000, ...".  I don't expect this sort of corruption being particularly common, though, so I'm not too
botheredabout it. 

>
>> Corruption #3:
>>
>>        UPDATE $toastname SET chunk_id = (chunk_id::integer + 10000000)::oid WHERE chunk_seq = 3
>
> And here we'd just get a complaint that chunk 3 is missing.

It now reports:

# heap table "postgres"."public"."test", block 0, offset 18, attribute 2:
#     toast value 16461 missing chunk 3 with expected size 1996
# heap table "postgres"."public"."test", block 0, offset 18, attribute 2:
#     toast value 16461 was expected to end at chunk 6 with total size 10000, but ended at chunk 5 with total size 8004

It sounds like you weren't expecting the second of these reports.  I think it is valuable, especially when there are
multiplemissing chunks and multiple extraneous chunks, as it makes it easier for the user to reconcile the missing
chunksagainst the extraneous chunks. 



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Andres Freund
Date:
Subject: Re: [PATCH] Identify LWLocks in tracepoints