Re: pg_amcheck contrib application - Mailing list pgsql-hackers
From | Mark Dilger |
---|---|
Subject | Re: pg_amcheck contrib application |
Date | |
Msg-id | 4F10540A-9022-49EE-A6FC-7FE90797793E@enterprisedb.com Whole thread Raw |
Responses |
Re: pg_amcheck contrib application
Re: pg_amcheck contrib application |
List | pgsql-hackers |
> On Mar 16, 2021, at 12:52 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Mar 15, 2021 at 10:10 PM Mark Dilger > <mark.dilger@enterprisedb.com> wrote: >> It is unfortunate that the failing test only runs pg_amcheck after creating numerous corruptions, as we can't know ifpg_amcheck would have complained about pg_statistic before the corruptions were created in other tables, or if it onlydoes so after. The attached patch v7-0003 adds a call to pg_amcheck after all tables are created and populated, butbefore any corruptions are caused. This should help narrow down what is happening, and doesn't hurt to leave in placelong-term. >> >> I don't immediately see anything wrong with how pg_statistic uses a pseudo-type, but it leads me to want to poke a bitmore at pg_statistic on hornet and tern, though I don't have any regression tests specifically for doing so. >> >> Tests v7-0001 and v7-0002 are just repeats of the tests posted previously. > > Since we now know that shutting autovacuum off makes the problem go > away, I don't see a reason to commit 0001. We should fix pg_amcheck > instead, if, as presently seems to be the case, that's where the > problem is. If you get unlucky, autovacuum will process one of the tables that the test intentionally corrupted, with bad consequences,ultimately causing build farm intermittent test failures. We could wait to see if this ever happens beforefixing it, if you prefer. I'm not sure what that buys us, though. The right approach, I think, is to extend the contrib/amcheck tests to include regressing this particular case to see ifit fails. I've written that test and verified that it fails against the old code and passes against the new. > I just committed 0002. Thanks! > I think 0003 is probably a good idea, but I haven't committed it yet. It won't do anything for us in this particular case, but it might make debugging failures quicker in the future. > As for 0004, it seems to me that we might want to do a little more > rewording of these messages and perhaps we should try to do it all at > once. Like, for example, your other change to print out the toast > value ID seems like a good idea, and could apply to any new messages > as well as some existing ones. Maybe there are also more fields in the > TOAST pointer for which we could add checks. Of the toast pointer fields: int32 va_rawsize; /* Original data size (includes header) */ int32 va_extsize; /* External saved size (doesn't) */ Oid va_valueid; /* Unique ID of value within TOAST table */ Oid va_toastrelid; /* RelID of TOAST table containing it */ all seem worth getting as part of any toast error message, even if these fields themselves are not corrupt. It just makesit easier to understand the context of the error you're looking at. At first I tried putting these into each message,but it is very wordy to say things like "toast pointer with rawsize %u and extsize %u pointing at relation with oid%u" and such. It made more sense to just add these four fields to the verify_heapam tuple format. That saves puttingthem in the message text itself, and has the benefit that you could filter the rows coming from verify_heapam() forones where valueid is or is not null, for example. This changes the external interface of verify_heapam, but I didn'tbother with a amcheck--1.3--1.4.sql because amcheck--1.2--1.3. sql was added as part of the v14 development work andhas not yet been released. My assumption is that I can just change it, rather than making a new upgrade file. These patches fix the visibility rules and add extra toast checking. Some of the previous patch material is not included,since it is not clear to me if you wanted to commit any of it. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: