Re: new heapcheck contrib module - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: new heapcheck contrib module
Date
Msg-id 2A7DA1A8-C4AA-43DF-A985-3CA52F4DC775@enterprisedb.com
Whole thread Raw
In response to Re: new heapcheck contrib module  (Mark Dilger <mark.dilger@enterprisedb.com>)
Responses Re: new heapcheck contrib module  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers

> On Oct 22, 2020, at 9:01 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
>
>
>> On Oct 22, 2020, at 7:06 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Thu, Oct 22, 2020 at 8:51 AM Robert Haas <robertmhaas@gmail.com> wrote:
>>> Committed. Let's see what the buildfarm thinks.
>>
>> It is mostly happy, but thorntail is not:
>>
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2020-10-22%2012%3A58%3A11
>>
>> I thought that the problem might be related to the fact that thorntail
>> is using force_parallel_mode, but I tried that here and it did not
>> cause a failure. So my next guess is that it is related to the fact
>> that this is a sparc64 machine, but it's hard to tell, since none of
>> the other sparc64 critters have run yet. In any case I don't know why
>> that would cause a failure. The messages in the log aren't very
>> illuminating, unfortunately. :-(
>>
>> Mark, any ideas what might cause specifically that set of tests to fail?
>
> The code is correctly handling an uncorrupted table, but then more or less randomly failing some of the time when
processinga corrupt table. 
>
> Tom identified a problem with an uninitialized variable.  I'm putting together a new patch set to address it.

The 0001 attached patch addresses the -Werror=maybe-uninitialized problem.

The 0002 attached patch addresses the test failures:

The failing test is designed to stop the server, create blunt force trauma to the heap and toast files through
overwritinggarbage bytes, restart the server, and verify that corruption is detected by amcheck's verify_heapam().  The
exacttrauma is intended to be the same on all platforms, in terms of the number of bytes written and the location in
thefile that it gets written, but owing to differences between platforms, by design the test does not expect a
particularcorruption message. 

The test was overwriting far fewer bytes than I had intended, but since it was still sufficient to create corruption on
theplatforms where I tested, I failed to notice.  It should do a more thorough job now. 



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Next
From: Tom Lane
Date:
Subject: Re: Mop-up around psql's \connect behavior