Re: Cache invalidation bug in RelationGetIndexAttrBitmap() - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Cache invalidation bug in RelationGetIndexAttrBitmap()
Date
Msg-id 5373BE49.3050406@fuzzy.cz
Whole thread Raw
In response to Cache invalidation bug in RelationGetIndexAttrBitmap()  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Cache invalidation bug in RelationGetIndexAttrBitmap()
List pgsql-hackers
On 14.5.2014 17:52, Andres Freund wrote:
> On 2014-05-14 15:17:39 +0200, Andres Freund wrote:
>> On 2014-05-14 15:08:08 +0200, Tomas Vondra wrote:
>>> Apparently there's something wrong with 'test-decoding-check':
>>
>> Man. I shouldn't have asked... My code. There's some output in there
>> that's probably triggered by the extraordinarily long runtimes, but
>> there's definitely something else wrong.
>> My gut feeling says it's in RelationGetIndexList().
> 
> Nearly right. It's in RelationGetIndexAttrBitmap(). Fix attached.
> 
> Tomas, thanks for that. I've never (and probably will never) run
> CLOBBER_CACHE_RECURSIVELY during development. Having a machine do that
> regularly is really helpful. How long does a single testrun take? It
> takes hundreds of seconds here to do a single UPDATE?

Don't know yet, as it fails at the beginning. But I suppose it will be
tens or possibly hundreds of hours. For example these are the logs from
regular build (no clobber etc.)
   May 14 19:00 SCM-checkout.log   May 14 19:00 githead.log   May 14 19:00 configure.log   May 14 19:00 config.log
May14 19:05 make.log   May 14 19:05 check.log   May 14 19:06 make-contrib.log   May 14 19:06 make-install.log   May 14
19:06install-contrib.log   May 14 19:07 check-pg_upgrade.log   May 14 19:08 test-decoding-check.log
 

while these are the logs from recursive clobber:
   May 14 00:19 SCM-checkout.log   May 14 00:20 configure.log   May 14 00:20 config.log   May 14 00:26 make.log   May
1403:12 check.log   May 14 03:13 make-contrib.log   May 14 03:13 make-install.log   May 14 03:13 install-contrib.log
May14 08:25 check-pg_upgrade.log   May 14 09:07 test-decoding-check.log   May 14 09:07 web-txn.data
 


So with the regular build, it took <1 minute to do 'make check' and ~1
minute to test pg_upgrade, with recursive clobber it takes ~3 hours and
~5 hours. That's a factor of ~300, although it's a very rough estimate.

Without clobber the whole run (for a "C" locale) takes ~10 minutes, so
my estimate is ~50 hours for the recursive one. But I wouldn't be
surprised by 100 hours.

> 
> There were some more differences but those are all harmless and caused
> by the extraordinarily long runtime (autovacuums). I think we need to
> add a feature to test_decoding to suppress displaying transactions
> without changes. Ick.
> 

I expect to hit more timing-related issues with the recursive clobber
tests - not necessarily in the code/tests itself, but I guess the
buildfarm tooling doesn't really expect runs that long.

regards
Tomas



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: 9.4 release notes
Next
From: Thomas Munro
Date:
Subject: Re: SKIP LOCKED DATA (work in progress)