Re: Cache invalidation bug in RelationGetIndexAttrBitmap() - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Cache invalidation bug in RelationGetIndexAttrBitmap()
Date
Msg-id 20140514202915.GJ23943@awork2.anarazel.de
Whole thread Raw
In response to Re: Cache invalidation bug in RelationGetIndexAttrBitmap()  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: Cache invalidation bug in RelationGetIndexAttrBitmap()
List pgsql-hackers
Hi,

On 2014-05-14 21:04:41 +0200, Tomas Vondra wrote:
> On 14.5.2014 17:52, Andres Freund wrote:
> > On 2014-05-14 15:17:39 +0200, Andres Freund wrote:
> >> On 2014-05-14 15:08:08 +0200, Tomas Vondra wrote:
> >>> Apparently there's something wrong with 'test-decoding-check':
> >>
> >> Man. I shouldn't have asked... My code. There's some output in there
> >> that's probably triggered by the extraordinarily long runtimes, but
> >> there's definitely something else wrong.
> >> My gut feeling says it's in RelationGetIndexList().
> > 
> > Nearly right. It's in RelationGetIndexAttrBitmap(). Fix attached.
> > 
> > Tomas, thanks for that. I've never (and probably will never) run
> > CLOBBER_CACHE_RECURSIVELY during development. Having a machine do that
> > regularly is really helpful. How long does a single testrun take? It
> > takes hundreds of seconds here to do a single UPDATE?
> 
> Don't know yet, as it fails at the beginning.

test decoding is at the beginning? That's somewhat odd?

> But I suppose it will be
> tens or possibly hundreds of hours. For example these are the logs from
> regular build (no clobber etc.)

>     May 14 19:00 SCM-checkout.log
>     May 14 19:00 githead.log
>     May 14 19:00 configure.log
>     May 14 19:00 config.log
>     May 14 19:05 make.log
>     May 14 19:05 check.log
>     May 14 19:06 make-contrib.log
>     May 14 19:06 make-install.log
>     May 14 19:06 install-contrib.log
>     May 14 19:07 check-pg_upgrade.log
>     May 14 19:08 test-decoding-check.log
> 
> while these are the logs from recursive clobber:
> 
>     May 14 00:19 SCM-checkout.log
>     May 14 00:20 configure.log
>     May 14 00:20 config.log
>     May 14 00:26 make.log
>     May 14 03:12 check.log
>     May 14 03:13 make-contrib.log
>     May 14 03:13 make-install.log
>     May 14 03:13 install-contrib.log
>     May 14 08:25 check-pg_upgrade.log
>     May 14 09:07 test-decoding-check.log
>     May 14 09:07 web-txn.data
> 
> 
> So with the regular build, it took <1 minute to do 'make check' and ~1
> minute to test pg_upgrade, with recursive clobber it takes ~3 hours and
> ~5 hours. That's a factor of ~300, although it's a very rough
> estimate.

I seriously doubt that's recursive clobber. That should take *way* much
longer. And indeed you have:

> -DCLOBBER_CACHE_ALWAYS -DCLOBBER_FREED_MEMORY -DMEMORY_CONTEXT_CHECKING
> -DRANDOMIZE_ALLOCATED_MEMORY -DCLOBBER_CACHE_RECURSIVELY
> 
> it does not happen with
> 
> CPPFLAGS => '-DCLOBBER_CACHE_ALWAYS -DCLOBBER_FREED_MEMORY
> -DMEMORY_CONTEXT_CHECKING -DRANDOMIZE_ALLOCATED_MEMORY',

#if defined(CLOBBER_CACHE_ALWAYS){    static bool in_recursion = false;
    if (!in_recursion)    {        in_recursion = true;        InvalidateSystemCaches();        in_recursion = false;
}}
 
#elif defined(CLOBBER_CACHE_RECURSIVELY)InvalidateSystemCaches();
#endif

i.e. you can't specifiy -DCLOBBER_CACHE_ALWAYS and
-DCLOBBER_CACHE_RECURSIVELY together. The former will take precedence.

> Without clobber the whole run (for a "C" locale) takes ~10 minutes, so
> my estimate is ~50 hours for the recursive one. But I wouldn't be
> surprised by 100 hours.

I'm afraid it's more in the year range from what i've seen. I.e. not
practical.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: SKIP LOCKED DATA (work in progress)
Next
From: Tomas Vondra
Date:
Subject: Re: Cache invalidation bug in RelationGetIndexAttrBitmap()