Thread: Failure on markhor with CLOBBER_CACHE_ALWAYS for test brin

Failure on markhor with CLOBBER_CACHE_ALWAYS for test brin

From
Michael Paquier
Date:
HI all.

markhor has run for the first time in 8 days, and there is something
in range e703261..72dd233 making the regression test of brin crashing.
See here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=markhor&dt=2014-12-30%2020%3A58%3A49
Regards,
-- 
Michael



Re: Failure on markhor with CLOBBER_CACHE_ALWAYS for test brin

From
Alvaro Herrera
Date:
Michael Paquier wrote:
> HI all.
> 
> markhor has run for the first time in 8 days, and there is something
> in range e703261..72dd233 making the regression test of brin crashing.
> See here:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=markhor&dt=2014-12-30%2020%3A58%3A49

This shows that the crash was in the object_address test, not brin.
Will research.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Failure on markhor with CLOBBER_CACHE_ALWAYS for test brin

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Michael Paquier wrote:
> > HI all.
> > 
> > markhor has run for the first time in 8 days, and there is something
> > in range e703261..72dd233 making the regression test of brin crashing.
> > See here:
> > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=markhor&dt=2014-12-30%2020%3A58%3A49
> 
> This shows that the crash was in the object_address test, not brin.
> Will research.

I can reproduce the crash in a CLOBBER_CACHE_ALWAYS build in
the object_address test.  The backtrace is pretty strange:

#0  0x00007f08ce674107 in __GI_raise (sig=sig@entry=6)   at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f08ce6754e8 in __GI_abort () at abort.c:89
#2  0x00000000007ac071 in ExceptionalCondition (   conditionName=conditionName@entry=0x800f28 "!(keylen < 64)",
errorType=errorType@entry=0x7e724f"FailedAssertion",    fileName=fileName@entry=0x800ef0
"/pgsql/source/master/src/backend/access/hash/hashfunc.c",lineNumber=lineNumber@entry=147)   at
/pgsql/source/master/src/backend/utils/error/assert.c:54
#3  0x0000000000494a93 in hashname (fcinfo=fcinfo@entry=0x7fff244324a0)   at
/pgsql/source/master/src/backend/access/hash/hashfunc.c:147
#4  0x00000000007b450d in DirectFunctionCall1Coll (func=0x494a50 <hashname>,    collation=collation@entry=0,
arg1=<optimizedout>)   at /pgsql/source/master/src/backend/utils/fmgr/fmgr.c:1027
 
#5  0x0000000000797aca in CatalogCacheComputeHashValue (cache=cache@entry=0x10367d8,    nkeys=<optimized out>,
cur_skey=cur_skey@entry=0x7fff244328e0)  at /pgsql/source/master/src/backend/utils/cache/catcache.c:212
 
#6  0x0000000000798ff7 in SearchCatCache (cache=0x10367d8, v1=18241016, v2=6, v3=11, v4=0)   at
/pgsql/source/master/src/backend/utils/cache/catcache.c:1149
#7  0x00000000007a67ae in GetSysCacheOid (cacheId=cacheId@entry=15, key1=<optimized out>,    key2=key2@entry=6,
key3=key3@entry=11,key4=key4@entry=0)   at /pgsql/source/master/src/backend/utils/cache/syscache.c:988
 
#8  0x0000000000504699 in get_collation_oid (name=name@entry=0x11655c0,    missing_ok=missing_ok@entry=0 '\000')   at
/pgsql/source/master/src/backend/catalog/namespace.c:3323
#9  0x000000000050d8dc in get_object_address (objtype=objtype@entry=OBJECT_COLLATION,
objname=objname@entry=0x11655c0,objargs=objargs@entry=0x0,    relp=relp@entry=0x7fff24432c28,
lockmode=lockmode@entry=1,   missing_ok=missing_ok@entry=0 '\000')   at
/pgsql/source/master/src/backend/catalog/objectaddress.c:704


-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Failure on markhor with CLOBBER_CACHE_ALWAYS for test brin

From
Andres Freund
Date:
On 2014-12-31 10:02:40 -0300, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> > Michael Paquier wrote:
> > > HI all.
> > > 
> > > markhor has run for the first time in 8 days, and there is something
> > > in range e703261..72dd233 making the regression test of brin crashing.
> > > See here:
> > > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=markhor&dt=2014-12-30%2020%3A58%3A49
> > 
> > This shows that the crash was in the object_address test, not brin.
> > Will research.
> 
> I can reproduce the crash in a CLOBBER_CACHE_ALWAYS build in
> the object_address test.  The backtrace is pretty strange:

Hard to say without more detail, but my guess is that the argument to
get_collation_oid() isn't actually valid. For one, that'd explain the
error, for another, the pointer's value (name=name@entry=0x11655c0) is
suspiciously low.

> #8  0x0000000000504699 in get_collation_oid (name=name@entry=0x11655c0, 
>     missing_ok=missing_ok@entry=0 '\000')
>     at /pgsql/source/master/src/backend/catalog/namespace.c:3323
> #9  0x000000000050d8dc in get_object_address (objtype=objtype@entry=OBJECT_COLLATION, 
>     objname=objname@entry=0x11655c0, objargs=objargs@entry=0x0, 
>     relp=relp@entry=0x7fff24432c28, lockmode=lockmode@entry=1, 
>     missing_ok=missing_ok@entry=0 '\000')
>     at /pgsql/source/master/src/backend/catalog/objectaddress.c:704

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Failure on markhor with CLOBBER_CACHE_ALWAYS for test brin

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-12-31 10:02:40 -0300, Alvaro Herrera wrote:
>> I can reproduce the crash in a CLOBBER_CACHE_ALWAYS build in
>> the object_address test.  The backtrace is pretty strange:

> Hard to say without more detail, but my guess is that the argument to
> get_collation_oid() isn't actually valid. For one, that'd explain the
> error, for another, the pointer's value (name=name@entry=0x11655c0) is
> suspiciously low.

Given that CLOBBER_CACHE_ALWAYS seems to make it fail reliably, the
obvious explanation is that what's being passed is a pointer into
catcache or relcache storage that isn't guaranteed to be valid for
long enough.  The given backtrace doesn't go down far enough to show
where the bogus input came from, but I'm betting that something is
returning to SQL a string it got from cache without pstrdup'ing it.
        regards, tom lane



Re: Failure on markhor with CLOBBER_CACHE_ALWAYS for test brin

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Given that CLOBBER_CACHE_ALWAYS seems to make it fail reliably, the
> obvious explanation is that what's being passed is a pointer into
> catcache or relcache storage that isn't guaranteed to be valid for
> long enough.  The given backtrace doesn't go down far enough to show
> where the bogus input came from, but I'm betting that something is
> returning to SQL a string it got from cache without pstrdup'ing it.

Yep, that was it -- the bug was in getObjectIdentityParts.  I noticed
other three cases of missing pstrdup(), also fixed.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services