Re: gs_group_1 crashing on 13beta2/s390x - Mailing list pgsql-hackers

From Tom Lane
Subject Re: gs_group_1 crashing on 13beta2/s390x
Date
Msg-id 3176347.1594849535@sss.pgh.pa.us
Whole thread Raw
In response to Re: gs_group_1 crashing on 13beta2/s390x  (Christoph Berg <myon@debian.org>)
Responses Re: gs_group_1 crashing on 13beta2/s390x  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Re: gs_group_1 crashing on 13beta2/s390x  (Christoph Berg <myon@debian.org>)
List pgsql-hackers
Christoph Berg <myon@debian.org> writes:
>> On the Debian s390x buildd, the 13beta2 build is crashing:

> I wired gdb into the build process and got this backtrace:

> #0  datumCopy (typByVal=false, typLen=-1, value=0) at ./build/../src/backend/utils/adt/datum.c:142
>         vl = 0x0
>         res = <optimized out>
>         res = <optimized out>
>         vl = <optimized out>
>         eoh = <optimized out>
>         resultsize = <optimized out>
>         resultptr = <optimized out>
>         realSize = <optimized out>
>         resultptr = <optimized out>
>         realSize = <optimized out>
>         resultptr = <optimized out>
> #1  datumCopy (value=0, typByVal=false, typLen=-1) at ./build/../src/backend/utils/adt/datum.c:131
>         res = <optimized out>
>         vl = <optimized out>
>         eoh = <optimized out>
>         resultsize = <optimized out>
>         resultptr = <optimized out>
>         realSize = <optimized out>
>         resultptr = <optimized out>
> #2  0x000002aa04423af8 in finalize_aggregate (aggstate=aggstate@entry=0x2aa05775920,
peragg=peragg@entry=0x2aa056e02f0,resultVal=0x2aa056e0208, resultIsNull=0x2aa056e022a, pergroupstate=<optimized out>,
pergroupstate=<optimizedout>) at ./build/../src/backend/executor/nodeAgg.c:1128 

Hmm.  If gdb isn't lying to us, that has to be coming from here:

    /*
     * If result is pass-by-ref, make sure it is in the right context.
     */
    if (!peragg->resulttypeByVal && !*resultIsNull &&
        !MemoryContextContains(CurrentMemoryContext,
                               DatumGetPointer(*resultVal)))
        *resultVal = datumCopy(*resultVal,
                               peragg->resulttypeByVal,
                               peragg->resulttypeLen);

The line numbers in HEAD are a bit different, but that's the only
call of datumCopy() in finalize_aggregate().

It's hardly surprising that datumCopy would segfault when given
a null "value" and told it is pass-by-reference.  However, to get to
the datumCopy call, we must have passed the MemoryContextContains
check on that very same pointer value, and that would surely have
segfaulted as well, one would think.

Given the apparently-can't-happen situation at the call site,
and the fact that we're not seeing similar failures reported
elsewhere (and note that every line shown above is at least
five years old), I'm kind of forced to the conclusion that this
is a compiler bug.  Does adjusting the -O level make it go away?

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Warn when parallel restoring a custom dump without data offsets
Next
From: David Rowley
Date:
Subject: Re: Generic Index Skip Scan