On 2017-03-02 04:47:13 +0100, Tomas Vondra wrote:
> On 03/01/2017 11:55 PM, Andres Freund wrote:
> > On 2017-02-28 20:29:36 -0800, Andres Freund wrote:
> > > On 2017-02-28 20:18:35 -0800, Andres Freund wrote:
> > > > - Andres, hoping the buildfarm turns greener
> > >
> > > Oh well, that didn't work. Investigating.
> >
> > The fix for that was fairly trivial, and the buildfarm has cooled down.
> >
> > The issue was that on 32bit platforms the Datum returned by some
> > functions (int2int4_sum in this case) isn't actually a separately
> > allocated Datum, but rather just something embedded in a larger
> > struct. That, combined with the following code:
> > if (!peraggstate->resulttypeByVal && !*isnull &&
> > !MemoryContextContains(CurrentMemoryContext,
> > DatumGetPointer(*result)))
> > seems somewhat problematic to me. MemoryContextContains() can give
> > false positives when used on memory that's not a distinctly allocated
> > chunk, and if so, we violate memory lifetime rules. It's quite
> > unlikely, given the required bit patterns, but nonetheless it's making
> > me somewhat uncomfortable.
> >
>
> I assume you're only using that code snippet as an example of code that
> might be broken by MemoryContextContains() false positives, right?
I'm mentioning that piece of code because it's what temporarily caused
all 32bit animals to fail, when I had made MemoryContextContains() less
forgiving.
> (I don't see how the slab allocator could interfere with aggregates, as it's
> only used for reorderbuffer.c).
Indeed, this is independent of slab.c. I just came across it because I
triggered crashes when shrinking the StandardChunkHeader to be just the
chunk's MemoryContext.
> > Do others think this isn't an issue and we can just live with it?
> >
>
> My understanding is all the places calling MemoryContextContains() assume
> they can't receive memory not allocated as a simple chunk by palloc(). If
> that's not the case, it's likely broken.
Yea, that's my conclusion too. Which means nodeAgg.c and nodeWindowAgg.c
are broken afaics, because of e.g. int2int4_sum's() use of
Int64GetDatumFast() on sub-parts of larger allocations.
- Andres