Hi,
On 2020-01-14 17:54:16 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2020-01-14 17:01:01 -0500, Tom Lane wrote:
> >> But I agree that not checking null-ness
> >> explicitly is kind of unsafe. We've never before had any expectation
> >> that the Datum value of a null is anything in particular.
>
> > I'm still not sure I actually fully understand the bug. It's obvious how
> > returning the input value again could lead to memory not being freed (so
> > that leak seems to go all the way back). And similarly, since the
> > introduction of expanded objects, it can also lead to the expanded
> > object not being deleted.
> > But that's not the problem causing the crash here. What I think must
> > instead be the problem is that pergroupstate->transValueIsNull, but
> > pergroupstate->transValue is set to something looking like a
> > pointer. Which caused us not to datumCopy() a new transition value into
> > a long lived context. and then a later transition causes us to free the
> > short-lived value?
>
> Yeah, I was kind of wondering that too. While formally the Datum value
> for a null is undefined, I'm not aware offhand of any functions that
> wouldn't return zero --- and this would have to be an aggregate transition
> function doing so, which reduces the universe of candidates quite a lot.
> Plus there's the question of how often a transition function would return
> null for non-null input at all.
>
> Could we see a test case that provokes this crash, even if it doesn't
> do so reliably?
There's a larger reproducer referenced in the first message. I had hoped
that Teodor could narrow it down - I guess I'll try to do that tomorrow...
Greetings,
Andres Freund