Re: BUG #17619: AllocSizeIsValid violation in parallel hash join - Mailing list pgsql-bugs

From Peter Geoghegan
Subject Re: BUG #17619: AllocSizeIsValid violation in parallel hash join
Date
Msg-id CAH2-WzknY8v7CWP-qZB071_i-QP7MpTYFxAkrmR_r3LZRc9CLQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17619: AllocSizeIsValid violation in parallel hash join  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #17619: AllocSizeIsValid violation in parallel hash join
List pgsql-bugs
On Tue, Sep 27, 2022 at 9:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> That scares me too, but how come things aren't falling over
> routinely?  Can we even make a test case where it breaks?

I'm not completely sure, but I think that the explanation might just
be that the memory is likely to be "zero initialized" in practice.
Even when it isn't we're still only talking about instrumentation
counters that start out with garbage values -- so nothing truly
critical.

My main concern is the big picture, and not so much these specific
lapses. Less benign crash bugs like the one fixed by commit 662ba729
are much less likely with proper testing.

> I think I'd personally prefer to treat such memory more like we
> treat palloc'd memory, ie there's *not* a guarantee of zero
> initialization and indeed testing builds intentionally clobber it.

Isn't that already how it works? The problem is that it's not
particularly clear that that's how it works right now. And that the
dynamic shared memory stuff isn't tested via the same techniques that
we use for palloc.

We could more or less apply the same techniques and expect the same
good results, but we don't do that right now.

-- 
Peter Geoghegan



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #17619: AllocSizeIsValid violation in parallel hash join
Next
From: Tom Lane
Date:
Subject: Re: BUG #17619: AllocSizeIsValid violation in parallel hash join