On Thu, 2025-10-16 at 11:51 +1300, David Rowley wrote:
> I forgot to mention, this isn't the same thing as the
> tts_minimal_store_tuple() issue you first reported, so if there is a
> problem there, this one has nothing to do with it.
I investigated, but came up empty so far. Any additional info on the
hashagg crash would be appreciated.
I appended my raw notes below in case someone notices a mistake.
Regards,
Jeff Davis
Raw notes:
* Somehow entry->firstTuple==0x1b, which is obviously wrong.
* The entry structure lives in the bucket array, allocated in the
metacxt using MCXT_ALLOC_ZERO, so there's no uninitialized memory
floating around in the bucket array.
* The metacxt (aggstate->hash_metacxt) is an AllocSet, and it's never
reset. It contains the bucket array as well as some ExprStates and an
ExprContext for evaluating hash functions.
* Hash entries are never deleted, but between batches the entire hash
table is reset (which memsets the entire bucket array to zero).
* The entry->firstTuple is assigned only in one place, from
ExecCopySlotMinimalTupleExtra(). The 'extra' argument is a multiple of
16.
* ExecCopySlotMinimalTupleExtra() does some interesting pointer math,
but I didn't find any path that could plausibly return something like
0x1b. The memory is allocated with palloc/palloc0, which cannot return
zero, and 0x1b is not a multiple of 16 so seems unrelated to the extra
argument.
* JIT does not seem to be involved, because it's going through
ExecInterpExpr().
* When the hash table grows, it invalidates previously-returned entry
pointers. But, given the site of the crash, I don't see that as a
problem in this case.