Re: BUG #19078: Segfaults in tts_minimal_store_tuple() following pg_upgrade - Mailing list pgsql-bugs

From Jeff Davis
Subject Re: BUG #19078: Segfaults in tts_minimal_store_tuple() following pg_upgrade
Date
Msg-id 514ea5ef5120c380856226ec98951a55ef6f40c3.camel@j-davis.com
Whole thread Raw
In response to Re: BUG #19078: Segfaults in tts_minimal_store_tuple() following pg_upgrade  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: BUG #19078: Segfaults in tts_minimal_store_tuple() following pg_upgrade
List pgsql-bugs
On Thu, 2025-10-16 at 11:51 +1300, David Rowley wrote:
> I forgot to mention, this isn't the same thing as the
> tts_minimal_store_tuple() issue you first reported, so if there is a
> problem there, this one has nothing to do with it.

I investigated, but came up empty so far. Any additional info on the
hashagg crash would be appreciated.

I appended my raw notes below in case someone notices a mistake.

Regards,
    Jeff Davis


Raw notes:

* Somehow entry->firstTuple==0x1b, which is obviously wrong.

* The entry structure lives in the bucket array, allocated in the
metacxt using MCXT_ALLOC_ZERO, so there's no uninitialized memory
floating around in the bucket array.

* The metacxt (aggstate->hash_metacxt) is an AllocSet, and it's never
reset. It contains the bucket array as well as some ExprStates and an
ExprContext for evaluating hash functions.

* Hash entries are never deleted, but between batches the entire hash
table is reset (which memsets the entire bucket array to zero).

* The entry->firstTuple is assigned only in one place, from
ExecCopySlotMinimalTupleExtra(). The 'extra' argument is a multiple of
16.

* ExecCopySlotMinimalTupleExtra() does some interesting pointer math,
but I didn't find any path that could plausibly return something like
0x1b. The memory is allocated with palloc/palloc0, which cannot return
zero, and 0x1b is not a multiple of 16 so seems unrelated to the extra
argument.

* JIT does not seem to be involved, because it's going through
ExecInterpExpr().

* When the hash table grows, it invalidates previously-returned entry
pointers. But, given the site of the crash, I don't see that as a
problem in this case.





pgsql-bugs by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: BUG #19089: Mounting Issue
Next
From: David Rowley
Date:
Subject: Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)