Re: Incorrect CHUNKHDRSZ in nodeAgg.c - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Incorrect CHUNKHDRSZ in nodeAgg.c
Date
Msg-id 160825.1735778003@sss.pgh.pa.us
Whole thread Raw
In response to Re: Incorrect CHUNKHDRSZ in nodeAgg.c  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Incorrect CHUNKHDRSZ in nodeAgg.c
List pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 2 Jan 2025 at 12:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I thought for a bit about whether we shouldn't try to account for
>> palloc power-of-2-block-size overhead here.  That omission would
>> typically be a far larger error than the one you are fixing.  However,
>> given that the inputs to hash_agg_entry_size are only estimates,
>> I'm not sure that we can hope to do better than the current behavior.

> Likely the most correct way would be to use GetMemoryChunkSpace(), but
> there might be some additional overhead to consider there.

Nah, you've got the wrong mental model.  hash_agg_entry_size is
trying to predict the average hash entry size in advance of seeing
any actual data, so that we can estimate how many entries will fit
in work_mem.  By the time we can use GetMemoryChunkSpace on an
actual entry, it's too late for that.

>> Could we use a generation or even bump context?

> Bump wouldn't work due to the SH_FREE() in SH_GROW() when resizing the
> table.

Meh.  I guess we'd have to keep that structure in a context separate
from the tuples.  Might not be worth the trouble.

> I think what would be more interesting is seeing if we can store the
> TupleHashEntryData.firstTuple in a bump context.

Are you saying the same as above, or something different?

            regards, tom lane



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Incorrect CHUNKHDRSZ in nodeAgg.c
Next
From: David Rowley
Date:
Subject: Re: Incorrect CHUNKHDRSZ in nodeAgg.c