Home > mailing lists

Re: Incorrect CHUNKHDRSZ in nodeAgg.c - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Incorrect CHUNKHDRSZ in nodeAgg.c
Date	January 2 03:23:54
Msg-id	CAApHDvozg3jDN4oVYvtpA91P1DA50gdRZ5071SnPg_24oL5PbA@mail.gmail.com Whole thread Raw
In response to	Re: Incorrect CHUNKHDRSZ in nodeAgg.c (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Incorrect CHUNKHDRSZ in nodeAgg.c Re: Incorrect CHUNKHDRSZ in nodeAgg.c
List	pgsql-hackers

Tree view

On Thu, 2 Jan 2025 at 12:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I thought for a bit about whether we shouldn't try to account for
> palloc power-of-2-block-size overhead here.  That omission would
> typically be a far larger error than the one you are fixing.  However,
> given that the inputs to hash_agg_entry_size are only estimates,
> I'm not sure that we can hope to do better than the current behavior.

Likely the most correct way would be to use GetMemoryChunkSpace(), but
there might be some additional overhead to consider there. I looked at
[1] and didn't see any mention of using that function for this
purpose. There likely is a small overhead to doing so, which is
something to consider.

> Should tuple hash tables be using a different memory context type
> that doesn't impose that power-of-2 overhead?  It's only useful
> when we expect a fair amount of pfree-and-recycle behavior, but
> I think we don't have much retail entry removal in tuple hash
> tables.  Could we use a generation or even bump context?

Bump wouldn't work due to the SH_FREE() in SH_GROW() when resizing the
table. If sizeof(TupleHashEntryData) were a power-of-two, then there'd
be no wastage as the hash table always has a power-of-two bucket count
and two powers-of-two multiplied are always a power-of-two value.
Unfortunately, TupleHashEntryData is 24 bytes and I don't see any easy
way to shrink it to 16 bytes.

I think what would be more interesting is seeing if we can store the
TupleHashEntryData.firstTuple in a bump context. I'd need to check if
we ever pfree() those minimal tuples or if we just throw the entire
batch of tuples away with a context reset. Since these minimal tuples
only contain the GROUP BY columns, for most cases they should be very
narrow tuples indeed, so not having a MemoryChunk header would save
quite a bit of memory in many cases. I think that's basically
1083f94da plus 6ed83d5fa for nodeAgg.c.

David

[1]
https://www.postgresql.org/message-id/flat/20200325220936.il3ni2fj2j2b45y5%40alap3.anarazel.de#78592829b675371eadf592a99897bcb3

pgsql-hackers by date:

From: Peter Smith
Date: 02 January, 03:13:38
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Tom Lane
Date: 02 January, 03:33:23
Subject: Re: Incorrect CHUNKHDRSZ in nodeAgg.c

Re: Incorrect CHUNKHDRSZ in nodeAgg.c - Mailing list pgsql-hackers

Previous

Next