Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Date
Msg-id 7235b808-ae18-466b-adaf-6db41314ae8c@vondra.me
Whole thread Raw
In response to Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
List pgsql-performance
On 4/4/26 02:21, David Rowley wrote:
> On Sat, 4 Apr 2026 at 08:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Jeff Davis <pgsql@j-davis.com> writes:
>>> One idea would be to update parent contexts' memory totals recursively
>>> each time a subcontext allocates a new block. Block allocations are
>>> infrequent enough that may be acceptable.
>>
>>> If we are worried about affecting unrelated cases, we could set an
>>> "accounting_enabled" flag for the contexts we care about, which would
>>> be automatically inherited by subcontexts, and then stop recursing up
>>> when that flag is false.
>>
>> Yeah, I was speculating about similar ideas.  Since mem_allocated
>> is only changed after a malloc() or free() call, it probably
>> wouldn't add too much overhead to propagate that up to parent
>> contexts.  I agree with having a flag to prevent the propagation
>> from going up further than we actually care about, though.
>>
>> Would it make sense to accumulate those values in a separate field
>> child_mem_allocated, rather than redefining what mem_allocated
>> means?
> 
> A slight variation on this that I was thinking of would be to
> introduce a MemoryPool struct that could be tagged onto a
> MemoryContext which contains a pool_limit. A child MemoryContext
> would, by default, inherit its parent's MemoryPool. On malloc/free, if
> the owning context has a non-null MemoryPool, the MemoryPool's
> memory_allocated is updated. At a safe point in nodeAgg.c, we'd check
> if the pool limit has been reached. I assume there's some simple
> inline function that just checks if memory_allocated is greater than
> pool_limit. Doing it this way would mean there's no need to
> recursively propagate the mentioned child_mem_allocated field up the
> hierarchy, as there is only a single field to update if the MemoryPool
> field is set.
> 

This reminds me the discussions in 2022 about having a global memory
limit, and in particular this PoC patch [1] with a MemoryPool doing
roughly what you're describing here (at least I think).

[1]
https://www.postgresql.org/message-id/4fb99fb7-8a6a-2828-dd77-e2f1d75c7dd0%40enterprisedb.com

-- 
Tomas Vondra




pgsql-performance by date:

Previous
From: David Rowley
Date:
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Next
From: Rick Otten
Date:
Subject: Linux 7.0 performance degradation