Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 - Mailing list pgsql-performance

From David Rowley
Subject Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Date
Msg-id CAApHDvo9=fmTwHkw63CU8FJooH8AWFP0RXzHe3X0S2Hr3OL8KA@mail.gmail.com
Whole thread
In response to Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
List pgsql-performance
On Sat, 4 Apr 2026 at 08:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Jeff Davis <pgsql@j-davis.com> writes:
> > One idea would be to update parent contexts' memory totals recursively
> > each time a subcontext allocates a new block. Block allocations are
> > infrequent enough that may be acceptable.
>
> > If we are worried about affecting unrelated cases, we could set an
> > "accounting_enabled" flag for the contexts we care about, which would
> > be automatically inherited by subcontexts, and then stop recursing up
> > when that flag is false.
>
> Yeah, I was speculating about similar ideas.  Since mem_allocated
> is only changed after a malloc() or free() call, it probably
> wouldn't add too much overhead to propagate that up to parent
> contexts.  I agree with having a flag to prevent the propagation
> from going up further than we actually care about, though.
>
> Would it make sense to accumulate those values in a separate field
> child_mem_allocated, rather than redefining what mem_allocated
> means?

A slight variation on this that I was thinking of would be to
introduce a MemoryPool struct that could be tagged onto a
MemoryContext which contains a pool_limit. A child MemoryContext
would, by default, inherit its parent's MemoryPool. On malloc/free, if
the owning context has a non-null MemoryPool, the MemoryPool's
memory_allocated is updated. At a safe point in nodeAgg.c, we'd check
if the pool limit has been reached. I assume there's some simple
inline function that just checks if memory_allocated is greater than
pool_limit. Doing it this way would mean there's no need to
recursively propagate the mentioned child_mem_allocated field up the
hierarchy, as there is only a single field to update if the MemoryPool
field is set.

David



pgsql-performance by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Next
From: Tomas Vondra
Date:
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17