Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 - Mailing list pgsql-performance

From David Rowley
Subject Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Date
Msg-id CAApHDvrga1FNonW+MfvtuuSASaVwjwJa0_xhB4o6hg59tVFO6A@mail.gmail.com
Whole thread
In response to Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-performance
On Wed, 15 Apr 2026 at 08:42, Jeff Davis <pgsql@j-davis.com> wrote:
> Attached v2-0001 simply tracks the totals for all contexts, and works
> across MemoryContextSetParent(). I was (and still am) slightly worried
> that it will cause a regression, but some basic pgbench runs didn't
> show any slowdown. Can you suggest some settings that might focus more
> on allocation performance?

The patch in [1] might be worth testing. I tried the following on an
AMD Zen 2 machine.

Master:

postgres=# select r,
pg_allocate_memory_test(530000,530000,1024::bigint*1024*1024*1024,'generation')
from generate_series(1,3)r;
 r  | pg_allocate_memory_test
----+-------------------------
  1 |                0.058918
  2 |                0.057592
  3 |                0.055907

With your patch:

postgres=# select r,
pg_allocate_memory_test(530000,530000,1024::bigint*1024*1024*1024,'generation')
from generate_series(1,3)r;
 r  | pg_allocate_memory_test
----+-------------------------
  1 |                0.071015
  2 |                0.071987
  3 |                0.071948

(+24% slower)

I went for 530000 as the allocChunkLimit was 512KB. I believe it's
smaller for aset.c, which might mean a bigger overhead for that
context type.

> Assuming we only want to track for a subtree, then the two approaches
> suggested are:
>
> 1. Have some "invalid" marker that only tracks the memory upward as far
> as it's needed/enabled, then stops. This adds a bit of complexity
> because:
>   a. to enable it after the context is created, we need a new memory
> context method to give the current allocated total so we can properly
> initialize the size
>   b. when doing MemoryContextSetParent(), we need to subtract from the
> old subtree and add to the new subtree, and if the current subtree
> isn't already tracked, then we need to recalculate
>
> 2. The memory pools as you suggest. The complexity here is:
>   a. in MemoryContextSetParent() for the same reason as above
>   b. mixing subtrees gets messy, and I'm not sure we can just disallow
> that. Does that mean MemoryContextSetParent() would just fail?

I was maybe wrong about just not bothering to handle
MemoryContextSetParent(), but I'm not all that sure where the
complexity is. Shouldn't it just be a matter of:

If the context has a MemoryPool set, check if the parent has one too,
   if not just swap parents out as the pool belongs to the context
that's changing parent.
   Else, gather memory totals for the swapping context and subtract
from the MemoryPool, set the context being reparented's pool to NULL
and change parent.
else (no pool is set), just swap parent... I think.

I think there might also need to be a check to see if the new parent
has a pool and ERROR if it does. Maybe that's the messy part?

>   c. joining in to an existing pool -- not terribly complex, but adds
> to the API
>
> I think if we are taking on the complexity with memory pools, we should
> do so with the idea that they'll be useful beyond just optimizing the
> size calculation. For instance, carrying around information that it's
> part of a "work mem" context, which can do things like limit the max
> block size.

You might be right.  It would be interesting to know if those
overheads are lower with the MemoryPool idea. I suspect it'll be quite
a bit less overhead as there's no pointer chasing when the MemoryPool
isn't set, and it should be very fast to check that as we have to
access the MemoryContext's mem_allocated field anyway, so we could
expect that the cacheline for that will be loaded, or will be about to
get loaded anyway, depending on which order you do the accounting in.

David

[1] https://postgr.es/m/CAApHDvox3Ro8mZJxignuyB-dGXJ9=wQNEkOFni9025GP=rOKkg@mail.gmail.com



pgsql-performance by date:

Previous
From: Kristjan Mustkivi
Date:
Subject: Re: table bloat very fast and free space can not be reused
Next
From: Rick Otten
Date:
Subject: Re: table bloat very fast and free space can not be reused