Home > mailing lists

Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 - Mailing list pgsql-performance

From	Scott Carey
Subject	Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Date	March 31 20:55:42
Msg-id	CA+vubOE6feEWh7XxoWqdPh634SUVjHKW7s9BgMD755wJTqQFPw@mail.gmail.com Whole thread
In response to	Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 (David Rowley <dgrowleyml@gmail.com>)
Responses	Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
List	pgsql-performance

Tree view

On Tue, Mar 31, 2026 at 5:03 AM David Rowley <dgrowleyml@gmail.com> wrote:

I tried and failed to recreate this locally on 17.9. For me the
json_agg query is slower than array_agg(). I tried making the table
10x bigger and still don't see the same issue. The one with more
work_mem and fewer batches is always faster for me.

Is the machine under a lot of memory pressure and swapping pages to
disk? Maybe you need to consider running a lower work_mem setting. How
much RAM is installed in this machine?

The machine has 768GB of RAM and about 35% CPU used at the time of these tests. It is AlmaLinux 9, with 50GB shared_buffers. OS available memory is > 600GB (filled with pagecache).

The system is under heavy load, with many large sequential scans on 1GB to 80GB tables with OLAP queries and large batch updates at any given time. The system RAM buffers disk access relatively well, there is a constant stream of 100MB/sec to 200MB/sec from disk with bursts to 2000MB/sec off disk (NVMe RAID) from time to time but iowait is generally low (0.2%).

The problem reproduces on my Ubuntu 25.10 laptop at idle with a near empty db with 32MB shared_buffers.

It also reproduces on the read-only streaming standby server which is extremely idle and swimming in just as much RAM..

> set hash_mem_multiplier = 2;
> set work_mem = "100MB";
>
> explain (analyze, buffers) select product_id, array_agg(region_id) from array_agg_test group by product_id;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=4274.00..4771.49 rows=49749 width=40) (actual time=4628.278..4643.765 rows=50000 loops=1)
> Group Key: product_id
> Batches: 1 Memory Usage: 55649kB
> Buffers: shared hit=1274
> -> Seq Scan on array_agg_test (cost=0.00..3274.00 rows=200000 width=16) (actual time=0.030..16.694 rows=200000 loops=1)
> Buffers: shared hit=1274
> Planning Time: 0.067 ms
> Execution Time: 4648.698 ms

> Below, note json_agg does not have this problem:
>
> set hash_mem_multiplier = 2;
> set work_mem = "500MB";
>
> explain (analyze, buffers) select product_id, json_agg(region_id) from array_agg_test group by product_id;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=4274.00..4895.86 rows=49749 width=40) (actual time=110.975..122.781 rows=50000 loops=1)
> Group Key: product_id
> Batches: 1 Memory Usage: 67089kB
> Buffers: shared read=1274
> -> Seq Scan on array_agg_test (cost=0.00..3274.00 rows=200000 width=16) (actual time=0.034..17.659 rows=200000 loops=1)
> Buffers: shared read=1274
> Planning:
> Buffers: shared hit=10
> Planning Time: 0.054 ms
> Execution Time: 124.929 ms

What changed here apart from the aggregate function? Why are the
buffers being read on this run and not the previous? Same machine? Was
there a restart?

I waited a few minutes, it is a busy server. I can run the example back to back witn no significant change other than the buffer hits going up, the pages are in OS page cache if I wait a bit, and even if they are on disk its nVME SSD raid 10 and the table is 'tiny' for this server.

json_agg allocates slightly more memory per agg state than array_agg.
You can see that in the reported Hash Aggregate memory usage and I
expect the actual transition function call between array_agg() and
json_agg() not to differ very much in cost, so it very much feels like
something else is going on here.

There are some other differences from a default config.

The database was created with
`initdb -E UTF-8`

Other non-default values in postgresql.conf that might be related:

max_files_per_process = 4000 (we have some partitioned tables with a lot of partitions)

effective_io_concurrency = 16

lc_messages = 'en_US.UTF-8'

lc_monetary = 'en_US.UTF-8'

lc_numeric = 'en_US.UTF-8'

lc_time = 'en_US.UTF-8'

default_text_search_config = 'pg_catalog.english'

\l+ shows encoding UTF8, Locale Provider libc, Collate en_US.UTF-8 and Ctype en_US.UTF-8

I don't know what other differences there could be, other than OS. This reproduces for me on Linux with the above on a RHEL 9 clone (pg 17) or with Ubuntu 25.10 (pg 16) so I suspect it is not too picky about the distro used.

-Scott

David

pgsql-performance by date:

From: David Rowley
Date: 31 March, 15:03:13
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17

From: Scott Carey
Date: 31 March, 21:06:36
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17

Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 - Mailing list pgsql-performance

Previous

Next