Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 - Mailing list pgsql-performance

From Tom Lane
Subject Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Date
Msg-id 3579239.1775151498@sss.pgh.pa.us
Whole thread Raw
In response to Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17  (Scott Carey <scott.carey@algonomy.com>)
Responses Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
List pgsql-performance
Scott Carey <scott.carey@algonomy.com> writes:
> I have discovered the root cause.
> This database is old.  It pre-dates Postgres 8.4 which introduced
> array_agg.   Apparently, in some version prior to 8.4 array_agg was added
> as a user function, defined as below for bigint:

>  create AGGREGATE array_agg(
>       BASETYPE = bigint,
>       SFUNC = array_append,
>       STYPE = bigint[],
>       INITCOND = '{}'
>  );

> So if you create a test database and run the previous test, performance
> will be fine and the query will be fast.  Then run:
> create AGGREGATE array_agg(BASETYPE = bigint, SFUNC = array_append,STYPE =
> bigint[], INITCOND = '{}');

> It will be slow and reproduce this behavior.

Thank you for running that to ground!  I confirm your results that v13
and up are far slower for this example than v12 was.

> Why would this run so much more slowly after updating from postgres 12 to
> 17?   It is a user defined aggregate, although maybe not as optimized as
> the intrinsic one it shouldn't behave this way.

I did some bisecting using the attached simplified test case, and found
that the query execution time jumps from circa 60ms to circa 7500ms here:

1f39bce021540fde00990af55b4432c55ef4b3c7 is the first bad commit
commit 1f39bce021540fde00990af55b4432c55ef4b3c7
Author: Jeff Davis <jdavis@postgresql.org>
Date:   Wed Mar 18 15:42:02 2020 -0700

    Disk-based Hash Aggregation.

    While performing hash aggregation, track memory usage when adding new
    groups to a hash table. If the memory usage exceeds work_mem, enter
    "spill mode".

(Times quoted are on a Mac M4 Pro, but in assert-enabled builds so
maybe not directly comparable to production.)

I'm bemused as to why: the test case has work_mem set high enough that
we shouldn't be triggering spill mode, so why did this change affect
it at all?

            regards, tom lane

CREATE AGGREGATE array_agg(
      BASETYPE = bigint,
      SFUNC = array_append,
      STYPE = bigint[],
      INITCOND = '{}'
);

drop table if exists array_agg_test;

create table array_agg_test(product_id bigint not null, region_id bigint
not null, available boolean not null);

insert into array_agg_test (product_id, region_id, available) SELECT
generate_series(1, 50000) as product_id,
(ARRAY[1,2,3,4])[floor(random()*4)+1] as region_id, true as available;

insert into array_agg_test (product_id, region_id, available) SELECT
generate_series(1, 50000) as product_id,
(ARRAY[11,12,13,14])[floor(random()*4)+1] as region_id, true as available;

insert into array_agg_test (product_id, region_id, available) SELECT
generate_series(1, 50000) as product_id,
(ARRAY[111,112,113,114])[floor(random()*4)+1] as region_id, true as
available;

insert into array_agg_test (product_id, region_id, available) SELECT
generate_series(1, 50000) as product_id,
(ARRAY[1111,1112,1113,1114])[floor(random()*4)+1] as region_id, true as
available;

vacuum analyze array_agg_test;

\set ECHO all

-- set hash_mem_multiplier = 2;
set work_mem = "200MB";

explain (analyze, buffers) select product_id, array_agg(region_id) from
array_agg_test group by product_id;

pgsql-performance by date:

Previous
From: Scott Carey
Date:
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Next
From: Scott Carey
Date:
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17