Re: PATCH: decreasing memory needlessly consumed by array_agg - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: PATCH: decreasing memory needlessly consumed by array_agg
Date
Msg-id 1424569116.12308.235.camel@jeff-desktop
Whole thread Raw
In response to Re: PATCH: decreasing memory needlessly consumed by array_agg  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: PATCH: decreasing memory needlessly consumed by array_agg  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Wed, 2015-01-28 at 23:25 +0100, Tomas Vondra wrote:
> 3) moves the assert into the 'if (release)' branch

You missed one, but I got it.

> 4) includes the comments proposed by Ali Akbar in his reviews
>
>    Warnings at makeArrayResult/makeMdArrayResult about freeing memory
>    with private subcontexts.

I also edited the comments substantially.

> Regarding the performance impact of decreasing the size of the
> preallocated array from 64 to just 8 elements, I tried this.
>
>   CREATE TABLE test AS
>   SELECT mod(i,100000) a, i FROM generate_series(1,64*100000) s(i);
>
>   SELECT a, array_agg(i) AS x FRM test GROUP BY 1;
>
> or actually (to minimize transfer overhead):
>
>   SELECT COUNT(x) FROM (
>      SELECT a, array_agg(i) AS x FRM test GROUP BY 1
>   ) foo;

That's actually a bogus test -- array_agg is never executed. See the
test script I used (attached). Many small groups has a significant
improvement with your patch (median 167ms unpatched, 125ms patched), and
the one large group is not measurably different (median 58ms for both).

The test script uses a small dataset of 100K tuples, because 1M tuples
will run out of memory for small groups without the patch.

Committed.

Regards,
    Jeff Davis

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Parallel Seq Scan
Next
From: Peter Geoghegan
Date:
Subject: Abbreviated keys for text cost model fix