Re: Parallel Aggregates for string_agg and array_agg - Mailing list pgsql-hackers

From David Rowley
Subject Re: Parallel Aggregates for string_agg and array_agg
Date
Msg-id CAApHDvq7ChY83NaCw9e8rP1jUd_hgVj=jespHFY56gA4TynnfA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Aggregates for string_agg and array_agg  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Thu, 19 Jan 2023 at 20:44, David Rowley <dgrowleyml@gmail.com> wrote:
> The attached agg_comparison_scalability.png shows the time it took to
> execute the query with 0 workers divided by the time for N workers,
> thus showing how many times the query was sped up with N workers. The
> scalability of the string_agg() appears slightly better than the sum()
> in this test due to the string_agg() query being slightly slower with
> 0 workers than the sum() version of the query. The sum query got 7.63x
> faster with 32 workers whereas the string_agg query beat that slightly
> with 7.67x faster than when the query was executed with parallelism
> disabled.

I was just looking at this again today and I noticed I messed up the
script and used the sum(b) query twice instead of the
string_agg(b::text, ',') query. oops.

I've attached the correct graphs this time.  The scalability of
string_agg() is not quite as good as I reported before, but it's still
good.  sum(b) with 32 workers with a new benchmark run I found to be
7.15x faster than with 0 parallel workers and string_agg(b::text, ',')
I found to be 6.49x faster with 32 workers than with 0 parallel
workers. This test showed that the string_agg query scaled better with
up to 6 workers before the scalability dropped below that of the
sum(b) query.

David

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: run pgindent on a regular basis / scripted manner
Next
From: Andrew Dunstan
Date:
Subject: Re: run pgindent on a regular basis / scripted manner