Home > mailing lists

Re: Parallel Aggregates for string_agg and array_agg - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Parallel Aggregates for string_agg and array_agg
Date	May 2, 2018 00:21:27
Msg-id	20180501212127.6rqw4wj6osxtjyvx@alap3.anarazel.de Whole thread Raw
In response to	Re: Parallel Aggregates for string_agg and array_agg (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On 2018-05-01 17:16:16 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-05-01 14:09:39 -0700, Mark Dilger wrote:
> >> I don't care which order the data is in, as long as x[i] and y[i] are
> >> matched correctly.  It sounds like this patch would force me to write
> >> that as, for example:
> >> 
> >> select array_agg(a order by a, b) AS x, array_agg(b order by a, b) AS y
> >> from generate_a_b_func(foo);
> >> 
> >> which I did not need to do before.
> 
> > Why would it require that? Rows are still processed row-by-row even if
> > there's parallelism, no?
> 
> Yeah, as long as we distribute all the aggregates in the same way,
> it seems like they'd all see the same random-ish input ordering.
> I can vaguely conceive of future optimizations that might break
> that, but not what we have today.

Yea, a column store would with a and b being in different column sets,
or a and b originating from different tables and processing the two
aggregates in independent parts of the query tree, or other similar
stuff could result in trouble for the above assumption. But that seems
pretty unrelated to the matter at hand...

Greetings,

Andres Freund

pgsql-hackers by date:

From: Tom Lane
Date: 02 May 2018, 00:16:16
Subject: Re: Parallel Aggregates for string_agg and array_agg

From: Miles Elam
Date: 02 May 2018, 00:29:29
Subject: Re: Format base - Code contribution

Re: Parallel Aggregates for string_agg and array_agg - Mailing list pgsql-hackers

Previous

Next