Re: Parallel Aggregates for string_agg and array_agg - Mailing list pgsql-hackers

From David Rowley
Subject Re: Parallel Aggregates for string_agg and array_agg
Date
Msg-id CAKJS1f_yDsOrSffHBqAN3Gx27R1vW6OHaKy_EwnhnLZJmo8+Mw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Aggregates for string_agg and array_agg  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Parallel Aggregates for string_agg and array_agg  ("Tels" <nospam-pg-abuse@bloodgate.com>)
Re: Parallel Aggregates for string_agg and array_agg  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
Hi Tomas,

Thanks for taking another look.

On 5 April 2018 at 07:12, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> Seems fine to me, although we should handle the anyarray case too, I
> guess. That is, get_agg_clause_costs_walker() should do this too:
>
>     /* Same thing for array_agg_array_(de)serialize. */
>     if ((aggserialfn == F_ARRAY_AGG_ARRAY_SERIALIZE ||
>          aggdeserialfn == F_ARRAY_AGG_ARRAY_DESERIALIZE) &&
>          !agg_args_support_sendreceive(aggref))
>         costs->hasNonSerial = true;

hmm, array_agg_array_serialize and array_agg_array_deserialize don't
use the send/receive functions though, so not sure why that's
required?

> Other than that, the patch seems fine to me, and it's already marked as
> RFC so I'll leave it at that.

Thanks.

> The last obstacle seems to be the argument about the risks of the patch
> breaking queries of people relying on the ordering. Not sure what's are
> the right next steps in this regard ...

yeah, seems like a bit of a stalemate.

Personally, think if the group of people Tom mentions do exist, then
they've already been through some troubled times since Parallel Query
was born. I'd hope they've already put up their defenses due to the
advent of that feature. I stand by my thoughts that it's pretty
bizarre to draw the line here when we've probably been causing these
people issues for many years already. I said my piece on this already
so likely not much point in going on about it. These people are also
perfectly capable of sidestepping the whole issue with setting
max_parallel_workers_per_gather TO 0.

Perhaps one solution is to drop the string_agg and keep the array_agg.
Unsure if that would be good enough for Tom? More people seem to have
voiced that array_agg ordering is generally less of a concern, which I
imagine is probably true, but my opinion might not matter here as I'm
the sort of person who if I needed a specific ordering I'd have
written an ORDER BY clause.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] Runtime Partition Pruning
Next
From: Thomas Munro
Date:
Subject: Unstable number of workers in select_parallel test on spurfowl