On Tue, Mar 27, 2018 at 12:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <david.rowley@2ndquadrant.com> writes: > On 27 March 2018 at 09:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I do not think it is accidental that these aggregates are exactly the ones >> that do not have parallelism support today. Rather, that's because you >> just about always have an interest in the order in which the inputs get >> aggregated, which is something that parallel aggregation cannot support.
> This very much reminds me of something that exists in the 8.4 release notes: >> SELECT DISTINCT and UNION/INTERSECT/EXCEPT no longer always produce sorted output (Tom)
That's a completely false analogy: we got a significant performance benefit for a significant fraction of users by supporting hashed aggregation. My argument here is that only a very tiny fraction of string_agg/array_agg users will not care about aggregation order, and thus I don't believe that this patch can help very many people. Against that, it's likely to hurt other people, by breaking their queries and forcing them to insert expensive explicit sorts to fix it. Even discounting the backwards-compatibility question, we don't normally adopt performance features for which it's unclear that the net gain over all users is positive.
I think you are quite wrong in claiming that only a tiny fraction of the users are going to care.
This may, and quite probably does, hold true for string_agg(), but not for array_agg(). I see a lot of cases where people use that to load it into an unordered array/hashmap/set/whatever on the client side, which looses ordering *anyway*,and they would definitely benefit from it.