Re: Parallel Aggregates for string_agg and array_agg - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Parallel Aggregates for string_agg and array_agg
Date
Msg-id CAFj8pRD12EyhXPZGs_q3D0HfzjxvXD4i9Wo5kGCj-3z41tmp7g@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Aggregates for string_agg and array_agg  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
Hi

út 27. 12. 2022 v 6:26 odesílatel David Rowley <dgrowleyml@gmail.com> napsal:
On Wed, 20 Jun 2018 at 19:08, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I'll submit it again when there more consensus that we want this.

Waking up this old thread again. If you don't have a copy, the entire
thread is in [1].

The remaining item that seemed to cause this patch to be rejected was
raised in [2]. The summary of that was that it might not be a good
idea to allow parallel aggregation of string_agg() and array_agg() as
there might be some people who rely on the current ordering they get
without having an ORDER BY clause in the aggregate function call.  Tom
mentioned in [3] that users might not want to add an ORDER BY to their
aggregate function because the performance of it is terrible.  That
was true up until 1349d2790 [4], where I changed how ORDER BY /
DISTINCT aggregation worked to allow the planner to provide pre-sorted
input rather than always having nodeAgg.c do the sorting.  I think
this removes quite a lot of the argument against the patch, but not
all of it.  So here goes testing the water on seeing if any opinions
have changed over the past few years?

A rebased patch is attached.

The technical part of this patch was discussed before, and there were no objections against it.

I did some performance benchmarks and I agree with David. Sure - it can change the ordering, but without the ORDER BY clause, the order depended before too. And the performance with ORDER clause is not changed with and without this patch.

I can confirm that all regress tests passed, and there are not any problems with compilation.

I'll mark this patch as ready for committer

Regards

Pavel


 

David

[1] https://www.postgresql.org/message-id/flat/CAKJS1f98yPkRMsE0JnDh72%3DAQEUuE3atiCJtPVCtjhFwzCRJHQ%40mail.gmail.com#8bbce15b9279d2da2da99071f732a99d
[2] https://www.postgresql.org/message-id/6538.1522096067@sss.pgh.pa.us
[3] https://www.postgresql.org/message-id/18594.1522099194@sss.pgh.pa.us
[4] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1349d2790bf48a4de072931c722f39337e72055e

pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: New strategies for freezing, advancing relfrozenxid early
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Exit walsender before confirming remote flush in logical replication