Re: Using distinct in an aggregate prevents parallel execution? - Mailing list pgsql-general

From Tom Lane
Subject Re: Using distinct in an aggregate prevents parallel execution?
Date
Msg-id 6457.1528295547@sss.pgh.pa.us
Whole thread Raw
In response to Using distinct in an aggregate prevents parallel execution?  (Thomas Kellerer <spam_eater@gmx.net>)
Responses Re: Using distinct in an aggregate prevents parallel execution?
List pgsql-general
Thomas Kellerer <spam_eater@gmx.net> writes:
> Is this a known limitation? 

Yes, unless somebody has done radical restructuring of the aggregation
code while I wasn't looking.

agg(DISTINCT ...) is currently implemented inside the Agg plan node,
so it's an indivisible black box to everything else.  That was a
simple, minimum-code-footprint method for implementing the feature
back when; but it's got lots of drawbacks, and one is that there's
no reasonable way to parallelize.

I'd anticipate that before we could even start to think of parallelizing,
we'd have to split out the distinct-ification processing into a separate
plan node.

agg(... ORDER BY ...) has got the same problem, and it'd likely be
advisable to fix that at the same time.

            regards, tom lane


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Failover replication building a new master
Next
From: "Joshua D. Drake"
Date:
Subject: Re: Code of Conduct plan