Home > mailing lists

Re: parallel distinct union and aggregate support patch - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: parallel distinct union and aggregate support patch
Date	November 27, 2020 18:55:25
Msg-id	937fa586-1d97-c732-47b8-1697ac0f6360@iki.fi Whole thread Raw
In response to	Re: Re: parallel distinct union and aggregate support patch (Dilip Kumar <dilipbalaut@gmail.com>)
Responses	Re: parallel distinct union and aggregate support patch Re: parallel distinct union and aggregate support patch
List	pgsql-hackers

Tree view

I also had a quick look at the patch and the comments made so far. Summary:

1. The performance results are promising.

2. The code needs comments.

Regarding the design:

Thomas Munro mentioned the idea of a "Parallel Repartition" node that 
would redistribute tuples like this. As I understand it, the difference 
is that this BatchSort implementation collects all tuples in a tuplesort 
or a tuplestore, while a Parallel Repartition node would just 
redistribute the tuples to the workers, without buffering. The receiving 
worker could put the tuples to a tuplestore or sort if needed.

I think a non-buffering Reparttion node would be simpler, and thus 
better. In these patches, you have a BatchSort node, and batchstore, but 
a simple Parallel Repartition node could do both. For example, to 
implement distinct:

Gather
-  > Unique
        -> Sort
            -> Parallel Redistribute
                -> Parallel Seq Scan

And a Hash Agg would look like this:

Gather
-  > Hash Agg
         -> Parallel Redistribute
             -> Parallel Seq Scan


I'm marking this as Waiting on Author in the commitfest.

- Heikki

pgsql-hackers by date:

From: Tom Lane
Date: 27 November 2020, 18:29:24
Subject: Re: configure and DocBook XML

From: Stephen Frost
Date: 27 November 2020, 19:15:27
Subject: Re: Online verification of checksums

Re: parallel distinct union and aggregate support patch - Mailing list pgsql-hackers

Previous

Next