I also had a quick look at the patch and the comments made so far. Summary:
1. The performance results are promising.
2. The code needs comments.
Regarding the design:
Thomas Munro mentioned the idea of a "Parallel Repartition" node that
would redistribute tuples like this. As I understand it, the difference
is that this BatchSort implementation collects all tuples in a tuplesort
or a tuplestore, while a Parallel Repartition node would just
redistribute the tuples to the workers, without buffering. The receiving
worker could put the tuples to a tuplestore or sort if needed.
I think a non-buffering Reparttion node would be simpler, and thus
better. In these patches, you have a BatchSort node, and batchstore, but
a simple Parallel Repartition node could do both. For example, to
implement distinct:
Gather
- > Unique
-> Sort
-> Parallel Redistribute
-> Parallel Seq Scan
And a Hash Agg would look like this:
Gather
- > Hash Agg
-> Parallel Redistribute
-> Parallel Seq Scan
I'm marking this as Waiting on Author in the commitfest.
- Heikki