On Tue, Sep 22, 2015 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Robert, thanks for asking. We'll be stuck with these words for some time,
> user visible via EXPLAIN so this is important.
I agree, thanks for taking an interest.
> The main operations are the 3 mentioned by Nicolas:
> 1. Send data from many to one - which has subtypes for Unsorted, Sorted and
> Evenly balanced (but unsorted)
> 2. Send data from one process to many
> 3. Send data from many to many
>
> My preferences for this would be
> 1. Gather (but not Gather Motion) e.g. Gather, Gather Sorted
> 2. Scatter (since Broadcast only makes sense in the context of a distributed
> query, it sounds weird for intra-node query)
> 3. Redistribution - which implies the description of how we spread data
> across nodes is "Distribution" (or DISTRIBUTED BY)
"Scatter" isn't one of the things that I mentioned in my original
email. Not sure where we'd use that, although there might be
somewhere.
> For 3 we should definitely use Redistribute, since this is what Teradata has
> been calling it for 30 years, which is where Greenplum got it from.
That's a reasonable option. We can bikeshed it some more when we get that far.
> For 1, Gather makes most sense.
Yeah, I'm leaning that way myself. Amit argued for "Parallel Gather"
but I think that's overkill. There can't be a non-parallel gather,
and long names are a pain.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company