On Tue, Sep 22, 2015 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Robert, thanks for asking. We'll be stuck with these words for some time, > user visible via EXPLAIN so this is important.
I agree, thanks for taking an interest. > The main operations are the 3 mentioned by Nicolas: > 1. Send data from many to one - which has subtypes for Unsorted, Sorted and > Evenly balanced (but unsorted) > 2. Send data from one process to many > 3. Send data from many to many > > My preferences for this would be > 1. Gather (but not Gather Motion) e.g. Gather, Gather Sorted > 2. Scatter (since Broadcast only makes sense in the context of a distributed > query, it sounds weird for intra-node query) > 3. Redistribution - which implies the description of how we spread data > across nodes is "Distribution" (or DISTRIBUTED BY)
"Scatter" isn't one of the things that I mentioned in my original email. Not sure where we'd use that, although there might be somewhere.
Understood. Thought it best to cover all the phrases we'll use in the future now in one discussion.
> For 3 we should definitely use Redistribute, since this is what Teradata has > been calling it for 30 years, which is where Greenplum got it from.
That's a reasonable option. We can bikeshed it some more when we get that far.
Sure
> For 1, Gather makes most sense.
Yeah, I'm leaning that way myself. Amit argued for "Parallel Gather" but I think that's overkill. There can't be a non-parallel gather, and long names are a pain.
Agreed
--
Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services