Thread: a funnel by any other name

a funnel by any other name

From
Robert Haas
Date:
I have discovered that I have reinvented the wheel.

In http://www.postgresql.org/message-id/CA+TgmobM7X6jgre442638b+33h1EWa=vcZqnsvzEdX057ZHVuw@mail.gmail.com
I invented an operator called Funnel, whose job it is to fire up a
bunch of workers and run a plan in all of them, so that we can
eventually generate plans like this:

Funnel
Number of Workers: 4
-> Hash Join -> Partial Seq Scan on a -> Hash   -> Seq Scan on b

The idea is that each worker will read all of b and build a hash
table, and about 1/4 of a, and do the join of its portion of a to all
of b, and then the funnel will combine the separate flows of tuples
into one.   Funny thing though, it turns out that I'm not the first
person to invent this operator.  Greenplum has it, and there it's
called "Gather Motion".[1]  However, it seems like the most systems
use the name "Exchange" to refer to a more powerful operator which can
do this thing among others.[2,3]

The Volcano paper[3] is particularly interesting for a couple of
reasons.  First, it's about 20 years old.  Second, it may be the
earliest reference to the exchange operator; the authors speak as if
they invented it.  Third, it describes several variants of exchange:

1. One parent, one child (see p. 11, under A. Vertical Parallelism).
The child can execute in parallel with the parent.  This is basically
the same as a Funnel if you had only one worker and didn't ever let
the parent execute the stuff under the funnel.

2. One parent, multiple children (see p. 12, last paragraph).  Classic funnel.

3. Multiple parents, multiple children, with a copy to each parent
(see p. 13, under "Variants of the Exchange Operator").

4. Exchange-merge (see p. 13, column two).  This is like what I
described as a funnel, but instead of returning all of the results in
the order they're returned, it merges a set of streams into a single
stream.

5. Interchange (see end of p. 13, continuing on p. 14).  This is an
operator for "repartitioning" or "distributed shuffle" or whatever you
want to call it.  The portion of the plan tree below the operator is
run to generate tuples, which are then consumed by the portion of the
plan tree above the operator.  But between the two the tuples are
shuffled back and forth between cooperating workers so that each
worker ends up with the tuples that map to some partition.  For
example, you can imagine that below the interchange you might have a
parallel sequential scan of a table with a column x.  Let h be a hash
function.  The interchange operator will move the tuples around so
that, above the interchange, each worker will see all and only those
tuples where h(x) % number_of_workers = my_worker_number.

This paper basically calls all of these things an Exchange, but
they're clearly all somewhat different from each other, so I'm not
sure it's a good idea to use the name Exchange for our node.  At the
same time, I'm not sure it's a good idea to use terminology that I
invented completely out of whole cloth in preference to terminology
that's seems to be somewhat standard.  One idea is to call them all
Exchange nodes but with a subtype.  For example, using the Volcano's
paper's terminology, we could call these:

1. Exchange Bushy
2. Exchange Inter-Operator (this is what's currently implemented)
3. Exchange Replicate
4. Exchange Merge
5. Interchange

Or taking inspiration from Greenplum, we could go with:

1. ?
2. Gather
3. Broadcast (sorta)
4. Gather Merge
5. Redistribute

Or maybe something like this:

1. Parallel Child
2. Parallel Gather
3. Parallel Replicate
4. Parallel Merge
5. Parallel Redistribute

Or, yet another option, we could combine the similar operators under
one umbrella while keeping the things that are more different as
separate nodes:

1, 2. Exchange (or Gather or Funnel)
3, 5. Distribute (or Redistribute or Interchange or Exchange)
4. Exchange Merge (or Gather Merge or Funnel Merge)

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://www.ndm.net/datawarehouse/Greenplum/parallel-query-optimizer
[2]
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00743.1570/html/queryprocessing/CHDHHIIF.htm
[3] http://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf



Re: a funnel by any other name

From
Petr Jelinek
Date:
On 2015-09-17 04:39, Robert Haas wrote:
>
> 1. Exchange Bushy
> 2. Exchange Inter-Operator (this is what's currently implemented)
> 3. Exchange Replicate
> 4. Exchange Merge
> 5. Interchange
>
> Or taking inspiration from Greenplum, we could go with:
>
> 1. ?
> 2. Gather
> 3. Broadcast (sorta)
> 4. Gather Merge
> 5. Redistribute
>
> Or maybe something like this:
>
> 1. Parallel Child
> 2. Parallel Gather
> 3. Parallel Replicate
> 4. Parallel Merge
> 5. Parallel Redistribute
>
> Or, yet another option, we could combine the similar operators under
> one umbrella while keeping the things that are more different as
> separate nodes:
>
> 1, 2. Exchange (or Gather or Funnel)
> 3, 5. Distribute (or Redistribute or Interchange or Exchange)
> 4. Exchange Merge (or Gather Merge or Funnel Merge)
>
> Thoughts?
>

Interesting read.

I think 1 and 2 are similar enough to be same node (Exchange sounds good 
to me).

Exchange Merge for 4 also sounds good.

About 3 and 5, if I understand correctly those are similar with the main 
difference being that in 3 all parents get copy of every tuple while in 
5 the tuples are partitioned between the parents. Sounds reasonable to 
have Redistribute/Interchange or something like that for both with some 
additional info saying if tuples are being partitioned or duplicated.

In any case, let's not name any of the nodes as "Replicate".

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: a funnel by any other name

From
Amit Kapila
Date:
On Thu, Sep 17, 2015 at 8:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> Or, yet another option, we could combine the similar operators under
> one umbrella while keeping the things that are more different as
> separate nodes:
>
> 1, 2. Exchange (or Gather or Funnel)
> 3, 5. Distribute (or Redistribute or Interchange or Exchange)
> 4. Exchange Merge (or Gather Merge or Funnel Merge)
>

+1 for combining, but it seems better to call 1,2 as Parallel Gather
and similarly for others.  Adding Parallel to Gather makes it
self-explanatory.


With Regards,
Amit Kapila.

Re: a funnel by any other name

From
Nicolas Barbier
Date:
2015-09-17 Robert Haas <robertmhaas@gmail.com>:

> 1. Exchange Bushy
> 2. Exchange Inter-Operator (this is what's currently implemented)
> 3. Exchange Replicate
> 4. Exchange Merge
> 5. Interchange

> 1. ?
> 2. Gather
> 3. Broadcast (sorta)
> 4. Gather Merge
> 5. Redistribute

> 1. Parallel Child
> 2. Parallel Gather
> 3. Parallel Replicate
> 4. Parallel Merge
> 5. Parallel Redistribute

FYI, SQL Server has these in its execution plans:

* Distribute Streams: read from one thread, write to multiple threads
* Repartition Streams: both read and write from/to multiple threads
* Gather Streams: read from multiple threads, write to one thread

Nicolas

-- 
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?



Re: a funnel by any other name

From
Simon Riggs
Date:
On 17 September 2015 at 05:07, Nicolas Barbier <nicolas.barbier@gmail.com> wrote:
2015-09-17 Robert Haas <robertmhaas@gmail.com>:

> 1. Exchange Bushy
> 2. Exchange Inter-Operator (this is what's currently implemented)
> 3. Exchange Replicate
> 4. Exchange Merge
> 5. Interchange

> 1. ?
> 2. Gather
> 3. Broadcast (sorta)
> 4. Gather Merge
> 5. Redistribute

> 1. Parallel Child
> 2. Parallel Gather
> 3. Parallel Replicate
> 4. Parallel Merge
> 5. Parallel Redistribute

FYI, SQL Server has these in its execution plans:

* Distribute Streams: read from one thread, write to multiple threads
* Repartition Streams: both read and write from/to multiple threads
* Gather Streams: read from multiple threads, write to one thread

Robert, thanks for asking. We'll be stuck with these words for some time, user visible via EXPLAIN so this is important.

In general we should stick to words already used in other similar situations, which could include DBMS and parallel ETL tools, of which there are many more than mentioned here.

I would be against using any of these words: Funnel, Motion, Bushy because I don't find them very descriptive (I think of spiders, bowels and shrubs respectively, sorry).

These words are liable to confusion with other concepts: Replicate, Duplicate, Distribute, Partition, Repartition, MERGE.

I've seen this concept called Fan-In/Fan-Out and Scatter/Gather

The main operations are the 3 mentioned by Nicolas:
1. Send data from many to one - which has subtypes for Unsorted, Sorted and Evenly balanced (but unsorted)
2. Send data from one process to many
3. Send data from many to many

My preferences for this would be 
1. Gather (but not Gather Motion) e.g. Gather, Gather Sorted
2. Scatter (since Broadcast only makes sense in the context of a distributed query, it sounds weird for intra-node query)
3. Redistribution - which implies the description of how we spread data across nodes is "Distribution" (or DISTRIBUTED BY)

For 3 we should definitely use Redistribute, since this is what Teradata has been calling it for 30 years, which is where Greenplum got it from.
For 1, Gather makes most sense.

For 2, it could be either Scatter or Distribute. The former works well with Gather, the latter works well with Redistribute.

Sorry for my absence for further review on parallel ops.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: a funnel by any other name

From
Robert Haas
Date:
On Tue, Sep 22, 2015 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Robert, thanks for asking. We'll be stuck with these words for some time,
> user visible via EXPLAIN so this is important.

I agree, thanks for taking an interest.

> The main operations are the 3 mentioned by Nicolas:
> 1. Send data from many to one - which has subtypes for Unsorted, Sorted and
> Evenly balanced (but unsorted)
> 2. Send data from one process to many
> 3. Send data from many to many
>
> My preferences for this would be
> 1. Gather (but not Gather Motion) e.g. Gather, Gather Sorted
> 2. Scatter (since Broadcast only makes sense in the context of a distributed
> query, it sounds weird for intra-node query)
> 3. Redistribution - which implies the description of how we spread data
> across nodes is "Distribution" (or DISTRIBUTED BY)

"Scatter" isn't one of the things that I mentioned in my original
email.  Not sure where we'd use that, although there might be
somewhere.

> For 3 we should definitely use Redistribute, since this is what Teradata has
> been calling it for 30 years, which is where Greenplum got it from.

That's a reasonable option.  We can bikeshed it some more when we get that far.

> For 1, Gather makes most sense.

Yeah, I'm leaning that way myself.  Amit argued for "Parallel Gather"
but I think that's overkill.  There can't be a non-parallel gather,
and long names are a pain.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: a funnel by any other name

From
Alvaro Herrera
Date:
Robert Haas wrote:
> On Tue, Sep 22, 2015 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

> > For 1, Gather makes most sense.
> 
> Yeah, I'm leaning that way myself.  Amit argued for "Parallel Gather"
> but I think that's overkill.  There can't be a non-parallel gather,
> and long names are a pain.

"Gather" seems a pretty decent choice to me too, even if we only have a
single worker (your "1").  I don't think there's much need to
distinguish 1 from 2, is there?

We can bikeshed the other names when the time comes; the insight in the
thread is good to have.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: a funnel by any other name

From
Simon Riggs
Date:
On 22 September 2015 at 20:34, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Sep 22, 2015 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Robert, thanks for asking. We'll be stuck with these words for some time,
> user visible via EXPLAIN so this is important.

I agree, thanks for taking an interest.

> The main operations are the 3 mentioned by Nicolas:
> 1. Send data from many to one - which has subtypes for Unsorted, Sorted and
> Evenly balanced (but unsorted)
> 2. Send data from one process to many
> 3. Send data from many to many
>
> My preferences for this would be
> 1. Gather (but not Gather Motion) e.g. Gather, Gather Sorted
> 2. Scatter (since Broadcast only makes sense in the context of a distributed
> query, it sounds weird for intra-node query)
> 3. Redistribution - which implies the description of how we spread data
> across nodes is "Distribution" (or DISTRIBUTED BY)

"Scatter" isn't one of the things that I mentioned in my original
email.  Not sure where we'd use that, although there might be
somewhere.

Understood. Thought it best to cover all the phrases we'll use in the future now in one discussion.
 
> For 3 we should definitely use Redistribute, since this is what Teradata has
> been calling it for 30 years, which is where Greenplum got it from.

That's a reasonable option.  We can bikeshed it some more when we get that far.

Sure
 
> For 1, Gather makes most sense.

Yeah, I'm leaning that way myself.  Amit argued for "Parallel Gather"
but I think that's overkill.  There can't be a non-parallel gather,
and long names are a pain.

Agreed 

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: a funnel by any other name

From
Simon Riggs
Date:
On 22 September 2015 at 21:14, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Robert Haas wrote:
> On Tue, Sep 22, 2015 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

> > For 1, Gather makes most sense.
>
> Yeah, I'm leaning that way myself.  Amit argued for "Parallel Gather"
> but I think that's overkill.  There can't be a non-parallel gather,
> and long names are a pain.

"Gather" seems a pretty decent choice to me too, even if we only have a
single worker (your "1").  I don't think there's much need to
distinguish 1 from 2, is there?

I think so. 1 is Many->1 and the other is 1->Many.

You may wish to do an operation like a parallel merge join.

Parallel Sort -> Scatter -> Parallel Merge

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services