Home > mailing lists

Re: Parallel Seq Scan - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Parallel Seq Scan
Date	September 17, 2015 06:54:45
Msg-id	CAA4eK1JpmQJzLLFOP0eUFo5Kv9hvjQBBA59jYJMrsOFkzLaTmA@mail.gmail.com Whole thread Raw
In response to	Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Parallel Seq Scan
List	pgsql-hackers

Tree view

On Thu, Sep 17, 2015 at 1:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Sep 10, 2015 at 12:12 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> 2. I think it's probably a good idea - at least for now, and maybe
> >> forever - to avoid nesting parallel plans inside of other parallel
> >> plans. It's hard to imagine that being a win in a case like this, and
> >> it certainly adds a lot more cases to think about.
> >
> > I also think that avoiding nested parallel plans is a good step forward.
>
> Doing that as a part of the assess parallel safety patch was trivial, so I did.
>

As per my understanding, what you have done there will not prohibit such

cases.

+ * For now, we don't try to use parallel mode if we're running inside
+ * a parallel worker. We might eventually be able to relax this
+ * restriction, but for now it seems best not to have parallel workers
+ * trying to create their own parallel workers.
+ */
+ glob->parallelModeOK = (cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
+ IsUnderPostmaster && dynamic_shared_memory_type != DSM_IMPL_NONE &&
+ parse->commandType == CMD_SELECT && !parse->hasModifyingCTE &&
+ parse->utilityStmt == NULL && !IsParallelWorker() &&
+ !contain_parallel_unsafe((Node *) parse);

IIUC, your are referring to !IsParallelWorker() check in above code. If yes,

then I think it won't work because we generate the plan in master backend,

parallel worker will never exercise this code. I have tested it as well with

below example and it still generates SubPlan as Funnel.

CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM

generate_series(1, 10000000) g;

CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM

generate_series(1, 1000000) g;

set parallel_seqscan_degree=2;

set cpu_tuple_comm_cost=0.01;

explain select * from t1 where c1 not in (select c1 from t2 where c2 = 'xxxx');

QUERY PLAN

--------------------------------------------------------------------------------

----

Funnel on t1 (cost=11536.88..126809.17 rows=3432492 width=36)

Filter: (NOT (hashed SubPlan 1))

Number of Workers: 2

-> Partial Seq Scan on t1 (cost=11536.88..58159.32 rows=3432492 width=36)

Filter: (NOT (hashed SubPlan 1))

SubPlan 1

-> Funnel on t2 (cost=0.00..11528.30 rows=3433 width=4)

Filter: (c2 = 'xxxx'::text)

Number of Workers: 2

-> Partial Seq Scan on t2 (cost=0.00..4662.68 rows=3433 width

=4)

Filter: (c2 = 'xxxx'::text)

SubPlan 1

-> Funnel on t2 (cost=0.00..11528.30 rows=3433 width=4)

Filter: (c2 = 'xxxx'::text)

Number of Workers: 2

-> Partial Seq Scan on t2 (cost=0.00..4662.68 rows=3433 width=4)

Filter: (c2 = 'xxxx'::text)

(17 rows)

Here the subplan is generated before the top level plan and while generation

of subplan we can't predict whether it is okay to generate it as Funnel or not,

because it might be that top level plan is non-Funnel. Also if such a subplan

is actually an InitPlan, then we are safe (as we execute the InitPlans in

master backend and then pass the result to parallel worker) even if top level

plan is Funnel. I think the place where we can catch this is during the

generation of Funnel path, basically we can evaluate if any nodes beneath

Funnel node has 'filter' or 'targetlist' as another Funnel node, then we have

two options to proceed:

a. Mark such a filter or target list as non-pushable which will indicate that

they need to be executed only in master backend. If we go with this

option, then we have to make Funnel node capable of evaluating Filter

and Targetlist which is not a big thing.

b. Don't choose the current path as Funnel path.

I prefer second one as that seems to be simpler as compare to first and

there doesn't seem to be much benefit in going by first.

Any better ideas?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Thomas Munro
Date: 17 September 2015, 05:23:47
Subject: Re: Support for N synchronous standby servers - take 2

From: Amit Kapila
Date: 17 September 2015, 09:23:40
Subject: Re: a funnel by any other name

Re: Parallel Seq Scan - Mailing list pgsql-hackers

Previous

Next