Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parallel Seq Scan
Date
Msg-id CA+TgmoYwdH_b0-fqeaHd_zfHxQLS2e9j+VuB-Qg-fPz=Xy=mhQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Oct 23, 2015 at 3:35 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Considering parallelism at RelOptInfo level in the way as done in patch,
> won't consider the RelOptInfo's for child relations in case of Append node.
> Refer build_simple_rel().

Hmm, true, but what can go wrong there?  The same quals apply to both,
and either both are temp or neither is.

> Also for cases when parallelism is not enabled like max_parallel_degree = 0,
> the current way of doing could add an overhead of traversing the
> baserestrictinfo without need. I think one way to avoid that would be check
> that while setting parallelModeOK flag.

Good idea.

> Another point is that it will consider parallelism for cases where we really
> can't parallelize example for foreign table, sample scan.

As soon as we add the ability to push joins below Gather nodes, we
will be able to parallelize that stuff if it is joined to something we
can parallelize.  That's why this flag is so handy.

> One thing to note here is that we already have precedent of verifying qual
> push down safety while path generation (during subquery path generation),
> so it doesn't seem wrong to consider the same for parallel paths and it
> would
> minimize the cases where we need to evaluate parallelism.

Mmm, yeah.

>> The advantage of this is that the logic is centralized.  If we have
>> parallel seq scan and also, say, parallel bitmap heap scan, your
>> approach would require that we duplicate the logic to check for
>> parallel-restricted functions for each path generation function.
>
> Don't we anyway need that irrespective of caching it in RelOptInfo?
> During bitmappath creation, bitmapqual could contain something
> which needs to be evaluated for parallel-safety as it is built based
> on index paths which inturn can be based on some join clause.  As per
> patch, the join clause parallel-safety is checked much later than
> generation bitmappath.

Yes, it's possible there could be some additional checks needed here
for parameterized paths.  But we're not quite there yet, so I think we
can solve that problem when we get there.  I have it in mind that in
the future we may want a parallel_safe flag on each path, which would
normally match the consider_parallel flag on the RelOptInfo but could
instead be false if the path internally uses parallelism (since,
currently, Gather nodes cannot be nested) or if it's got
parallel-restricted parameterized quals.  However, that seems like
future work.

> + else if (IsA(node, SubPlan) || IsA(node, SubLink) ||
> + IsA(node, AlternativeSubPlan) || IsA(node, Param))
> + {
> + /*
> + * Since we don't have the ability to push subplans down to workers
> + * at present, we treat subplan references as parallel-restricted.
> + */
> + if (!context->allow_restricted)
> + return true;
> + }
>
> I think it is better to do this for PARAM_EXEC paramkind, as those are
> the cases where it would be subplan or initplan.

Right, OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [patch] extensions_path GUC
Next
From: Robert Haas
Date:
Subject: Re: Multiline-statement and multi-statement for pgbench custom script.