Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parallel Seq Scan
Date
Msg-id CA+TgmobP+UnXhKpR2iLqvPQeJMS=LEZOkZkqZjGmZsWTZ7O5ZA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Seq Scan
List pgsql-hackers
On Thu, Sep 24, 2015 at 2:31 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> I have fixed most of the review comments raised in this mail
> as well as other e-mails and rebased the patch on commit-
> 020235a5.  Even though I have fixed many of the things, but
> still quite a few comments are yet to be handled.  This patch
> is mainly a rebased version to ease the review.  We can continue
> to have discussion on the left over things and I will address
> those in consecutive patches.

Thanks for the update.  Here are some more review comments:

1. parallel_seqscan_degree is still not what we should have here.  As
previously mentioned, I think we should have a GUC for the maximum
degree of parallelism in a query generally, not the maximum degree of
parallel sequential scan.

2. fix_node_funcids() can go away because of commit
9f1255ac859364a86264a67729dbd1a36dd63ff2.

3. cost_patialseqscan is still misspelled.  I pointed this out before, too.

4. Much more seriously than any of the above,
create_parallelscan_paths() generates plans that are badly broken:

rhaas=# explain select * from pgbench_accounts where filler < random()::text;
QUERYPLAN
 
-----------------------------------------------------------------------------------------Funnel on pgbench_accounts
(cost=0.00..35357.73rows=3333333 width=97)  Filter: ((filler)::text < (random())::text)  Number of Workers: 10  ->
PartialSeq Scan on pgbench_accounts  (cost=0.00..35357.73
 
rows=3333333 width=97)        Filter: ((filler)::text < (random())::text)
(5 rows)

This is wrong both because random() is parallel-restricted and thus
can't be executed in a parallel worker, and also because enforcing a
volatile qual twice is no good.

rhaas=# explain select * from pgbench_accounts where aid % 10000 = 0;                                     QUERY PLAN
---------------------------------------------------------------------------------------Funnel on pgbench_accounts
(cost=0.00..28539.55rows=50000 width=97)  Filter: ((aid % 10000) = 0)  Number of Workers: 10  ->  Partial Seq Scan on
pgbench_accounts (cost=0.00..28539.55
 
rows=50000 width=97)        Filter: ((aid % 10000) = 0)
(5 rows)

This will work, but it's a bad plan because we shouldn't need to
re-apply the filter condition in the parallel leader after we've
already checked it in the worker.

rhaas=# explain select * from pgbench_accounts a where a.bid not in
(select bid from pgbench_branches);                                       QUERY PLAN
-------------------------------------------------------------------------------------------Funnel on pgbench_accounts a
(cost=2.25..26269.07 rows=5000000 width=97)  Filter: (NOT (hashed SubPlan 1))  Number of Workers: 10  ->  Partial Seq
Scanon pgbench_accounts a  (cost=2.25..26269.07
 
rows=5000000 width=97)        Filter: (NOT (hashed SubPlan 1))        SubPlan 1          ->  Seq Scan on
pgbench_branches (cost=0.00..2.00 rows=100 width=4)  SubPlan 1    ->  Seq Scan on pgbench_branches  (cost=0.00..2.00
rows=100width=4)
 
(9 rows)

This will not work, because the subplan isn't available inside the
worker.  Trying it without the EXPLAIN crashes the server.  This is
more or less the same issue as one of the known issues you already
mentioned, but I mention it again here because I think this case is
closely related to the previous two.

Basically, you need to have some kind of logic for deciding which
things need to go below the funnel and which on the funnel itself.
The stuff that's safe should get pushed down, and the stuff that's not
safe should get attached to the funnel.  The unsafe stuff is whatever
contains references to initplans or subplans, or anything that
contains parallel-restricted functions ... and there might be some
other stuff, too, but at least those things.

Instead of preventing initplans or subplans from getting pushed down
into the funnel, we could instead try to teach the system to push them
down.  However, that's very complicated; e.g. a subplan that
references a CTE isn't safe to push down, and a subplan that
references another subplan must be pushed down if that other subplan
is pushed down, and an initplan that contains volatile functions can't
be pushed down because each worker would execute it separately and
they might not all get the same answer, and an initplan that
references a temporary table can't be pushed down because it can't be
referenced from a worker.  All in all, it seems better not to go there
right now.

Now, when you fix this, you're quickly going to run into another
problem, which is that as you have this today, the funnel node does
not actually enforce its filter condition, so the EXPLAIN plan is a
dastardly lie.  And when you try to fix that, you're going to run into
a third problem, which is that the stuff the funnel node needs in
order to evaluate its filter condition might not be in the partial seq
scan's target list.  So you need to fix both of those problems, too.
Even if you cheat and just don't generate a parallel path at all
except when all quals can be pushed down, you're still going to have
to fix these problems: it's not OK to print out a filter condition on
the funnel as if you were going to enforce it, and then not actually
enforce it there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Manual bitswizzling -> LOCKBIT_ON
Next
From: Amit Kapila
Date:
Subject: Re: Parallel Seq Scan