[HACKERS] Parallel seq. plan is not coming against inheritance or partition table - Mailing list pgsql-hackers

From Ashutosh Sharma
Subject [HACKERS] Parallel seq. plan is not coming against inheritance or partition table
Date
Msg-id CAE9k0PmgSoOHRd60SHu09aRVTHRSs8s6pmyhJKWHxWw9C_x+XA@mail.gmail.com
Whole thread Raw
Responses Re: [HACKERS] Parallel seq. plan is not coming against inheritance orpartition table  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi All,

From following git commit onwards, parallel seq scan is never getting
selected for inheritance or partitioned tables.

<git-commit>
commit 51ee6f3160d2e1515ed6197594bda67eb99dc2cc
Author: Robert Haas <rhaas@postgresql.org>
Date:   Wed Feb 15 13:37:24 2017 -0500

    Replace min_parallel_relation_size with two new GUCs.
</git-commit>


Steps to reproduce:
==============
create table t1 (a integer);

create table t1_1 (check (a >=1  and a < 1000000)) inherits (t1);
create table t1_2 (check (a >= 1000000 and a < 2000000)) inherits (t1);

insert into t1_1 select generate_series(1, 900000);
insert into t1_2 select generate_series(1000000, 1900000);

analyze t1;
analyze t1_1;
analyze t1_2;

explain analyze select * from t1 where a < 50000 OR a > 1950000;

EXPLAIN ANALYZE output:
====================
1) Prior to "Replace min_parallel_relation_size with two new GUCs" commit,

postgres=# explain analyze select * from t1 where a < 50000 OR a > 1950000;
                                                          QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..25094.71 rows=48787 width=4) (actual
time=0.431..184.264 rows=49999 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Append  (cost=0.00..19216.01 rows=20328 width=4) (actual
time=0.025..167.465 rows=16666 loops=3)
         ->  Parallel Seq Scan on t1  (cost=0.00..0.00 rows=1 width=4)
(actual time=0.001..0.001 rows=0 loops=3)
               Filter: ((a < 50000) OR (a > 1950000))
         ->  Parallel Seq Scan on t1_1  (cost=0.00..9608.00 rows=20252
width=4) (actual time=0.023..76.644 rows=16666 loops=3)
               Filter: ((a < 50000) OR (a > 1950000))
               Rows Removed by Filter: 283334
         ->  Parallel Seq Scan on t1_2  (cost=0.00..9608.01 rows=75
width=4) (actual time=89.505..89.505 rows=0 loops=3)
               Filter: ((a < 50000) OR (a > 1950000))
               Rows Removed by Filter: 300000
 Planning time: 0.343 ms
 Execution time: 188.624 ms
(14 rows)

2) From "Replace min_parallel_relation_size with two new GUCs" commit onwards,

postgres=# explain analyze select * from t1 where a < 50000 OR a > 1950000;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..34966.01 rows=50546 width=4) (actual
time=0.021..375.747 rows=49999 loops=1)
   ->  Seq Scan on t1  (cost=0.00..0.00 rows=1 width=4) (actual
time=0.004..0.004 rows=0 loops=1)
         Filter: ((a < 50000) OR (a > 1950000))
   ->  Seq Scan on t1_1  (cost=0.00..17483.00 rows=50365 width=4)
(actual time=0.016..198.393 rows=49999 loops=1)
         Filter: ((a < 50000) OR (a > 1950000))
         Rows Removed by Filter: 850001
   ->  Seq Scan on t1_2  (cost=0.00..17483.01 rows=180 width=4)
(actual time=173.310..173.310 rows=0 loops=1)
         Filter: ((a < 50000) OR (a > 1950000))
         Rows Removed by Filter: 900001
 Planning time: 0.812 ms
 Execution time: 377.831 ms
(11 rows)

RCA:
====
From "Replace min_parallel_relation_size with two new GUCs" commit
onwards, we are not assigning parallel workers for the child rel with
zero heap pages. This means we won't be able to create a partial
append path as this requires all the child rels within an Append Node
to have a partial path. Please check the following code snippet from
set_append_rel_pathlist().

        /* Same idea, but for a partial plan. */
        if (childrel->partial_pathlist != NIL)
            partial_subpaths = accumulate_append_subpath(partial_subpaths,
                                       linitial(childrel->partial_pathlist));
        else
            partial_subpaths_valid = false;

    .........
    .........

    /*
     * Consider an append of partial unordered, unparameterized partial paths.
     */
    if (partial_subpaths_valid)
    {
    ...........
    ...........

        /* Generate a partial append path. */
        appendpath = create_append_path(rel, partial_subpaths, NULL,
                                        parallel_workers);
        add_partial_path(rel, (Path *) appendpath);
    }


In short, no Gather path would be generated if baserel having an
Append Node contains any child rel without partial path. This means
just because one childRel having zero heap pages didn't get parallel
workers other childRels that was good enough to go for Parallel Seq
Scan had to go for normal seq scan which could be costlier.

Fix:
====
Attached is the patch that fixes this issue.

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] GSoC 2017
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] REINDEX CONCURRENTLY 2.0