Thread: Can dml realize the partition table's rule and make good execution plan?

Can dml realize the partition table's rule and make good execution plan?

From
高健
Date:
Hi all:
I made partition tables:

postgres=# create table ptest(id integer, name varchar(20));
CREATE TABLE
postgres=# create table ctest01(CHECK(id<5000000)) inherits (ptest);
CREATE TABLE
postgres=# create table ctest02(CHECK(id>=5000000)) inherits (ptest);
CREATE TABLE
postgres=# 
postgres=# create index on ctest01(id);
CREATE INDEX
postgres=# create index on ctest02(id);
CREATE INDEX
postgres=# 
postgres=# 

postgres=# CREATE OR REPLACE FUNCTION ptest_insert_trigger() RETURNS TRIGGER AS $$ 
postgres$# 
postgres$# BEGIN 
postgres$# 
postgres$#    IF ( NEW.id <5000000 ) THEN 
postgres$#        INSERT INTO ctest01 VALUES (NEW.*);
postgres$#    ELSIF ( NEW.id >= 5000000 ) THEN 
postgres$#        INSERT INTO ctest02 VALUES (NEW.*); 
postgres$#    ELSE 
postgres$#        RAISE EXCEPTION 'Error while inserting data';
postgres$#    END IF; 
postgres$#   
postgres$#   RETURN NULL;
postgres$# END; $$ LANGUAGE plpgsql;
CREATE FUNCTION
postgres=# 
postgres=# CREATE TRIGGER insert_ptest_trigger BEFORE INSERT ON ptest FOR EACH ROW 
postgres-#   EXECUTE PROCEDURE ptest_insert_trigger();
CREATE TRIGGER
postgres=#

And  when executing sql statement , I got the following plan:

postgres=# explain select * from ptest where id=5000 or id=6000000;
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Result  (cost=0.00..54.93 rows=5 width=20)
   ->  Append  (cost=0.00..54.93 rows=5 width=20)
         ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1 width=62)
               Filter: ((id = 5000) OR (id = 6000000))
         ->  Bitmap Heap Scan on ctest01 ptest  (cost=19.49..27.46 rows=2 width=9)
               Recheck Cond: ((id = 5000) OR (id = 6000000))
               ->  BitmapOr  (cost=19.49..19.49 rows=2 width=0)
                     ->  Bitmap Index Scan on ctest01_id_idx  (cost=0.00..9.74 rows=1 width=0)
                           Index Cond: (id = 5000)
                     ->  Bitmap Index Scan on ctest01_id_idx  (cost=0.00..9.74 rows=1 width=0)
                           Index Cond: (id = 6000000)
         ->  Bitmap Heap Scan on ctest02 ptest  (cost=19.49..27.46 rows=2 width=9)
               Recheck Cond: ((id = 5000) OR (id = 6000000))
               ->  BitmapOr  (cost=19.49..19.49 rows=2 width=0)
                     ->  Bitmap Index Scan on ctest02_id_idx  (cost=0.00..9.74 rows=1 width=0)
                           Index Cond: (id = 5000)
                     ->  Bitmap Index Scan on ctest02_id_idx  (cost=0.00..9.74 rows=1 width=0)
                           Index Cond: (id = 6000000)
(18 rows)

postgres=# 

The selection used where condition for every  partition table, which is not what I want. my rule is just for id column value.
And my select sql statement's where condition is also for id column value.
Is there any method to let  the database to realize my rule of parent table when creating execution plan?

Thanks in advance

Re: Can dml realize the partition table's rule and make good execution plan?

From
Craig Ringer
Date:
On 11/12/2012 10:39 AM, 高健 wrote:

The selection used where condition for every  partition table, which is not what I want. my rule is just for id column value.
Is `constraint_exclusion` turned on?

http://www.postgresql.org/docs/current/static/ddl-partitioning.html
http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-CONSTRAINT-EXCLUSION

--
Craig Ringer

Re: Can dml realize the partition table's rule and make good execution plan?

From
Craig Ringer
Date:
On 11/12/2012 10:39 AM, 高健 wrote:
> The selection used where condition for every  partition table, which
> is not what I want. my rule is just for id column value.
> And my select sql statement's where condition is also for id column value.
After re-reading your question I see what you're getting at. You want
the query planner to rewrite it as if it were:

explain select * from ptest where id=5000
UNION ALL
select * from ptest WHERE id=6000000

and produce a plan like this:


regress=> explain select * from ptest where id=5000 UNION ALL select *
from ptest WHERE id=6000000;
                                             QUERY
PLAN
-----------------------------------------------------------------------------------------------------
 Result  (cost=0.00..25.58 rows=10 width=62)
   ->  Append  (cost=0.00..25.58 rows=10 width=62)
         ->  Result  (cost=0.00..12.74 rows=5 width=62)
               ->  Append  (cost=0.00..12.74 rows=5 width=62)
                     ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1
width=62)
                           Filter: (id = 5000)
                     ->  Bitmap Heap Scan on ctest01 ptest
(cost=4.28..12.74 rows=4 width=62)
                           Recheck Cond: (id = 5000)
                           ->  Bitmap Index Scan on ctest01_id_idx
(cost=0.00..4.28 rows=4 width=0)
                                 Index Cond: (id = 5000)
         ->  Result  (cost=0.00..12.74 rows=5 width=62)
               ->  Append  (cost=0.00..12.74 rows=5 width=62)
                     ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1
width=62)
                           Filter: (id = 6000000)
                     ->  Bitmap Heap Scan on ctest02 ptest
(cost=4.28..12.74 rows=4 width=62)
                           Recheck Cond: (id = 6000000)
                           ->  Bitmap Index Scan on ctest02_id_idx
(cost=0.00..4.28 rows=4 width=0)
                                 Index Cond: (id = 6000000)
(18 rows)


ie to scan ctest01 using ctest01_id_idx for 500, and ctest02 using
ctest02_id_idx for
6000000, then combine the results.

If so: I'm not aware of any way to make the planner aware that that's
possible. It'd be an interesting enhancement, to apply constraint
exclusion to values pushed down into partitions, rather than simply to
include or exclude partitions based on constraint exclusion.

--
Craig Ringer


Hi Craig:
Thank you for your reply.
I checked for constratint_exclusion , in my sample, on and partition is same(I have no data on parent table). it really works for me.
I tried and found that constraint_exclusion can work in simple ways , but not for complicated conditions such as "id=a or id=b".
And the union all of two simple query really produced a lower cost. 

postgres=# show constraint_exclusion;
 constraint_exclusion 
----------------------
 partition
(1 row)

postgres=# explain select * from ptest where id=5000;
                                           QUERY PLAN                                            
-------------------------------------------------------------------------------------------------
 Result  (cost=0.00..13.75 rows=2 width=36)
   ->  Append  (cost=0.00..13.75 rows=2 width=36)
         ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1 width=62)
               Filter: (id = 5000)
         ->  Index Scan using ctest01_id_idx on ctest01 ptest  (cost=0.00..13.75 rows=1 width=9)
               Index Cond: (id = 5000)
(6 rows)

postgres=# 

postgres=# explain select * from ptest where id=600000;
                                           QUERY PLAN                                            
-------------------------------------------------------------------------------------------------
 Result  (cost=0.00..13.75 rows=2 width=36)
   ->  Append  (cost=0.00..13.75 rows=2 width=36)
         ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1 width=62)
               Filter: (id = 600000)
         ->  Index Scan using ctest01_id_idx on ctest01 ptest  (cost=0.00..13.75 rows=1 width=9)
               Index Cond: (id = 600000)
(6 rows)

postgres=# 


postgres=# explain select * from ptest where id=5000
UNION ALL
select * from ptest where id=6000000;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Result  (cost=0.00..27.55 rows=4 width=36)
   ->  Append  (cost=0.00..27.55 rows=4 width=36)
         ->  Result  (cost=0.00..13.75 rows=2 width=36)
               ->  Append  (cost=0.00..13.75 rows=2 width=36)
                     ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1 width=62)
                           Filter: (id = 5000)
                     ->  Index Scan using ctest01_id_idx on ctest01 ptest  (cost=0.00..13.75 rows=1 width=9)
                           Index Cond: (id = 5000)
         ->  Result  (cost=0.00..13.75 rows=2 width=36)
               ->  Append  (cost=0.00..13.75 rows=2 width=36)
                     ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1 width=62)
                           Filter: (id = 6000000)
                     ->  Index Scan using ctest02_id_idx on ctest02 ptest  (cost=0.00..13.75 rows=1 width=9)
                           Index Cond: (id = 6000000)
(14 rows)

postgres=# 


2012/11/12 Craig Ringer <craig@2ndquadrant.com>
On 11/12/2012 10:39 AM, 高健 wrote:
> The selection used where condition for every  partition table, which
> is not what I want. my rule is just for id column value.
> And my select sql statement's where condition is also for id column value.
After re-reading your question I see what you're getting at. You want
the query planner to rewrite it as if it were:

explain select * from ptest where id=5000
UNION ALL
select * from ptest WHERE id=6000000

and produce a plan like this:


regress=> explain select * from ptest where id=5000 UNION ALL select *
from ptest WHERE id=6000000;
                                             QUERY
PLAN
-----------------------------------------------------------------------------------------------------
 Result  (cost=0.00..25.58 rows=10 width=62)
   ->  Append  (cost=0.00..25.58 rows=10 width=62)
         ->  Result  (cost=0.00..12.74 rows=5 width=62)
               ->  Append  (cost=0.00..12.74 rows=5 width=62)
                     ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1
width=62)
                           Filter: (id = 5000)
                     ->  Bitmap Heap Scan on ctest01 ptest
(cost=4.28..12.74 rows=4 width=62)
                           Recheck Cond: (id = 5000)
                           ->  Bitmap Index Scan on ctest01_id_idx
(cost=0.00..4.28 rows=4 width=0)
                                 Index Cond: (id = 5000)
         ->  Result  (cost=0.00..12.74 rows=5 width=62)
               ->  Append  (cost=0.00..12.74 rows=5 width=62)
                     ->  Seq Scan on ptest  (cost=0.00..0.00 rows=1
width=62)
                           Filter: (id = 6000000)
                     ->  Bitmap Heap Scan on ctest02 ptest
(cost=4.28..12.74 rows=4 width=62)
                           Recheck Cond: (id = 6000000)
                           ->  Bitmap Index Scan on ctest02_id_idx
(cost=0.00..4.28 rows=4 width=0)
                                 Index Cond: (id = 6000000)
(18 rows)


ie to scan ctest01 using ctest01_id_idx for 500, and ctest02 using
ctest02_id_idx for
6000000, then combine the results.

If so: I'm not aware of any way to make the planner aware that that's
possible. It'd be an interesting enhancement, to apply constraint
exclusion to values pushed down into partitions, rather than simply to
include or exclude partitions based on constraint exclusion.

--
Craig Ringer

Re: Can dml realize the partition table's rule and make good execution plan?

From
Craig Ringer
Date:
On 11/12/2012 02:24 PM, 高健 wrote:
> Hi Craig:
> Thank you for your reply.
> I checked for constratint_exclusion , in my sample, on and partition
> is same(I have no data on parent table). it really works for me.
> I tried and found that constraint_exclusion can work in simple ways ,
> but not for complicated conditions such as "id=a or id=b".
> And the union all of two simple query really produced a lower cost.
For now you'll probably need to use that approach - UNION ALL of simpler
queries. It's cumbersome and annoying, though.

It'd be really interesting to enhance the query planner to be smarter
about this particular case, but the planner is way past my
scary-code-voodoo level so I can't really help there; I'm more
interested in usability issues in the tools for any development time I get.

--
Craig Ringer


Craig Ringer <craig@2ndQuadrant.com> writes:
> It'd be really interesting to enhance the query planner to be smarter
> about this particular case,

At least for this particular example, it's not at all clear to me that
it'd be worth the cost.  Getting rid of accesses to tables altogether is
worthwhile, but avoiding one very-low-cost arm of a BitmapOr is less so.

            regards, tom lane