Re: Is it worth pushing conditions to sublink/subplan? - Mailing list pgsql-hackers

From shawn wang
Subject Re: Is it worth pushing conditions to sublink/subplan?
Date
Msg-id CA+T=_GW5JX-4EFb9iujv_ULpW1HjxmKYkjjMA1iq3fGa8FL4bQ@mail.gmail.com
Whole thread Raw
In response to Re: Is it worth pushing conditions to sublink/subplan?  (Wenjing <wenjing.zwj@alibaba-inc.com>)
Responses 回复:Re: Is it worth pushing conditions to sublink/subplan?
List pgsql-hackers
I tested it the way you said and increased the number of sub-tables.
I created a hash partition table of 1000 sub-tables.
Test according to your first SQL, the optimizer cuts the unnecessary sub-tables well.
You can see the plan:

postgres=# explain analyze

postgres-# select temp.p1  from 

postgres-# (

postgres(# select p1,p2 from test1.test1hashtable x where x.p1 = '1'

postgres(# union all 

postgres(# (values('1','1')) 

postgres(# ) temp,

postgres-# test1.test1hashtable y

postgres-# where  y.p2 = temp.p2 and y.p1 = '1' and y.p1='1';

                                                     QUERY PLAN                                                      

---------------------------------------------------------------------------------------------------------------------

 Nested Loop  (cost=0.00..25.55 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=1)

   Join Filter: (x.p2 = y.p2)

   ->  Seq Scan on test1hashtable826 y  (cost=0.00..12.75 rows=1 width=32) (actual time=0.002..0.002 rows=0 loops=1)

         Filter: (p1 = '1'::text)

   ->  Append  (cost=0.00..12.78 rows=2 width=64) (never executed)

         ->  Seq Scan on test1hashtable826 x  (cost=0.00..12.75 rows=1 width=64) (never executed)

               Filter: (p1 = '1'::text)

         ->  Result  (cost=0.00..0.01 rows=1 width=64) (never executed)

 Planning Time: 0.158 ms

 Execution Time: 0.022 ms

(10 rows)


But when the second one runs, the planning time reaches 13.942ms.
The plan:


postgres=# explain analyze

postgres-# select

postgres-# y.p1,

postgres-# (Select x.p2 from test1.test1hashtable x where y.p1 =x.p1 and y.p2=x.p2) as b

postgres-# from test1.test1hashtable y where p1 = '1' and p2 = '1';

                                                    QUERY PLAN                                                    

------------------------------------------------------------------------------------------------------------------

 Seq Scan on test1hashtable826 y  (cost=0.00..13318.30 rows=1 width=64) (actual time=0.004..0.047 rows=0 loops=1)

   Filter: ((p1 = '1'::text) AND (p2 = '1'::text))

   SubPlan 1

     ->  Append  (cost=0.00..13305.00 rows=1000 width=32) (never executed)

           ->  Seq Scan on test1hashtable1 x_1  (cost=0.00..13.30 rows=1 width=32) (never executed)

                 Filter: ((y.p1 = p1) AND (y.p2 = p2))

           ->  Seq Scan on test1hashtable1000 x_1000  (cost=0.00..13.30 rows=1 width=32) (never executed)

                 Filter: ((y.p1 = p1) AND (y.p2 = p2))

 Planning Time: 13.942 ms

 Execution Time: 4.899 ms

(2006 rows)


This is a very worthwhile thing to do. In a relatively large business system, a large number of partition tables and high concurrency are often used. If the planning time is too long, this will greatly affect the business.


regards,

Shawn.


Wenjing <wenjing.zwj@alibaba-inc.com> 于2021年8月17日周二 上午10:31写道:


2021年8月16日 17:15,Wenjing <wenjing.zwj@alibaba-inc.com> 写道:

Hi Hackers,

Recently, a issue has been bothering me, This is about conditional push-down in SQL.
I use cases from regression testing as an example.
I found that the conditions  (B =1)  can be pushed down into the subquery, However, it cannot be pushed down to sublink/subplan.
If a sublink/subplan clause contains a partition table, it can be useful to get the conditions for pruning.
So, is it worth pushing conditions to sublink/subplan?
Anybody have any ideas?


regards,
Wenjing


example:
create table p (a int, b int, c int) partition by list (a);
create table p1 partition of p for values in (1);
create table p2 partition of p for values in (2);
create table q (a int, b int, c int) partition by list (a);
create table q1 partition of q for values in (1) partition by list (b);
create table q11 partition of q1 for values in (1) partition by list (c);
create table q111 partition of q11 for values in (1);
create table q2 partition of q for values in (2) partition by list (b);
create table q21 partition of q2 for values in (1);
create table q22 partition of q2 for values in (2);
insert into q22 values (2, 2, 3);
Sorry, I messed up the structure of the table.
It is should be:
create table ab (a int not null, b int not null) partition by list (a);
create table ab_a2 partition of ab for values in(2) partition by list (b);
create table ab_a2_b1 partition of ab_a2 for values in (1);
create table ab_a2_b2 partition of ab_a2 for values in (2);
create table ab_a2_b3 partition of ab_a2 for values in (3);
create table ab_a1 partition of ab for values in(1) partition by list (b);
create table ab_a1_b1 partition of ab_a1 for values in (1);
create table ab_a1_b2 partition of ab_a1 for values in (2);
create table ab_a1_b3 partition of ab_a1 for values in (3);
create table ab_a3 partition of ab for values in(3) partition by list (b);
create table ab_a3_b1 partition of ab_a3 for values in (1);
create table ab_a3_b2 partition of ab_a3 for values in (2);
create table ab_a3_b3 partition of ab_a3 for values in (3);




postgres-# explain (costs off)
postgres-# select temp.b  from 
postgres-# (
postgres(# select a,b from ab x where x.a = 1
postgres(# union all 
postgres(# (values(1,1)) 
postgres(# ) temp,
postgres-# ab y
postgres-# where  y.b = temp.b and y.a = 1 and y.b=1;
                    QUERY PLAN                     
---------------------------------------------------
 Nested Loop
   ->  Seq Scan on ab_a1_b1 y
         Filter: ((b = 1) AND (a = 1))
   ->  Append
         ->  Subquery Scan on "*SELECT* 1"
               ->  Seq Scan on ab_a1_b1 x
                     Filter: ((a = 1) AND (b = 1))
         ->  Result
(8 rows)

The conditions  (B =1)  can be pushed down into the subquery.

postgres=# explain (costs off)
postgres-# select
postgres-# y.a,
postgres-# (Select x.b from ab x where y.a =x.a and y.b=x.b) as b
postgres-# from ab y where a = 1 and b = 1;
                    QUERY PLAN                     
---------------------------------------------------
 Seq Scan on ab_a1_b1 y
   Filter: ((a = 1) AND (b = 1))
   SubPlan 1
     ->  Append
           ->  Seq Scan on ab_a1_b1 x_1
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a1_b2 x_2
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a1_b3 x_3
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a2_b1 x_4
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a2_b2 x_5
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a2_b3 x_6
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a3_b1 x_7
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a3_b2 x_8
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a3_b3 x_9
                 Filter: ((y.a = a) AND (y.b = b))
(22 rows)

The conditions (B = 1 and A = 1) cannot be pushed down to sublink/subplan in targetlist.

postgres=# explain (costs off)
postgres-# select y.a
postgres-# from ab y 
postgres-# where
postgres-# (select x.a > x.b from ab x where y.a =x.a and y.b=x.b) and
postgres-# y.a = 1 and y.b = 1;
                    QUERY PLAN                     
---------------------------------------------------
 Seq Scan on ab_a1_b1 y
   Filter: ((a = 1) AND (b = 1) AND (SubPlan 1))
   SubPlan 1
     ->  Append
           ->  Seq Scan on ab_a1_b1 x_1
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a1_b2 x_2
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a1_b3 x_3
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a2_b1 x_4
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a2_b2 x_5
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a2_b3 x_6
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a3_b1 x_7
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a3_b2 x_8
                 Filter: ((y.a = a) AND (y.b = b))
           ->  Seq Scan on ab_a3_b3 x_9
                 Filter: ((y.a = a) AND (y.b = b))
(22 rows)

The conditions  (B=1 and A=1)  cannot be pushed down to sublink/subplan in where clause.




pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: prevent immature WAL streaming
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Possible missing segments in archiving on standby