Home > mailing lists

Re: Is it worth pushing conditions to sublink/subplan? - Mailing list pgsql-hackers

From	shawn wang
Subject	Re: Is it worth pushing conditions to sublink/subplan?
Date	September 1, 2021 02:54:35
Msg-id	CA+T=_GW5JX-4EFb9iujv_ULpW1HjxmKYkjjMA1iq3fGa8FL4bQ@mail.gmail.com Whole thread
In response to	Re: Is it worth pushing conditions to sublink/subplan? (Wenjing <wenjing.zwj@alibaba-inc.com>)
Responses	回复：Re: Is it worth pushing conditions to sublink/subplan?
List	pgsql-hackers

Tree view

I tested it the way you said and increased the number of sub-tables.

I created a hash partition table of 1000 sub-tables.
Test according to your first SQL, the optimizer cuts the unnecessary sub-tables well.
You can see the plan:

postgres=# explain analyze

postgres-# select temp.p1 from

postgres-# (

postgres(# select p1,p2 from test1.test1hashtable x where x.p1 = '1'

postgres(# union all

postgres(# (values('1','1'))

postgres(# ) temp,

postgres-# test1.test1hashtable y

postgres-# where y.p2 = temp.p2 and y.p1 = '1' and y.p1='1';

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=0.00..25.55 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=1)

Join Filter: (x.p2 = y.p2)

-> Seq Scan on test1hashtable826 y (cost=0.00..12.75 rows=1 width=32) (actual time=0.002..0.002 rows=0 loops=1)

Filter: (p1 = '1'::text)

-> Append (cost=0.00..12.78 rows=2 width=64) (never executed)

-> Seq Scan on test1hashtable826 x (cost=0.00..12.75 rows=1 width=64) (never executed)

Filter: (p1 = '1'::text)

-> Result (cost=0.00..0.01 rows=1 width=64) (never executed)

Planning Time: 0.158 ms

Execution Time: 0.022 ms

(10 rows)

But when the second one runs, the planning time reaches 13.942ms.
The plan:

postgres=# explain analyze

postgres-# select

postgres-# y.p1,

postgres-# (Select x.p2 from test1.test1hashtable x where y.p1 =x.p1 and y.p2=x.p2) as b

postgres-# from test1.test1hashtable y where p1 = '1' and p2 = '1';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------

Seq Scan on test1hashtable826 y (cost=0.00..13318.30 rows=1 width=64) (actual time=0.004..0.047 rows=0 loops=1)

Filter: ((p1 = '1'::text) AND (p2 = '1'::text))

SubPlan 1

-> Append (cost=0.00..13305.00 rows=1000 width=32) (never executed)

-> Seq Scan on test1hashtable1 x_1 (cost=0.00..13.30 rows=1 width=32) (never executed)

Filter: ((y.p1 = p1) AND (y.p2 = p2))

-> Seq Scan on test1hashtable1000 x_1000 (cost=0.00..13.30 rows=1 width=32) (never executed)

Filter: ((y.p1 = p1) AND (y.p2 = p2))

Planning Time: 13.942 ms

Execution Time: 4.899 ms

(2006 rows)

This is a very worthwhile thing to do. In a relatively large business system, a large number of partition tables and high concurrency are often used. If the planning time is too long, this will greatly affect the business.

regards,

Shawn.

Wenjing <wenjing.zwj@alibaba-inc.com> 于2021年8月17日周二上午10:31写道：

2021年8月16日 17:15，Wenjing <wenjing.zwj@alibaba-inc.com> 写道：

Hi Hackers,

Recently, a issue has been bothering me, This is about conditional push-down in SQL.
I use cases from regression testing as an example.
I found that the conditions (B =1) can be pushed down into the subquery, However, it cannot be pushed down to sublink/subplan.
If a sublink/subplan clause contains a partition table, it can be useful to get the conditions for pruning.
So, is it worth pushing conditions to sublink/subplan?
Anybody have any ideas?

regards,
Wenjing

example:
create table p (a int, b int, c int) partition by list (a);
create table p1 partition of p for values in (1);
create table p2 partition of p for values in (2);
create table q (a int, b int, c int) partition by list (a);
create table q1 partition of q for values in (1) partition by list (b);
create table q11 partition of q1 for values in (1) partition by list (c);
create table q111 partition of q11 for values in (1);
create table q2 partition of q for values in (2) partition by list (b);
create table q21 partition of q2 for values in (1);
create table q22 partition of q2 for values in (2);
insert into q22 values (2, 2, 3);
Sorry, I messed up the structure of the table.
It is should be:
create table ab (a int not null, b int not null) partition by list (a);
create table ab_a2 partition of ab for values in(2) partition by list (b);
create table ab_a2_b1 partition of ab_a2 for values in (1);
create table ab_a2_b2 partition of ab_a2 for values in (2);
create table ab_a2_b3 partition of ab_a2 for values in (3);
create table ab_a1 partition of ab for values in(1) partition by list (b);
create table ab_a1_b1 partition of ab_a1 for values in (1);
create table ab_a1_b2 partition of ab_a1 for values in (2);
create table ab_a1_b3 partition of ab_a1 for values in (3);
create table ab_a3 partition of ab for values in(3) partition by list (b);
create table ab_a3_b1 partition of ab_a3 for values in (1);
create table ab_a3_b2 partition of ab_a3 for values in (2);
create table ab_a3_b3 partition of ab_a3 for values in (3);

postgres-# explain (costs off)
postgres-# select temp.b from
postgres-# (
postgres(# select a,b from ab x where x.a = 1
postgres(# union all
postgres(# (values(1,1))
postgres(# ) temp,
postgres-# ab y
postgres-# where y.b = temp.b and y.a = 1 and y.b=1;
QUERY PLAN
---------------------------------------------------
Nested Loop
   -> Seq Scan on ab_a1_b1 y
   Filter: ((b = 1) AND (a = 1))
   -> Append
   -> Subquery Scan on "*SELECT* 1"
   -> Seq Scan on ab_a1_b1 x
   Filter: ((a = 1) AND (b = 1))
   -> Result
(8 rows)

The conditions (B =1) can be pushed down into the subquery.

postgres=# explain (costs off)
postgres-# select
postgres-# y.a,
postgres-# (Select x.b from ab x where y.a =x.a and y.b=x.b) as b
postgres-# from ab y where a = 1 and b = 1;
QUERY PLAN
---------------------------------------------------
Seq Scan on ab_a1_b1 y
   Filter: ((a = 1) AND (b = 1))
   SubPlan 1
   -> Append
   -> Seq Scan on ab_a1_b1 x_1
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a1_b2 x_2
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a1_b3 x_3
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a2_b1 x_4
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a2_b2 x_5
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a2_b3 x_6
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a3_b1 x_7
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a3_b2 x_8
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a3_b3 x_9
   Filter: ((y.a = a) AND (y.b = b))
(22 rows)

The conditions (B = 1 and A = 1) cannot be pushed down to sublink/subplan in targetlist.

postgres=# explain (costs off)
postgres-# select y.a
postgres-# from ab y
postgres-# where
postgres-# (select x.a > x.b from ab x where y.a =x.a and y.b=x.b) and
postgres-# y.a = 1 and y.b = 1;
QUERY PLAN
---------------------------------------------------
Seq Scan on ab_a1_b1 y
   Filter: ((a = 1) AND (b = 1) AND (SubPlan 1))
   SubPlan 1
   -> Append
   -> Seq Scan on ab_a1_b1 x_1
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a1_b2 x_2
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a1_b3 x_3
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a2_b1 x_4
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a2_b2 x_5
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a2_b3 x_6
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a3_b1 x_7
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a3_b2 x_8
   Filter: ((y.a = a) AND (y.b = b))
   -> Seq Scan on ab_a3_b3 x_9
   Filter: ((y.a = a) AND (y.b = b))
(22 rows)

The conditions (B=1 and A=1) cannot be pushed down to sublink/subplan in where clause.

pgsql-hackers by date:

From: Fujii Masao
Date: 01 September 2021, 02:34:34
Subject: Re: prevent immature WAL streaming

From: Kyotaro Horiguchi
Date: 01 September 2021, 03:12:25
Subject: Re: Possible missing segments in archiving on standby

Re: Is it worth pushing conditions to sublink/subplan? - Mailing list pgsql-hackers

Previous

Next