Thread: force partition pruning
Hi all
I need a litte advice on how to
Postgres 13.2
A metadata query pulls partition keys:
select m.period_version from register.register_metadata m where m.current_version and m.period between '201712' and '201912' ;
A query using these in an in-list easily makes the planner do partition pruning.
select * from register.register d where d.period_version in ('201712_1', '201812_1', '201912_1');
However combining the metadataquery into the dataquery makes the planner decide to scan all partitions.
select * from register.register d where d.period_version in (select m.period_version from register.register_metadata m where m.current_version and m.period between '201712' and '201912');
I am quite aware that the latter query requires partition pruning to take place during execution not during planning.
My question here is how do I package the two-step proces into an interface that analysts can actually use?
One possibility is to have a prepare step that creates a temporary view with the hard-coded values built-in. And then query data via the temp view. This works ok, but there is an issue with possible naming conflicts on the temp view (not that this could not be worked around).
Ideally I would like a function to figure out the query and then return the data from that dynamically executed query. Complicating matters is the fact that there are more than one set of data/metatable tables and each datatable has a different set of columns. This excludes a table returning function since that must list the columns present.
Best regards
Niels Jespersen
Chief Adviser
IT Center
Mobile phone:+45 42 42 93 73
Email: njn@dst.dk
Statistics Denmark, Sejrøgade 11, DK-2100 Copenhagen
www.dst.dk/en | Twitter | LinkedIn | Facebook
Attachment
Hi all
I need a litte advice on how to
Postgres 13.2
A metadata query pulls partition keys:
select m.period_version from register.register_metadata m where m.current_version and m.period between '201712' and '201912' ;
A query using these in an in-list easily makes the planner do partition pruning.
select * from register.register d where d.period_version in ('201712_1', '201812_1', '201912_1');
However combining the metadataquery into the dataquery makes the planner decide to scan all partitions.
select * from register.register d where d.period_version in (select m.period_version from register.register_metadata m where m.current_version and m.period between '201712' and '201912');
I am quite aware that the latter query requires partition pruning to take place during execution not during planning.
My question here is how do I package the two-step proces into an interface that analysts can actually use?
One possibility is to have a prepare step that creates a temporary view with the hard-coded values built-in. And then query data via the temp view. This works ok, but there is an issue with possible naming conflicts on the temp view (not that this could not be worked around).
Ideally I would like a function to figure out the query and then return the data from that dynamically executed query. Complicating matters is the fact that there are more than one set of data/metatable tables and each datatable has a different set of columns. This excludes a table returning function since that must list the columns present.
Best regards
Niels Jespersen
Chief Adviser
IT Center
Mobile phone:+45 42 42 93 73
Email: njn@dst.dk
Statistics Denmark, Sejrøgade 11, DK-2100 Copenhagenwww.dst.dk/en | Twitter | LinkedIn | Facebook
Attachment
During initialization of the query plan. Partition pruning can be performed here for parameter values which are known during the initialization phase of execution. Partitions which are pruned during this stage will not show up in the query's
EXPLAIN
orEXPLAIN ANALYZE
. It is possible to determine the number of partitions which were removed during this phase by observing the “Subplans Removed” property in theEXPLAIN
output.During actual execution of the query plan. Partition pruning may also be performed here to remove partitions using values which are only known during actual query execution. This includes values from subqueries and values from execution-time parameters such as those from parameterized nested loop joins. Since the value of these parameters may change many times during the execution of the query, partition pruning is performed whenever one of the execution parameters being used by partition pruning changes. Determining if partitions were pruned during this phase requires careful inspection of the
loops
property in theEXPLAIN ANALYZE
output. Subplans corresponding to different partitions may have different values for it depending on how many times each of them was pruned during execution. Some may be shown as(never executed)
if they were pruned every time.
I do not know how to put this in words,but see below when the predicate is explicitly applied to the main table with partition.postgres=# \d+ prt1Partitioned table "public.prt1"Column | Type | Collation | Nullable | Default | Storage | Stats target | Description--------+-------------------+-----------+----------+---------+----------+--------------+-------------a | integer | | not null | | plain | |b | integer | | | | plain | |c | character varying | | | | extended | |Partition key: RANGE (a)Partitions: prt1_p1 FOR VALUES FROM (0) TO (250),prt1_p2 FOR VALUES FROM (250) TO (500),prt1_p3 FOR VALUES FROM (500) TO (600)(failed reverse-i-search)`': ^Cpostgres=# \d+ bTable "public.b"Column | Type | Collation | Nullable | Default | Storage | Stats target | Description--------+---------+-----------+----------+---------+---------+--------------+-------------id | integer | | not null | | plain | |Indexes:"b_id_idx" btree (id)Access method: heappostgres=# table b;id-----200400(2 rows)-- basically if the table is joined and predicate can be applied to the outer table which has constraints matching,partition pruning takes place.I do not know the theory, or even what i did is correct, but just FYI.postgres=# explain analyze select prt1.* from prt1 where a in ( select id from b where id in (1, 100, 200) );QUERY PLAN-------------------------------------------------------------------------------------------------------------------------Hash Semi Join (cost=1.05..9.36 rows=2 width=13) (actual time=0.034..0.074 rows=1 loops=1)Hash Cond: (prt1.a = b.id)-> Append (cost=0.00..7.50 rows=300 width=13) (actual time=0.006..0.043 rows=300 loops=1)-> Seq Scan on prt1_p1 prt1_1 (cost=0.00..2.25 rows=125 width=13) (actual time=0.005..0.013 rows=125 loops=1)-> Seq Scan on prt1_p2 prt1_2 (cost=0.00..2.25 rows=125 width=13) (actual time=0.003..0.009 rows=125 loops=1)-> Seq Scan on prt1_p3 prt1_3 (cost=0.00..1.50 rows=50 width=13) (actual time=0.002..0.004 rows=50 loops=1)-> Hash (cost=1.03..1.03 rows=2 width=4) (actual time=0.005..0.005 rows=1 loops=1)Buckets: 1024 Batches: 1 Memory Usage: 9kB-> Seq Scan on b (cost=0.00..1.03 rows=2 width=4) (actual time=0.003..0.003 rows=1 loops=1)Filter: (id = ANY ('{1,100,200}'::integer[]))Rows Removed by Filter: 1Planning Time: 0.181 msExecution Time: 0.089 ms(13 rows)postgres=# explain analyze select prt1.* from prt1 where a in ( select id from b where b.id = prt1.a) and a in (1, 100, 200);QUERY PLAN---------------------------------------------------------------------------------------------------------------------------Index Scan using iprt1_p1_a on prt1_p1 prt1 (cost=0.14..14.03 rows=2 width=13) (actual time=0.024..0.025 rows=1 loops=1)Index Cond: (a = ANY ('{1,100,200}'::integer[]))Filter: (SubPlan 1)Rows Removed by Filter: 1SubPlan 1-> Seq Scan on b (cost=0.00..1.02 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=2)Filter: (id = prt1.a)Rows Removed by Filter: 1Planning Time: 0.120 msExecution Time: 0.041 ms(10 rows)postgres=# explain analyze select prt1.* from prt1 where exists ( select 1 from b where b.id = prt1.a) and a in (1, 100, 200);QUERY PLAN-------------------------------------------------------------------------------------------------------------Hash Semi Join (cost=1.04..3.79 rows=2 width=13) (actual time=0.024..0.028 rows=1 loops=1)Hash Cond: (prt1.a = b.id)-> Seq Scan on prt1_p1 prt1 (cost=0.00..2.72 rows=3 width=13) (actual time=0.011..0.017 rows=2 loops=1)Filter: (a = ANY ('{1,100,200}'::integer[]))Rows Removed by Filter: 123-> Hash (cost=1.02..1.02 rows=2 width=4) (actual time=0.004..0.004 rows=2 loops=1)Buckets: 1024 Batches: 1 Memory Usage: 9kB-> Seq Scan on b (cost=0.00..1.02 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)Planning Time: 0.192 msExecution Time: 0.043 ms(10 rows)postgres=# explain analyze select prt1.* from prt1 inner join b on prt1.a = b.id where a in (1, 100, 200);QUERY PLAN-------------------------------------------------------------------------------------------------------------Hash Join (cost=1.04..3.79 rows=2 width=13) (actual time=0.024..0.028 rows=1 loops=1)Hash Cond: (prt1.a = b.id)-> Seq Scan on prt1_p1 prt1 (cost=0.00..2.72 rows=3 width=13) (actual time=0.012..0.018 rows=2 loops=1)Filter: (a = ANY ('{1,100,200}'::integer[]))Rows Removed by Filter: 123-> Hash (cost=1.02..1.02 rows=2 width=4) (actual time=0.003..0.004 rows=2 loops=1)Buckets: 1024 Batches: 1 Memory Usage: 9kB-> Seq Scan on b (cost=0.00..1.02 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)Planning Time: 0.181 msExecution Time: 0.043 ms(10 rows)On Mon, 10 May 2021 at 17:09, Niels Jespersen <NJN@dst.dk> wrote:Hi all
I need a litte advice on how to
Postgres 13.2
A metadata query pulls partition keys:
select m.period_version from register.register_metadata m where m.current_version and m.period between '201712' and '201912' ;
A query using these in an in-list easily makes the planner do partition pruning.
select * from register.register d where d.period_version in ('201712_1', '201812_1', '201912_1');
However combining the metadataquery into the dataquery makes the planner decide to scan all partitions.
select * from register.register d where d.period_version in (select m.period_version from register.register_metadata m where m.current_version and m.period between '201712' and '201912');
I am quite aware that the latter query requires partition pruning to take place during execution not during planning.
My question here is how do I package the two-step proces into an interface that analysts can actually use?
One possibility is to have a prepare step that creates a temporary view with the hard-coded values built-in. And then query data via the temp view. This works ok, but there is an issue with possible naming conflicts on the temp view (not that this could not be worked around).
Ideally I would like a function to figure out the query and then return the data from that dynamically executed query. Complicating matters is the fact that there are more than one set of data/metatable tables and each datatable has a different set of columns. This excludes a table returning function since that must list the columns present.
Best regards
Niels Jespersen
Chief Adviser
IT Center
Mobile phone:+45 42 42 93 73
Email: njn@dst.dk
Statistics Denmark, Sejrøgade 11, DK-2100 Copenhagenwww.dst.dk/en | Twitter | LinkedIn | Facebook
--Thanks,VijayMumbai, India
Attachment
>
>Sorry,
>
>I made a major mistake. I somehow saw the period and period_version as the same.
>so, yes partitions are not pruned here. So my suggestion makes no sense.
Thats quite ok. I think my plan now is to have a table returning function that executes a query dynamically. The query has a where caluse that is first constructed.
Like this:
return query execute format('select d.x, d.y from %1$I.%1$I d where d.period_version = any(' || quote_literal(_periode_version_array) ||'::text[])', register_in);
Regards Niels
>
>Sorry,
>
>I made a major mistake. I somehow saw the period and period_version as the same.
>so, yes partitions are not pruned here. So my suggestion makes no sense.
Thats quite ok. I think my plan now is to have a table returning function that executes a query dynamically. The query has a where caluse that is first constructed.
Like this:
return query execute format('select d.x, d.y from %1$I.%1$I d where d.period_version = any(' || quote_literal(_periode_version_array) ||'::text[])', register_in);
Regards Niels
Ok. maybe you are in a rush.But I would keep the thread open, to understand what I am not understanding or else, it'll become a habit of converting sql to plpgsql :)Big Guys,It seems, when the table is partitioned by range, it makes use of a nested loop which helps in partition pruning.if the table is list partitioned, it scans all the partitions.Is this expected ?LIST BASED PARTITION**********************postgres@go:~$ more p.sqldrop table tbl1;drop table tprt;create table tbl1(col1 int);insert into tbl1 values (501), (505);-- Basic tablecreate table tprt (col1 int) partition by list (col1);create table tprt_1 partition of tprt for values in (501);create table tprt_2 partition of tprt for values in (1001);create table tprt_3 partition of tprt for values in (2001);create table tprt_4 partition of tprt for values in (3001);create table tprt_5 partition of tprt for values in (4001);create table tprt_6 partition of tprt for values in (5001);create index tprt1_idx on tprt_1 (col1);create index tprt2_idx on tprt_2 (col1);create index tprt3_idx on tprt_3 (col1);create index tprt4_idx on tprt_4 (col1);create index tprt5_idx on tprt_5 (col1);create index tprt6_idx on tprt_6 (col1);insert into tprt values (501), (1001), (2001), (3001), (4001), (5001), (501);alter table tbl1 add column col2 int default 0;update tbl1 set col2 =1 where col1 = 501;vacuum analyze tprt;vacuum analyze tbl1;explain analyze select * from tprt where tprt.col1 in (select tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );QUERY PLAN------------------------------------------------------------------------------------------------------------Hash Semi Join (cost=1.05..7.20 rows=2 width=4) (actual time=0.028..0.034 rows=2 loops=1)Hash Cond: (tprt.col1 = tbl1.col1)-> Append (cost=0.00..6.10 rows=7 width=4) (actual time=0.003..0.008 rows=7 loops=1)-> Seq Scan on tprt_1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.002..0.003 rows=2 loops=1)-> Seq Scan on tprt_2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_3 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_4 (cost=0.00..1.01 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)-> Seq Scan on tprt_5 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_6 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Hash (cost=1.02..1.02 rows=2 width=4) (actual time=0.006..0.006 rows=1 loops=1)Buckets: 1024 Batches: 1 Memory Usage: 9kB-> Seq Scan on tbl1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.003..0.003 rows=1 loops=1)Filter: (col2 = ANY ('{1,2}'::integer[]))Rows Removed by Filter: 1Planning Time: 0.237 msExecution Time: 0.060 mseven if i set hashjoin offpostgres=# set enable_hashjoin TO 0;SETpostgres=# explain analyze select * from tprt where tprt.col1 in (select tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );QUERY PLAN------------------------------------------------------------------------------------------------------------Nested Loop Semi Join (cost=0.00..7.34 rows=2 width=4) (actual time=0.013..0.023 rows=2 loops=1)Join Filter: (tprt.col1 = tbl1.col1)Rows Removed by Join Filter: 5-> Append (cost=0.00..6.10 rows=7 width=4) (actual time=0.004..0.010 rows=7 loops=1)-> Seq Scan on tprt_1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.003..0.003 rows=2 loops=1)-> Seq Scan on tprt_2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_3 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_4 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_5 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_6 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Materialize (cost=0.00..1.03 rows=2 width=4) (actual time=0.001..0.001 rows=1 loops=7)-> Seq Scan on tbl1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.007..0.007 rows=1 loops=1)Filter: (col2 = ANY ('{1,2}'::integer[]))Rows Removed by Filter: 1Planning Time: 0.578 msExecution Time: 0.038 ms(16 rows)**********************RANGE BASED PARTITION**********************postgres@go:~$ more q.sqldrop table tbl1;drop table tprt;create table tbl1(col1 int);insert into tbl1 values (501), (505);-- Basic tablecreate table tprt (col1 int) partition by range(col1);create table tprt_1 partition of tprt for values from (0) to (500);create table tprt_2 partition of tprt for values from (500) to (1000);create table tprt_3 partition of tprt for values from (1000) to (1500);create table tprt_4 partition of tprt for values from (1500) to (2000);create table tprt_5 partition of tprt for values from (2000) to (22500);create index tprt1_idx on tprt_1 (col1);create index tprt2_idx on tprt_2 (col1);create index tprt3_idx on tprt_3 (col1);create index tprt4_idx on tprt_4 (col1);create index tprt5_idx on tprt_5 (col1);insert into tprt values (501), (1001), (2001), (3001), (4001), (5001), (501);vacuum analyze tbl1;vacuum analyze tprt;alter table tbl1 add column col2 int default 0;update tbl1 set col2 =1 where col1 = 501;explain analyze select * from tprt where tprt.col1 in (select tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );QUERY PLAN----------------------------------------------------------------------------------------------------------------Nested Loop (cost=3.29..55.37 rows=2 width=4) (actual time=0.016..0.018 rows=2 loops=1)-> Unique (cost=1.03..1.04 rows=2 width=4) (actual time=0.012..0.012 rows=1 loops=1)-> Sort (cost=1.03..1.04 rows=2 width=4) (actual time=0.011..0.011 rows=1 loops=1)Sort Key: tbl1.col1Sort Method: quicksort Memory: 25kB-> Seq Scan on tbl1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.004..0.005 rows=1 loops=1)Filter: (col2 = ANY ('{1,2}'::integer[]))Rows Removed by Filter: 1-> Append (cost=2.26..26.86 rows=30 width=4) (actual time=0.003..0.004 rows=2 loops=1)-> Bitmap Heap Scan on tprt_1 (cost=2.26..11.81 rows=13 width=4) (never executed)Recheck Cond: (col1 = tbl1.col1)-> Bitmap Index Scan on tprt1_idx (cost=0.00..2.25 rows=13 width=0) (never executed)Index Cond: (col1 = tbl1.col1)-> Seq Scan on tprt_2 (cost=0.00..1.02 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)Filter: (tbl1.col1 = col1)-> Seq Scan on tprt_3 (cost=0.00..1.01 rows=1 width=4) (never executed)Filter: (tbl1.col1 = col1)-> Bitmap Heap Scan on tprt_4 (cost=2.26..11.81 rows=13 width=4) (never executed)Recheck Cond: (col1 = tbl1.col1)-> Bitmap Index Scan on tprt4_idx (cost=0.00..2.25 rows=13 width=0) (never executed)Index Cond: (col1 = tbl1.col1)-> Seq Scan on tprt_5 (cost=0.00..1.05 rows=1 width=4) (never executed)Filter: (tbl1.col1 = col1)Planning Time: 0.214 msExecution Time: 0.069 ms(25 rows)**********************On Tue, 11 May 2021 at 17:44, Niels Jespersen <NJN@dst.dk> wrote:>
>Sorry,
>
>I made a major mistake. I somehow saw the period and period_version as the same.
>so, yes partitions are not pruned here. So my suggestion makes no sense.
Thats quite ok. I think my plan now is to have a table returning function that executes a query dynamically. The query has a where caluse that is first constructed.
Like this:
return query execute format('select d.x, d.y from %1$I.%1$I d where d.period_version = any(' || quote_literal(_periode_version_array) ||'::text[])', register_in);
Regards Niels
--Thanks,VijayMumbai, India
ok,ok forcing hash join off, did not work as the outer table was the partitioned table selected.On Tue, 11 May 2021 at 22:42, Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote:Ok. maybe you are in a rush.But I would keep the thread open, to understand what I am not understanding or else, it'll become a habit of converting sql to plpgsql :)Big Guys,It seems, when the table is partitioned by range, it makes use of a nested loop which helps in partition pruning.if the table is list partitioned, it scans all the partitions.Is this expected ?LIST BASED PARTITION**********************postgres@go:~$ more p.sqldrop table tbl1;drop table tprt;create table tbl1(col1 int);insert into tbl1 values (501), (505);-- Basic tablecreate table tprt (col1 int) partition by list (col1);create table tprt_1 partition of tprt for values in (501);create table tprt_2 partition of tprt for values in (1001);create table tprt_3 partition of tprt for values in (2001);create table tprt_4 partition of tprt for values in (3001);create table tprt_5 partition of tprt for values in (4001);create table tprt_6 partition of tprt for values in (5001);create index tprt1_idx on tprt_1 (col1);create index tprt2_idx on tprt_2 (col1);create index tprt3_idx on tprt_3 (col1);create index tprt4_idx on tprt_4 (col1);create index tprt5_idx on tprt_5 (col1);create index tprt6_idx on tprt_6 (col1);insert into tprt values (501), (1001), (2001), (3001), (4001), (5001), (501);alter table tbl1 add column col2 int default 0;update tbl1 set col2 =1 where col1 = 501;vacuum analyze tprt;vacuum analyze tbl1;explain analyze select * from tprt where tprt.col1 in (select tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );QUERY PLAN------------------------------------------------------------------------------------------------------------Hash Semi Join (cost=1.05..7.20 rows=2 width=4) (actual time=0.028..0.034 rows=2 loops=1)Hash Cond: (tprt.col1 = tbl1.col1)-> Append (cost=0.00..6.10 rows=7 width=4) (actual time=0.003..0.008 rows=7 loops=1)-> Seq Scan on tprt_1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.002..0.003 rows=2 loops=1)-> Seq Scan on tprt_2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_3 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_4 (cost=0.00..1.01 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)-> Seq Scan on tprt_5 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_6 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Hash (cost=1.02..1.02 rows=2 width=4) (actual time=0.006..0.006 rows=1 loops=1)Buckets: 1024 Batches: 1 Memory Usage: 9kB-> Seq Scan on tbl1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.003..0.003 rows=1 loops=1)Filter: (col2 = ANY ('{1,2}'::integer[]))Rows Removed by Filter: 1Planning Time: 0.237 msExecution Time: 0.060 mseven if i set hashjoin offpostgres=# set enable_hashjoin TO 0;SETpostgres=# explain analyze select * from tprt where tprt.col1 in (select tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );QUERY PLAN------------------------------------------------------------------------------------------------------------Nested Loop Semi Join (cost=0.00..7.34 rows=2 width=4) (actual time=0.013..0.023 rows=2 loops=1)Join Filter: (tprt.col1 = tbl1.col1)Rows Removed by Join Filter: 5-> Append (cost=0.00..6.10 rows=7 width=4) (actual time=0.004..0.010 rows=7 loops=1)-> Seq Scan on tprt_1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.003..0.003 rows=2 loops=1)-> Seq Scan on tprt_2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_3 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_4 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_5 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Seq Scan on tprt_6 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)-> Materialize (cost=0.00..1.03 rows=2 width=4) (actual time=0.001..0.001 rows=1 loops=7)-> Seq Scan on tbl1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.007..0.007 rows=1 loops=1)Filter: (col2 = ANY ('{1,2}'::integer[]))Rows Removed by Filter: 1Planning Time: 0.578 msExecution Time: 0.038 ms(16 rows)**********************RANGE BASED PARTITION**********************postgres@go:~$ more q.sqldrop table tbl1;drop table tprt;create table tbl1(col1 int);insert into tbl1 values (501), (505);-- Basic tablecreate table tprt (col1 int) partition by range(col1);create table tprt_1 partition of tprt for values from (0) to (500);create table tprt_2 partition of tprt for values from (500) to (1000);create table tprt_3 partition of tprt for values from (1000) to (1500);create table tprt_4 partition of tprt for values from (1500) to (2000);create table tprt_5 partition of tprt for values from (2000) to (22500);create index tprt1_idx on tprt_1 (col1);create index tprt2_idx on tprt_2 (col1);create index tprt3_idx on tprt_3 (col1);create index tprt4_idx on tprt_4 (col1);create index tprt5_idx on tprt_5 (col1);insert into tprt values (501), (1001), (2001), (3001), (4001), (5001), (501);vacuum analyze tbl1;vacuum analyze tprt;alter table tbl1 add column col2 int default 0;update tbl1 set col2 =1 where col1 = 501;explain analyze select * from tprt where tprt.col1 in (select tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );QUERY PLAN----------------------------------------------------------------------------------------------------------------Nested Loop (cost=3.29..55.37 rows=2 width=4) (actual time=0.016..0.018 rows=2 loops=1)-> Unique (cost=1.03..1.04 rows=2 width=4) (actual time=0.012..0.012 rows=1 loops=1)-> Sort (cost=1.03..1.04 rows=2 width=4) (actual time=0.011..0.011 rows=1 loops=1)Sort Key: tbl1.col1Sort Method: quicksort Memory: 25kB-> Seq Scan on tbl1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.004..0.005 rows=1 loops=1)Filter: (col2 = ANY ('{1,2}'::integer[]))Rows Removed by Filter: 1-> Append (cost=2.26..26.86 rows=30 width=4) (actual time=0.003..0.004 rows=2 loops=1)-> Bitmap Heap Scan on tprt_1 (cost=2.26..11.81 rows=13 width=4) (never executed)Recheck Cond: (col1 = tbl1.col1)-> Bitmap Index Scan on tprt1_idx (cost=0.00..2.25 rows=13 width=0) (never executed)Index Cond: (col1 = tbl1.col1)-> Seq Scan on tprt_2 (cost=0.00..1.02 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)Filter: (tbl1.col1 = col1)-> Seq Scan on tprt_3 (cost=0.00..1.01 rows=1 width=4) (never executed)Filter: (tbl1.col1 = col1)-> Bitmap Heap Scan on tprt_4 (cost=2.26..11.81 rows=13 width=4) (never executed)Recheck Cond: (col1 = tbl1.col1)-> Bitmap Index Scan on tprt4_idx (cost=0.00..2.25 rows=13 width=0) (never executed)Index Cond: (col1 = tbl1.col1)-> Seq Scan on tprt_5 (cost=0.00..1.05 rows=1 width=4) (never executed)Filter: (tbl1.col1 = col1)Planning Time: 0.214 msExecution Time: 0.069 ms(25 rows)**********************On Tue, 11 May 2021 at 17:44, Niels Jespersen <NJN@dst.dk> wrote:>
>Sorry,
>
>I made a major mistake. I somehow saw the period and period_version as the same.
>so, yes partitions are not pruned here. So my suggestion makes no sense.
Thats quite ok. I think my plan now is to have a table returning function that executes a query dynamically. The query has a where caluse that is first constructed.
Like this:
return query execute format('select d.x, d.y from %1$I.%1$I d where d.period_version = any(' || quote_literal(_periode_version_array) ||'::text[])', register_in);
Regards Niels
--Thanks,VijayMumbai, India--Thanks,VijayMumbai, India
On Wed, 12 May 2021 at 06:33, Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote: > > ok i think i just may be there is very less data , hence no index scan, no pruning. > > when i try to force seq_scan off, > > postgres=# set enable_seqscan TO off; > SET > postgres=# explain analyze select * from tprt where tprt.col1 in (select tbl1.col1 from tbl1 where tbl1.col2 in (1, 2)); > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------- > Merge Semi Join (cost=0.88..20.98 rows=2 width=4) (actual time=0.031..0.042 rows=2 loops=1) > Merge Cond: (tprt.col1 = tbl1.col1) > -> Append (cost=0.75..28.89 rows=7 width=4) (actual time=0.024..0.032 rows=3 loops=1) > -> Index Only Scan using tprt1_idx on tprt_1 (cost=0.13..8.16 rows=2 width=4) (actual time=0.023..0.024 rows=2loops=1) > Heap Fetches: 0 > -> Index Only Scan using tprt2_idx on tprt_2 (cost=0.12..4.14 rows=1 width=4) (actual time=0.006..0.006 rows=1loops=1) > Heap Fetches: 0 > -> Index Only Scan using tprt3_idx on tprt_3 (cost=0.12..4.14 rows=1 width=4) (never executed) > Heap Fetches: 0 > -> Index Only Scan using tprt4_idx on tprt_4 (cost=0.12..4.14 rows=1 width=4) (never executed) > Heap Fetches: 0 > -> Index Only Scan using tprt5_idx on tprt_5 (cost=0.12..4.14 rows=1 width=4) (never executed) > Heap Fetches: 0 > -> Index Only Scan using tprt6_idx on tprt_6 (cost=0.12..4.14 rows=1 width=4) (never executed) > Heap Fetches: 0 > -> Index Scan using tbl1_col1_idx on tbl1 (cost=0.13..12.16 rows=2 width=4) (actual time=0.006..0.007 rows=1 loops=1) > Filter: (col2 = ANY ('{1,2}'::integer[])) > Rows Removed by Filter: 1 > Planning Time: 0.244 ms > Execution Time: 0.067 ms > (20 rows) Unfortunately, no run-time pruning occurred in the above plan. The reason you're seeing (never executed) is that the index scan on tbl1_col1_idx ran out of rows before the Append read tuples from all of its children. If there were rows in tbl1 that had join partners in tprt_6, then Append would have had to churn through all partitions to get to the matching row in tprt_6 The fact that the above plan uses Append made that possible. MergeAppend would have had to read a tuple from each child node to find the next highest one. Append can be used in this case as the schema design means tptr_1 will always have lower values col1 rows than tptr_2. Using Append and ensuring the children are in the correct order is a sort of "ordered partition scan". As for trying to do what you're trying to make work, right now run-time pruning only works when there are parameters that change during execution for Append and MergeAppend. So the Append/MergeAppend would have to be below either a subquery scan or a nested loop join. At the moment the planner only creates parameterized paths for indexes. Likely to make what you want to work actually work, we'd need to add some sort of concept of parameterized Seq Scans. We'd also need to attempt to cost those in a way that takes into account run-time pruning, otherwise, they'd just appear as expensive as normal seq scans. It's a bit difficult to how to decide many rows to estimate in those cases as if your partitions are unevenly sized then which size do you choose. sum tuples divided by $something, but I'm not sure what $something would be. The current situation is that with your example query, you won't get any run-time pruning with it if the join method uses is Hash or Merge Join. These join types are not parameterized therefore can't be run-time pruned. It might be possible to have a feature that adds run-time pruning to hash join. We'd need to run the pruning code each time we put a value in the hash table, then we'd need to somehow communicate the matching partitions to the outer side of the join. I'm not really sure we have executor infrastructure to allow us to pass that sort of information between nodes. However, it does not seem impossible. It does sound like a problem that would be hard to cost during planning. Running the partition pruning code once for each row going into the hash table might be a bit costly and if it prunes nothing then it was a pretty poor investment. The planner would need to only enable that when the statistics indicate it appears worthwhile. I don't see how it would be possible to run-time prune during a merge join. We've no way to know which partitions to prune unless we read the entire other side of the join first. If you do that, you've just blown all the advantages of merge join. I think, for now, the only sure way to get run-time pruning working for this case is to run two separate queries so that the 2nd one can perform plan-time pruning. You might have some luck by disabling hash and merge joins and having an index on your partition key. You might further increase the chances of the planner picking a nested loop plan with the partition table on the inner side of you add DISTINCT to the subquery. That way the planner does not have to consider the aggregate step that it needs to do in order to swap the join order to put the partitioned table on the inner side. Otherwise, the semi-joined table cannot go on the outer side as duplicate records would cause wrong results. Even if you take all those steps then it's still not great as we have no parameterized seq scans, so you'd have a less efficient plan as you'd have to scan all data in all partitions using an index scan, which is not very cache efficient. Even with all of that, you're still at the mercy of the planner not putting the partition on the inside of the nested loop, or it using a normal nested loop rather than a parameterized one. I think if you try to make this work by trying to force the planner's hand, you'll just feel pain when the planner one day has a change of heart and decides to swap the join order on you. David
Fra: David Rowley <dgrowleyml@gmail.com> Sendt: 12. maj 2021 02:34 >> >> ok i think i just may be there is very less data , hence no index scan, no pruning. >> >> when i try to force seq_scan off, >> > >Unfortunately, no run-time pruning occurred in the above plan. > >The fact that the above plan uses Append made that possible. > >I think, for now, the only sure way to get run-time pruning working for this case is to run two separate queries so thatthe 2nd one can >perform plan-time pruning. This is the conclusion I'm taking from this thread and will base my further work on. I was the one asking the original question.A table returning function is my work-hypothesis for now. > > >I think if you try to make this work by trying to force the planner's hand, you'll just feel pain when the planner one dayhas a change of heart and decides to swap the join order on you. > >David > Thank you for the insights into the planner capabilities. Regards Niels
Fra: David Rowley <dgrowleyml@gmail.com> Sendt: 12. maj 2021 02:34
>>
>> ok i think i just may be there is very less data , hence no index scan, no pruning.
>>
>> when i try to force seq_scan off,
>>
>
>Unfortunately, no run-time pruning occurred in the above plan.
>
>The fact that the above plan uses Append made that possible.
>
>I think, for now, the only sure way to get run-time pruning working for this case is to run two separate queries so that the 2nd one can
>perform plan-time pruning.
This is the conclusion I'm taking from this thread and will base my further work on. I was the one asking the original question. A table returning function is my work-hypothesis for now.
>
>
>I think if you try to make this work by trying to force the planner's hand, you'll just feel pain when the planner one day has a change of heart and decides to swap the join order on you.
>
>David
>
Thank you for the insights into the planner capabilities.
Regards Niels