Re: The rewritting of join conditions caused a very slow query plan. - Mailing list pgsql-hackers

From chang chao
Subject Re: The rewritting of join conditions caused a very slow query plan.
Date
Msg-id SG2PR06MB114954351FA5DAE7566854A984490@SG2PR06MB1149.apcprd06.prod.outlook.com
Whole thread Raw
In response to The rewritting of join conditions caused a very slow query plan.  (chang chao <chang-chao@hotmail.com>)
List pgsql-hackers
Hi,all

This is my ddl and test data.

CREATE TABLE level1_table
(     level1_no serial NOT NULL ,     level1_node_name varchar(255),      PRIMARY KEY (level1_no)
) WITHOUT OIDS;

CREATE TABLE level2_table
(     level2_no serial NOT NULL ,     parent_no int NOT NULL,     level1_node_name varchar(255),      PRIMARY KEY
(level2_no)
) WITHOUT OIDS;

ALTER TABLE level2_table      ADD FOREIGN KEY (parent_no)      REFERENCES level1_table (level1_no)      ON UPDATE
RESTRICT     ON DELETE RESTRICT 
;

CREATE TABLE all_level_status
(     level int NOT NULL,     node_no int NOT NULL,     status int,      PRIMARY KEY (level, node_no)
) WITHOUT OIDS;


--create test data
INSERT INTO level1_table(           level1_no)
select generate_series(1,200);

INSERT INTO level2_table(           level2_no, parent_no)
select
level2_no,level2_no/50+1 as parent_no
from
generate_series(1,9999) level2_no ;

INSERT INTO all_level_status(           level, node_no, status)
select 1,level1_no,0
from level1_table;

INSERT INTO all_level_status(           level, node_no, status)
select 2,level2_no,0
from level2_table;

--analyze table
analyze level1_table;
analyze level2_table;
analyze all_level_status;

--execute sql
explain analyze
select * from level2_table l2
join (select l1.* from level1_table l1join all_level_status als on (als.level=1 and als.node_no=l1.level1_no)where
als.status=0
) normal_l1 on l2.parent_no=normal_l1.level1_no;

result :
"Hash Join  (cost=16.39..198.92 rows=200 width=1044) (actual time=0.246..3.772 rows=9999 loops=1)"
"  Hash Cond: (l2.parent_no = als.node_no)"
"  ->  Seq Scan on level2_table l2  (cost=0.00..144.99 rows=9999 width=524) (actual time=0.011..0.840 rows=9999
loops=1)"
"  ->  Hash  (cost=16.34..16.34 rows=4 width=524) (actual time=0.226..0.226 rows=200 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 8kB"
"        ->  Merge Join  (cost=10.93..16.34 rows=4 width=524) (actual time=0.077..0.193 rows=200 loops=1)"
"              Merge Cond: (als.node_no = l1.level1_no)"
"              ->  Index Scan using all_level_status_pkey on all_level_status als  (cost=0.29..109.10 rows=200 width=4)
(actualtime=0.021..0.069 rows=200 loops=1)" 
"                    Index Cond: (level = 1)"
"                    Filter: (status = 0)"
"              ->  Sort  (cost=10.64..11.14 rows=200 width=520) (actual time=0.055..0.062 rows=200 loops=1)"
"                    Sort Key: l1.level1_no"
"                    Sort Method: quicksort  Memory: 34kB"
"                    ->  Seq Scan on level1_table l1  (cost=0.00..3.00 rows=200 width=520) (actual time=0.007..0.020
rows=200loops=1)" 
"Planning time: 1.158 ms"
"Execution time: 4.054 ms"

The main reason for the the big gap between estimated(200) and actual(9999) rows in line 1 lies in that, in actuality
rowsfor the join conditon column(node_no) in all_level_status table are not evenly distributed for the filter condition
column(level),butin row estimation,the optimizer takes the assumption that they are evenly distributed.So as a lesson,I
thinkwe should try to make the distribution of rows for the join condition column even when possible. 

Thus, can we draw the conclusion that the current structure of all_level_status table is a bad design and should be
avoided? 

Chao.

From: chang chao <chang-chao@hotmail.com>
Sent: Monday, May 16, 2016 17:23
To: pgsql-hackers@postgresql.org
Subject: The rewritting of join conditions caused a very slow query plan.
 

Hi,all

I have a query that is very slow,and the reason may be in the rewritting of join conditions.

this is the simplied version table and the key part of the sql.

level1_table and level2_table hold the tree data nodes,
and all_level_status table holds the current status all all nodes of all levels.
(I know that there would be much less trouble in performance if all_level_status was divided into two
tables,namely,level1_statusand level2_status tables.) 

table1: level1_table
  level1_no   PK:serial
  level1_node_name :varchar

table2:level2_table
  level2_no   PK:serial
  parent_no   FK to level1_table.level1_no
  level2_node_name :varchar

table3: all_level_status
  level:1 OR 2 PK1
  node_no:level1_table.level1_no or level2_table.level2_no PK2
  status:0 OR 1(normal or abnormal)


The sql to find all level2 nodes whose parent level nodes are in normal status.

explain analyze
select * from level2_table l2
join (
 select l1.* from level1_table l1
 join all_level_status als on (als.level=1 and als.node_no=l1.level1_no)
 where  als.status=0
) normal_l1 on l2.parent_no=normal_l1.level1_no;


this is the query plan .

"Merge Join  (cost=3.38..5.13 rows=3 width=158) (actual time=0.087..0.179 rows=21 loops=1)"
"  Merge Cond: (als.node_no = l2.parent_no)"
"  ->  Merge Join  (cost=1.63..7.66 rows=19 width=80) (actual time=0.067..0.126 rows=18 loops=1)"
"        Merge Cond: (als.node_no = l1.level1_no)"
"        ->  Index Scan using all_level_status_pkey on all_level_status als  (cost=0.00..21.74 rows=19 width=4) (actual
time=0.037..0.079rows=18 loops=1)" 
"              Index Cond: (level = 1)"
"              Filter: (status = 0)"
"        ->  Sort  (cost=1.63..1.68 rows=20 width=76) (actual time=0.026..0.026 rows=20 loops=1)"
"              Sort Key: l1.level1_no"
"              Sort Method:  quicksort  Memory: 27kB"
"              ->  Seq Scan on level1_table l1  (cost=0.00..1.20 rows=20 width=76) (actual time=0.005..0.009 rows=20
loops=1)"
"  ->  Sort  (cost=1.75..1.81 rows=23 width=82) (actual time=0.016..0.024 rows=23 loops=1)"
"        Sort Key: l2.parent_no"
"        Sort Method:  quicksort  Memory: 28kB"
"        ->  Seq Scan on level2_table l2  (cost=0.00..1.23 rows=23 width=82) (actual time=0.003..0.005 rows=23
loops=1)"
"Total runtime: 0.307 ms"


Please note that,join condition of query plan line 2 is rewritten to "als.node_no = l2.parent_no"
level1 and level2 nodes are of the 1:n relationship,and because all_level_status.node_no represents different
things(level1_table.level1_noand level2_table.level2_no uses separate serials),so when this rewriting is applied,the
statisticsof mcvs of all_level_status.node_no  and level2.parent_no will be used to do the row selectivity,as can be
anticipated,alarge gap occurred between actual rows and estimated rows. 

the above sql is one simplified part of a long sql,because of this gap,the estimated row count becomes 1 in the outer
sub-query,whichin actual has large number of values, 
the very slow nested-loop join is selected.

Had the rewriting of the join condition not be done,maybe a much fast query plan would be selected.

So I'm wondering what is the reason behind the join condition rewriting,
Is it just because that join conditions that both left and right side have mcvs are preferable to those in which there
areno mcvs on both sides? 



Chao.




pgsql-hackers by date:

Previous
From: Ildar Musin
Date:
Subject: Re: Declarative partitioning
Next
From: Jeff Janes
Date:
Subject: Re: Reviewing freeze map code