Thread: Estimate of the inner_rows
Hi
I encounter an query plan problem like as the following. It's contain two nodes which assume the result is 1 and 7, but however, the last result is 7418. And the actual result is just 1, but because of the result is too big, which will affect the following join methods. And I've analyze the reason, but I think we can do bettwer.
postgres=# explain select * from test t1 left join test t2 on t1.b = t2.b and t2.c = 10 where t1.a = 1;
QUERY PLAN
-----------------------------------------------------------------------------
Nested Loop Left Join (cost=0.85..48.16 rows=7418 width=24)
-> Index Scan using a_idx on test t1 (cost=0.43..8.45 rows=1 width=12)
Index Cond: (a = 1)
-> Index Scan using b_idx on test t2 (cost=0.43..39.64 rows=7 width=12)
Index Cond: (b = t1.b)
Filter: (c = 10)
(6 rows)
postgres=# \d test
Table "public.test"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | integer | | |
c | integer | | |
Indexes:
"a_idx" btree (a)
"b_idx" btree (b)
Throughing the source code, I know it how to get this result.
Firstly, the result 7 is assume fron the condition t1.b = t2.b and t2.c = 10, in the function clauselist_selectivity_ext compute the selectivity, and the result is:
t2.b selc * t2.c = 10 selc * ntuples
(1/134830) * (1002795/1201000) * 1201000 = 7
Secondly, when compute the join selec, the compute function is eqjoinsel,and the result function is calc_joinrel_size_estimate
case JOIN_LEFT:
nrows = outer_rows * inner_rows * fkselec * jselec;
if (nrows < outer_rows)
nrows = outer_rows;
nrows *= pselec;
break;
outer_rows is 1, inner_rows is 1002795, which is the result the estimate result of t2.c = 10, while not the 7.
So, through the analyze, I think the reason is the estimate result of inner_rows, now we just consider the condition t2.c = 10, not the condition t1.b = t2,b and t2.c = 10.
=?GBK?B?s8LR47fJ?= <postgresql_2016@163.com> writes: > postgres=# explain select * from test t1 left join test t2 on t1.b = t2.b and t2.c = 10 where t1.a = 1; > QUERY PLAN > ----------------------------------------------------------------------------- > Nested Loop Left Join (cost=0.85..48.16 rows=7418 width=24) > -> Index Scan using a_idx on test t1 (cost=0.43..8.45 rows=1 width=12) > Index Cond: (a = 1) > -> Index Scan using b_idx on test t2 (cost=0.43..39.64 rows=7 width=12) > Index Cond: (b = t1.b) > Filter: (c = 10) > (6 rows) I tried to reproduce this and could not. What PG version are you running exactly? Can you provide a self-contained example? regards, tom lane
=?GBK?B?s8LR47fJ?= <postgresql_2016@163.com> writes: > 1¡¢The first problem is found in PG9.2 for our profuction version. But I test in the latest version. You're still running 9.2 in production? That's ... inadvisable. > 2¡¢The data is simply. The data in b has many distinct number, and the data in C=10 has too many numbers. I've dumpedit in test.log in the email attachemt. Thanks for the test data. I don't think there is actually anything wrong here, or at least nothing readily improvable. I get this for your original query: postgres=# explain analyze select * from test t1 left join test t2 on t1.b = t2.b and t2.c = 10 where t1.a = 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=0.85..16.90 rows=7567 width=24) (actual time=0.014..0.015 rows=1 loops=1) -> Index Scan using a_idx on test t1 (cost=0.43..8.45 rows=1 width=12) (actual time=0.008..0.008 rows=1 loops=1) Index Cond: (a = 1) -> Index Scan using b_idx on test t2 (cost=0.43..8.45 rows=1 width=12) (actual time=0.003..0.004 rows=1 loops=1) Index Cond: (b = t1.b) Filter: (c = 10) Planning Time: 0.321 ms Execution Time: 0.032 ms (8 rows) Slightly different from your estimate, but random sampling or a different statistics-target setting would be enough to explain that. It is *not* the t2.c = 10 condition that's resulting in the incorrect estimate, because taking it out doesn't change things much: postgres=# explain analyze select * from test t1 left join test t2 on t1.b = t2.b where t1.a = 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=0.85..16.90 rows=9088 width=24) (actual time=0.014..0.015 rows=1 loops=1) -> Index Scan using a_idx on test t1 (cost=0.43..8.45 rows=1 width=12) (actual time=0.008..0.009 rows=1 loops=1) Index Cond: (a = 1) -> Index Scan using b_idx on test t2 (cost=0.43..8.45 rows=1 width=12) (actual time=0.002..0.003 rows=1 loops=1) Index Cond: (b = t1.b) Planning Time: 0.312 ms Execution Time: 0.031 ms (7 rows) Next, let's take out the t1.a = 1 condition: postgres=# explain analyze select * from test t1 left join test t2 on t1.b = t2.b ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=0.43..602605.00 rows=10914402957 width=24) (actual time=0.023..1872651.044 rows=10914402000loops=1) -> Seq Scan on test t1 (cost=0.00..18506.00 rows=1201000 width=12) (actual time=0.013..62.837 rows=1201000 loops=1) -> Index Scan using b_idx on test t2 (cost=0.43..0.48 rows=1 width=12) (actual time=0.001..0.945 rows=9088 loops=1201000) Index Cond: (b = t1.b) Planning Time: 0.297 ms Execution Time: 2062621.438 ms (6 rows) The fundamental join-size estimate, that is the selectivity of t1.b = t2.b, is pretty much dead on here. And note the actual rows=9088 in the inner indexscan. What we see here is that the actual average number of t2 rows joining to a t1 row is 9088. Now the reason for the estimate for the second query becomes clear: the planner doesn't know exactly how many rows join to the specific row with t1.a = 1, but it knows that the average number of joined rows should be 9088, so that's its estimate. In your original query, that is reduced a little bit by the not-very-selective "c = 10" condition, but the big picture is the same. Basically, the row with "a = 1" has an atypical number of join partners and that's why the join size estimate is wrong for this particular query. You might nonetheless complain that the join size estimate should be the product of the two input-scan estimates, but that's not how it works. (Typically those do match up, but with highly skewed data like this the fact that they're derived differently can become obvious.) The size of the b_idx result is based on considering the selectivity of "t2.b = ?" where the comparison value is not known. Because there are so many b values that are unique, we end up estimating the average number of matching rows as 1 even though it will be far higher for a few values. regards, tom lane