Thread: Join Filter vs. Index Cond (performance regression 9.1->9.2+/HEAD)
This is distilled down from a performance regression problem that came past on IRC earlier today: create table t1 (a integer, b integer, c integer, primary key (a,b,c)); create table t2 (k2 integer, a integer, primary key (k2,a)); create table t3 (k3 integer, b integer, primary key (k3,b)); create table t4 (k4 integer, c integer, primary key (k4,c)); insert into t1 select i,i,i from generate_series(1,1000,20) i; insert into t1 select 2,2,i from generate_series(1,500) i; insert into t2 select i,i from generate_series(1,1000) i; insert into t3 select i,i from generate_series(1,1000) i; insert into t4 select i,i from generate_series(1,1000) i; analyze; explain analyze select * from t4 left join t3 on (t4.c=t3.k3) left join t2 on (t3.b=t2.k2) leftjoin t1 on (t1.a=t2.a and t1.b=t3.b and t1.c=t4.c) where t4.k4=2; The plan for this on 9.4.2 comes out like this: Nested Loop Left Join (cost=1.10..17.28 rows=1 width=36) (actual time=0.089..0.448 rows=1 loops=1) Join Filter: (t1.c= t4.c) Rows Removed by Join Filter: 499 -> Nested Loop Left Join (cost=0.83..16.94 rows=1 width=24) (actualtime=0.056..0.059 rows=1 loops=1) -> Nested Loop Left Join (cost=0.55..16.60 rows=1 width=16) (actual time=0.044..0.046rows=1 loops=1) -> Index Only Scan using t4_pkey on t4 (cost=0.28..8.29 rows=1 width=8)(actual time=0.024..0.025 rows=1 loops=1) Index Cond: (k4 = 2) Heap Fetches:1 -> Index Only Scan using t3_pkey on t3 (cost=0.28..8.29 rows=1 width=8) (actual time=0.011..0.012rows=1 loops=1) Index Cond: (k3 = t4.c) Heap Fetches: 1 -> Index Only Scan using t2_pkey on t2 (cost=0.28..0.33 rows=1 width=8) (actual time=0.010..0.011 rows=1 loops=1) Index Cond: (k2 = t3.b) Heap Fetches: 1 -> Index Only Scan using t1_pkey on t1 (cost=0.28..0.33rows=1 width=12) (actual time=0.025..0.281 rows=500 loops=1) Index Cond: ((a = t2.a) AND (b = t3.b)) Heap Fetches: 500 Whereas 9.1 gives this: Nested Loop Left Join (cost=0.00..33.12 rows=1 width=36) (actual time=0.074..0.096 rows=1 loops=1) -> Nested Loop LeftJoin (cost=0.00..24.83 rows=1 width=24) (actual time=0.054..0.069 rows=1 loops=1) -> Nested Loop Left Join (cost=0.00..16.55 rows=1 width=16) (actual time=0.039..0.048 rows=1 loops=1) -> Index Scan using t4_pkey ont4 (cost=0.00..8.27 rows=1 width=8) (actual time=0.020..0.022 rows=1 loops=1) Index Cond: (k4 = 2) -> Index Scan using t3_pkey on t3 (cost=0.00..8.27 rows=1 width=8) (actual time=0.009..0.011 rows=1 loops=1) Index Cond: (t4.c = k3) -> Index Scan using t2_pkey on t2 (cost=0.00..8.27 rows=1 width=8)(actual time=0.008..0.010 rows=1 loops=1) Index Cond: (t3.b = k2) -> Index Scan using t1_pkey on t1 (cost=0.00..8.27 rows=1 width=12) (actual time=0.013..0.016 rows=1 loops=1) Index Cond: ((a = t2.a) AND (b = t3.b)AND (c = t4.c)) In the real example, the join filter in the 9.4.2 plan was discarding 40 million rows, not just 500, so the performance impact was quite serious. Obviously it makes little sense to use an (a,b,c) index to look up just (a,b) and then filter on c; the question is, what is the planner doing that leads it to get this so wrong? Finding a workaround for it was not easy, either - the only thing that I found that worked was replacing the t1 join with a lateral join with an OFFSET 0 clause to nobble the planner entirely. -- Andrew (irc:RhodiumToad)
On Fri, May 29, 2015 at 6:45 PM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > Obviously it makes little sense to use an (a,b,c) index to look up just > (a,b) and then filter on c; the question is, what is the planner doing > that leads it to get this so wrong? Finding a workaround for it was not > easy, either - the only thing that I found that worked was replacing the > t1 join with a lateral join with an OFFSET 0 clause to nobble the > planner entirely. I agree. That looks like a bug. The fact that it chooses index-only scans is also a little strange, considering that they seem to be entirely ineffective. The tables have never been vacuumed, so presumably, the all-visible bits are all 0, and the planner knows it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Obviously it makes little sense to use an (a,b,c) index to look up just > (a,b) and then filter on c; the question is, what is the planner doing > that leads it to get this so wrong? It's not so astonishing as all that; compare regression=# explain select * from t1 where a=3 and b=4; QUERY PLAN ------------------------------------------------------------------------Index Only Scan using t1_pkey on t1 (cost=0.28..8.29rows=1 width=12) Index Cond: ((a = 3) AND (b = 4)) (2 rows) regression=# explain select * from t1 where a=3 and b=4 and c=5; QUERY PLAN ------------------------------------------------------------------------Index Only Scan using t1_pkey on t1 (cost=0.28..8.30rows=1 width=12) Index Cond: ((a = 3) AND (b = 4) AND (c = 5)) (2 rows) Once you're down to an estimate of one row retrieved, adding additional index conditions simply increases the cost (not by much, but it increases) without delivering any visible benefit. I believe what probably happened in this case is that the planner considered both forms of the indexscan path and concluded that they were fuzzily the same cost and rowcount, yet the path using only t2.a and t3.b clearly dominated by requiring strictly fewer outer relations for parameters. So it threw away the path that also had the c = t4.c comparison before it ever got to the join stage. Even had it kept that path, the join cost estimate wouldn't have looked any better than the one for the join it did pick, so there would have been no certainty of picking the "correct" plan. The real problem in your example is thus the incorrect rowcount estimate; with better rowcount estimates the two cases wouldn't have appeared to have the same output rowcount. For the toy data in your example, this can probably be blamed on the fact that eqjoinsel_inner doesn't have any smarts for the case of having an MCV list for only one side (though as noted in the comments, it's not obvious what it should do instead). However, it's not very clear what was happening in the real-world case. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> Once you're down to an estimate of one row retrieved, addingTom> additional index conditions simply increases the cost(not byTom> much, but it increases) without delivering any visible benefit. OK, but this is a serious problem because "estimate of one row" is a very common estimation failure mode, and isn't always solvable in the sense of arranging for better estimates (in the absence of hints, ugh). Tom> I believe what probably happened in this case is that the plannerTom> considered both forms of the indexscan path andconcluded thatTom> they were fuzzily the same cost and rowcount, yet the path usingTom> only t2.a and t3.b clearly dominatedby requiring strictly fewerTom> outer relations for parameters. So it threw away the path thatTom> also had thec = t4.c comparison before it ever got to the joinTom> stage. Even had it kept that path, the join cost estimateTom>wouldn't have looked any better than the one for the join it didTom> pick, so there would have been no certaintyof picking theTom> "correct" plan. Tom> The real problem in your example is thus the incorrect rowcountTom> estimate; with better rowcount estimates the twocases wouldn'tTom> have appeared to have the same output rowcount. Tom> For the toy data in your example, this can probably be blamed onTom> the fact that eqjoinsel_inner doesn't have anysmarts for the caseTom> of having an MCV list for only one side (though as noted in theTom> comments, it's not obviouswhat it should do instead). However,Tom> it's not very clear what was happening in the real-world case. In the real-world case, t1 was something like an "overrides" table for data otherwise obtained from the other tables, i.e. special-case exceptions for general rules. As such it is highly skew, with many possible (a,b) values having no row at all, but others having hundreds of matches on (a,b) (but only one at most on (a,b,c) since this was the pkey in the real data as well as the testcase). Accordingly, there was no way that we could identify of getting any kind of better estimate of rowcount. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> Once you're down to an estimate of one row retrieved, adding > Tom> additional index conditions simply increases the cost (not by > Tom> much, but it increases) without delivering any visible benefit. > OK, but this is a serious problem because "estimate of one row" is a > very common estimation failure mode, and isn't always solvable in the > sense of arranging for better estimates (in the absence of hints, ugh). Yeah. I've occasionally wondered about removing the clamp-to-one-row behavior, so that additional conditions would still look like they contributed something (ie, 0.1 row is better than 1 row). However, that seems likely to break about as many cases as it fixes :-(. A variant of that would be to only allow the minimum to be 1 row if we are absolutely certain that's what we'll get (eg, we're searching on a unique-key equality condition), and otherwise clamp to at least 2 rows. Again though, this would be destabilizing lots of cases that work well today. I doubt there are any simple solutions here. regards, tom lane
On Sun, May 31, 2015 at 12:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> Obviously it makes little sense to use an (a,b,c) index to look up just >> (a,b) and then filter on c; the question is, what is the planner doing >> that leads it to get this so wrong? > > It's not so astonishing as all that; compare > > regression=# explain select * from t1 where a=3 and b=4; > QUERY PLAN > ------------------------------------------------------------------------ > Index Only Scan using t1_pkey on t1 (cost=0.28..8.29 rows=1 width=12) > Index Cond: ((a = 3) AND (b = 4)) > (2 rows) > > regression=# explain select * from t1 where a=3 and b=4 and c=5; > QUERY PLAN > ------------------------------------------------------------------------ > Index Only Scan using t1_pkey on t1 (cost=0.28..8.30 rows=1 width=12) > Index Cond: ((a = 3) AND (b = 4) AND (c = 5)) > (2 rows) > > Once you're down to an estimate of one row retrieved, adding additional > index conditions simply increases the cost (not by much, but it increases) > without delivering any visible benefit. But Andrew's example is equivalent to planning the second query by putting the quals on a and b into the index qual and treating c=5 as a post-filter condition even though the index we're using is on (a, b, c), which I don't believe we'd ever do. If we did, I'd find that astonishing, too. > I believe what probably happened in this case is that the planner > considered both forms of the indexscan path and concluded that they were > fuzzily the same cost and rowcount, yet the path using only t2.a and t3.b > clearly dominated by requiring strictly fewer outer relations for > parameters. So it threw away the path that also had the c = t4.c > comparison before it ever got to the join stage. Even had it kept that > path, the join cost estimate wouldn't have looked any better than the one > for the join it did pick, so there would have been no certainty of picking > the "correct" plan. It's just hard to believe that it's ever better to treat something as a join filter than as an index condition. Yes, checking the index condition isn't free, either. But it doesn't seem like it should be particularly more expensive than checking the same thing as a join filter. And if there a lot of rows involved, it's going to be a whole lot LESS expensive. I guess it's hard for me to credit the idea that a parameterized index path constraining a superset of the columns present in some other parameterized path on the same index has insufficient additional selectivity to justify its existence. Even in a world where planner estimates are never wrong, I doubt treating the qual as a join filter is ever meaningfully better. The absolute best case - or so it seems to me - is a tie. If we underestimate the row counts, it's a loss. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company