Thread: A query become very slow after upgrade from 8.1.10 to 8.4.5
Hi Sorry, my previous post haven't shown in this list, so I repost this one. I have a sql become very slow after upgrade to 8.4.5. The table creation sql like this. begin; CREATE TABLE t_a ( id INT NOT NULL PRIMARY KEY ); CREATE TABLE t_b ( id INT NOT NULL PRIMARY KEY ); CREATE TABLE t_c ( id INT NOT NULL PRIMARY KEY, flag boolean ); INSERT INTO t_a SELECT s FROM generate_series(1, 600) s; INSERT INTO t_b SELECT s FROM generate_series(1, 3000) s; SELECT SETSEED(0.1); INSERT INTO t_c SELECT s, RANDOM()> 0.5 FROM generate_series(1, 12000) s; -- insert some id not in t_b into t_a INSERT INTO t_a values( 20000); ANALYZE t_a; ANALYZE t_b; ANALYZE t_c; end; The query sql is like this. SELECT t_a.id FROM t_a WHERE EXISTS ( SELECT t_b.id FROM t_b, t_c WHERE t_b.id = t_a.id AND t_c.flag = 'f') I extract this part form a big query.I known this query is not very good.The query plan is different between 8.1.10 and 8.4.5, 8.1.10 use a index scan, 8.4.5 use two table scan. PostgreSQL 8.1.10 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.4 (mingw special) Seq Scan on t_a (cost=0.00..34.67 rows=300 width=4) (actual time=0.025..5.350 rows=600 loops=1) Filter: (subplan) SubPlan -> Nested Loop (cost=0.00..248.44 rows=6042 width=4) (actual time=0.007..0.007 rows=1 loops=601) -> Index Scan using t_b_pkey on t_b (cost=0.00..3.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=601) Index Cond: (id = $0) -> Seq Scan on t_c (cost=0.00..185.00 rows=6042 width=0) (actual time=0.001..0.001 rows=1 loops=600) Filter: (NOT flag) Total runtime: 5.574 ms PostgreSQL 8.4.5, compiled by Visual C++ build 1400, 32-bit Nested Loop Semi Join (cost=0.00..134044.44 rows=601 width=4) (actual time=0.033..17375.045 rows=600 loops=1) Join Filter: (t_a.id = t_b.id) -> Seq Scan on t_a (cost=0.00..9.01 rows=601 width=4) (actual time=0.008..0.172 rows=601 loops=1) -> Nested Loop (cost=0.00..447282.00 rows=18126000 width=4) (actual time=0.011..20.922 rows=30460 loops=601) -> Seq Scan on t_c (cost=0.00..174.00 rows=6042 width=0) (actual time=0.004..0.011 rows=11 loops=601) Filter: (NOT flag) -> Seq Scan on t_b (cost=0.00..44.00 rows=3000 width=4) (actual time=0.004..0.652 rows=2756 loops=6642) Total runtime: 17375.247 ms If some t_a.id not in t_b.id 8.4.5 will become very slow. I confirmed this behavior on default configuration. Regards, Yao
Yaocl <chunlinyao@gmail.com> writes: > SELECT t_a.id FROM t_a > WHERE EXISTS ( SELECT t_b.id FROM t_b, t_c > WHERE t_b.id = t_a.id AND t_c.flag = 'f') I have some hopes for fixing this in 9.1, but nothing is going to happen in 8.4 or 9.0. In the meantime, is it intentional that there is no join clause between t_b and t_c? That'd be a lot more efficient as two separate EXISTS tests, ie WHERE EXISTS ( SELECT 1 FROM t_b WHERE t_b.id = t_a.id ) AND EXISTS ( SELECT 1 FROM t_c WHERE t_c.flag = 'f') but I wonder whether this query doesn't simply reflect a logic error on the client side. regards, tom lane
On Wed, Nov 3, 2010 at 6:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yaocl <chunlinyao@gmail.com> writes: >> SELECT t_a.id FROM t_a >> WHERE EXISTS ( SELECT t_b.id FROM t_b, t_c >> WHERE t_b.id = t_a.id AND t_c.flag = 'f') > > I have some hopes for fixing this in 9.1, but nothing is going to happen > in 8.4 or 9.0. In the meantime, is it intentional that there is no join > clause between t_b and t_c? That'd be a lot more efficient as two > separate EXISTS tests, ie > > WHERE EXISTS ( SELECT 1 FROM t_b WHERE t_b.id = t_a.id ) AND > EXISTS ( SELECT 1 FROM t_c WHERE t_c.flag = 'f') > > but I wonder whether this query doesn't simply reflect a logic error on > the client side. > > regards, tom lane > Yes ,If I moved t_c to another clause, It can resolve this problem. The original sql is generate by a orm.Has some connection between t_b and t_c.Like this: AND exists ( SELECT t_b.id from t_b, t_c WHERE t_b.id = t_a.id AND t_c.some_field <= t_b.some_field ) How ever this is still a poor query. select t_a.id from t_a where exists ( select t_b.id from t_b, t_c where t_b.id = t_a.id and t_c.flag = 'f' AND t_b.id < t_c.id) 8.1.10 Seq Scan on t_a (cost=0.00..50.87 rows=300 width=4) (actual time=0.021..5.367 rows=600 loops=1) Filter: (subplan) SubPlan -> Nested Loop (cost=0.00..137.19 rows=2014 width=4) (actual time=0.007..0.007 rows=1 loops=601) -> Index Scan using t_b_pkey on t_b (cost=0.00..3.02 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=601) Index Cond: (id = $0) -> Index Scan using t_c_pkey on t_c (cost=0.00..109.00 rows=2014 width=4) (actual time=0.003..0.003 rows=1 loops=600) Index Cond: (outer.id <= t_c.id) Filter: (NOT flag) Total runtime: 5.564 ms 8.4.5 Nested Loop Semi Join (cost=0.00..154223.42 rows=601 width=4) (actual time=0.037..38727.982 rows=600 loops=1) Join Filter: (t_a.id = t_b.id) -> Seq Scan on t_a (cost=0.00..9.01 rows=601 width=4) (actual time=0.011..0.237 rows=601 loops=1) -> Nested Loop (cost=0.00..182995.83 rows=6042000 width=4) (actual time=0.009..49.298 rows=57594 loops=601) -> Seq Scan on t_c (cost=0.00..174.00 rows=6042 width=4) (actual time=0.005..0.085 rows=169 loops=601) Filter: (NOT flag) -> Index Scan using t_b_pkey on t_b (cost=0.00..17.76 rows=1000 width=4) (actual time=0.007..0.132 rows=342 loops=101296) Index Cond: (t_b.id <= t_c.id) Total runtime: 38728.263 ms finally I rewritten the orm query to generate a different sql. Regards, Yao