Thread: A query become very slow after upgrade from 8.1.10 to 8.4.5

A query become very slow after upgrade from 8.1.10 to 8.4.5

From

Yaocl

Date:

01 November 2010, 23:50:50

Hi

Sorry, my previous post haven't shown in this list, so I repost this
one. I have a sql become very slow after upgrade to 8.4.5.
The table creation sql like this.

begin;
CREATE TABLE t_a (
        id INT NOT NULL PRIMARY KEY
);
CREATE TABLE t_b (
        id INT NOT NULL PRIMARY KEY
);
CREATE TABLE t_c (
        id INT NOT NULL PRIMARY KEY,
        flag boolean
);

INSERT
INTO    t_a
SELECT  s
FROM    generate_series(1, 600) s;

INSERT
INTO    t_b
SELECT  s
FROM    generate_series(1, 3000) s;

SELECT SETSEED(0.1);
INSERT
INTO    t_c
SELECT  s, RANDOM()> 0.5
FROM    generate_series(1, 12000) s;

-- insert some id not in t_b into t_a
INSERT
INTO t_a values( 20000);

ANALYZE t_a;
ANALYZE t_b;
ANALYZE t_c;
end;

The query sql is like this.

SELECT t_a.id FROM t_a
WHERE EXISTS ( SELECT t_b.id FROM t_b, t_c
               WHERE t_b.id = t_a.id AND t_c.flag = 'f')

I extract this part form a big query.I known this query is not very
good.The query plan is different between 8.1.10 and 8.4.5, 8.1.10 use
a index scan, 8.4.5 use two table scan.

PostgreSQL 8.1.10 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC)
3.4.4 (mingw special)
Seq Scan on t_a  (cost=0.00..34.67 rows=300 width=4) (actual
time=0.025..5.350 rows=600 loops=1)
  Filter: (subplan)
  SubPlan
    ->  Nested Loop  (cost=0.00..248.44 rows=6042 width=4) (actual
time=0.007..0.007 rows=1 loops=601)
          ->  Index Scan using t_b_pkey on t_b  (cost=0.00..3.02
rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=601)
                Index Cond: (id = $0)
          ->  Seq Scan on t_c  (cost=0.00..185.00 rows=6042 width=0)
(actual time=0.001..0.001 rows=1 loops=600)
                Filter: (NOT flag)
Total runtime: 5.574 ms


PostgreSQL 8.4.5, compiled by Visual C++ build 1400, 32-bit
Nested Loop Semi Join  (cost=0.00..134044.44 rows=601 width=4) (actual
time=0.033..17375.045 rows=600 loops=1)
  Join Filter: (t_a.id = t_b.id)
  ->  Seq Scan on t_a  (cost=0.00..9.01 rows=601 width=4) (actual
time=0.008..0.172 rows=601 loops=1)
  ->  Nested Loop  (cost=0.00..447282.00 rows=18126000 width=4)
(actual time=0.011..20.922 rows=30460 loops=601)
        ->  Seq Scan on t_c  (cost=0.00..174.00 rows=6042 width=0)
(actual time=0.004..0.011 rows=11 loops=601)
              Filter: (NOT flag)
        ->  Seq Scan on t_b  (cost=0.00..44.00 rows=3000 width=4)
(actual time=0.004..0.652 rows=2756 loops=6642)
Total runtime: 17375.247 ms

If some t_a.id not in t_b.id 8.4.5 will become very slow. I confirmed
this behavior on default configuration.

Regards,
Yao

Re: A query become very slow after upgrade from 8.1.10 to 8.4.5

From

Tom Lane

Date:

02 November 2010, 19:30:17

Yaocl <chunlinyao@gmail.com> writes:
> SELECT t_a.id FROM t_a
> WHERE EXISTS ( SELECT t_b.id FROM t_b, t_c
>                WHERE t_b.id = t_a.id AND t_c.flag = 'f')

I have some hopes for fixing this in 9.1, but nothing is going to happen
in 8.4 or 9.0.  In the meantime, is it intentional that there is no join
clause between t_b and t_c?  That'd be a lot more efficient as two
separate EXISTS tests, ie

WHERE EXISTS ( SELECT 1 FROM t_b WHERE t_b.id = t_a.id ) AND
      EXISTS ( SELECT 1 FROM t_c WHERE t_c.flag = 'f')

but I wonder whether this query doesn't simply reflect a logic error on
the client side.

            regards, tom lane

Re: A query become very slow after upgrade from 8.1.10 to 8.4.5

From

Yaocl

Date:

02 November 2010, 22:47:37

On Wed, Nov 3, 2010 at 6:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yaocl <chunlinyao@gmail.com> writes:
>> SELECT t_a.id FROM t_a
>> WHERE EXISTS ( SELECT t_b.id FROM t_b, t_c
>>                WHERE t_b.id = t_a.id AND t_c.flag = 'f')
>
> I have some hopes for fixing this in 9.1, but nothing is going to happen
> in 8.4 or 9.0.  In the meantime, is it intentional that there is no join
> clause between t_b and t_c?  That'd be a lot more efficient as two
> separate EXISTS tests, ie
>
> WHERE EXISTS ( SELECT 1 FROM t_b WHERE t_b.id = t_a.id ) AND
>      EXISTS ( SELECT 1 FROM t_c WHERE t_c.flag = 'f')
>
> but I wonder whether this query doesn't simply reflect a logic error on
> the client side.
>
>                        regards, tom lane
>
Yes ,If I moved t_c to another clause, It can resolve this problem.
The original sql is generate by a orm.Has some connection between t_b
and t_c.Like this:
AND exists ( SELECT t_b.id from t_b, t_c
                 WHERE t_b.id = t_a.id
                 AND t_c.some_field <= t_b.some_field )
How ever this is still a poor query.

select t_a.id from t_a
where exists ( select t_b.id from t_b, t_c
              where t_b.id = t_a.id and t_c.flag = 'f'
               AND t_b.id < t_c.id)

8.1.10
Seq Scan on t_a  (cost=0.00..50.87 rows=300 width=4) (actual
time=0.021..5.367 rows=600 loops=1)
 Filter: (subplan)
 SubPlan
   ->  Nested Loop  (cost=0.00..137.19 rows=2014 width=4) (actual
time=0.007..0.007 rows=1 loops=601)
         ->  Index Scan using t_b_pkey on t_b  (cost=0.00..3.02
rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=601)
               Index Cond: (id = $0)
         ->  Index Scan using t_c_pkey on t_c  (cost=0.00..109.00
rows=2014 width=4) (actual time=0.003..0.003 rows=1 loops=600)
               Index Cond: (outer.id <= t_c.id)
               Filter: (NOT flag)
Total runtime: 5.564 ms

8.4.5
Nested Loop Semi Join  (cost=0.00..154223.42 rows=601 width=4) (actual
time=0.037..38727.982 rows=600 loops=1)
 Join Filter: (t_a.id = t_b.id)
 ->  Seq Scan on t_a  (cost=0.00..9.01 rows=601 width=4) (actual
time=0.011..0.237 rows=601 loops=1)
 ->  Nested Loop  (cost=0.00..182995.83 rows=6042000 width=4) (actual
time=0.009..49.298 rows=57594 loops=601)
       ->  Seq Scan on t_c  (cost=0.00..174.00 rows=6042 width=4)
(actual time=0.005..0.085 rows=169 loops=601)
             Filter: (NOT flag)
       ->  Index Scan using t_b_pkey on t_b  (cost=0.00..17.76
rows=1000 width=4) (actual time=0.007..0.132 rows=342 loops=101296)
             Index Cond: (t_b.id <= t_c.id)
Total runtime: 38728.263 ms

finally I rewritten the orm query to generate a different sql.

Regards,
Yao