Re: mysterious difference in speed when combining two queries with OR - Mailing list pgsql-performance

From hans
Subject Re: mysterious difference in speed when combining two queries with OR
Date
Msg-id 20080423090605.GE6957@Spelmaskinen
Whole thread Raw
In response to Re: mysterious difference in speed when combining two queries with OR  ("A. Kretschmer" <andreas.kretschmer@schollglas.com>)
List pgsql-performance
On Wed, Apr 23, 2008 at 09:58:10AM +0200, A. Kretschmer wrote:
> am  Wed, dem 23.04.2008, um  9:23:07 +0200 mailte Hans Ekbrand folgendes:
> > I cannot understand why the following two queries differ so much in execution time (almost ten times)
>
> wild guess: different execution plans.
>
>
> Can you show us the plans? (EXPLAIN ANALYSE SELECT ...)

Query A (first part)

fektest=> explain analyse select distinct moment.mid from moment,timecard where parent = 45 and (pid=17 and
timecard.mid= moment.mid) order by moment.mid; 
                                                         QUERY PLAN
     

-----------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=287.66..288.67 rows=203 width=4) (actual time=11.412..11.622 rows=41 loops=1)
   ->  Sort  (cost=287.66..288.16 rows=203 width=4) (actual time=11.409..11.484 rows=57 loops=1)
         Sort Key: moment.mid
         ->  Hash Join  (cost=60.98..279.88 rows=203 width=4) (actual time=2.346..11.182 rows=57 loops=1)
               Hash Cond: ("outer".mid = "inner".mid)
               ->  Seq Scan on timecard  (cost=0.00..211.78 rows=1017 width=4) (actual time=0.031..7.427 rows=995
loops=1)
                     Filter: (pid = 17)
               ->  Hash  (cost=59.88..59.88 rows=444 width=4) (actual time=2.127..2.127 rows=0 loops=1)
                     ->  Seq Scan on moment  (cost=0.00..59.88 rows=444 width=4) (actual time=0.027..1.825 rows=199
loops=1)
                           Filter: (parent = 45)
 Total runtime: 11.852 ms
(11 rows)

Query A (second part)

fektest=> explain analyse select distinct moment.mid from moment,timecard where parent = 45 and (pbar = 0) order by
moment.mid;
                                                              QUERY PLAN
              

--------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=192.62..3800.67 rows=20 width=4) (actual time=0.578..109.274 rows=2 loops=1)
   ->  Nested Loop  (cost=192.62..3417.57 rows=153240 width=4) (actual time=0.575..89.546 rows=15324 loops=1)
         ->  Index Scan using moment_mid_idx on moment  (cost=0.00..160.15 rows=20 width=4) (actual time=0.544..3.490
rows=2loops=1) 
               Filter: ((parent = 45) AND (pbar = 0))
         ->  Materialize  (cost=192.62..269.24 rows=7662 width=0) (actual time=0.009..21.998 rows=7662 loops=2)
               ->  Seq Scan on timecard  (cost=0.00..192.62 rows=7662 width=0) (actual time=0.007..14.554 rows=7662
loops=1)
 Total runtime: 109.870 ms
(7 rows)

Query B

fektest=> EXPLAIN ANALYSE SELECT distinct moment.mid from moment,timecard where parent = 45 and ((pid=17 and
timecard.mid= moment.mid) or (pbar = 0)) order by moment.mid; 
                                                               QUERY PLAN
                 

-----------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=192.62..102469.31 rows=444 width=4) (actual time=143.444..4838.067 rows=42 loops=1)
   ->  Nested Loop  (cost=192.62..102405.04 rows=25710 width=4) (actual time=143.439..4818.215 rows=15379 loops=1)
         Join Filter: ((("inner".pid = 17) OR ("outer".pbar = 0)) AND (("inner".mid = "outer".mid) OR ("outer".pbar =
0)))
         ->  Index Scan using moment_mid_idx on moment  (cost=0.00..154.58 rows=444 width=8) (actual time=0.390..5.954
rows=199loops=1) 
               Filter: (parent = 45)
         ->  Materialize  (cost=192.62..269.24 rows=7662 width=8) (actual time=0.001..9.728 rows=7662 loops=199)
               ->  Seq Scan on timecard  (cost=0.00..192.62 rows=7662 width=8) (actual time=0.007..17.007 rows=7662
loops=1)
 Total runtime: 4838.786 ms
(8 rows)

> > I should say that this is on postgresql 7.4.16 (debian stable).
>
> Uhh. Why not a recent version? We have 8.3.0...

No particularly good reason, just that I have taken over a production
system and I didn't want to mess up with before I am confident with
it. But I on a test-site I have migrated to 8.1 without problems, so
migration will happen, we just haven't a reason for doing it yet,
since 7.4 has served us well.

> > Can query B be rewritten so that it would execute faster?
>
> Quick and dirty: use both selects (query A) combined with UNION.

I will look into that.

> I guess, with a recent version the planner can use a bitmap index scan
> to perform Query B faster.

That might be a good reason to upgrade :-)

Thanks for your answer.

--
Every non-free program has a lord, a master --
and if you use the program, he is your master.
Learn to master free software: www.ubuntulinux.com

Attachment

pgsql-performance by date:

Previous
From: "A. Kretschmer"
Date:
Subject: Re: mysterious difference in speed when combining two queries with OR
Next
From: Hans Ekbrand
Date:
Subject: Re: mysterious difference in speed when combining two queries with OR