Thread: mysterious difference in speed when combining two queries with OR

mysterious difference in speed when combining two queries with OR

From
Hans Ekbrand
Date:
I cannot understand why the following two queries differ so much in execution time (almost ten times)

Query A (two queries)

select distinct moment.mid from moment,timecard where parent = 45 and (pid=17 and timecard.mid = moment.mid) order by
moment.mid;
select distinct moment.mid from moment,timecard where parent = 45 and (pbar = 0) order by moment.mid;

Query B (combining the two with OR)

select distinct moment.mid from moment,timecard where parent = 45 and ((pid=17 and timecard.mid = moment.mid) or (pbar
=0)) order by moment.mid; 

$ time psql -o /dev/null -f query-a.sql fektest

real    0m2.016s
user    0m1.532s
sys     0m0.140s

$ time psql -o /dev/null -f query-b.sql fektest

real    0m28.534s
user    0m1.516s
sys     0m0.156s

I have tested this in two different computers with different amount of
RAM, fast or slow CPU, and the difference is persistent, almost ten
times.

I should say that this is on postgresql 7.4.16 (debian stable).

Can query B be rewritten so that it would execute faster?

TIA

--
Hans Ekbrand (http://sociologi.cjb.net) <hans@sociologi.cjb.net>
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E

Attachment

Re: mysterious difference in speed when combining two queries with OR

From
"A. Kretschmer"
Date:
am  Wed, dem 23.04.2008, um  9:23:07 +0200 mailte Hans Ekbrand folgendes:
> I cannot understand why the following two queries differ so much in execution time (almost ten times)

wild guess: different execution plans.


Can you show us the plans? (EXPLAIN ANALYSE SELECT ...)

>
> Query A (two queries)
>
> select distinct moment.mid from moment,timecard where parent = 45 and (pid=17 and timecard.mid = moment.mid) order by
moment.mid;
> select distinct moment.mid from moment,timecard where parent = 45 and (pbar = 0) order by moment.mid;
>
> Query B (combining the two with OR)
>
> select distinct moment.mid from moment,timecard where parent = 45 and ((pid=17 and timecard.mid = moment.mid) or
(pbar= 0)) order by moment.mid; 
>
> [ snip ]
>
> I should say that this is on postgresql 7.4.16 (debian stable).

Uhh. Why not a recent version? We have 8.3.0...


>
> Can query B be rewritten so that it would execute faster?

Quick and dirty: use both selects (query A) combined with UNION.

I guess, with a recent version the planner can use a bitmap index scan
to perform Query B faster.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: mysterious difference in speed when combining two queries with OR

From
"A. Kretschmer"
Date:
am  Wed, dem 23.04.2008, um  9:58:10 +0200 mailte A. Kretschmer folgendes:
> > Query A (two queries)
> >
> > select distinct moment.mid from moment,timecard where parent = 45 and (pid=17 and timecard.mid = moment.mid) order
bymoment.mid; 
> > select distinct moment.mid from moment,timecard where parent = 45 and (pbar = 0) order by moment.mid;
> >
> > Query B (combining the two with OR)
> >
> > select distinct moment.mid from moment,timecard where parent = 45 and ((pid=17 and timecard.mid = moment.mid) or
(pbar= 0)) order by moment.mid; 

Thanks to depesz on #postgresql (irc-channel):

Query A, the second query: there are no join between the 2 tables.
Mistake?


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

On Wed, Apr 23, 2008 at 09:58:10AM +0200, A. Kretschmer wrote:
> am  Wed, dem 23.04.2008, um  9:23:07 +0200 mailte Hans Ekbrand folgendes:
> > I cannot understand why the following two queries differ so much in execution time (almost ten times)
>
> wild guess: different execution plans.
>
>
> Can you show us the plans? (EXPLAIN ANALYSE SELECT ...)

Query A (first part)

fektest=> explain analyse select distinct moment.mid from moment,timecard where parent = 45 and (pid=17 and
timecard.mid= moment.mid) order by moment.mid; 
                                                         QUERY PLAN
     

-----------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=287.66..288.67 rows=203 width=4) (actual time=11.412..11.622 rows=41 loops=1)
   ->  Sort  (cost=287.66..288.16 rows=203 width=4) (actual time=11.409..11.484 rows=57 loops=1)
         Sort Key: moment.mid
         ->  Hash Join  (cost=60.98..279.88 rows=203 width=4) (actual time=2.346..11.182 rows=57 loops=1)
               Hash Cond: ("outer".mid = "inner".mid)
               ->  Seq Scan on timecard  (cost=0.00..211.78 rows=1017 width=4) (actual time=0.031..7.427 rows=995
loops=1)
                     Filter: (pid = 17)
               ->  Hash  (cost=59.88..59.88 rows=444 width=4) (actual time=2.127..2.127 rows=0 loops=1)
                     ->  Seq Scan on moment  (cost=0.00..59.88 rows=444 width=4) (actual time=0.027..1.825 rows=199
loops=1)
                           Filter: (parent = 45)
 Total runtime: 11.852 ms
(11 rows)

Query A (second part)

fektest=> explain analyse select distinct moment.mid from moment,timecard where parent = 45 and (pbar = 0) order by
moment.mid;
                                                              QUERY PLAN
              

--------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=192.62..3800.67 rows=20 width=4) (actual time=0.578..109.274 rows=2 loops=1)
   ->  Nested Loop  (cost=192.62..3417.57 rows=153240 width=4) (actual time=0.575..89.546 rows=15324 loops=1)
         ->  Index Scan using moment_mid_idx on moment  (cost=0.00..160.15 rows=20 width=4) (actual time=0.544..3.490
rows=2loops=1) 
               Filter: ((parent = 45) AND (pbar = 0))
         ->  Materialize  (cost=192.62..269.24 rows=7662 width=0) (actual time=0.009..21.998 rows=7662 loops=2)
               ->  Seq Scan on timecard  (cost=0.00..192.62 rows=7662 width=0) (actual time=0.007..14.554 rows=7662
loops=1)
 Total runtime: 109.870 ms
(7 rows)

Query B

fektest=> EXPLAIN ANALYSE SELECT distinct moment.mid from moment,timecard where parent = 45 and ((pid=17 and
timecard.mid= moment.mid) or (pbar = 0)) order by moment.mid; 
                                                               QUERY PLAN
                 

-----------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=192.62..102469.31 rows=444 width=4) (actual time=143.444..4838.067 rows=42 loops=1)
   ->  Nested Loop  (cost=192.62..102405.04 rows=25710 width=4) (actual time=143.439..4818.215 rows=15379 loops=1)
         Join Filter: ((("inner".pid = 17) OR ("outer".pbar = 0)) AND (("inner".mid = "outer".mid) OR ("outer".pbar =
0)))
         ->  Index Scan using moment_mid_idx on moment  (cost=0.00..154.58 rows=444 width=8) (actual time=0.390..5.954
rows=199loops=1) 
               Filter: (parent = 45)
         ->  Materialize  (cost=192.62..269.24 rows=7662 width=8) (actual time=0.001..9.728 rows=7662 loops=199)
               ->  Seq Scan on timecard  (cost=0.00..192.62 rows=7662 width=8) (actual time=0.007..17.007 rows=7662
loops=1)
 Total runtime: 4838.786 ms
(8 rows)

> > I should say that this is on postgresql 7.4.16 (debian stable).
>
> Uhh. Why not a recent version? We have 8.3.0...

No particularly good reason, just that I have taken over a production
system and I didn't want to mess up with before I am confident with
it. But I on a test-site I have migrated to 8.1 without problems, so
migration will happen, we just haven't a reason for doing it yet,
since 7.4 has served us well.

> > Can query B be rewritten so that it would execute faster?
>
> Quick and dirty: use both selects (query A) combined with UNION.

I will look into that.

> I guess, with a recent version the planner can use a bitmap index scan
> to perform Query B faster.

That might be a good reason to upgrade :-)

Thanks for your answer.

--
Every non-free program has a lord, a master --
and if you use the program, he is your master.
Learn to master free software: www.ubuntulinux.com

Attachment

Re: mysterious difference in speed when combining two queries with OR

From
Hans Ekbrand
Date:
On Wed, Apr 23, 2008 at 10:57:04AM +0200, A. Kretschmer wrote:
> am  Wed, dem 23.04.2008, um  9:58:10 +0200 mailte A. Kretschmer folgendes:
> > > Query A (two queries)
> > >
> > > select distinct moment.mid from moment,timecard where parent = 45 and (pid=17 and timecard.mid = moment.mid)
orderby moment.mid; 
> > > select distinct moment.mid from moment,timecard where parent = 45 and (pbar = 0) order by moment.mid;
> > >
> > > Query B (combining the two with OR)
> > >
> > > select distinct moment.mid from moment,timecard where parent = 45 and ((pid=17 and timecard.mid = moment.mid) or
(pbar= 0)) order by moment.mid; 
>
> Thanks to depesz on #postgresql (irc-channel):
>
> Query A, the second query: there are no join between the 2 tables.
> Mistake?

No, I just wanted to show the time differences, I haven't used join
before. Now that you have adviced me to, I have tried your suggestion
to rewrite B as a union and it works good! Just as fast as the A Query!

                                                                          QUERY PLAN
                                      

--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=4100.27..4101.39 rows=223 width=4) (actual time=120.963..121.124 rows=42 loops=1)
   ->  Sort  (cost=4100.27..4100.83 rows=223 width=4) (actual time=120.959..121.008 rows=43 loops=1)
         Sort Key: mid
         ->  Append  (cost=287.66..4091.57 rows=223 width=4) (actual time=11.274..120.795 rows=43 loops=1)
               ->  Subquery Scan "*SELECT* 1"  (cost=287.66..290.70 rows=203 width=4) (actual time=11.270..11.604
rows=41loops=1) 
                     ->  Unique  (cost=287.66..288.67 rows=203 width=4) (actual time=11.264..11.469 rows=41 loops=1)
                           ->  Sort  (cost=287.66..288.16 rows=203 width=4) (actual time=11.260..11.331 rows=57
loops=1)
                                 Sort Key: moment.mid
                                 ->  Hash Join  (cost=60.98..279.88 rows=203 width=4) (actual time=2.563..11.136
rows=57loops=1) 
                                       Hash Cond: ("outer".mid = "inner".mid)
                                       ->  Seq Scan on timecard  (cost=0.00..211.78 rows=1017 width=4) (actual
time=0.032..7.156rows=995 loops=1) 
                                             Filter: (pid = 17)
                                       ->  Hash  (cost=59.88..59.88 rows=444 width=4) (actual time=2.329..2.329 rows=0
loops=1)
                                             ->  Seq Scan on moment  (cost=0.00..59.88 rows=444 width=4) (actual
time=0.035..1.980rows=199 loops=1) 
                                                   Filter: (parent = 45)
               ->  Subquery Scan "*SELECT* 2"  (cost=192.62..3800.87 rows=20 width=4) (actual time=0.583..109.073
rows=2loops=1) 
                     ->  Unique  (cost=192.62..3800.67 rows=20 width=4) (actual time=0.578..109.061 rows=2 loops=1)
                           ->  Nested Loop  (cost=192.62..3417.57 rows=153240 width=4) (actual time=0.576..89.437
rows=15324loops=1) 
                                 ->  Index Scan using moment_mid_idx on moment  (cost=0.00..160.15 rows=20 width=4)
(actualtime=0.544..3.527 rows=2 loops=1) 
                                       Filter: ((parent = 45) AND (pbar = 0))
                                 ->  Materialize  (cost=192.62..269.24 rows=7662 width=0) (actual time=0.014..21.930
rows=7662loops=2) 
                                       ->  Seq Scan on timecard  (cost=0.00..192.62 rows=7662 width=0) (actual
time=0.005..14.560rows=7662 loops=1) 
 Total runtime: 122.076 ms
(23 rows)

--
Hans Ekbrand (http://sociologi.cjb.net) <hans@sociologi.cjb.net>
A. Because it breaks the logical sequence of discussion
Q. Why is top posting bad?

Attachment

Re: mysterious difference in speed when combining two queries with OR

From
Theo Kramer
Date:
On 23 Apr 2008, at 9:23AM, Hans Ekbrand wrote:

> I cannot understand why the following two queries differ so much in
> execution time (almost ten times)
>
> Query A (two queries)
>
> select distinct moment.mid from moment,timecard where parent = 45
> and (pid=17 and timecard.mid = moment.mid) order by moment.mid;
> select distinct moment.mid from moment,timecard where parent = 45
> and (pbar = 0) order by moment.mid;
>
> Query B (combining the two with OR)
>
> select distinct moment.mid from moment,timecard where parent = 45
> and ((pid=17 and timecard.mid = moment.mid) or (pbar = 0)) order by
> moment.mid;
>
> $ time psql -o /dev/null -f query-a.sql fektest
>
> real    0m2.016s
> user    0m1.532s
> sys     0m0.140s
>
> $ time psql -o /dev/null -f query-b.sql fektest
>
> real    0m28.534s
> user    0m1.516s
> sys     0m0.156s
>
> I have tested this in two different computers with different amount of
> RAM, fast or slow CPU, and the difference is persistent, almost ten
> times.
>
> I should say that this is on postgresql 7.4.16 (debian stable).
>
> Can query B be rewritten so that it would execute faster?

Try
select distinct moment.mid from moment,timecard where parent = 45 and
(pid=17 and timecard.mid = moment.mid) order by moment.mid
union all
select distinct moment.mid from moment,timecard where parent = 45 and
(pbar = 0) order by moment.mid;
--
Regards
Theo


> I should say that this is on postgresql 7.4.16 (debian stable).

    Whoa.

> I cannot understand why the following two queries differ so much in
> execution time (almost ten times)

    Post EXPLAIN ANALYZE for both, and also post table definitions (with
indexes), use \d table. This will allow people to help you.

> $ time psql -o /dev/null -f query-a.sql fektest
>
> real    0m2.016s
> user    0m1.532s
> sys     0m0.140s

    You are measuring the time it takes the server to perform the query, plus
this :
    - time for the client (psql) to launch itself,
    - to read the configuration file,
    - to connect to the server, send the query
    - to transfer the results back to the client (is this on network or local
? what is the amount of data transferred ?)
    - to process the results, format them as text, display them,
    - to close the connection,
    - to exit cleanly

    As you can see from the above numbers,
    - 2.016 seconds elapsed on your wall clock, of which :
    - 76% was used as CPU time in the client (therefore of absolutely no
relevance to postgresql server performance)
    - and the rest (24%) distributed in unknown proportion between server CPU
spent to process your query, network roundtrips, data transfer, server
iowait, etcetera.

    In order to properly benchmark your query, you should :

    1- Ensure the server is not loaded and processing any other query (unless
you explicitly intend to test behaviour under load)
    If you don't do that, your timings will be random, depending on how much
load you have, if someone holds a lock you have to wait on, etc.

    2- ssh to your server and use a psql session local to the server, to
avoid network roundtrips.

    3- enable statement timing with \t

    2- EXPLAIN your query.

    Check the plan.
    Check the time it took to EXPLAIN, this will tell you how much time it
takes to parse and plan your query.

    2- EXPLAIN ANALYZE your query.

    Do it several times, note the different timings and understand the query
plans.
    If the data was not cached, the first timing will be much longer than the
subsequent other timings. This will give you useful information about the
behaviour of this query : if lasts for 1 second (cached) and 5 minutes
(not cached), you might not want to execute it at the same time as that
huge scheduled backup job. Those timings will also provide hints on wether
you should CLUSTER the table, etc.

    3- EXPLAIN SELECT count(*) FROM (your query) AS foo
    Check that the plan is the same.

    4- SELECT count(*) FROM (your query) AS foo
    The count(*) means very little data is exchanged between client and
server, so this doesn't mess with the timing.

    Now, compare :

    The timings displayed by psql (\t) include query planning, roundtrip to
server, and result processing (hence the count(*) to reduce this overhead).
    The timings displayed by EXPLAIN ANALYZE include only query execution
time, but EXPLAIN ANALYZE is slower than just executing the query, because
it takes time to instrument the query and measure its performance. For
instance, on a very simple query that computes an aggregate on lots of
rows, more time will be spent measuring than actually executing the query.
Hence steps 3 and 4 above.

    Knowing this, you deduce the time it takes to parse & plan your query
(should you then use PREPAREd statements ? up to you) and the time it
takes to execute it.

    5- EXPLAIN ANALYZE, while changing the parameters (trying some very
selective or less selective ones) to check for plan change, mess with
enable_**** parameters to check for different plans, rewrite the query
differently (DISTINCT/GROUP BY, OR/UNION, JOIN or IN(subquery), etc).



























Re: mysterious difference in speed when combining two queries with OR

From
Gregory Stark
Date:
"Hans Ekbrand" <hans.ekbrand@sociology.gu.se> writes:

> No, I just wanted to show the time differences, I haven't used join
> before. Now that you have adviced me to, I have tried your suggestion
> to rewrite B as a union and it works good! Just as fast as the A Query!

You can even do better. If you know the two sets of mid are disjoint you can
use UNION ALL. If not you could remove the two DISTINCTs as the UNION will
take care of removing duplicates.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning