Re: Query optimization problem - Mailing list pgsql-performance

From Yeb Havinga
Subject Re: Query optimization problem
Date
Msg-id 4C3EBFBA.1040408@gmail.com
Whole thread Raw
In response to Query optimization problem  (Zotov <zotov@oe-it.ru>)
List pgsql-performance
Hello Zotov,

Somehow the equivalence d2.basedon=d1.id is not used in the slow query,
probably because the equivalence constant value would be used inside a
not-base expression (the OR). You can see that the equivalence values
*are* used by changing the or to an and and compare both queries. The
only thing you can do to guarantee the planner has all information to in
cases like this it explicitly name the equivalence inside OR
expressions, e.g.

SELECT d1.ID, d2.ID
 FROM DocPrimary d1
   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
 WHERE (d1.ID=234409763 and d2.basedon=234409763) or (d2.ID=234409763) ;

regards,
Yeb Havinga

PS: the analyze time of the slow query showed 0.007ms?

Zotov wrote:
> I have a query:
>
>  SELECT d1.ID, d2.ID
>  FROM DocPrimary d1
>    JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
>  WHERE (d1.ID=234409763) or (d2.ID=234409763)
>
> i think what QO(Query Optimizer) can make it faster (now it seq scan
> and on
> million records works 7 sec)
>
>  SELECT d1.ID, d2.ID
>  FROM DocPrimary d1
>    JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
>  WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)
>
>
>  ----------------------
>  Slow Query
>  ----------------------
>  test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT
> d1.ID,
>  d2.ID
>  test-# FROM DocPrimary d1
>  test-#   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
>  test-# WHERE (d1.ID=234409763) or (d2.ID=234409763);
>                                                          QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------

>
>    Hash Join  (cost=58.15..132.35 rows=2 width=8) (actual
> time=0.007..0.007
>  rows=0 loops=1)
>      Output: d1.id, d2.id
>      Hash Cond: (d2.basedon = d1.id)
>      Join Filter: ((d1.id = 234409763) OR (d2.id = 234409763))
>      ->   Seq Scan on public.docprimary d2  (cost=0.00..31.40 rows=2140
>  width=8) (actual time=0.002..0.002 rows=0 loops=1)
>            Output: d2.id, d2.basedon
>      ->   Hash  (cost=31.40..31.40 rows=2140 width=4) (never executed)
>            Output: d1.id
>            ->   Seq Scan on public.docprimary d1  (cost=0.00..31.40
> rows=2140
>  width=4) (never executed)
>                  Output: d1.id
>
>  ------------------
>  Fast Query
>  ------------------
>  test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT
> d1.ID,
>  d2.ID
>  test-# FROM DocPrimary d1
>  test-#   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
>  test-# WHERE (d2.BasedOn=234409763) or (d2.ID=234409763);
>                                                              QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------

>
>    Nested Loop  (cost=8.60..58.67 rows=12 width=8) (actual
> time=0.026..0.026
>  rows=0 loops=1)
>      Output: d1.id, d2.id
>      ->   Bitmap Heap Scan on public.docprimary d2  (cost=8.60..19.31
> rows=12
>  width=8) (actual time=0.023..0.023 rows=0 loops=1)
>            Output: d2.id, d2.basedon
>            Recheck Cond: ((d2.basedon = 234409763) OR (d2.id =
> 234409763))
>            ->   BitmapOr  (cost=8.60..8.60 rows=12 width=0) (actual
>  time=0.018..0.018 rows=0 loops=1)
>                  ->   Bitmap Index Scan on basedon_idx  (cost=0.00..4.33
>  rows=11 width=0) (actual time=0.008..0.008 rows=0 loops=1)
>                        Index Cond: (d2.basedon = 234409763)
>                  ->   Bitmap Index Scan on id_pk  (cost=0.00..4.26 rows=1
>  width=0) (actual time=0.003..0.003 rows=0 loops=1)
>                        Index Cond: (d2.id = 234409763)
>      ->   Index Scan using id_pk on public.docprimary d1
> (cost=0.00..3.27
>  rows=1 width=4) (never executed)
>            Output: d1.id, d1.basedon
>            Index Cond: (d1.id = d2.basedon)
>
>
> --------------------------------------------
> PGver: PostgreSQL 9.0b x86
> OS: Win7 x64
>
> ---------------------
> Create table query:
> ---------------------
>
> CREATE TABLE docprimary
> (
>   id integer NOT NULL,
>   basedon integer,
>   CONSTRAINT id_pk PRIMARY KEY (id)
> );
> CREATE INDEX basedon_idx
>   ON docprimary
>   USING btree
>   (basedon);
>
>


pgsql-performance by date:

Previous
From: Ryan Wexler
Date:
Subject: Re: performance on new linux box
Next
From: Patrick Donlin
Date:
Subject: Identical query slower on 8.4 vs 8.3