Re: Join optimisation Quandry - Mailing list pgsql-performance

From Ceri Storey
Subject Re: Join optimisation Quandry
Date
Msg-id 20040117115826.GD43530@mrtall.compsoc.man.ac.uk
Whole thread Raw
In response to Join optimisation Quandry  (Ceri Storey <cez@necrofish.org.uk>)
Responses Re: Join optimisation Quandry
List pgsql-performance
On Sat, Jan 17, 2004 at 01:03:34AM +0000, Ceri Storey wrote:
> Okay, from top to bottom:
>
>   SELECT p1.chan_name, p1.prog_start AS now_start, p1.prog_id, p1.title_text,
>     p2.prog_start AS next_start, p2.prog_id, p2.title_text,
>     p1.title_wanted, p2.title_wanted, p1.chan_id
>   FROM (programme natural join channel NATURAL JOIN title) AS p1
>     LEFT OUTER JOIN (programme NATURAL JOIN title) AS p2
>     ON p1.prog_next = p2.prog_id
>   WHERE p1.prog_start <= timestamp 'now' AND p1.prog_stop > timestamp 'now'
>   ORDER BY p1.chan_id ASC


Although, as I've just found, another bottleneck is the title table.
PostgreSQL seems to inst on doing a Seq Scan on the entire table.

Compare:
tv=> explain analyse SELECT * FROM tid LEFT OUTER  JOIN title ON  t1 = title_id OR t2 = title_id;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=190.83..267285.83 rows=2000 width=35) (actual time=222.776..2430.073 rows=33 loops=1)
   Join Filter: (("outer".t1 = "inner".title_id) OR ("outer".t2 = "inner".title_id))
   ->  Seq Scan on tid  (cost=0.00..20.00 rows=1000 width=8) (actual time=0.028..10.457 rows=17 loops=1)
   ->  Materialize  (cost=190.83..297.66 rows=10683 width=27) (actual time=0.197..57.918 rows=10767 loops=17)
         ->  Seq Scan on title  (cost=0.00..190.83 rows=10683 width=27) (actual time=0.045..64.988 rows=10767 loops=1)
 Total runtime: 2435.059 ms
(6 rows)

With:
tv=> explain analyse select * from title where title_id IN (SELECT t1 FROM tid UNION SELECT t2 FROM tid);
                                                                      QUERY PLAN
                               

-------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=205.16..451.40 rows=200 width=27) (actual time=3.065..82.689 rows=33 loops=1)
   Hash Cond: ("outer".title_id = "inner".t1)
   ->  Seq Scan on title  (cost=0.00..190.83 rows=10683 width=27) (actual time=0.010..36.325 rows=10767 loops=1)
   ->  Hash  (cost=204.66..204.66 rows=200 width=4) (actual time=1.464..1.464 rows=0 loops=1)
         ->  HashAggregate  (cost=204.66..204.66 rows=200 width=4) (actual time=1.234..1.355 rows=33 loops=1)
               ->  Subquery Scan "IN_subquery"  (cost=169.66..199.66 rows=2000 width=4) (actual time=0.735..1.104
rows=33loops=1) 
                     ->  Unique  (cost=169.66..179.66 rows=2000 width=4) (actual time=0.728..0.934 rows=33 loops=1)
                           ->  Sort  (cost=169.66..174.66 rows=2000 width=4) (actual time=0.722..0.779 rows=34 loops=1)
                                 Sort Key: t1
                                 ->  Append  (cost=0.00..60.00 rows=2000 width=4) (actual time=0.054..0.534 rows=34
loops=1)
                                       ->  Subquery Scan "*SELECT* 1"  (cost=0.00..30.00 rows=1000 width=4) (actual
time=0.050..0.228rows=17 loops=1) 
                                             ->  Seq Scan on tid  (cost=0.00..20.00 rows=1000 width=4) (actual
time=0.041..0.126rows=17 loops=1) 
                                       ->  Subquery Scan "*SELECT* 2"  (cost=0.00..30.00 rows=1000 width=4) (actual
time=0.014..0.183rows=17 loops=1) 
                                             ->  Seq Scan on tid  (cost=0.00..20.00 rows=1000 width=4) (actual
time=0.008..0.087rows=17 loops=1) 
 Total runtime: 83.214 ms
(15 rows)

--
Ceri Storey <cez@necrofish.org.uk>

pgsql-performance by date:

Previous
From: John Siracusa
Date:
Subject: Re: Idle postmaster taking up a lot of CPU
Next
From: Ceri Storey
Date:
Subject: Re: Join optimisation Quandry