Slow left outer join - Mailing list pgsql-general

From Alexander Staubo
Subject Slow left outer join
Date
Msg-id C424647A-6C94-4DA9-B214-AAA0F1E4DF8A@purefiction.net
Whole thread Raw
Responses Re: Slow left outer join  (Richard Broersma Jr <rabroersma@yahoo.com>)
Re: Slow left outer join  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I have thee primary tables: comments, events and users. There's a
join table events_comments tying comments to events. There are B-tree
indexes on every column involved, and the tables are freshly vacuumed
and analyzed;

A simple select to retrieve all comments for a given event, with an
outer join to retrieve creators:

select comments.*, users.*
from comments
left outer join users on
   users.id = comments.creator_id
inner join events_comments on
   comments.id = events_comments.comment_id
   and events_comments.event_id = 9244

...uses the following execution plan:


   QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------
Hash Join  (cost=1138.11..29763.30 rows=9 width=805) (actual
time=1002.348..1493.016 rows=3 loops=1)
    Hash Cond: ("outer".id = "inner".comment_id)
    ->  Hash Left Join  (cost=1119.99..28858.50 rows=177318
width=805) (actual time=28.919..1440.155 rows=177448 loops=1)
          Hash Cond: ("outer".creator_id = "inner".id)
          ->  Seq Scan on comments  (cost=0.00..9230.18 rows=177318
width=325) (actual time=0.005..117.746 rows=177448 loops=1)
          ->  Hash  (cost=531.19..531.19 rows=9119 width=480) (actual
time=28.883..28.883 rows=9119 loops=1)
                ->  Seq Scan on users  (cost=0.00..531.19 rows=9119
width=480) (actual time=0.003..6.555 rows=9119 loops=1)
    ->  Hash  (cost=18.10..18.10 rows=9 width=4) (actual
time=0.050..0.050 rows=3 loops=1)
          ->  Index Scan using events_comments_event_id_index on
events_comments  (cost=0.00..18.10 rows=9 width=4) (actual
time=0.028..0.038 rows=3 loops=1)
                Index Cond: (event_id = 9244)
Total runtime: 1493.565 ms

In this case there are three comments. Every comment has a creator
user here, so an inner join will produce the same results, but the
execution plan is considerably more efficient:


   QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------
Nested Loop  (cost=0.00..104.52 rows=8 width=805) (actual
time=0.059..0.117 rows=3 loops=1)
    ->  Nested Loop  (cost=0.00..50.75 rows=9 width=325) (actual
time=0.045..0.078 rows=3 loops=1)
          ->  Index Scan using events_comments_event_id_index on
events_comments  (cost=0.00..18.10 rows=9 width=4) (actual
time=0.026..0.033 rows=3 loops=1)
                Index Cond: (event_id = 9244)
          ->  Index Scan using comments_pkey on comments
(cost=0.00..3.62 rows=1 width=325) (actual time=0.011..0.012 rows=1
loops=3)
                Index Cond: (comments.id = "outer".comment_id)
    ->  Index Scan using users_pkey on users  (cost=0.00..5.96 rows=1
width=480) (actual time=0.009..0.010 rows=1 loops=3)
          Index Cond: (users.id = "outer".creator_id)
Total runtime: 0.354 ms

The outer join seems unnecessarily slow. Is there anything I can do
to speed it up?

PostgreSQL 8.1.3 on OS X (MacPorts).

Alexander.

pgsql-general by date:

Previous
From: Bob Pawley
Date:
Subject: Re: Upgrade
Next
From: Richard Broersma Jr
Date:
Subject: Re: Slow left outer join