Thread: Slow left outer join

Slow left outer join

From
Alexander Staubo
Date:
I have thee primary tables: comments, events and users. There's a
join table events_comments tying comments to events. There are B-tree
indexes on every column involved, and the tables are freshly vacuumed
and analyzed;

A simple select to retrieve all comments for a given event, with an
outer join to retrieve creators:

select comments.*, users.*
from comments
left outer join users on
   users.id = comments.creator_id
inner join events_comments on
   comments.id = events_comments.comment_id
   and events_comments.event_id = 9244

...uses the following execution plan:


   QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------
Hash Join  (cost=1138.11..29763.30 rows=9 width=805) (actual
time=1002.348..1493.016 rows=3 loops=1)
    Hash Cond: ("outer".id = "inner".comment_id)
    ->  Hash Left Join  (cost=1119.99..28858.50 rows=177318
width=805) (actual time=28.919..1440.155 rows=177448 loops=1)
          Hash Cond: ("outer".creator_id = "inner".id)
          ->  Seq Scan on comments  (cost=0.00..9230.18 rows=177318
width=325) (actual time=0.005..117.746 rows=177448 loops=1)
          ->  Hash  (cost=531.19..531.19 rows=9119 width=480) (actual
time=28.883..28.883 rows=9119 loops=1)
                ->  Seq Scan on users  (cost=0.00..531.19 rows=9119
width=480) (actual time=0.003..6.555 rows=9119 loops=1)
    ->  Hash  (cost=18.10..18.10 rows=9 width=4) (actual
time=0.050..0.050 rows=3 loops=1)
          ->  Index Scan using events_comments_event_id_index on
events_comments  (cost=0.00..18.10 rows=9 width=4) (actual
time=0.028..0.038 rows=3 loops=1)
                Index Cond: (event_id = 9244)
Total runtime: 1493.565 ms

In this case there are three comments. Every comment has a creator
user here, so an inner join will produce the same results, but the
execution plan is considerably more efficient:


   QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------
Nested Loop  (cost=0.00..104.52 rows=8 width=805) (actual
time=0.059..0.117 rows=3 loops=1)
    ->  Nested Loop  (cost=0.00..50.75 rows=9 width=325) (actual
time=0.045..0.078 rows=3 loops=1)
          ->  Index Scan using events_comments_event_id_index on
events_comments  (cost=0.00..18.10 rows=9 width=4) (actual
time=0.026..0.033 rows=3 loops=1)
                Index Cond: (event_id = 9244)
          ->  Index Scan using comments_pkey on comments
(cost=0.00..3.62 rows=1 width=325) (actual time=0.011..0.012 rows=1
loops=3)
                Index Cond: (comments.id = "outer".comment_id)
    ->  Index Scan using users_pkey on users  (cost=0.00..5.96 rows=1
width=480) (actual time=0.009..0.010 rows=1 loops=3)
          Index Cond: (users.id = "outer".creator_id)
Total runtime: 0.354 ms

The outer join seems unnecessarily slow. Is there anything I can do
to speed it up?

PostgreSQL 8.1.3 on OS X (MacPorts).

Alexander.

Re: Slow left outer join

From
Richard Broersma Jr
Date:
> The outer join seems unnecessarily slow. Is there anything I can do
> to speed it up?
>
> PostgreSQL 8.1.3 on OS X (MacPorts).

This is just a guess on my part, but would it help if you preformed the inner join first to filter
out the most of the unwated rows that then preform the outer join with what is left over?

But just double check that results are as you expect.

Regards,

Richard Broersma Jr.

Re: Slow left outer join

From
Tom Lane
Date:
Alexander Staubo <alex@purefiction.net> writes:
> select comments.*, users.*
> from comments
> left outer join users on
>    users.id = comments.creator_id
> inner join events_comments on
>    comments.id = events_comments.comment_id
>    and events_comments.event_id = 9244

> The outer join seems unnecessarily slow.

Try flipping the order of the two joins.  PG 8.2 will be smart enough to
do that for itself, but no existing release understands when it's safe
to rearrange outer-join order.

            regards, tom lane

Re: Slow left outer join

From
Alexander Staubo
Date:
On Nov 27, 2006, at 02:45 , Tom Lane wrote:

> Alexander Staubo <alex@purefiction.net> writes:
>> select comments.*, users.*
>> from comments
>> left outer join users on
>>    users.id = comments.creator_id
>> inner join events_comments on
>>    comments.id = events_comments.comment_id
>>    and events_comments.event_id = 9244
>
>> The outer join seems unnecessarily slow.
>
> Try flipping the order of the two joins.  PG 8.2 will be smart
> enough to
> do that for itself, but no existing release understands when it's safe
> to rearrange outer-join order.

That fixes it. Thanks.

Alexander.