Horribly slow hash join - Mailing list pgsql-performance

From Jim C. Nasby
Subject Horribly slow hash join
Date
Msg-id 20040416154502.GF87362@nasby.net
Whole thread Raw
Responses Re: Horribly slow hash join  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Horribly slow hash join  ("Jim C. Nasby" <jim@nasby.net>)
List pgsql-performance
Note the time for the hash join step:



------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=357.62..26677.99 rows=93668 width=62) (actual time=741.159..443381.011 rows=49091 loops=1)
   Hash Cond: ("outer".work_today = "inner".work_units)
   ->  Hash Join  (cost=337.11..24784.11 rows=93668 width=54) (actual time=731.374..417188.519 rows=49091 loops=1)
         Hash Cond: ("outer".work_total = "inner".work_units)
         ->  Seq Scan on email_rank  (cost=0.00..22240.04 rows=254056 width=46) (actual time=582.145..1627.759
rows=49091loops=1) 
               Filter: (project_id = 8)
         ->  Hash  (cost=292.49..292.49 rows=17849 width=16) (actual time=148.944..148.944 rows=0 loops=1)
               ->  Seq Scan on rank_tie_overall o  (cost=0.00..292.49 rows=17849 width=16) (actual time=0.059..75.984
rows=17849loops=1) 
   ->  Hash  (cost=17.81..17.81 rows=1081 width=16) (actual time=8.996..8.996 rows=0 loops=1)
         ->  Seq Scan on rank_tie_today d  (cost=0.00..17.81 rows=1081 width=16) (actual time=0.080..4.635 rows=1081
loops=1)
 Total runtime: 619047.032 ms

By comparison:
stats=# set enable_hashjoin=false;
SET
stats=# explain analyze select * from email_rank, rank_tie_overall o, rank_tie_today d  WHERE email_rank.work_today =
d.work_unitsAND email_rank.work_total = o.work_units AND email_rank.project_id = :ProjectID; 
                                                                             QUERY PLAN
                                             

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=55391.69..56823.23 rows=93668 width=80) (actual time=2705.344..3349.318 rows=49091 loops=1)
   Merge Cond: ("outer".work_units = "inner".work_today)
   ->  Index Scan using work_units_today on rank_tie_today d  (cost=0.00..23.89 rows=1081 width=16) (actual
time=0.150..4.874rows=1081 loops=1) 
   ->  Sort  (cost=55391.69..55625.86 rows=93668 width=64) (actual time=2705.153..2888.039 rows=49091 loops=1)
         Sort Key: email_rank.work_today
         ->  Merge Join  (cost=45047.64..47656.93 rows=93668 width=64) (actual time=1685.414..2494.342 rows=49091
loops=1)
               Merge Cond: ("outer".work_units = "inner".work_total)
               ->  Index Scan using work_units_overall on rank_tie_overall o  (cost=0.00..361.34 rows=17849 width=16)
(actualtime=0.122..79.383 rows=17849 loops=1) 
               ->  Sort  (cost=45047.64..45682.78 rows=254056 width=48) (actual time=1685.228..1866.215 rows=49091
loops=1)
                     Sort Key: email_rank.work_total
                     ->  Seq Scan on email_rank  (cost=0.00..22240.04 rows=254056 width=48) (actual
time=786.515..1289.101rows=49091 loops=1) 
                           Filter: (project_id = 8)
 Total runtime: 3548.087 ms

Even though the second case is only a select, it seems clear that
something's wrong...
--
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

pgsql-performance by date:

Previous
From: Robert Treat
Date:
Subject: Re: query slows down with more accurate stats
Next
From: Mike Nolan
Date:
Subject: Re: Long running queries degrade performance