Hash Join is very slooow in some cases - Mailing list pgsql-hackers

From Hiroshi Inoue
Subject Hash Join is very slooow in some cases
Date
Msg-id 000201bf322b$7df2a620$2801007e@cadzone.tpf.co.jp
Whole thread Raw
Responses Re: [HACKERS] Hash Join is very slooow in some cases  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi all,

We are converting Oracle system to PostgreSQL.
But some queries takes very looong time.

I have tried variously and made a typical example
in PostgreSQL 6.5.2 .
It's already known ?

select count(*) from a,b where a.id1=b.id1;

returns immeidaitely and EXPLAIN shows

Aggregate  (cost=3380.24 rows=24929 width=8) ->  Hash Join  (cost=3380.24 rows=24929 width=8)       ->  Seq Scan on a
(cost=1551.41rows=27558 width=4)       ->  Hash  (cost=604.21 rows=8885 width=4)             ->  Seq Scan on b
(cost=604.21rows=8885 width=4)
 

But

select count(*) from a,b where a.id1=b.id1 and a.id2=b.id2;

takes very looong time.
EXPLAIN shows

Aggregate  (cost=3382.24 rows=8195 width=12) ->  Hash Join  (cost=3382.24 rows=8195 width=12)       ->  Seq Scan on a
(cost=1551.41rows=27558 width=6)       ->  Hash  (cost=604.21 rows=8885 width=6)             ->  Seq Scan on b
(cost=604.21rows=8885 width=6)
 

Query plans are almost same.
Why is there a such difference ?

I examined an output by EXPLAIN VERBOSE and found that
the 1st query uses id1 as its hashkey and 2nd query uses id2
as its hashkey.

So I tried the following query.

select count(*) from a,b where a.id2=b.id2 and a.id1=b.id1;

This returns immediately as I expected.

id1-s are type int4 and their disbursions are 0.00010181/ 
0.000203409.
id2-s are type int2 and their disbursions are 0.523526/
0.328712 .
Is id2 suitable for a hashkey ?

I can't try it in current now,sorry.
But seems the following code in createplan.c is unchanged.
/* Now the righthand op of the sole hashclause is the inner hash key. */       innerhashkey =
get_rightop(lfirst(hashclauses)); 
 

Comments ?

Hiroshi Inoue
Inoue@tpf.co.jp


pgsql-hackers by date:

Previous
From: wieck@debis.com (Jan Wieck)
Date:
Subject: Re: [HACKERS] psql & regress tests
Next
From: Bruce Momjian
Date:
Subject: Getting OID in psql of recent insert