Home > mailing lists

Re: [HACKERS] Hash Join is very slooow in some cases - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [HACKERS] Hash Join is very slooow in some cases
Date	November 18, 1999 22:52:20
Msg-id	789.942983480@sss.pgh.pa.us Whole thread Raw
In response to	Hash Join is very slooow in some cases ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
List	pgsql-hackers

Tree view

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> select count(*) from a,b where a.id1=b.id1;
> returns immeidaitely ...
> But
> select count(*) from a,b where a.id1=b.id1 and a.id2=b.id2;
> takes very looong time.
> I examined an output by EXPLAIN VERBOSE and found that
> the 1st query uses id1 as its hashkey and 2nd query uses id2
> as its hashkey.

Yes, and since id2 has terrible disbursion, most of the hashtable
entries end up in a small number of hash buckets, resulting in
an unexpectedly large number of comparisons done for each outer
tuple.  I've seen this effect before.

I have a TODO item to make the optimizer pay attention to disbursion
when estimating the cost of a hashjoin.  That would cause it to make
the right choice of key in this example.  Not done yet though :-(.
Feel free to jump in if you need it today...
        regards, tom lane

pgsql-hackers by date:

From: Tom Lane
Date: 18 November 1999, 22:47:20
Subject: Re: [HACKERS] Getting OID in psql of recent insert

From: Tom Lane
Date: 18 November 1999, 23:08:21
Subject: Re: [HACKERS] pg version date file

Re: [HACKERS] Hash Join is very slooow in some cases - Mailing list pgsql-hackers

Previous

Next