Re: New style of hash join proposal - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: New style of hash join proposal
Date
Msg-id 87myowaecy.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: New style of hash join proposal  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> Nested Loop  (cost=5.39..198.81 rows=51 width=244)
>>> ->  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
>>>     ->  Seq Scan on int4_tbl b  (cost=0.00..1.05 rows=5 width=4)
>>> ->  Bitmap Heap Scan on tenk1 a  (cost=4.33..39.41 rows=10 width=244)
>>> Recheck Cond: (a.thousand = b.f1)
>>>     ->  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..4.33 rows=10 width=0)
>>> Index Cond: (a.thousand = b.f1)
>>> (7 rows)
>
>> Sure, but that's still re-executing the bitmap index scan 51 times -- possibly
>> having to fetch the same records off disk repeatedly.

sorry 5 times

> It's not fetching any record repeatedly, because the HashAggregate
> step eliminated duplicate keys on the other side.

Only because it's an IN query. If it was a normal join which hash joins are
perfectly capable of handling then it would be. As it is it could be fetching
the same page repeatedly but not precisely the same tuples. 

It happens that transforming this query to 

explain analyze select * from tenk1 a where thousand = any (array(select f1 from int4_tbl b));

makes it run about 40% faster. That could just be that the hash is small
enough that a linear search is more efficient than calling hashint4 though.
(Perhaps we should be protecting against that in dynahash, actually)

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!
 


pgsql-hackers by date:

Previous
From: KaiGai Kohei
Date:
Subject: Re: [0/4] Proposal of SE-PostgreSQL patches
Next
From: Tom Lane
Date:
Subject: Re: count(*) performance improvement ideas