small table left outer join big table - Mailing list pgsql-hackers

From Jie Li
Subject small table left outer join big table
Date
Msg-id AANLkTikNKpOfM=OPzfMgi9_q2tyUVSSPA0vBYaJ2_mk4@mail.gmail.com
Whole thread Raw
Responses Re: small table left outer join big table
Re: small table left outer join big table
List pgsql-hackers
Hi,<br /><br />Please see the following plan:<br /><br />postgres=# explain select * from small_table left outer join
big_tableusing (id);<br />                                 QUERY PLAN                                 <br
/>----------------------------------------------------------------------------<br/>  Hash Left Join 
(cost=126408.00..142436.98rows=371 width=12)<br />   Hash Cond: (<a href="http://small_table.id">small_table.id</a> =
<ahref="http://big_table.id">big_table.id</a>)<br />   ->  Seq Scan on small_table  (cost=0.00..1.09 rows=9
width=8)<br/>    ->  Hash  (cost=59142.00..59142.00 rows=4100000 width=8)<br />         ->  Seq Scan on
big_table (cost=0.00..59142.00 rows=4100000 width=8)<br />(5 rows)<br /><br />Here I have a puzzle, why not choose the
smalltable to build hash table? It can avoid multiple batches thus save significant I/O cost, isn't it? <br /><br />We
canperform this query in two phases: <br />1) inner join, using the small table to build hash table.<br />2) check
whethereach tuple in the hash table has matches before, which can be done with another flag bit<br /><br /> The only
compromiseis the output order, due to the two separate phases. Not sure whether the SQL standard requires it.<br /><br
/>Thanks,<br/>Li Jie<br /><br /><br /> 

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: "writable CTEs"
Next
From: Peter Eisentraut
Date:
Subject: Re: "writable CTEs"