Home > mailing lists

Re: small table left outer join big table - Mailing list pgsql-hackers

From	Gurjeet Singh
Subject	Re: small table left outer join big table
Date	December 28, 2010 11:10:19
Msg-id	AANLkTi=_==09etYitq0Aiysa2BPScWZcq4WSq=mwm-3t@mail.gmail.com Whole thread
In response to	small table left outer join big table (Jie Li <jay23jack@gmail.com>)
List	pgsql-hackers

Tree view

On Tue, Dec 28, 2010 at 5:13 AM, Jie Li <jay23jack@gmail.com> wrote:

Hi,

Please see the following plan:

postgres=# explain select * from small_table left outer join big_table using (id);
                                 QUERY PLAN
----------------------------------------------------------------------------
Hash Left Join (cost=126408.00..142436.98 rows=371 width=12)
   Hash Cond: (small_table.id = big_table.id)
   -> Seq Scan on small_table (cost=0.00..1.09 rows=9 width=8)
   -> Hash (cost=59142.00..59142.00 rows=4100000 width=8)
         -> Seq Scan on big_table (cost=0.00..59142.00 rows=4100000 width=8)
(5 rows)

Here I have a puzzle, why not choose the small table to build hash table? It can avoid multiple batches thus save significant I/O cost, isn't it?

We can perform this query in two phases:
1) inner join, using the small table to build hash table.
2) check whether each tuple in the hash table has matches before, which can be done with another flag bit

The only compromise is the output order, due to the two separate phases. Not sure whether the SQL standard requires it.

SQL standard does not require the result to be in any particular order unless an ORDER BY is used.

Regards,

--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

pgsql-hackers by date:

From: Andy Colson
Date: 28 December 2010, 11:10:16
Subject: page compression

From: Tom Lane
Date: 28 December 2010, 11:16:03
Subject: Re: Libpq PGRES_COPY_BOTH - version compatibility

Re: small table left outer join big table - Mailing list pgsql-hackers

Previous

Next