Re: EXISTS vs IN vs OUTER JOINS - Mailing list pgsql-performance

From Joe Conway
Subject Re: EXISTS vs IN vs OUTER JOINS
Date
Msg-id 3E02053C.7070305@joeconway.com
Whole thread Raw
In response to Re: EXISTS vs IN vs OUTER JOINS  ("Josh Berkus" <josh@agliodbs.com>)
Responses Re: EXISTS vs IN vs OUTER JOINS  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Josh Berkus wrote:
> where I have rarely seen a difference of more than 3:1.  As I
> understand it, this is because NOT EXISTS can use optimized join
> algorithms to locate matching rows, whereas NOT IN must compare each
> row against every possible matching value in the subselect.
>
> It also makes a difference whether or not the referenced field(s) in
> the subselect is indexed.   EXISTS will often use an index to compare
> the values in the master query to the sub-query.  As far as I know, IN
> can use an index to retrieve the subquery values, but not to sort or
> compare them after they have been retreived into memory.

I wonder if "[NOT] IN (subselect)" could be improved with a hash table in
similar fashion to the hash aggregate solution Tom recently implemented?

Joe



pgsql-performance by date:

Previous
From: jasiek@klaster.net
Date:
Subject: Re: EXISTS vs IN vs OUTER JOINS
Next
From: george young
Date:
Subject: 4G row table?