Re: "SELECT .. WHERE NOT IN" query running for hours - Mailing list pgsql-performance

From Mladen Gogala
Subject Re: "SELECT .. WHERE NOT IN" query running for hours
Date
Msg-id 4D2695C9.10204@vmsinfo.com
Whole thread Raw
In response to "SELECT .. WHERE NOT IN" query running for hours  (Γιωργος Βαλκανας <lebiathan@gmail.com>)
Responses Re: "SELECT .. WHERE NOT IN" query running for hours
List pgsql-performance
On 1/6/2011 9:36 PM, Γιωργος Βαλκανας wrote:
>
> 1) Why is it taking *so* long for the first query (with the "NOT IN" )
> to do even the simple select?
Because NOT IN has to execute the correlated subquery for every row and
then check whether the requested value is in the result set, usually by
doing sequential comparison. The NOT EXIST plan is also bad because
there is no index but at least it can use very fast and efficient hash
algorithm. Indexing the "hwdocid" column on the "Document" table or,
ideally, making it a primary key, should provide an additional boost to
your query. If you already do have an index, you may consider using
enable_seqscan=false for this session, so that the "hwdocid" index will
be used. It's a common wisdom that in the most cases NOT EXISTS will
beat NOT IN. That is so all over the database world. I've seen that in
Oracle applications, MS SQL applications and, of course MySQL
applications. Optimizing queries is far from trivial.

Μλαδεν Γογαλα

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com


pgsql-performance by date:

Previous
From: Γιωργος Βαλκανας
Date:
Subject: "SELECT .. WHERE NOT IN" query running for hours
Next
From: Greg Smith
Date:
Subject: Re: Wrong docs on wal_buffers?