Re: NOT IN doesn't use index? (fwd) - Mailing list pgsql-performance

From Rod Taylor
Subject Re: NOT IN doesn't use index? (fwd)
Date
Msg-id 1051996600.1656.36.camel@jester
Whole thread Raw
In response to Re: NOT IN doesn't use index? (fwd)  (Becky Neville <rebecca.neville@yale.edu>)
List pgsql-performance
On Sat, 2003-05-03 at 15:56, Becky Neville wrote:
> I think that list is actually (gulp) hard coded.  It's not my query.  I am
> trying to speed it up for someone else - to hopefully learn something in
> the process that isn't dependent on what version of postgres i'm
> running :)

An interesting test might be to see if the overhead of doing a character
based comparison (as opposed to integer based) is significant.  If it
is, previous tests show it can be significant for CPU bound queries,
convert all of those codes into integers and use a lookup table table to
do the conversion.

Another interesting thought, since you have a long running query would
be to attempt an inversion.  Create a temporary table with the *valid*
codes if count(valid codes) < 2 * count(invalid codes).  Run the query
replacing NOT IN with a join to the temporary table.  This will reduce
the number of comparisons required, as a match can move onto the next
datum, but a NOT IN must check all values. If this helps, try indexing
(and analyzing) the temporary table.

By far the fastest results can be achieved by not allowing invalid
billing codes to be inserted into the table via a constraint of somekind
(check or fkey to summary table).

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Attachment

pgsql-performance by date:

Previous
From: Ron Johnson
Date:
Subject: Re: Looking for a cheap upgrade (RAID)
Next
From: Josh Berkus
Date:
Subject: Suggestions wanted for 7.2.4 query