Thread: Postgres inconsistent use of Index vs. Seq Scan

Postgres inconsistent use of Index vs. Seq Scan

From
"Jim Carroll"
Date:

I'm having difficulty understanding what I perceive as an inconsistency in how the postgres parser chooses to use indices. We have a query based on NOT IN against an indexed column that the parser  executes sequentially, but when we perform the same query as IN, it uses the index.

 

I've created a simplistic example that I believe demonstrates the issue, notice this first query is sequential

 

CREATE TABLE node

(

  id SERIAL PRIMARY KEY,

  vid INTEGER

);

CREATE INDEX x ON node(vid);

 

INSERT INTO node(vid) VALUES (1),(2);

 

EXPLAIN ANALYZE

SELECT *

FROM node

WHERE NOT vid IN (1);

 

Seq Scan on node  (cost=0.00..36.75 rows=2129 width=8) (actual time=0.009..0.010 rows=1 loops=1)

  Filter: (vid <> 1)

  Rows Removed by Filter: 1

Total runtime: 0.025 ms

 

But if we invert the query to IN, you'll notice that it now decided to use the index

 

EXPLAIN ANALYZE

SELECT *

FROM node

WHERE vid IN (2);

 

Bitmap Heap Scan on node  (cost=4.34..15.01 rows=11 width=8) (actual time=0.017..0.017 rows=1 loops=1)

  Recheck Cond: (vid = 1)

  ->  Bitmap Index Scan on x  (cost=0.00..4.33 rows=11 width=0) (actual time=0.012..0.012 rows=1 loops=1)

        Index Cond: (vid = 1)

Total runtime: 0.039 ms

 

Can anyone shed any light on this? Specifically, is there a way to re-write out NOT IN to work with the index (when obviously the result set is not as simplistic as just 1 or 2).

 

We are using Postgres 9.2 on CentOS 6.6

 

Re: Postgres inconsistent use of Index vs. Seq Scan

From
Tom Lane
Date:
"Jim Carroll" <jim@carroll.com> writes:
> I'm having difficulty understanding what I perceive as an inconsistency in
> how the postgres parser chooses to use indices. We have a query based on NOT
> IN against an indexed column that the parser  executes sequentially, but
> when we perform the same query as IN, it uses the index.

What you've got here is a query that asks for all rows with vid <> 1.
Not-equals is not an indexable operator according to Postgres, and there
would not be much point in making it one, since it generally implies
having to scan the majority of the table.

If, indeed, 99% of your table has vid = 1, then there would be some point
in trying to use an index to find the other 1%; but you'll have to
formulate the query differently (perhaps "vid > 1" would do?) or else
use a properly-designed partial index.

            regards, tom lane