Thread: Postgres not using indices defined on my table with certain queries using "in"

The two queries blow require radically different query times 1600ms vs 10ms:

select * from test where ock in (1800,1810);
select * from test where ock = 1800 or ock = 1810;

The first query does not use any of the indices defined. I can't
figure out why. How can I make Postgres to use the indices.

In comparison, in the following two queries consume 42ms vs 92ms and
both use the indices defined:

select * from test where ock in (1800,1801);
select * from test where ock = 1800 or ock = 1801;

Below is my table definition and the indices that I've defined on it.
I've gone overboard with the indices trying to understand the behavior
that Postgres is exhibiting. My table contains 161000 records.

      Table "public.test"
 Column |   Type   | Modifiers
--------+----------+-----------
 ock    | integer  |
 ick    | smallint |
 det    | smallint |
Indexes:
    "test_ock_1800" btree (ock) WHERE ock >= 1800 AND ock <= 1800
    "test_ock_1800_1809" btree (ock) WHERE ock >= 1800 AND ock <= 1809
    "test_ock_1801" btree (ock) WHERE ock >= 1801 AND ock <= 1801
    "test_ock_1802" btree (ock) WHERE ock >= 1802 AND ock <= 1802
    "test_ock_1803" btree (ock) WHERE ock >= 1803 AND ock <= 1803
    "test_ock_1804" btree (ock) WHERE ock >= 1804 AND ock <= 1804
    "test_ock_1805" btree (ock) WHERE ock >= 1805 AND ock <= 1805
    "test_ock_1810_1819" btree (ock) WHERE ock >= 1810 AND ock <= 1819
    "test_ock_1820_1829" btree (ock) WHERE ock >= 1820 AND ock <= 1829
    "test_ock_1830_1839" btree (ock) WHERE ock >= 1830 AND ock <= 1839
    "test_ock_1880" btree (ock) WHERE ock >= 1880 AND ock <= 1880


EXPLAIN output using "select * from test where ock in (1800,1810);"
                              QUERY PLAN
----------------------------------------------------------------------
 Seq Scan on test  (cost=100000000.00..100002802.50 rows=103 width=8)
   Filter: (ock = ANY ('{1800,1810}'::integer[]))
(2 rows)


EXPLAIN output using "select * fom test where ock = 1800 or ock = 1810;"
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost= 8.95..12.96 rows=103 width=8)
   Recheck Cond: ((ock = 1800) OR (ock = 1810))
   ->  BitmapOr  (cost=8.95..8.95 rows=1 width=0)
         ->  Bitmap Index Scan on test_ock_1800  (cost=0.00..4.26
rows=1 width=0)
               Index Cond: (ock = 1800)
         ->  Bitmap Index Scan on test_ock_1810_1819  (cost=0.00..4.64
rows=1 width=0)
               Index Cond: (ock = 1810)
(7 rows)



EXPLAIN select * from test where ock in (1800, 1801);
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=9.31..13.32 rows=103 width=8)
   Recheck Cond: (ock = ANY ('{1800,1801}'::integer[]))
   ->  Bitmap Index Scan on test_ock_1800_1809  (cost=0.00..9.28 rows=1 width=0)
         Index Cond: (ock = ANY ('{1800,1801}'::integer[]))
(4 rows)


EXPLAIN select * from test where ock = 1800 or ock = 1801;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=8.57..12.58 rows=103 width=8)
   Recheck Cond: ((ock = 1800) OR (ock = 1801))
   ->  BitmapOr  (cost=8.57..8.57 rows=1 width=0)
         ->  Bitmap Index Scan on test_ock_1800  (cost= 0.00..4.26
rows=1 width=0)
               Index Cond: (ock = 1800)
         ->  Bitmap Index Scan on test_ock_1801  (cost=0.00..4.26
rows=1 width=0)
               Index Cond: (ock = 1801)
(7 rows)

Thanks.
Saadat.

"s anwar" <sanwar@gmail.com> writes:
> The two queries blow require radically different query times 1600ms vs 10ms:

Try not to be so fancy with a bunch of somewhat-overlapping partial indexes.
The planner is not so smart as you, and will not always be able to prove
to itself that it can use these indexes.  A single, non-partial index on
ock would perform at least as well as this hodgepodge.

            regards, tom lane

Tom:

Thank you for your response. The actual table that will have 400
million rows. The last time I created an index on an integer field on
a table that size it was too big for Postgres to use (to high a cost
of using the index). Hence, Postgres reverted back to sequential scan.
I would like to figure out a better way of partitioning my index such
that it still remains useful for Postgres. I will appreciate any tips
that you can provide in this regard. Thanks.

Saadat.

On 7/4/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "s anwar" <sanwar@gmail.com> writes:
> > The two queries blow require radically different query times 1600ms vs 10ms:
>
> Try not to be so fancy with a bunch of somewhat-overlapping partial indexes.
> The planner is not so smart as you, and will not always be able to prove
> to itself that it can use these indexes.  A single, non-partial index on
> ock would perform at least as well as this hodgepodge.
>
>                        regards, tom lane
>

"s anwar" <sanwar@gmail.com> writes:
> Thank you for your response. The actual table that will have 400
> million rows. The last time I created an index on an integer field on
> a table that size it was too big for Postgres to use (to high a cost
> of using the index). Hence, Postgres reverted back to sequential scan.

I rather doubt that you analyzed that situation correctly.  An index
being large is not a reason not to use it (since the table will also be
large, hence the seqscan cost is even higher).  However, with no details
it's impossible to guess the real reason the index wasn't being used.

            regards, tom lane