Thread: BUG #3639: queryplanner degrades to a sequential scan even if there's an index

BUG #3639: queryplanner degrades to a sequential scan even if there's an index

From
"Hannu Valtonen"
Date:
The following bug has been logged online:

Bug reference:      3639
Logged by:          Hannu Valtonen
Email address:      hannu.valtonen@hut.fi
PostgreSQL version: 8.2.4
Operating system:   Debian Linux (4.0), kernel 2.6.20
Description:        queryplanner degrades to a sequential scan even if
there's an index
Details:

Hi,

I'm wondering my query plan degrades to a sequential scan. If I have a :

select * from table1, table2 where table1.table2_id = table2.id and
table2.id = <some constant number>

it uses the index on table.table2_id and everything's fast but if I change
the query to be:

select * from table1, table2 where table1.table2_id = table2.id and
lower(table2.name) = lower('nicestring')

the table1.table2_id index scan degrades to a sequential scan.. making it
unbelievably slow.
The tables are vacuumed every 6 hours with analyze, there's nothing in the
server logs that looks out of the ordinary. Table 1 has about 3.5 million
rows and table 2 about 3 million.

I tried with hashjoin and seqscans set to false, but the seqscan still
happens.

Any help would be really appreciated.

- Hannu Valtonen

ps.
Here are the relevant queries with explain analyze

explain analyze select * from table1, table2 where table1.table2_id =
table2.id and table2.id = 2841962;
                                                               QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------------
 Nested Loop  (cost=30.28..2788.35 rows=1124 width=335) (actual
time=0.155..0.164 rows=2 loops=1)
   ->  Index Scan using table2_pkey on table2  (cost=0.00..8.36 rows=1
width=35) (actual time=0.030..0.032 rows=1 loops=1)
         Index Cond: (id = 2841962)
   ->  Bitmap Heap Scan on table1 (cost=30.28..2777.74 rows=1124 width=300)
(actual time=0.033..0.036 rows=2 loops=1)
         Recheck Cond: (2841962 = table2_id)
         ->  Bitmap Index Scan on table1.table2_id_index  (cost=0.00..30.23
rows=723 width=0) (actual time=0.025..0.025 rows=2 loops=1)
               Index Cond: (2841962 = table2_id)
 Total runtime: 0.257 ms
(8 rows)

explain analyze select * from table1, table2 where table1.table2_id =
table2.id and lower(table2.name) = lower('nicestring');

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--
 Hash Join  (cost=8.87..176869.89 rows=1 width=335) (actual
time=59725.942..86744.682 rows=2 loops=1)
   Hash Cond: (table1.table2_id = table2.id)
   ->  Seq Scan on table1  (cost=0.00..174217.47 rows=3524735 width=300)
(actual time=0.002..81600.987 rows=3525023 loops=1)
   ->  Hash  (cost=8.87..8.87 rows=1 width=35) (actual time=2.659..2.659
rows=1 loops=1)
         ->  Index Scan using table2_name_lower_index on table2
(cost=0.00..8.87 rows=1 width=35) (actual time=2.636..2.643 rows=1 loops=1)
               Index Cond: (lower((name)::text) = 'nicestring'::text)
 Total runtime: 86744.726 ms
(7 rows)

Time: 86719,735 ms

Table                                        Table "public.table1"
      Column      |            Type             |
Modifiers
------------------+-----------------------------+---------------------------
-------------------------------
 id               | integer                     | not null default
nextval('table1_id_seq'::regclass)
 table2_id        | integer                     |
"table2_id_index" btree (table2_id) WHERE (NOT (table2_id = 1))
"table2_id_exists" FOREIGN KEY (table2_id) REFERENCES table2(id)

And 25 other foreign keys in table1 that point to table 2 which have the
same types of partial indexes and foreign keys.

                                     Table "public.table2"
  Column   |            Type             |                       Modifiers
-----------+-----------------------------+----------------------------------
----------------------
 id        | integer                     | not null default
nextval('table2_id_seq'::regclass)
 name      | character varying(128)      | not null
 timestamp | timestamp without time zone |
Indexes:
    "table2_pkey" primary key, btree (id)
    "table2_name_key" unique, btree (name)
    "table2_name_lower_index" btree (lower((name)::text))

Re: BUG #3639: queryplanner degrades to a sequential scan even if there's an index

From
Heikki Linnakangas
Date:
Hannu Valtonen wrote:
> explain analyze select * from table1, table2 where table1.table2_id =
> table2.id and table2.id = 2841962;

Here the planner is using the partial index table2_id_index on table1,
knowing that table1.table2_id equals 2841962, satisfying the "NOT
(table2_id = 1)" condition that index has.

> explain analyze select * from table1, table2 where table1.table2_id =
> table2.id and lower(table2.name) = lower('nicestring');

But here, table1.table2_id can have any value, including 1, so it can't
use that index.

You can rewrite the query like this:

select * from table1, table2 where table1.table2_id = table2.id and
lower(table2.name) = lower('nicestring') AND NOT (table1.table2_id = 1)
UNION ALL
select * from table1, table2 where table1.table2_id = table2.id and
lower(table2.name) = lower('nicestring') AND (table1.table2_id = 1)

In which case the planner can use the index for the first part, though
not for the second part which might still be slow. I don't know the
schema, but perhaps you're not really interested in rows with table2_id
= 1, so you could just leave out the second part of the union.

Or you can make the index a normal, non-partial index.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: BUG #3639: queryplanner degrades to a sequential scan even if there's an index

From
Hannu Valtonen
Date:
Heikki Linnakangas kirjoitti:
> Hannu Valtonen wrote:
>
>> explain analyze select * from table1, table2 where table1.table2_id =
>> table2.id and table2.id = 2841962;
>>
>
> Here the planner is using the partial index table2_id_index on table1,
> knowing that table1.table2_id equals 2841962, satisfying the "NOT
> (table2_id = 1)" condition that index has.
>
>
>> explain analyze select * from table1, table2 where table1.table2_id =
>> table2.id and lower(table2.name) = lower('nicestring');
>>
>
> But here, table1.table2_id can have any value, including 1, so it can't
> use that index.
>
> You can rewrite the query like this:
>
> select * from table1, table2 where table1.table2_id = table2.id and
> lower(table2.name) = lower('nicestring') AND NOT (table1.table2_id = 1)
> UNION ALL
> select * from table1, table2 where table1.table2_id = table2.id and
> lower(table2.name) = lower('nicestring') AND (table1.table2_id = 1)
>
> In which case the planner can use the index for the first part, though
> not for the second part which might still be slow. I don't know the
> schema, but perhaps you're not really interested in rows with table2_id
> = 1, so you could just leave out the second part of the union.
>
> Or you can make the index a normal, non-partial index.
>
Ah, thank you very much.  I was specifically interested in the != 1
chunk of table1. And that now works with the

AND NOT (table1.table2_id = 1)

- Hannu