Thread: BUG #3639: queryplanner degrades to a sequential scan even if there's an index
BUG #3639: queryplanner degrades to a sequential scan even if there's an index
From
"Hannu Valtonen"
Date:
The following bug has been logged online: Bug reference: 3639 Logged by: Hannu Valtonen Email address: hannu.valtonen@hut.fi PostgreSQL version: 8.2.4 Operating system: Debian Linux (4.0), kernel 2.6.20 Description: queryplanner degrades to a sequential scan even if there's an index Details: Hi, I'm wondering my query plan degrades to a sequential scan. If I have a : select * from table1, table2 where table1.table2_id = table2.id and table2.id = <some constant number> it uses the index on table.table2_id and everything's fast but if I change the query to be: select * from table1, table2 where table1.table2_id = table2.id and lower(table2.name) = lower('nicestring') the table1.table2_id index scan degrades to a sequential scan.. making it unbelievably slow. The tables are vacuumed every 6 hours with analyze, there's nothing in the server logs that looks out of the ordinary. Table 1 has about 3.5 million rows and table 2 about 3 million. I tried with hashjoin and seqscans set to false, but the seqscan still happens. Any help would be really appreciated. - Hannu Valtonen ps. Here are the relevant queries with explain analyze explain analyze select * from table1, table2 where table1.table2_id = table2.id and table2.id = 2841962; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------------ Nested Loop (cost=30.28..2788.35 rows=1124 width=335) (actual time=0.155..0.164 rows=2 loops=1) -> Index Scan using table2_pkey on table2 (cost=0.00..8.36 rows=1 width=35) (actual time=0.030..0.032 rows=1 loops=1) Index Cond: (id = 2841962) -> Bitmap Heap Scan on table1 (cost=30.28..2777.74 rows=1124 width=300) (actual time=0.033..0.036 rows=2 loops=1) Recheck Cond: (2841962 = table2_id) -> Bitmap Index Scan on table1.table2_id_index (cost=0.00..30.23 rows=723 width=0) (actual time=0.025..0.025 rows=2 loops=1) Index Cond: (2841962 = table2_id) Total runtime: 0.257 ms (8 rows) explain analyze select * from table1, table2 where table1.table2_id = table2.id and lower(table2.name) = lower('nicestring'); ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- -- Hash Join (cost=8.87..176869.89 rows=1 width=335) (actual time=59725.942..86744.682 rows=2 loops=1) Hash Cond: (table1.table2_id = table2.id) -> Seq Scan on table1 (cost=0.00..174217.47 rows=3524735 width=300) (actual time=0.002..81600.987 rows=3525023 loops=1) -> Hash (cost=8.87..8.87 rows=1 width=35) (actual time=2.659..2.659 rows=1 loops=1) -> Index Scan using table2_name_lower_index on table2 (cost=0.00..8.87 rows=1 width=35) (actual time=2.636..2.643 rows=1 loops=1) Index Cond: (lower((name)::text) = 'nicestring'::text) Total runtime: 86744.726 ms (7 rows) Time: 86719,735 ms Table Table "public.table1" Column | Type | Modifiers ------------------+-----------------------------+--------------------------- ------------------------------- id | integer | not null default nextval('table1_id_seq'::regclass) table2_id | integer | "table2_id_index" btree (table2_id) WHERE (NOT (table2_id = 1)) "table2_id_exists" FOREIGN KEY (table2_id) REFERENCES table2(id) And 25 other foreign keys in table1 that point to table 2 which have the same types of partial indexes and foreign keys. Table "public.table2" Column | Type | Modifiers -----------+-----------------------------+---------------------------------- ---------------------- id | integer | not null default nextval('table2_id_seq'::regclass) name | character varying(128) | not null timestamp | timestamp without time zone | Indexes: "table2_pkey" primary key, btree (id) "table2_name_key" unique, btree (name) "table2_name_lower_index" btree (lower((name)::text))
Re: BUG #3639: queryplanner degrades to a sequential scan even if there's an index
From
Heikki Linnakangas
Date:
Hannu Valtonen wrote: > explain analyze select * from table1, table2 where table1.table2_id = > table2.id and table2.id = 2841962; Here the planner is using the partial index table2_id_index on table1, knowing that table1.table2_id equals 2841962, satisfying the "NOT (table2_id = 1)" condition that index has. > explain analyze select * from table1, table2 where table1.table2_id = > table2.id and lower(table2.name) = lower('nicestring'); But here, table1.table2_id can have any value, including 1, so it can't use that index. You can rewrite the query like this: select * from table1, table2 where table1.table2_id = table2.id and lower(table2.name) = lower('nicestring') AND NOT (table1.table2_id = 1) UNION ALL select * from table1, table2 where table1.table2_id = table2.id and lower(table2.name) = lower('nicestring') AND (table1.table2_id = 1) In which case the planner can use the index for the first part, though not for the second part which might still be slow. I don't know the schema, but perhaps you're not really interested in rows with table2_id = 1, so you could just leave out the second part of the union. Or you can make the index a normal, non-partial index. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: BUG #3639: queryplanner degrades to a sequential scan even if there's an index
From
Hannu Valtonen
Date:
Heikki Linnakangas kirjoitti: > Hannu Valtonen wrote: > >> explain analyze select * from table1, table2 where table1.table2_id = >> table2.id and table2.id = 2841962; >> > > Here the planner is using the partial index table2_id_index on table1, > knowing that table1.table2_id equals 2841962, satisfying the "NOT > (table2_id = 1)" condition that index has. > > >> explain analyze select * from table1, table2 where table1.table2_id = >> table2.id and lower(table2.name) = lower('nicestring'); >> > > But here, table1.table2_id can have any value, including 1, so it can't > use that index. > > You can rewrite the query like this: > > select * from table1, table2 where table1.table2_id = table2.id and > lower(table2.name) = lower('nicestring') AND NOT (table1.table2_id = 1) > UNION ALL > select * from table1, table2 where table1.table2_id = table2.id and > lower(table2.name) = lower('nicestring') AND (table1.table2_id = 1) > > In which case the planner can use the index for the first part, though > not for the second part which might still be slow. I don't know the > schema, but perhaps you're not really interested in rows with table2_id > = 1, so you could just leave out the second part of the union. > > Or you can make the index a normal, non-partial index. > Ah, thank you very much. I was specifically interested in the != 1 chunk of table1. And that now works with the AND NOT (table1.table2_id = 1) - Hannu