Thread: 8.1 (win32): partial index not used?
hi list i've noticed a problem with an index, where the index wouldn't be used anymore as soon as a partial condition is added. here's the setup: table: "oldtables.movies" containing around 50'000 entries. a small percentage (roughly 4000) have a value for the columns "mov_year" (int, default: 0) and "mov_folder" (varchar, default: '') indices: movies_mov_year_idx: USING btree ("mov_year"); movies_folder_idx: USING btree (lower((mov_folder)::text)); query: SELECT * FROM oldtables.movies WHERE mov_year = 2007 AND lower(mov_folder) = 'harrypotter5' LIMIT 1 queryplan1: Bitmap Heap Scan on movies (cost=5.67..9.68 rows=1 width=232) (actual time=0.197..0.198 rows=1 loops=1) Recheck Cond: ((lower((mov_folder)::text) = 'harrypotter5'::text) AND (mov_year = 2007)) -> BitmapAnd (cost=5.67..5.67 rows=1 width=0) (actual time=0.191..0.191 rows=0 loops=1) -> Bitmap Index Scan on movies_folder_idx (cost=0.00..2.64 rows=183 width=0) (actual time=0.089..0.089 rows=1 loops=1) Index Cond: (lower((mov_folder)::text) = 'harrypotter5'::text) -> Bitmap Index Scan on movies_mov_year_idx (cost=0.00..2.78 rows=222 width=0) (actual time=0.097..0.097 rows=62 loops=1) Index Cond: (mov_year = 2007) Total runtime: 0.274 ms i thought of changing the movies_folder_idx to a partial index for speeding up the query: movies_folder_idx: USING btree ((lower((mov_folder)::text))) WHERE ((mov_folder)::text <> ''::text); after adding the index, i've issued a reindex and ran the query again: query: SELECT * FROM oldtables.movies WHERE mov_year = 2007 AND lower(mov_folder) = 'harrypotter5' LIMIT 1 queryplan2: Bitmap Heap Scan on movies (cost=2.78..581.78 rows=1 width=232) (actual time=0.440..0.630 rows=1 loops=1) Recheck Cond: (mov_year = 2007) Filter: (lower((mov_folder)::text) = 'harrypotter5'::text) -> Bitmap Index Scan on movies_mov_year_idx (cost=0.00..2.78 rows=222 width=0) (actual time=0.182..0.182 rows=62 loops=1) Index Cond: (mov_year = 2007) Total runtime: 0.768 ms why isn't the index in queryplan2 used anymore? on a sidenote, the query runtimes seem in both plans way to high for a relatively small table. what settings would i have to tweak to get better results? thanks, thomas
I'm still new at this as well, but I'm guessing that the query planner can't extrapolate that (lower(mov_folder) = 'harrypotter5') also implies ((mov_folder)::text <> ''::text). Try the query:
SELECT * FROM oldtables.movies WHERE mov_year = 2007 AND
lower(mov_folder) = 'harrypotter5' AND (mov_folder)::text <> ''::text
I'm thinking that that will allow the planner to match the WHERE and use the index.
-Mike
SELECT * FROM oldtables.movies WHERE mov_year = 2007 AND
lower(mov_folder) = 'harrypotter5' AND (mov_folder)::text <> ''::text
I'm thinking that that will allow the planner to match the WHERE and use the index.
-Mike
On 4/27/06, me@alternize.com <me@alternize.com> wrote:
hi list
i've noticed a problem with an index, where the index wouldn't be used
anymore as soon as a partial condition is added.
here's the setup:
table:
"oldtables.movies" containing around 50'000 entries. a small percentage
(roughly 4000) have a value for the columns "mov_year" (int, default: 0) and
"mov_folder" (varchar, default: '')
indices:
movies_mov_year_idx: USING btree ("mov_year");
movies_folder_idx: USING btree (lower((mov_folder)::text));
query:
SELECT * FROM oldtables.movies WHERE mov_year = 2007 AND
lower(mov_folder) = 'harrypotter5' LIMIT 1
queryplan1:
Bitmap Heap Scan on movies (cost= 5.67..9.68 rows=1 width=232) (actual
time=0.197..0.198 rows=1 loops=1)
Recheck Cond: ((lower((mov_folder)::text) = 'harrypotter5'::text) AND
(mov_year = 2007))
-> BitmapAnd (cost=5.67..5.67 rows=1 width=0) (actual time= 0.191..0.191
rows=0 loops=1)
-> Bitmap Index Scan on movies_folder_idx (cost=0.00..2.64
rows=183 width=0) (actual time=0.089..0.089 rows=1 loops=1)
Index Cond: (lower((mov_folder)::text) = 'harrypotter5'::text)
-> Bitmap Index Scan on movies_mov_year_idx (cost=0.00..2.78
rows=222 width=0) (actual time=0.097..0.097 rows=62 loops=1)
Index Cond: (mov_year = 2007)
Total runtime: 0.274 ms
i thought of changing the movies_folder_idx to a partial index for speeding
up the query:
movies_folder_idx: USING btree ((lower((mov_folder)::text))) WHERE
((mov_folder)::text <> ''::text);
after adding the index, i've issued a reindex and ran the query again:
query:
SELECT * FROM oldtables.movies WHERE mov_year = 2007 AND
lower(mov_folder) = 'harrypotter5' LIMIT 1
queryplan2:
Bitmap Heap Scan on movies (cost=2.78..581.78 rows=1 width=232) (actual
time=0.440..0.630 rows=1 loops=1)
Recheck Cond: (mov_year = 2007)
Filter: (lower((mov_folder)::text) = 'harrypotter5'::text)
-> Bitmap Index Scan on movies_mov_year_idx (cost=0.00..2.78 rows=222
width=0) (actual time= 0.182..0.182 rows=62 loops=1)
Index Cond: (mov_year = 2007)
Total runtime: 0.768 ms
why isn't the index in queryplan2 used anymore?
on a sidenote, the query runtimes seem in both plans way to high for a
relatively small table. what settings would i have to tweak to get better
results?
thanks,
thomas
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
> i've noticed a problem with an index, where the index wouldn't be used > anymore as soon as a partial condition is added. please ignore my partial index problem. of course i would have to define the partial index part correctly: it works just fine this way: movies_folder_idx: USING btree ((lower((mov_folder)::text))) WHERE (lower((mov_folder)::text) <> ''::text); still, 0.2ms for querying an index containing ~3000 entries seems quite long: SELECT * FROM oldtables.movies WHERE lower(mov_folder) = 'harrypotter5' AND mov_year = 2007 LIMIT 1 Limit (cost=5.67..9.68 rows=1 width=232) (actual time=0.084..0.084 rows=1 loops=1) -> Bitmap Heap Scan on movies (cost=5.67..9.68 rows=1 width=232) (actual time=0.081..0.081 rows=1 loops=1) Recheck Cond: ((lower((mov_folder)::text) = 'harrypotter5'::text) AND (mov_year = 2007)) -> BitmapAnd (cost=5.67..5.67 rows=1 width=0) (actual time=0.075..0.075 rows=0 loops=1) -> Bitmap Index Scan on movies_folder_idx (cost=0.00..2.64 rows=182 width=0) (actual time=0.034..0.034 rows=1 loops=1) Index Cond: (lower((mov_folder)::text) = 'harrypotter5'::text) -> Bitmap Index Scan on movies_mov_year_idx (cost=0.00..2.78 rows=222 width=0) (actual time=0.036..0.036 rows=62 loops=1) Index Cond: (mov_year = 2007) Total runtime: 0.217 ms any tipps? - thomas
> any tipps? oh well. i need more sleep. please also ignore my latest mail. of course i was looking at the EXPLAIN ANALYZE output runtime - the acutal query time is of course fast. i'll go hide in the corner now :-/ - thomas
<me@alternize.com> writes: > i thought of changing the movies_folder_idx to a partial index for speeding > up the query: > movies_folder_idx: USING btree ((lower((mov_folder)::text))) WHERE > ((mov_folder)::text <> ''::text); > after adding the index, i've issued a reindex and ran the query again: > query: > SELECT * FROM oldtables.movies WHERE mov_year = 2007 AND > lower(mov_folder) = 'harrypotter5' LIMIT 1 > why isn't the index in queryplan2 used anymore? Because the planner can't prove that the query only needs to access rows satisfying the index's condition. If you wrote SELECT * FROM oldtables.movies WHERE mov_year = 2007 AND lower(mov_folder) = 'harrypotter5' AND mov_folder <> '' LIMIT 1 then it'd work. If you're expecting the planner to deduce mov_folder <> '' from lower(mov_folder) = 'harrypotter5', sorry, that ain't happening. That requires at least one piece of knowledge the planner does not possess (that lower('') can't yield 'harrypotter5'), and even with that knowledge available there'd be way more searching to make this proof than we can afford to indulge in during the planning process. I think it would work with that query if you'd made the partial index predicate be WHERE lower(mov_folder) <> ''. That matches the query closely enough that the planner will figure out that what it needs to check is whether 'harrypotter5' <> ''. regards, tom lane