Re: Strange (?) Index behavior? - Mailing list pgsql-performance
From | Allen Landsidel |
---|---|
Subject | Re: Strange (?) Index behavior? |
Date | |
Msg-id | 88f1825a04110514407ef8a11b@mail.gmail.com Whole thread Raw |
In response to | Strange (?) Index behavior? (Allen Landsidel <alandsidel@gmail.com>) |
List | pgsql-performance |
On Fri, 05 Nov 2004 16:08:56 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Allen Landsidel <alandsidel@gmail.com> writes: > > With seqscan enabled however, "AB%" will use the index, but "A%" will not. > > > The estimated cost for the query is much higher without the partial > > indexes than it is with them, and the actual runtime of the query is > > definitely longer without the partial indexes. > > OK. This suggests that the planner is drastically misestimating > the selectivity of the 'A%' clause, which seems odd to me since in > principle it could get that fairly well from the ANALYZE histogram. > But it could well be that you need to increase the resolution of the > histogram --- see ALTER TABLE SET STATISTICS. I will look into this. > > Did you ever show us EXPLAIN ANALYZE results for this query? No, I didn't. I am running it now without the partial index on to give you the results but it's (the 'A%' problem query) been running pretty much since I got this message (an hour ago) and is still not finished. The EXPLAIN results without the ANALYZE will have to suffice until it's done, I can readd the index, and run it again, so you have both to compare to. First two queries run where both the main index, and the 'A%' index exist: -- QUERY 1 search=# explain search-# SELECT test_name FROM test WHERE test_name LIKE 'A%'; QUERY PLAN ------------------------------------------------------------------------------------------- Index Scan using test_name_idx_a on "test" (cost=0.00..8605.88 rows=391208 width=20) Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text)) Filter: (test_name ~~ 'A%'::text) (3 rows) Time: 16.507 ms -- QUERY 2 search=# explain search-# SELECT test_name FROM test WHERE test_name LIKE 'A%' AND test_name LIKE 'AB%'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Index Scan using test_name_idx_a on "test" (cost=0.00..113.79 rows=28 width=20) Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text) AND (test_name >= 'AB'::text) AND (test_name < 'AC'::text)) Filter: ((test_name ~~ 'A%'::text) AND (test_name ~~ 'AB%'::text)) (3 rows) Time: 3.197 ms Ok, now the same two queries after a DROP INDEX test_name_idx_a; search=# explain search-# SELECT test_name FROM test WHERE test_name LIKE 'A%'; QUERY PLAN ----------------------------------------------------------------------------------------------- Index Scan using test_name_unique on "test" (cost=0.00..1568918.66 rows=391208 width=20) Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text)) Filter: (test_name ~~ 'A%'::text) (3 rows) Time: 2.470 ms search=# explain search-# SELECT test_name FROM test WHERE test_name LIKE 'AB%'; QUERY PLAN ------------------------------------------------------------------------------------------- Index Scan using test_name_unique on "test" (cost=0.00..20379.49 rows=5081 width=20) Index Cond: ((test_name >= 'AB'::text) AND (test_name < 'AC'::text)) Filter: (test_name ~~ 'AB%'::text) (3 rows) Time: 2.489 ms ------------------ Copying just the costs you can see the vast difference... Index Scan using test_name_unique on "test" (cost=0.00..1568918.66 rows=391208 width=20) Index Scan using test_name_unique on "test" (cost=0.00..20379.49 rows=5081 width=20) vs Index Scan using test_name_idx_a on "test" (cost=0.00..8605.88 rows=391208 width=20) Index Scan using test_name_idx_a on "test" (cost=0.00..113.79 rows=28 width=20) Lastly no, neither of these row guesstimates is correct.. I'll get back and tell you how much they're off by if it's important, once this query is done. The odd thing is it used the index scan here each time -- that has not always been the case with the main unique index, it's trying to make a liar out of me heh. I'm used to the estimates and plan changing from one vacuum analyze to the next, even without any inserts or updates between.. the index scan is always used however when I have the partial indexes in place, and something like.. CREATE TEMP TABLE t1 AS SELECT field FROM table WHERE field LIKE 'A%' AND field LIKE 'AA%'; runs in 6-8 seconds as well, with a bit under 100k records. -Allen
pgsql-performance by date: