Thread: Postgresql selecting strange index for simple query
First some details about server: hh=# select version(); version ---------------------------------------------------------------------------------------------- PostgreSQL 8.3.5 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.2-1) 4.3.2 (1 row) hh=# SHOW effective_cache_size; effective_cache_size ---------------------- 32GB (1 row) hh=# SHOW random_page_cost; random_page_cost ------------------ 1 (1 row) (random_page_cost set to 1 because whole DB filling in RAM). Now troublesome query: select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00' Table resume have like 70 columns (structure i can post if required) and have 2 index which can be used for completing thisquery: "resume_last_change_time_idx" btree (last_change_time) WITH (fillfactor=90) "resume_user_id_disabled_is_finished_last_change_time_idx" btree (user_id, disabled, is_finished, last_change_time) WITH(fillfactor=90) And have rows: hh=# SELECT count(*) from resume; count --------- 5843784 (1 row) And postgres doing very strange thing selecting second index for this query: hh=# EXPLAIN ANALYZE select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=612329.41..612329.42 rows=1 width=0) (actual time=1806.860..1806.861 rows=1 loops=1) -> Index Scan using resume_user_id_disabled_is_finished_last_change_time_idx on resume (cost=0.00..608254.00 rows=1630162width=0) (actual time=0.020..1648.265 rows=1627437 loops=1) Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone) Total runtime: 1806.915 ms (4 rows) Ofcourse after i drop index resume_user_id_disabled_is_finished_last_change_time_idx postgres start use right index and querybecome faster: hh=# drop INDEX resume_user_id_disabled_is_finished_last_change_time_idx; DROP INDEX hh=# EXPLAIN ANALYZE select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=695294.37..695294.38 rows=1 width=0) (actual time=1659.655..1659.655 rows=1 loops=1) -> Index Scan using resume_last_change_time_idx on resume (cost=0.00..691218.96 rows=1630162 width=0) (actual time=0.021..1500.817rows=1627437 loops=1) Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone) Total runtime: 1659.715 ms (4 rows) (all quieris done after analyze and count(*) on resume and done 2-3 time until runtime become stable). timing real query show same results (eg query with wrong selected index really slower): hh=# \timing Timing is on. hh=# select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00'; count --------- 1627437 (1 row) Time: 1677.731 ms hh=# drop INDEX resume_user_id_disabled_is_finished_last_change_time_idx; DROP INDEX Time: 0.452 ms hh=# select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00'; count --------- 1627437 (1 row) Time: 1530.906 ms Somehow postgres think index scan on singlecolumn index slower comparing to scan on 4th field of 4column index. PS: very strange but clean room test with only 4 column part of table resume show postgres selecting right plan: (and full rectreated table (with all 70 columns) have same wrong plan) hh=# SELECT user_id, disabled, is_finished, last_change_time into test_table from resume; SELECT hh=# ANALYZE test_table ; ANALYZE hh=# CREATE INDEX test_table_last_change_time on test_table(last_change_time) with (fillfactor=90); CREATE INDEX hh=# CREATE INDEX test_table_complex_idx on test_table(user_id, disabled, is_finished, last_change_time) with (fillfactor=90); CREATE INDEX hh=# SELECT count(*) from test_table; count --------- 5843784 (1 row) hh=# EXPLAIN ANALYZE SELECT count(*) from test_table where last_change_time > '2008-09-01 00:00:00'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=72239.92..72239.93 rows=1 width=0) (actual time=1386.038..1386.038 rows=1 loops=1) -> Index Scan using test_table_last_change_time on test_table (cost=0.00..68193.49 rows=1618568 width=0) (actual time=0.057..1228.817 rows=1627437 loops=1) Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone) Total runtime: 1386.074 ms (4 rows) Thanks for help. -- SY, Maxim Boguk
Tom Lane wrote: > Maxim Boguk <mboguk@masterhost.ru> writes: >> Somehow postgres think index scan on singlecolumn index slower comparing to scan on 4th field of 4column index. > > It does know better than that. I'm wondering if the single-column index > has become very bloated or something. Have you compared the physical > index sizes? Table fresh loaded from dump on test server... So no index bloat for sure... As for comparing physical sizes, right single column index indeed smaller then wrong one: Right index: hh=# SELECT pg_size_pretty(pg_relation_size('resume_last_change_time_idx')); pg_size_pretty ---------------- 125 MB (1 row) Wrong index: hh=# SELECT pg_size_pretty(pg_relation_size('resume_user_id_disabled_is_finished_last_change_time_idx')); pg_size_pretty ---------------- 226 MB (1 row) Regards, Maxim Boguk
Maxim Boguk <mboguk@masterhost.ru> writes: > Tom Lane wrote: >> It does know better than that. I'm wondering if the single-column index >> has become very bloated or something. Have you compared the physical >> index sizes? > Table fresh loaded from dump on test server... So no index bloat for sure... > As for comparing physical sizes, right single column index indeed smaller then wrong one: Huh. I get sane-looking choices when I try a similar case here. Can you put together a self-contained test case? regards, tom lane
Maxim Boguk <mboguk@masterhost.ru> writes: > Somehow postgres think index scan on singlecolumn index slower comparing to scan on 4th field of 4column index. It does know better than that. I'm wondering if the single-column index has become very bloated or something. Have you compared the physical index sizes? regards, tom lane
> Maxim Boguk <mboguk@masterhost.ru> writes: >> Tom Lane wrote: >>> It does know better than that. I'm wondering if the single-column index >>> has become very bloated or something. Have you compared the physical >>> index sizes? > >> Table fresh loaded from dump on test server... So no index bloat for sure... >> As for comparing physical sizes, right single column index indeed smaller then wrong one: > > Huh. I get sane-looking choices when I try a similar case here. Can > you put together a self-contained test case? Not full self-contained test case but some minimal setup data which can give you ideas whats going wrong: Test confirm my theory wrong index selection linked with long rows in table. My tests contain such queries: Ok TEST1: set random_page_cost=1; drop TABLE if exists test_table ; SELECT user_id,last_change_time,rpad('a',1,'b') as f1 into test_table from resume; ANALYZE test_table; SELECT count(*) from test_table; CREATE INDEX right_idx on test_table(last_change_time); CREATE INDEX wrong_idx on test_table(user_id, last_change_time); EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00'; result: Index Scan using right_idx on test_table (cost=0.00..42763.35 rows=388718 width=0) (actual time=0.020..342.653 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 368.699 ms Ok TEST2 (but see: cost increased 4x times when real work time increased only by 30%): same but rpad('a',200,'b') instead of rpad('a',1,'b') when populating test_table: result: Index Scan using right_idx on test_table (cost=0.00..179346.09 rows=392268 width=0) (actual time=0.089..422.439 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 448.717 ms !!Not ok TEST3:!! same but rpad('a',500,'b') instead of rpad('a',1,'b') when populating test_table: Oops wrong index used: Index Scan using wrong_idx on test_table (cost=0.00..254918.19 rows=392231 width=0) (actual time=0.067..730.097 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 757.930 ms (3 rows) (btw if drop wrong_idx query become works almost 2х faster: drop INDEX wrong_idx; Index Scan using right_idx on test_table (cost=0.00..259709.09 rows=392231 width=0) (actual time=0.019..416.108 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 442.790 ms ) Again Ok TEST4 (here toast engine removed all long values from test_table): same but rpad('a',2000,'b') instead of rpad('a',1,'b') when populating test_table: Index Scan using right_idx on test_table (cost=0.00..64606.50 rows=393002 width=0) (actual time=0.058..371.723 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 397.929 ms So i have two theory (just waving hands ofcourse): 1)integer owerflow somewhere in cost calculation 2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09) PS: second issue is cost increasing with increasing lenght of rows in table complete indepenent with increase real work time. (probably i need tune other _cost parameters for get more exact cost values) PPS: sorry for my poor english Regards, Maxim Boguk
Maxim Boguk <mboguk@masterhost.ru> writes: > So i have two theory (just waving hands ofcourse): > 1)integer owerflow somewhere in cost calculation Costs are floats, and in any case you're not showing costs anywhere near the integer overflow limit... > 2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09) The planner is intentionally set up to consider costs within a percent or so of each other as being effectively equal. If the estimated costs are that close then it doesn't surprise me if it sometimes picks the "wrong" plan. The real question is why are the estimates so close? They should not be, since AFAICS you are talking about a situation where we'd have to scan all of the multicol index versus only about a fifth of the single-col one. regards, tom lane
Tom Lane wrote: > Maxim Boguk <mboguk@masterhost.ru> writes: >> So i have two theory (just waving hands ofcourse): >> 1)integer owerflow somewhere in cost calculation > > Costs are floats, and in any case you're not showing costs anywhere near > the integer overflow limit... > >> 2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09) > > The planner is intentionally set up to consider costs within a percent > or so of each other as being effectively equal. If the estimated costs > are that close then it doesn't surprise me if it sometimes picks the > "wrong" plan. The real question is why are the estimates so close? > They should not be, since AFAICS you are talking about a situation > where we'd have to scan all of the multicol index versus only about > a fifth of the single-col one. Ok i exploring more: just one thing: hh=# SHOW default_statistics_target ; default_statistics_target --------------------------- 10 (1 row) (btw increase statistic to 1000 do not fix situation). I try simplify test case and: Now use sequential user_id, and truncate last_change_time to date: SELECT nextval('test_seq') as user_id,last_change_time::date ,rpad('a',500,'b') as f1 into test_table from resume; ANALYZE test_table; SELECT count(*) from test_table; CREATE INDEX right_idx on test_table(last_change_time); CREATE INDEX wrong_idx on test_table(user_id, last_change_time); EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00'; Index Scan using wrong_idx on test_table (cost=0.00..182623.51 rows=316522 width=0) (actual time=0.056..534.620 rows=382671loops=1) Index Cond: (last_change_time > '2009-01-10'::date) DROP INDEX wrong_idx; EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00'; Index Scan using right_idx on test_table (cost=0.00..221765.19 rows=316522 width=0) (actual time=0.023..346.213 rows=382671loops=1) Index Cond: (last_change_time > '2009-01-10'::date) Full index scan over wrong index cost reasonable lower then 1/17 of single column index (182623 vs 221765)! So just last_change_time still cannot be generated... but: hh=# SELECT count(distinct last_change_time) from test_table; count ------- 2133 (1 row) And statistic values for last_change_time is: hh=# SELECT * from pg_stats where tablename='test_table' and attname='last_change_time'; -[ RECORD 1 ]-----+--------------------------------------------------------------------------------------------------------------------------- schemaname | public tablename | test_table attname | last_change_time null_frac | 0 avg_width | 4 n_distinct | 1211 most_common_vals | {2004-02-27,2009-01-26,2008-03-11,2009-01-27,2004-01-15,2008-10-13,2009-01-19,2009-01-22,2009-01-21,2008-09-29} most_common_freqs | {0.00833333,0.00766667,0.00733333,0.007,0.00666667,0.00533333,0.00533333,0.00533333,0.005,0.00466667} histogram_bounds | {2003-03-31,2005-08-26,2006-10-04,2007-04-29,2007-09-27,2008-01-24,2008-05-01,2008-07-29,2008-10-01,2008-11-27,2009-01-25} correlation | 0.261512 I think it is all what planner can use when choose plan... because user_id is unique sequential values. regargs, Maxim Boguk
Maxim Boguk <mboguk@masterhost.ru> writes: > I try simplify test case and: > Now use sequential user_id, and truncate last_change_time to date: > SELECT nextval('test_seq') as user_id,last_change_time::date ,rpad('a',500,'b') as f1 into test_table from resume; Could you send me a dump of this test_table off-list? It seems like there must be something strange about the stats of last_change_time, but I don't feel like guessing about what it is ... regards, tom lane
Maxim Boguk <mboguk@masterhost.ru> writes: > Tom Lane wrote: >> Could you send me a dump of this test_table off-list? It seems like >> there must be something strange about the stats of last_change_time, >> but I don't feel like guessing about what it is ... > Here attached is small part of table (1160 rows) wich show same wrong index selection. Thanks. It turns out the funny behavior is not because of last_change_time, but because wrong_index's *first* column is well correlated with the table ordering. In the test case it actually gets a 1.0 correlation score because it's a constant, but I suppose that in your real example the leading column is well correlated with the row insertion order. The planner is favoring the multicolumn index more than it should because of the correlation, which is an estimation bug that I've started a thread about on pgsql-hackers. In the meantime, though, it seems that the issue only occurs for a limited range of random_page_cost settings. At high random_page_cost the index page fetch cost increases enough to make the "wrong" index not be favored, and at low random_page_cost the index per-tuple costs do the same. So as a workaround you might look into whether your fully-cached situation is modeled better by having both random_page_cost and seq_page_cost less than 1.0. In the past we've seen some evidence that setting them both to 0.1 or so produces results that are more accurate for databases that are fully cached in RAM. (Alternatively you can jack up the various CPU costs, but there are more of them to mess with.) regards, tom lane