Thread: Problem with planner
Hi, we have 8.3.11 installation on client site, with table, which looks like this: $ \d objects Table "public.objects" Column | Type | Modifiers -------------------------+--------------------------+--------------------------------------------------------------- ... state | text | ... ending_tsz | timestamp with time zone | default (now() + '4 mons'::interval) ... Indexes: "objects_stat_user_id_creation_tsz" btree (state, user_id, creation_tsz) "objects_ending_tsz_active" btree (ending_tsz) WHERE state = 'active'::text "objects_ending_tsz_idx" btree (ending_tsz) ... and we have a query: select count(*) from objects where state='active'::text and ending_tsz <= ( select now() - '1 day'::interval ); Normally this query has been getting plan, using objects_ending_tsz_active, which is sane and fast. But today, without any sensible reason, it switched to: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=6719810.62..6719810.63 rows=1 width=0) InitPlan -> Result (cost=0.00..0.01 rows=1 width=0) -> Bitmap Heap Scan on objects (cost=1289719.58..6711422.56 rows=3355219 width=0) Recheck Cond: ((state = 'active'::text) AND (ending_tsz <= $0)) -> BitmapAnd (cost=1289719.58..1289719.58 rows=3355219 width=0) -> Bitmap Index Scan on objects_stat_user_id_creation_tsz (cost=0.00..334318.95 rows=10065657 width=0) Index Cond: (state = 'active'::text) -> Bitmap Index Scan on objects_ending_tsz_idx (cost=0.00..953722.77 rows=24986738 width=0) Index Cond: (ending_tsz <= $0) (10 rows) running analyze objects 2 times in a row fixed the issue, but hour later - the problem came back. what can be wrong? Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
hubert depesz lubaczewski <depesz@depesz.com> writes: > and we have a query: > select count(*) from objects where state='active'::text and ending_tsz <= ( select now() - '1 day'::interval ); Try getting rid of the sub-select. There might have been a reason to do it like that ten years ago, but these days it's a guaranteed pessimization. regards, tom lane
On Mon, Aug 08, 2011 at 12:51:40PM -0400, Tom Lane wrote: > hubert depesz lubaczewski <depesz@depesz.com> writes: > > and we have a query: > > select count(*) from objects where state='active'::text and ending_tsz <= ( select now() - '1 day'::interval ); > > Try getting rid of the sub-select. There might have been a reason to do > it like that ten years ago, but these days it's a guaranteed > pessimization. looks like even worse plan: QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Aggregate (cost=9168023.89..9168023.90 rows=1 width=0) -> Bitmap Heap Scan on objects (cost=336429.41..9147670.19 rows=8141478 width=0) Recheck Cond: (state = 'active'::text) Filter: (ending_tsz <= (now() - '1 day'::interval)) -> Bitmap Index Scan on objects_stat_user_id_creation_tsz (cost=0.00..334394.04 rows=10064868 width=0) Index Cond: (state = 'active'::text) (6 rows) Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
Try to change index: objects_endings_tsz_active(state,endings_tsz) where state='active'. 2011/8/8, hubert depesz lubaczewski <depesz@depesz.com>: > Hi, > we have 8.3.11 installation on client site, with table, which looks like > this: > > $ \d objects > Table "public.objects" > Column | Type | > Modifiers > -------------------------+--------------------------+--------------------------------------------------------------- > ... > state | text | > ... > ending_tsz | timestamp with time zone | default (now() + '4 > mons'::interval) > ... > Indexes: > "objects_stat_user_id_creation_tsz" btree (state, user_id, creation_tsz) > "objects_ending_tsz_active" btree (ending_tsz) WHERE state = > 'active'::text > "objects_ending_tsz_idx" btree (ending_tsz) > ... > > > and we have a query: > select count(*) from objects where state='active'::text and ending_tsz <= ( > select now() - '1 day'::interval ); > > Normally this query has been getting plan, using > objects_ending_tsz_active, which is sane and fast. > > But today, without any sensible reason, it switched to: > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=6719810.62..6719810.63 rows=1 width=0) > InitPlan > -> Result (cost=0.00..0.01 rows=1 width=0) > -> Bitmap Heap Scan on objects (cost=1289719.58..6711422.56 > rows=3355219 width=0) > Recheck Cond: ((state = 'active'::text) AND (ending_tsz <= $0)) > -> BitmapAnd (cost=1289719.58..1289719.58 rows=3355219 width=0) > -> Bitmap Index Scan on objects_stat_user_id_creation_tsz > (cost=0.00..334318.95 rows=10065657 width=0) > Index Cond: (state = 'active'::text) > -> Bitmap Index Scan on objects_ending_tsz_idx > (cost=0.00..953722.77 rows=24986738 width=0) > Index Cond: (ending_tsz <= $0) > (10 rows) > > running analyze objects 2 times in a row fixed the issue, but hour later > - the problem came back. > > what can be wrong? > > Best regards, > > depesz > > -- > The best thing about modern society is how easy it is to avoid contact with > it. > > http://depesz.com/ > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- ------------ pasman
On Mon, Aug 08, 2011 at 10:01:58PM +0200, pasman pasmański wrote: > Try to change index: objects_endings_tsz_active(state,endings_tsz) > where state='active'. adding new indexes on this machine is quite complicated, as it's heavily used production system. And the current index should be chosen by planner anyway - it is after analyze, but then it breaks again. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
2011/8/8 hubert depesz lubaczewski <depesz@depesz.com>: > Hi, > we have 8.3.11 installation on client site, with table, which looks like > this: > > $ \d objects > Table "public.objects" > Column | Type | Modifiers > -------------------------+--------------------------+--------------------------------------------------------------- > ... > state | text | > ... > ending_tsz | timestamp with time zone | default (now() + '4 mons'::interval) > ... > Indexes: > "objects_stat_user_id_creation_tsz" btree (state, user_id, creation_tsz) > "objects_ending_tsz_active" btree (ending_tsz) WHERE state = 'active'::text > "objects_ending_tsz_idx" btree (ending_tsz) > ... > > > and we have a query: > select count(*) from objects where state='active'::text and ending_tsz <= ( select now() - '1 day'::interval ); > > Normally this query has been getting plan, using > objects_ending_tsz_active, which is sane and fast. > > But today, without any sensible reason, it switched to: > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=6719810.62..6719810.63 rows=1 width=0) > InitPlan > -> Result (cost=0.00..0.01 rows=1 width=0) > -> Bitmap Heap Scan on objects (cost=1289719.58..6711422.56 rows=3355219 width=0) > Recheck Cond: ((state = 'active'::text) AND (ending_tsz <= $0)) > -> BitmapAnd (cost=1289719.58..1289719.58 rows=3355219 width=0) > -> Bitmap Index Scan on objects_stat_user_id_creation_tsz (cost=0.00..334318.95 rows=10065657 width=0) > Index Cond: (state = 'active'::text) > -> Bitmap Index Scan on objects_ending_tsz_idx (cost=0.00..953722.77 rows=24986738 width=0) > Index Cond: (ending_tsz <= $0) > (10 rows) > > running analyze objects 2 times in a row fixed the issue, but hour later > - the problem came back. > > what can be wrong? The plan turn bad without any new ANALYZE, right ? does the table increase more quickly now than before ? is it now way larger than before ? Also, do you have an explain with the 'good' plan ? -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
On Tue, Aug 09, 2011 at 05:11:09PM +0200, Cédric Villemain wrote: > The plan turn bad without any new ANALYZE, right ? Right. > does the table increase more quickly now than before ? is it now way > larger than before ? > Also, do you have an explain with the 'good' plan ? changes in the objects table are more or less the same (in volume) day to day. as for good plan. sure. If i'll disable bitmap scans, I get: $ explain analyze select count(*) from objects where state='active' and ending_tsz <= (select now() - '1 day'::interval ); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=7651119.35..7651119.36 rows=1 width=0) (actual time=63.150..63.151 rows=1 loops=1) InitPlan -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.010..0.010 rows=1 loops=1) -> Index Scan using objects_ending_tsz_active on objects (cost=0.00..7642758.99 rows=3344138 width=0) (actual time=63.131..63.131rows=0 loops=1) Index Cond: (ending_tsz <= $0) Total runtime: 63.279 ms (6 rows) for comparison, normal plan, with enable_bitmapscan = true: $ explain select count(*) from objects where state='active' and ending_tsz <= (select now() - '1 day'::interval ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=6726333.89..6726333.90 rows=1 width=0) InitPlan -> Result (cost=0.00..0.01 rows=1 width=0) -> Bitmap Heap Scan on objects (cost=1295618.40..6717973.52 rows=3344138 width=0) Recheck Cond: ((state = 'active'::text) AND (ending_tsz <= $0)) -> BitmapAnd (cost=1295618.40..1295618.40 rows=3344138 width=0) -> Bitmap Index Scan on objects_stat_user_id_creation_tsz (cost=0.00..333925.70 rows=10032414 width=0) Index Cond: (state = 'active'::text) -> Bitmap Index Scan on objects_ending_tsz_idx (cost=0.00..960020.38 rows=25015994 width=0) Index Cond: (ending_tsz <= $0) (10 rows) Now. Interesting is, that if i'll change the query like Tom suggested, things go really bad: $ set enable_bitmapscan = true; SET $ explain select count(*) from objects where state='active' and ending_tsz <= now() - '1 day'::interval; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Aggregate (cost=9192258.28..9192258.29 rows=1 width=0) -> Bitmap Heap Scan on objects (cost=335966.69..9171848.45 rows=8163932 width=0) Recheck Cond: (state = 'active'::text) Filter: (ending_tsz <= (now() - '1 day'::interval)) -> Bitmap Index Scan on objects_stat_user_id_creation_tsz (cost=0.00..333925.70 rows=10032414 width=0) Index Cond: (state = 'active'::text) (6 rows) $ set enable_bitmapscan = false; SET $ explain select count(*) from objects where state='active' and ending_tsz <= now() - '1 day'::interval; QUERY PLAN -------------------------------------------------------------------------------------------- Aggregate (cost=10814335.47..10814335.48 rows=1 width=0) -> Seq Scan on objects (cost=0.00..10793925.64 rows=8163932 width=0) Filter: ((state = 'active'::text) AND (ending_tsz <= (now() - '1 day'::interval))) (3 rows) Current stats of the table: $ select * from pg_stat_user_tables where relname = 'objects'; -[ RECORD 1 ]----+------------------------------ relid | 71635994 schemaname | public relname | objects seq_scan | 181 seq_tup_read | 3164627085 idx_scan | 164923232565 idx_tup_fetch | 1359016133552 n_tup_ins | 31372199 n_tup_upd | 698411462 n_tup_del | 1 n_tup_hot_upd | 20426973 n_live_tup | 75016862 n_dead_tup | 494489 last_vacuum | 2011-03-31 06:15:39.866869+00 last_autovacuum | 2011-08-09 05:51:35.050683+00 last_analyze | 2011-08-09 03:30:14.986266+00 last_autoanalyze | 2010-09-27 05:10:10.793584+00 Best regards, depesz
hubert depesz lubaczewski <depesz@depesz.com> writes: > as for good plan. sure. If i'll disable bitmap scans, I get: > $ explain analyze select count(*) from objects where state='active' and ending_tsz <= (select now() - '1 day'::interval); > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=7651119.35..7651119.36 rows=1 width=0) (actual time=63.150..63.151 rows=1 loops=1) > InitPlan > -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.010..0.010 rows=1 loops=1) > -> Index Scan using objects_ending_tsz_active on objects (cost=0.00..7642758.99 rows=3344138 width=0) (actual time=63.131..63.131rows=0 loops=1) > Index Cond: (ending_tsz <= $0) > Total runtime: 63.279 ms > (6 rows) There's absolutely no hope of getting the planner to make intelligent plan choices when its estimates are that far off (3344138 rows estimated versus none actual). It's just luck if you get the "right" plan despite that. Now part of the reason why the estimate is so bad is the sub-select, which completely defeats selectivity estimation for the ending_tsz condition. Could we see EXPLAIN ANALYZE for the query without a sub-select, both with and without enable_bitmapscan? regards, tom lane
On Tue, Aug 09, 2011 at 12:11:48PM -0400, Tom Lane wrote: > Now part of the reason why the estimate is so bad is the sub-select, > which completely defeats selectivity estimation for the ending_tsz > condition. Could we see EXPLAIN ANALYZE for the query without a > sub-select, both with and without enable_bitmapscan? working on it. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
On Tue, Aug 09, 2011 at 12:11:48PM -0400, Tom Lane wrote: > There's absolutely no hope of getting the planner to make intelligent > plan choices when its estimates are that far off (3344138 rows estimated > versus none actual). It's just luck if you get the "right" plan despite > that. > Now part of the reason why the estimate is so bad is the sub-select, > which completely defeats selectivity estimation for the ending_tsz > condition. Could we see EXPLAIN ANALYZE for the query without a > sub-select, both with and without enable_bitmapscan? version with disabled bitmapscans: $ explain analyze select count(*) from objects where state='active' and ending_tsz <= now() - '1 day'::interval; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=10815555.72..10815555.73 rows=1 width=0) (actual time=1416200.548..1416200.548 rows=1 loops=1) -> Seq Scan on objects (cost=0.00..10795673.36 rows=7952943 width=0) (actual time=1210074.356..1416200.498 rows=13 loops=1) Filter: ((state = 'active'::text) AND (ending_tsz <= (now() - '1 day'::interval))) Total runtime: 1416200.678 ms (4 rows) version with enabled bitmap scans is still working on (3 hours now), will post explain analyze when it finishes. explain (with enabled bitmap scans) looks like this: $ explain select count(*) from objects where state='active' and ending_tsz <= now() - '1 day'::interval; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Aggregate (cost=9117480.50..9117480.51 rows=1 width=0) -> Bitmap Heap Scan on objects (cost=326375.21..9097597.40 rows=7953240 width=0) Recheck Cond: (state = 'active'::text) Filter: (ending_tsz <= (now() - '1 day'::interval)) -> Bitmap Index Scan on objects_stat_user_id_creation_tsz (cost=0.00..324386.90 rows=9754574 width=0) Index Cond: (state = 'active'::text) (6 rows) value of state and ending_tsz are pretty correlated - it's very unlikely to have rows matching both criteria (state = 'active' and ending_tsz <= now() - '1 day'). Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
hubert depesz lubaczewski <depesz@depesz.com> writes: > version with disabled bitmapscans: > $ explain analyze select count(*) from objects where state='active' and ending_tsz <= now() - '1 day'::interval; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=10815555.72..10815555.73 rows=1 width=0) (actual time=1416200.548..1416200.548 rows=1 loops=1) > -> Seq Scan on objects (cost=0.00..10795673.36 rows=7952943 width=0) (actual time=1210074.356..1416200.498 rows=13loops=1) > Filter: ((state = 'active'::text) AND (ending_tsz <= (now() - '1 day'::interval))) > Total runtime: 1416200.678 ms > (4 rows) Hmm, so it's not using the index at all here. The problem clearly is that the rowcount estimate is still completely bogus :-(, even though it's presumably getting a reasonable estimate now for the ending_tsz test in isolation. I suppose what's going on here is that the "state" and "ending_tsz" columns are highly correlated, such that there are lots of 'active' items but hardly any of them ended more than a day ago? If so, you're going to have to rethink the representation somehow to get good results, because there's no way the planner will see this until we have cross-column stats in some form. The least invasive fix that I can think of offhand is to set up an index (non-partial) on the expression case when state = 'active' then ending_tsz else null end and phrase the query as WHERE (case when state = 'active' then ending_tsz else null end) <= (now() - '1 day'::interval) This should result in condensing the stats about active items' ending_tsz into a format the planner can deal with, assuming you're running a PG version that will keep and use stats on expression indexes. regards, tom lane
On Tue, Aug 09, 2011 at 04:08:39PM -0400, Tom Lane wrote: > I suppose what's going on here is that the "state" and "ending_tsz" > columns are highly correlated, such that there are lots of 'active' > items but hardly any of them ended more than a day ago? If so, yes, that's correct. > you're going to have to rethink the representation somehow to get > good results, because there's no way the planner will see this until > we have cross-column stats in some form. > > The least invasive fix that I can think of offhand is to set up an > index (non-partial) on the expression > > case when state = 'active' then ending_tsz else null end > > and phrase the query as > > WHERE (case when state = 'active' then ending_tsz else null end) <= (now() - '1 day'::interval) > > This should result in condensing the stats about active items' > ending_tsz into a format the planner can deal with, assuming > you're running a PG version that will keep and use stats on > expression indexes. it's 8.3.11. I solved the problem by adding "enable_bitmapscan = false" (and keeping the query in original format, with subselect) which caused the plan to be ok. but I'm much more interested to understand why pg chooses *not* to use index which is tailored specifically for the query - it has exactly matching where clause, and it indexes the column that we use for comparison. the thing is - i solved the problem for now. I might add new index the way you suggest, and it might help. but it's is very unnerving that postgresql will just choose to ignore specially made index, perfectly matching the criteria in query. since I can't test it - is there any chance (Cédric suggested something like this) that some newer version has more logic to try harder to use best index? Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
hubert depesz lubaczewski <depesz@depesz.com> writes: > I solved the problem by adding "enable_bitmapscan = false" (and keeping > the query in original format, with subselect) which caused the plan to > be ok. I doubt that solution is any more robust than what you had before --- in particular, it's likely to fall back to seqscans. > but I'm much more interested to understand why pg chooses *not* to use > index which is tailored specifically for the query - it has exactly > matching where clause, and it indexes the column that we use for > comparison. Because the planner thinks it will have to pull a huge number of rows from the index. Whether the index is "tailored" for the query is irrelevant if it looks more expensive to use than a seqscan. regards, tom lane
On Tue, Aug 09, 2011 at 05:18:38PM -0400, Tom Lane wrote: > hubert depesz lubaczewski <depesz@depesz.com> writes: > > I solved the problem by adding "enable_bitmapscan = false" (and keeping > > the query in original format, with subselect) which caused the plan to > > be ok. > > I doubt that solution is any more robust than what you had before --- > in particular, it's likely to fall back to seqscans. > > > but I'm much more interested to understand why pg chooses *not* to use > > index which is tailored specifically for the query - it has exactly > > matching where clause, and it indexes the column that we use for > > comparison. > > Because the planner thinks it will have to pull a huge number of rows > from the index. Whether the index is "tailored" for the query > is irrelevant if it looks more expensive to use than a seqscan. The query with enabled bitmap scans finally finished: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=9117479.68..9117479.69 rows=1 width=0) (actual time=24964197.316..24964197.317 rows=1 loops=1) -> Bitmap Heap Scan on objects (cost=326375.14..9097597.32 rows=7952942 width=0) (actual time=24804152.598..24964197.207rows=48 loops=1) Recheck Cond: (state = 'active'::text) Filter: (ending_tsz <= (now() - '1 day'::interval)) -> Bitmap Index Scan on objects_stat_user_id_creation_tsz (cost=0.00..324386.90 rows=9754574 width=0) (actualtime=24788705.254..24788705.254 rows=10582798 loops=1) Index Cond: (state = 'active'::text) Total runtime: 24964211.224 ms (7 rows) In the mean time, based on your other mail, I created this index: create index concurrently depesz_test_idx on objects (state, ending_tsz) WHERE state = 'active'::text; and now it's being used by plans: QUERY PLAN ------------------------------------------------------------------------------------------------------ Aggregate (cost=8510568.24..8510568.25 rows=1 width=0) -> Bitmap Heap Scan on objects (cost=183260.11..8490232.49 rows=8134302 width=0) Recheck Cond: ((state = 'active'::text) AND (ending_tsz <= (now() - '1 day'::interval))) -> Bitmap Index Scan on depesz_test_idx (cost=0.00..181226.54 rows=8134302 width=0) Index Cond: ((state = 'active'::text) AND (ending_tsz <= (now() - '1 day'::interval))) (5 rows) but when I'll disable bitmap scans it reverts back to seq scan. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
2011/8/9 hubert depesz lubaczewski <depesz@depesz.com>: > On Tue, Aug 09, 2011 at 04:08:39PM -0400, Tom Lane wrote: >> I suppose what's going on here is that the "state" and "ending_tsz" >> columns are highly correlated, such that there are lots of 'active' >> items but hardly any of them ended more than a day ago? If so, > > yes, that's correct. > >> you're going to have to rethink the representation somehow to get >> good results, because there's no way the planner will see this until >> we have cross-column stats in some form. >> >> The least invasive fix that I can think of offhand is to set up an >> index (non-partial) on the expression >> >> case when state = 'active' then ending_tsz else null end >> >> and phrase the query as >> >> WHERE (case when state = 'active' then ending_tsz else null end) <= (now() - '1 day'::interval) >> >> This should result in condensing the stats about active items' >> ending_tsz into a format the planner can deal with, assuming >> you're running a PG version that will keep and use stats on >> expression indexes. > > it's 8.3.11. > I solved the problem by adding "enable_bitmapscan = false" (and keeping > the query in original format, with subselect) which caused the plan to > be ok. > > but I'm much more interested to understand why pg chooses *not* to use > index which is tailored specifically for the query - it has exactly > matching where clause, and it indexes the column that we use for > comparison. > > the thing is - i solved the problem for now. I might add new index the > way you suggest, and it might help. but it's is very unnerving that > postgresql will just choose to ignore specially made index, perfectly > matching the criteria in query. > > since I can't test it - is there any chance (Cédric suggested something > like this) that some newer version has more logic to try harder to use > best index? I wondered if it is the same logic to choose between bitmap and indexscan in both 8.3 and HEAD. It looks like it is (except that now you can put the not-wanted index on another tablepsace and increase the cost of accessing it, which is another no-so-pretty way to workaround the issue). > > Best regards, > > depesz > > -- > The best thing about modern society is how easy it is to avoid contact with it. > http://depesz.com/ > -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation