Thread: Bad cost estimate with FALSE filter condition
All: This got posted to pgsql-bugs, but got no attention there[1], so I'm sending it to this list. Test case: createdb bench pgbench -i -s bench \c bench bench=# explain select * from pgbench_accounts where aid = 2; QUERY PLAN --------------------------------------------------------------- Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.42..8.44 rows=1 width=97) Index Cond: (aid = 2) (2 rows) bench=# explain select * from pgbench_accounts where aid = 2 and false; QUERY PLAN ------------------------------------------------- Result (cost=0.00..26394.00 rows=1 width=97) One-Time Filter: false -> Seq Scan on pgbench_accounts (cost=0.00..26394.00 rows=1 width=97) (3 rows) This seems like a special case of the "aborted plan cost", that is, when the planner expects to abort a plan early, it nevertheless returns the full cost for the non-aborted version of the query, rather than the working cost, which is based on the abort. For example: bench=# create index on pgbench_accounts(bid); CREATE INDEX bench=# explain select * from pgbench_accounts where bid = 2; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Index Scan using pgbench_accounts_bid_idx on pgbench_accounts (cost=0.42..4612.10 rows=102667 width=97) Index Cond: (bid = 2) (2 rows) bench=# explain select * from pgbench_accounts where bid = 2 limit 1; QUERY PLAN -------------------------------------------------------------------------------- Limit (cost=0.00..0.28 rows=1 width=97) -> Seq Scan on pgbench_accounts (cost=0.00..28894.00 rows=102667 width=97) Filter: (bid = 2) (3 rows) So in this case, the top-level node returns a lower cost because the planner knows that it will find a row with bid=2 fairly quickly in the seq scan. But in the WHERE FALSE example, that scan *is* the top-level node, so the planner returns a fictitious cost for the whole query. Or is there something else at work here? [1] http://www.postgresql.org/message-id/20150225194953.2546.86836@wrigleys.postgresql.org -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
So ... should I assume my diagnosis is correct? Haven't heard any other suggestions. On 02/27/2015 05:28 PM, Josh Berkus wrote: > All: > > This got posted to pgsql-bugs, but got no attention there[1], so I'm > sending it to this list. > > Test case: > > createdb bench > pgbench -i -s bench > \c bench > > bench=# explain select * from pgbench_accounts where aid = 2; > QUERY PLAN > --------------------------------------------------------------- > Index Scan using pgbench_accounts_pkey on pgbench_accounts > (cost=0.42..8.44 rows=1 width=97) > Index Cond: (aid = 2) > (2 rows) > > bench=# explain select * from pgbench_accounts where aid = 2 and false; > > QUERY PLAN > ------------------------------------------------- > Result (cost=0.00..26394.00 rows=1 width=97) > One-Time Filter: false > -> Seq Scan on pgbench_accounts (cost=0.00..26394.00 rows=1 width=97) > (3 rows) > > This seems like a special case of the "aborted plan cost", that is, when > the planner expects to abort a plan early, it nevertheless returns the > full cost for the non-aborted version of the query, rather than the > working cost, which is based on the abort. > > For example: > > bench=# create index on pgbench_accounts(bid); > CREATE INDEX > bench=# explain select * from pgbench_accounts where bid = 2; > QUERY PLAN > > ---------------------------------------------------------------------------------------------------------- > Index Scan using pgbench_accounts_bid_idx on pgbench_accounts > (cost=0.42..4612.10 rows=102667 width=97) > Index Cond: (bid = 2) > (2 rows) > > bench=# explain select * from pgbench_accounts where bid = 2 limit 1; > QUERY PLAN > > -------------------------------------------------------------------------------- > Limit (cost=0.00..0.28 rows=1 width=97) > -> Seq Scan on pgbench_accounts (cost=0.00..28894.00 rows=102667 > width=97) > Filter: (bid = 2) > (3 rows) > > So in this case, the top-level node returns a lower cost because the > planner knows that it will find a row with bid=2 fairly quickly in the > seq scan. But in the WHERE FALSE example, that scan *is* the top-level > node, so the planner returns a fictitious cost for the whole query. > > Or is there something else at work here? > > [1] > http://www.postgresql.org/message-id/20150225194953.2546.86836@wrigleys.postgresql.org > -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > So ... should I assume my diagnosis is correct? Haven't heard any other > suggestions. I don't see any reason to think this is worth worrying about, or worth spending planner cycles on to produce a cosmetically nicer cost estimate. One-time filters always apply at the top plan level so they're unlikely to change any planner choices. Moreover, for any case other than the not-terribly-interesting constant FALSE case, we're better off assuming that the filter condition will be true (and so there's nothing to adjust). regards, tom lane
On 03/16/2015 11:26 AM, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: >> So ... should I assume my diagnosis is correct? Haven't heard any other >> suggestions. > > I don't see any reason to think this is worth worrying about, or worth > spending planner cycles on to produce a cosmetically nicer cost estimate. I wouldn't say it's critical, but there's two issues: 1) users are confused when they see the plan, especially if it's chosen in preference to a lower-cost plan. It's counter-intuitive for EXPLAIN to not display the "real" estimated cost. 2) Tools which attempt to do some kind of useful aggregation or event handling around estimated plan cost have to write special workarounds for these cases. Is there anything *useful* about the existing behavior such that we'd like to preserve it? Or is it just a matter of Nobody's Submitted A Patch Yet? I ask because I'm thinking about a patch, so if changing this will break a lot of stuff, that's a good thing to know. > One-time filters always apply at the top plan level so they're unlikely > to change any planner choices. Moreover, for any case other than the > not-terribly-interesting constant FALSE case, we're better off assuming > that the filter condition will be true (and so there's nothing to adjust). Except that we *don't* get back the same estimate for a TRUE filter condition. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com