Re: Redundant filter in index scan with a bool column - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Redundant filter in index scan with a bool column
Date
Msg-id 27810.1547651110@sss.pgh.pa.us
Whole thread Raw
In response to Redundant filter in index scan with a bool column  (Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>)
List pgsql-hackers
Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:
> The filter is not needed, why is it there? Turns out we can't recognize 
> that the restriction clause 'b' and the index clause 'b = true' are 
> equivalent.

Yeah, it's intentional that we don't get rid of the extra clause;
it doesn't really seem worth the expense and complexity to do so.
Indexes on bool columns are a pretty niche case in the first place.
Usually, if you are interested in just the rows where b = true,
you're better off using "where b" as an index predicate.  In your
example, we can do this instead:

regression=# create index on t(i) where b;
CREATE INDEX
regression=# explain select * from t where i = 300 and b;
                            QUERY PLAN                            
------------------------------------------------------------------
 Index Scan using t_i_idx on t  (cost=0.12..24.19 rows=6 width=5)
   Index Cond: (i = 300)
(2 rows)

resulting in a much smaller index, if the b=true condition is selective
enough to be worth indexing.  Even in the case you showed, how much is
the redundant filter clause really costing?

> My first reaction was to patch operator_predicate_proof to 
> handle this case, but there is a more straightforward way: mark the 
> expanded index clause as potentially redundant when it is generated in 
> expand_indexqual_conditions. There is already RestrictInfo.parent_ec 
> that is used to mark redundant EC-derived join clauses. The patch 
> renames it to rinfo_parent and uses it to mark the expanded index 
> clauses.

That's an unbelievable hack that almost certainly breaks existing uses.

The approach of teaching predtest.c that "b = true" implies "b" would
work, but it seems a bit brute-force because ordinarily such cases
would never be seen there, thanks to simplify_boolean_equality having
canonicalized the former into the latter.  The problem we have is that
indxpath.c re-generates "b = true" in indexscan conditions.  Thinking
about it now, I wonder if we could postpone that conversion till later,
say do it in create_indexscan_plan after having checked for redundant
clauses.  Not sure how messy that'd be.

            regards, tom lane


pgsql-hackers by date:

Previous
From: James Coleman
Date:
Subject: Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's
Next
From: Shay Rojansky
Date:
Subject: Re: [PATCH] Allow UNLISTEN during recovery