Richard Guo <riguo@pivotal.io> writes: > In this patch, we are trying to do the similar deduction, from > non-equivalence > clauses, that is, A=B AND f(A) implies A=B AND f(A) and f(B), under some > restrictions on f.
Uh, *what* restrictions on f()? In general the above equivalence does not hold, at least not for any data type more complicated than integers; and we do not have any semantic model for deciding which functions it would be correct for.
Exactly! The operator in f() should be at least in the same opfamily as the equivalence class containing A,B.
Besides, as far as I can consider, the clause in f() should not contain volatile functions or subplans. Not sure
if these restrictions are enough to make it safe.
One simple example to show what I'm talking about is that float8 zero and minus zero are equal according to float8eq (assuming IEEE float arithmetic); but they aren't equivalent for any function f() that is sensitive to the sign or the text representation of the value. The numeric data type likewise has values that are "equal" without being identical for all purposes, eg 0.0 vs 0.000. Or consider citext.
Thanks for the example. Heikki materialized this example as:
create table a (f float8);
create table b (f float8);
insert into a values ('0'), ('-0');
insert into b values ('0'), ('-0');
select * from a, b where a.f = b.f and a.f::text <> '-0';
And run that query, this patch would give wrong result. Will address this in v2.
The existing planner deduction rules for equivalence classes are carefully designed to ensure that we only generate derived clauses using operators from the same operator class or family, so that it's on the opclass author to ensure that the operators have consistent semantics. I don't think we can extrapolate from that to any random function that accepts the datatype.