Fix neqjoinsel's behavior for semi/anti join cases.
Previously, this function estimated the selectivity as 1 minus eqjoinsel()
for the negator equality operator, regardless of join type (I think there
was an expectation that eqjoinsel would handle the join type). But
actually this is completely wrong for semijoin cases: the fraction of the
LHS that has a non-matching row is not one minus the fraction of the LHS
that has a matching row. In reality a semijoin with <> will nearly always
succeed: it can only fail when the RHS is empty, or it contains a single
distinct value that is equal to the particular LHS value, or the LHS value
is null. The only one of those things we should have much confidence in
estimating is the fraction of LHS values that are null, so let's just take
the selectivity as 1 minus outer nullfrac.
Per coding convention, antijoin should be estimated the same as semijoin.
Arguably this is a bug fix, but in view of the lack of field complaints
and the risk of destabilizing plans, no back-patch.
Thomas Munro, reviewed by Ashutosh Bapat
Discussion: https://postgr.es/m/CAEepm=270ze2hVxWkJw-5eKzc3AB4C9KpH3L2kih75R5pdSogg@mail.gmail.com
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/7ca25b7de6aefa5537e0dbe56541bc41c0464f97
Modified Files
--------------
src/backend/utils/adt/selfuncs.c | 70 +++++++++++++++++++++++++++++---------
src/test/regress/expected/join.out | 22 ++++++++++++
src/test/regress/sql/join.sql | 9 +++++
3 files changed, 85 insertions(+), 16 deletions(-)