pgsql: Fix planning of non-strict equivalence clauses above outer joins - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Fix planning of non-strict equivalence clauses above outer joins
Date
Msg-id E1TOszc-0003UA-9X@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix planning of non-strict equivalence clauses above outer joins.

If a potential equivalence clause references a variable from the nullable
side of an outer join, the planner needs to take care that derived clauses
are not pushed to below the outer join; else they may use the wrong value
for the variable.  (The problem arises only with non-strict clauses, since
if an upper clause can be proven strict then the outer join will get
simplified to a plain join.)  The planner attempted to prevent this type
of error by checking that potential equivalence clauses aren't
outerjoin-delayed as a whole, but actually we have to check each side
separately, since the two sides of the clause will get moved around
separately if it's treated as an equivalence.  Bugs of this type can be
demonstrated as far back as 7.4, even though releases before 8.3 had only
a very ad-hoc notion of equivalence clauses.

In addition, we neglected to account for the possibility that such clauses
might have nonempty nullable_relids even when not outerjoin-delayed; so the
equivalence-class machinery lacked logic to compute correct nullable_relids
values for clauses it constructs.  This oversight was harmless before 9.2
because we were only using RestrictInfo.nullable_relids for OR clauses;
but as of 9.2 it could result in pushing constructed equivalence clauses
to incorrect places.  (This accounts for bug #7604 from Bill MacArthur.)

Fix the first problem by adding a new test check_equivalence_delay() in
distribute_qual_to_rels, and fix the second one by adding code in
equivclass.c and called functions to set correct nullable_relids for
generated clauses.  Although I believe the second part of this is not
currently necessary before 9.2, I chose to back-patch it anyway, partly to
keep the logic similar across branches and partly because it seems possible
we might find other reasons why we need valid values of nullable_relids in
the older branches.

Add regression tests illustrating these problems.  In 9.0 and up, also
add test cases checking that we can push constants through outer joins,
since we've broken that optimization before and I nearly broke it again
with an overly simplistic patch for this problem.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/72a4231f0c80f213a4fa75356dc3c6b7c7419059

Modified Files
--------------
src/backend/nodes/outfuncs.c            |    1 +
src/backend/optimizer/path/equivclass.c |   80 ++++++++++++++++++++------
src/backend/optimizer/plan/initsplan.c  |   95 +++++++++++++++++++++++++------
src/include/nodes/relation.h            |    1 +
src/include/optimizer/planmain.h        |    4 +-
src/test/regress/expected/join.out      |   71 +++++++++++++++++++++++
src/test/regress/sql/join.sql           |   32 ++++++++++
7 files changed, 248 insertions(+), 36 deletions(-)


pgsql-committers by date:

Previous
From: Alvaro Herrera
Date:
Subject: pgsql: pg_dump: Output functions deterministically sorted
Next
From: Tom Lane
Date:
Subject: pgsql: Fix planning of non-strict equivalence clauses above outer joins