pgsql: Revisit handling of UNION ALL subqueries with non-Var output col - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Revisit handling of UNION ALL subqueries with non-Var output col
Date
Msg-id E1S8ahj-0004JJ-Rh@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Revisit handling of UNION ALL subqueries with non-Var output columns.

In commit 57664ed25e5dea117158a2e663c29e60b3546e1c I tried to fix a bug
reported by Teodor Sigaev by making non-simple-Var output columns distinct
(by wrapping their expressions with dummy PlaceHolderVar nodes).  This did
not work too well.  Commit b28ffd0fcc583c1811e5295279e7d4366c3cae6c fixed
some ensuing problems with matching to child indexes, but per a recent
report from Claus Stadler, constraint exclusion of UNION ALL subqueries was
still broken, because constant-simplification didn't handle the injected
PlaceHolderVars well either.  On reflection, the original patch was quite
misguided: there is no reason to expect that EquivalenceClass child members
will be distinct.  So instead of trying to make them so, we should ensure
that we can cope with the situation when they're not.

Accordingly, this patch reverts the code changes in the above-mentioned
commits (though the regression test cases they added stay).  Instead, I've
added assorted defenses to make sure that duplicate EC child members don't
cause any problems.  Teodor's original problem ("MergeAppend child's
targetlist doesn't match MergeAppend") is addressed more directly by
revising prepare_sort_from_pathkeys to let the parent MergeAppend's sort
list guide creation of each child's sort list.

In passing, get rid of add_sort_column; as far as I can tell, testing for
duplicate sort keys at this stage is dead code.  Certainly it doesn't
trigger often enough to be worth expending cycles on in ordinary queries.
And keeping the test would've greatly complicated the new logic in
prepare_sort_from_pathkeys, because comparing pathkey list entries against
a previous output array requires that we not skip any entries in the list.

Back-patch to 9.1, like the previous patches.  The only known issue in
this area that wasn't caused by the ill-advised previous patches was the
MergeAppend planning failure, which of course is not relevant before 9.1.
It's possible that we need some of the new defenses against duplicate child
EC entries in older branches, but until there's some clear evidence of that
I'm going to refrain from back-patching further.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/dd4134ea56cb8855aad3988febc45eca28851cd8

Modified Files
--------------
src/backend/optimizer/README              |    8 +
src/backend/optimizer/path/equivclass.c   |   44 +++-
src/backend/optimizer/path/indxpath.c     |   18 +-
src/backend/optimizer/path/pathkeys.c     |   17 ++-
src/backend/optimizer/plan/createplan.c   |  381 +++++++++++++++--------------
src/backend/optimizer/plan/planagg.c      |   13 +-
src/backend/optimizer/plan/planmain.c     |    2 +-
src/backend/optimizer/prep/prepjointree.c |   36 +--
src/backend/optimizer/util/placeholder.c  |   27 +--
src/include/nodes/relation.h              |   18 +-
src/include/optimizer/paths.h             |    1 +
src/include/optimizer/placeholder.h       |    2 +-
src/test/regress/expected/inherit.out     |   57 +++++-
src/test/regress/expected/union.out       |   14 +
src/test/regress/sql/inherit.sql          |   23 ++-
src/test/regress/sql/union.sql            |    8 +
16 files changed, 411 insertions(+), 258 deletions(-)


pgsql-committers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: pgsql: Add comments explaining why our Itanium spinlock implementation
Next
From: Tom Lane
Date:
Subject: pgsql: Revisit handling of UNION ALL subqueries with non-Var output col