Thread: UNION ALL has higher cost than inheritance

UNION ALL has higher cost than inheritance

From

Itagaki Takahiro

Date:

21 October 2010, 02:02:48

I found an explicit UNION ALL has higher cost than an automatic expansion
by inheritance (49 vs. 83 in the example below). Where does the difference
come from?  Since they have almost same plan trees, should it be the same cost?

=# CREATE TABLE parent (i integer);
=# CREATE TABLE child () INHERITS (parent);
=# INSERT INTO child SELECT generate_series(1, 1000);
=# CREATE INDEX ON child (i);
=# ANALYZE;

=# EXPLAIN SELECT * FROM parent;                                QUERY PLAN
----------------------------------------------------------------------------Result  (cost=0.00..49.00 rows=3400
width=4) ->  Append  (cost=0.00..49.00 rows=3400 width=4)        ->  Seq Scan on parent  (cost=0.00..34.00 rows=2400
width=4)       ->  Seq Scan on child parent  (cost=0.00..15.00 rows=1000 width=4)
 
(4 rows)

=# EXPLAIN SELECT * FROM ONLY parent UNION ALL SELECT * FROM child;                          QUERY PLAN
----------------------------------------------------------------Append  (cost=0.00..83.00 rows=3400 width=4)  ->  Seq
Scanon parent  (cost=0.00..34.00 rows=2400 width=4)  ->  Seq Scan on child  (cost=0.00..15.00 rows=1000 width=4)
 
(3 rows)

-- 
Itagaki Takahiro

Re: UNION ALL has higher cost than inheritance

From

Tom Lane

Date:

21 October 2010, 02:18:15

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> I found an explicit UNION ALL has higher cost than an automatic expansion
> by inheritance (49 vs. 83 in the example below). Where does the difference
> come from?

The plan for UNION initially involves a couple of SubqueryScan nodes,
which impose an extra cost of cpu_tuple_cost per tuple.  Those later
get optimized away, but we don't try to readjust the cost estimates
for that.
        regards, tom lane

Re: UNION ALL has higher cost than inheritance

From

Itagaki Takahiro

Date:

21 October 2010, 03:16:24

On Thu, Oct 21, 2010 at 2:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The plan for UNION initially involves a couple of SubqueryScan nodes,
> which impose an extra cost of cpu_tuple_cost per tuple.  Those later
> get optimized away, but we don't try to readjust the cost estimates
> for that.

Thanks. It also explains my another question why Merge Append cannot
be used for UNION ALL plans. Inheritance is better than UNION ALL
in much more cases thanks to Merge Append.

=# EXPLAIN SELECT * FROM parent ORDER BY i LIMIT 10;                                             QUERY PLAN
------------------------------------------------------------------------------------------------------Limit
(cost=1.02..1.58rows=10 width=4)  ->  Result  (cost=1.02..56.79 rows=1001 width=4)        ->  Merge Append
(cost=1.02..56.79rows=1001 width=4)              Sort Key: public.parent.i              ->  Sort  (cost=1.01..1.01
rows=1width=4)                    Sort Key: public.parent.i                    ->  Seq Scan on parent  (cost=0.00..1.00
rows=1width=4)              ->  Index Scan using child_i_idx on child parent 
(cost=0.00..43.25 rows=1000 width=4)
(8 rows)

=# EXPLAIN (SELECT * FROM ONLY parent ORDER BY i) UNION ALL (SELECT *
FROM child ORDER BY i) ORDER BY i LIMIT 10;                                         QUERY PLAN
-----------------------------------------------------------------------------------------------Limit
(cost=75.91..75.93rows=10 width=4)  ->  Sort  (cost=75.91..78.41 rows=1001 width=4)        Sort Key: parent.i        ->
Append  (cost=1.01..54.28 rows=1001 width=4)              ->  Sort  (cost=1.01..1.01 rows=1 width=4)
SortKey: parent.i                    ->  Seq Scan on parent  (cost=0.00..1.00 rows=1 width=4)              ->  Index
Scanusing child_i_idx on child 
(cost=0.00..43.25 rows=1000 width=4)
(8 rows)

--
Itagaki Takahiro

Re: UNION ALL has higher cost than inheritance

From

Tom Lane

Date:

21 October 2010, 10:57:52

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> On Thu, Oct 21, 2010 at 2:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The plan for UNION initially involves a couple of SubqueryScan nodes,
>> which impose an extra cost of cpu_tuple_cost per tuple.  Those later
>> get optimized away, but we don't try to readjust the cost estimates
>> for that.

> Thanks. It also explains my another question why Merge Append cannot
> be used for UNION ALL plans.

Hmm, seems like the example you show ought to work.  I wonder if there
was an oversight in that patch...
        regards, tom lane

Re: UNION ALL has higher cost than inheritance

From

Greg Stark

Date:

21 October 2010, 15:11:36

On Thu, Oct 21, 2010 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Thanks. It also explains my another question why Merge Append cannot
>> be used for UNION ALL plans.
>
> Hmm, seems like the example you show ought to work.  I wonder if there
> was an oversight in that patch...
>

Huh, that definitely worked in the earlier versions of the patch (as
much as it "worked" at all)

--
greg

Re: UNION ALL has higher cost than inheritance

From

Tom Lane

Date:

21 October 2010, 18:19:31

Greg Stark <gsstark@mit.edu> writes:
> On Thu, Oct 21, 2010 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Thanks. It also explains my another question why Merge Append cannot
>>> be used for UNION ALL plans.

>> Hmm, seems like the example you show ought to work. �I wonder if there
>> was an oversight in that patch...

> Huh, that definitely worked in the earlier versions of the patch (as
> much as it "worked" at all)

Actually, it works as long as the UNION is in a subquery:

regression=# EXPLAIN select * from (
(SELECT * FROM ONLY parent ORDER BY i) UNION ALL
(SELECT * FROM child ORDER BY i)) ss ORDER BY i LIMIT 10;                                         QUERY PLAN
               
 
-----------------------------------------------------------------------------------------------Limit
(cost=168.76..169.13rows=10 width=4)  ->  Result  (cost=168.76..294.51 rows=3400 width=4)        ->  Merge Append
(cost=168.76..294.51rows=3400 width=4)              Sort Key: parent.i              ->  Sort  (cost=168.75..174.75
rows=2400width=4)                    Sort Key: parent.i                    ->  Seq Scan on parent  (cost=0.00..34.00
rows=2400width=4)              ->  Index Scan using child_i_idx on child  (cost=0.00..43.25 rows=1000 width=4)
 
(8 rows)

The oversight here is that we don't use appendrel planning for
a top-level UNION ALL construct.  That didn't use to matter,
because you always got the same stupid Append plan either way.
Now it seems like we ought to have some more intelligence for the
top-level SetOp case.  I smell some code refactoring coming up.
        regards, tom lane

Re: UNION ALL has higher cost than inheritance

From

"David E. Wheeler"

Date:

21 October 2010, 18:47:53

On Oct 21, 2010, at 2:17 PM, Tom Lane wrote:

> The oversight here is that we don't use appendrel planning for
> a top-level UNION ALL construct.  That didn't use to matter,
> because you always got the same stupid Append plan either way.
> Now it seems like we ought to have some more intelligence for the
> top-level SetOp case.  I smell some code refactoring coming up.

Does it smell like chicken?

Best,

David

Re: UNION ALL has higher cost than inheritance

From

Tom Lane

Date:

08 November 2010, 00:00:21

I wrote:
> The oversight here is that we don't use appendrel planning for
> a top-level UNION ALL construct.  That didn't use to matter,
> because you always got the same stupid Append plan either way.
> Now it seems like we ought to have some more intelligence for the
> top-level SetOp case.  I smell some code refactoring coming up.

I did some hacking on this and came up with the attached patch, which
could use a bit more work on the comments but passes regression tests.
However, this just solves the issue of being smart about top-level
UNION ALL cases.  It might be worth looking into using MergeAppend
for the sorting required for other types of set operations.  That would
involve quite a different patch, and I'm not sure if it'd remove the
need for this one or not.

            regards, tom lane

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 07301c77fbf4f2550367ea8ff58c685462835223..a1055978869c5072db6e3a172169d934062ae2aa 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** subquery_planner(PlannerGlobal *glob, Qu
*** 341,353 ****
      inline_set_returning_functions(root);

      /*
!      * Check to see if any subqueries in the rangetable can be merged into
       * this query.
       */
      parse->jointree = (FromExpr *)
          pull_up_subqueries(root, (Node *) parse->jointree, NULL, NULL);

      /*
       * Detect whether any rangetable entries are RTE_JOIN kind; if not, we can
       * avoid the expense of doing flatten_join_alias_vars().  Also check for
       * outer joins --- if none, we can skip reduce_outer_joins(). This must be
--- 341,362 ----
      inline_set_returning_functions(root);

      /*
!      * Check to see if any subqueries in the jointree can be merged into
       * this query.
       */
      parse->jointree = (FromExpr *)
          pull_up_subqueries(root, (Node *) parse->jointree, NULL, NULL);

      /*
+      * If this is a simple UNION ALL query, flatten it into an appendrel.
+      * We do this now because it requires applying pull_up_subqueries to the
+      * leaf queries of the UNION ALL, which weren't touched above because they
+      * aren't referenced by the jointree until we do this.
+      */
+     if (parse->setOperations)
+         flatten_union_all(root);
+
+     /*
       * Detect whether any rangetable entries are RTE_JOIN kind; if not, we can
       * avoid the expense of doing flatten_join_alias_vars().  Also check for
       * outer joins --- if none, we can skip reduce_outer_joins(). This must be
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index e337751328bf0b05e3025a2ecc302f754b766855..847e67959a8b81a5cf8ea55ae48d57ec98f32285 100644
*** a/src/backend/optimizer/prep/prepjointree.c
--- b/src/backend/optimizer/prep/prepjointree.c
***************
*** 7,12 ****
--- 7,13 ----
   *        pull_up_sublinks
   *        inline_set_returning_functions
   *        pull_up_subqueries
+  *        flatten_union_all
   *        do expression preprocessing (including flattening JOIN alias vars)
   *        reduce_outer_joins
   *
*************** pull_up_simple_union_all(PlannerInfo *ro
*** 869,879 ****
      List       *rtable;

      /*
-      * Append the subquery rtable entries to upper query.
-      */
-     rtoffset = list_length(root->parse->rtable);
-
-     /*
       * Append child RTEs to parent rtable.
       *
       * Upper-level vars in subquery are now one level closer to their parent
--- 870,875 ----
*************** pull_up_simple_union_all(PlannerInfo *ro
*** 881,886 ****
--- 877,883 ----
       * because any such vars must refer to stuff above the level of the query
       * we are pulling into.
       */
+     rtoffset = list_length(root->parse->rtable);
      rtable = copyObject(subquery->rtable);
      IncrementVarSublevelsUp_rtable(rtable, -1, 1);
      root->parse->rtable = list_concat(root->parse->rtable, rtable);
*************** pullup_replace_vars_callback(Var *var,
*** 1418,1423 ****
--- 1415,1518 ----
      return newnode;
  }

+
+ /*
+  * flatten_union_all
+  *        Try to optimize a top-level UNION ALL structure
+  *
+  * If a query's setOperations tree consists entirely of simple UNION ALL
+  * operations, flatten it into an append relation, which we can process more
+  * intelligently than the general set-ops case.
+  *
+  * In most cases, this can succeed only for a top-level query, because for a
+  * subquery in FROM, the parent query's invocation of pull_up_subqueries would
+  * already have flattened the UNION via pull_up_simple_union_all.  But there
+  * are a few cases we can support here but not in that code path, for example
+  * when the subquery also contains ORDER BY.
+  */
+ void
+ flatten_union_all(PlannerInfo *root)
+ {
+     Query       *parse = root->parse;
+     SetOperationStmt *topop;
+     Node       *leftmostjtnode;
+     int            leftmostRTI;
+     RangeTblEntry *leftmostRTE;
+     int            childRTI;
+     RangeTblEntry *childRTE;
+     RangeTblRef *rtr;
+
+     /* Shouldn't be called unless query has setops */
+     topop = (SetOperationStmt *) parse->setOperations;
+     Assert(topop && IsA(topop, SetOperationStmt));
+
+     /* Can't optimize away a recursive UNION */
+     if (root->hasRecursion)
+         return;
+
+     /*
+      * Recursively check the tree of set operations.  If not all UNION ALL
+      * with identical column types, punt.
+      */
+     if (!is_simple_union_all_recurse((Node *) topop, parse,
+                                      topop->colTypes))
+         return;
+
+     /*
+      * Locate the leftmost leaf query in the setops tree.  The upper query's
+      * Vars all refer to this RTE (see transformSetOperationStmt).
+      */
+     leftmostjtnode = topop->larg;
+     while (leftmostjtnode && IsA(leftmostjtnode, SetOperationStmt))
+         leftmostjtnode = ((SetOperationStmt *) leftmostjtnode)->larg;
+     Assert(leftmostjtnode && IsA(leftmostjtnode, RangeTblRef));
+     leftmostRTI = ((RangeTblRef *) leftmostjtnode)->rtindex;
+     leftmostRTE = rt_fetch(leftmostRTI, parse->rtable);
+     Assert(leftmostRTE->rtekind == RTE_SUBQUERY);
+
+     /*
+      * Make a copy of the leftmost RTE and add it to the rtable.  This copy
+      * will represent the leftmost leaf query in its capacity as a member
+      * of the appendrel.  The original will represent the appendrel as a
+      * whole.  (We must do things this way because the upper query's Vars
+      * have to be seen as referring to the whole appendrel.)
+      */
+     childRTE = copyObject(leftmostRTE);
+     parse->rtable = lappend(parse->rtable, childRTE);
+     childRTI = list_length(parse->rtable);
+
+     /* Modify the setops tree to reference the child copy */
+     ((RangeTblRef *) leftmostjtnode)->rtindex = childRTI;
+
+     /*
+      * Modify the formerly-leftmost RTE to mark it as an appendrel parent
+      * (compare pull_up_simple_union_all).
+      */
+     leftmostRTE->inh = true;
+
+     /*
+      * Form a RangeTblRef for the appendrel, and insert it into FROM.  The top
+      * Query of a setops tree should have had an empty FromClause initially.
+      */
+     rtr = makeNode(RangeTblRef);
+     rtr->rtindex = leftmostRTI;
+     Assert(parse->jointree->fromlist == NIL);
+     parse->jointree->fromlist = list_make1(rtr);
+
+     /*
+      * Now pretend the query has no setops.  We must do this before trying
+      * to do subquery pullup, because of Assert in pull_up_simple_subquery.
+      */
+     parse->setOperations = NULL;
+
+     /*
+      * Build AppendRelInfo information and try to pull up the leaf queries.
+      * XXX need to adjust comments for pull_up_union_leaf_queries
+      */
+     pull_up_union_leaf_queries((Node *) topop, root, leftmostRTI, parse, 0);
+ }
+
+
  /*
   * reduce_outer_joins
   *        Attempt to reduce outer joins to plain inner joins.
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index f8dd5428ee47a74637d37f5a293d2acff117eae4..e4654d9eeab0ef3908f679652866adb2589bd1ce 100644
*** a/src/include/optimizer/prep.h
--- b/src/include/optimizer/prep.h
*************** extern void inline_set_returning_functio
*** 26,31 ****
--- 26,32 ----
  extern Node *pull_up_subqueries(PlannerInfo *root, Node *jtnode,
                     JoinExpr *lowest_outer_join,
                     AppendRelInfo *containing_appendrel);
+ extern void flatten_union_all(PlannerInfo *root);
  extern void reduce_outer_joins(PlannerInfo *root);
  extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
  extern Relids get_relids_for_join(PlannerInfo *root, int joinrelid);

Re: UNION ALL has higher cost than inheritance

From

Tom Lane

Date:

08 November 2010, 14:20:04

I wrote:
> I did some hacking on this and came up with the attached patch, which
> could use a bit more work on the comments but passes regression tests.
> However, this just solves the issue of being smart about top-level
> UNION ALL cases.  It might be worth looking into using MergeAppend
> for the sorting required for other types of set operations.  That would
> involve quite a different patch, and I'm not sure if it'd remove the
> need for this one or not.

I spent some more time thinking about this.  It seems difficult to make
any real further progress without essentially throwing out the current
approach to planning set-operation trees and starting over.  What we'd
want is to generate Paths representing the different ways a set-op could
be implemented and then pick the best one.  I can see the general shape
of how to do that: a set-op should be thought of as generating a
constrained join relation in a fashion similar to an OUTER JOIN
construct, and then the different Paths are possible implementations for
the join rel.  We'd need new Path types for hashed or sorted UNION,
INTERSECT, EXCEPT, as well as some analog of the SpecialJoinInfo
struct (or extend that struct to cover these cases).

It strikes me that maybe top-level DISTINCT could be married into this
too, since it's basically the same as a one-input UNION operator.  But
not sure what to do with DISTINCT ON.

Another thought here is that the current definition of the SetOp
execution node type could be improved.  Instead of taking a single
pre-merged input tuple stream, it'd be better if it had two inputs
like a JOIN node.  Then we could eliminate the "flag" column, which
would simplify matters for planning and reduce the amount of data
that has to be passed through sorting.  SetOp itself would need to
borrow some of the capability of MergeAppend so that it could read
two sorted streams in parallel and keep them in sync.

But this all looks like a pretty substantial amount of work, and
given the low level of user demand for improving the performance of
set operations, it seems to belong fairly far down the to-do list.
So I'm not going to tackle it now.  Barring objection, I'll clean up
yesterday's patch a bit more and commit it.
        regards, tom lane

Re: UNION ALL has higher cost than inheritance

From

Robert Haas

Date:

08 November 2010, 15:56:40

On Mon, Nov 8, 2010 at 1:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> I did some hacking on this and came up with the attached patch, which
>> could use a bit more work on the comments but passes regression tests.
>> However, this just solves the issue of being smart about top-level
>> UNION ALL cases.  It might be worth looking into using MergeAppend
>> for the sorting required for other types of set operations.  That would
>> involve quite a different patch, and I'm not sure if it'd remove the
>> need for this one or not.
>
> I spent some more time thinking about this.  It seems difficult to make
> any real further progress without essentially throwing out the current
> approach to planning set-operation trees and starting over.  What we'd
> want is to generate Paths representing the different ways a set-op could
> be implemented and then pick the best one.  I can see the general shape
> of how to do that: a set-op should be thought of as generating a
> constrained join relation in a fashion similar to an OUTER JOIN
> construct, and then the different Paths are possible implementations for
> the join rel.  We'd need new Path types for hashed or sorted UNION,
> INTERSECT, EXCEPT, as well as some analog of the SpecialJoinInfo
> struct (or extend that struct to cover these cases).

It strikes me that INTERSECT ALL and EXCEPT ALL are pretty much JUST a
semi- or anti-join (on the entire set of output columns).   But
without ALL it's completely different.

> It strikes me that maybe top-level DISTINCT could be married into this
> too, since it's basically the same as a one-input UNION operator.  But
> not sure what to do with DISTINCT ON.
>
> Another thought here is that the current definition of the SetOp
> execution node type could be improved.  Instead of taking a single
> pre-merged input tuple stream, it'd be better if it had two inputs
> like a JOIN node.  Then we could eliminate the "flag" column, which
> would simplify matters for planning and reduce the amount of data
> that has to be passed through sorting.  SetOp itself would need to
> borrow some of the capability of MergeAppend so that it could read
> two sorted streams in parallel and keep them in sync.
>
> But this all looks like a pretty substantial amount of work, and
> given the low level of user demand for improving the performance of
> set operations, it seems to belong fairly far down the to-do list.
> So I'm not going to tackle it now.  Barring objection, I'll clean up
> yesterday's patch a bit more and commit it.

I agree.  If we had infinite resources it would be nice to tackle
this, but I think we have bigger fish to fry.  In particular, I wonder
if you've thought any more about the generalized inner-indexscan
machinery, or taken a look at any of the issues around KNNGIST.  I'd
like to see our limited supply of planner-fu invested in those areas,
or perhaps in making inner join removal work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: UNION ALL has higher cost than inheritance

From

Tom Lane

Date:

08 November 2010, 16:05:25

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Nov 8, 2010 at 1:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> But this all looks like a pretty substantial amount of work, and
>> given the low level of user demand for improving the performance of
>> set operations, it seems to belong fairly far down the to-do list.
>> So I'm not going to tackle it now. �Barring objection, I'll clean up
>> yesterday's patch a bit more and commit it.

> I agree.  If we had infinite resources it would be nice to tackle
> this, but I think we have bigger fish to fry.  In particular, I wonder
> if you've thought any more about the generalized inner-indexscan
> machinery, or taken a look at any of the issues around KNNGIST.  I'd
> like to see our limited supply of planner-fu invested in those areas,
> or perhaps in making inner join removal work.

The two top things on my to-do list for 9.1 are the generalized
inner-indexscan stuff and automatic replans for parameterized queries.
I had been hoping to finish one or the other before the next commitfest,
though time is draining away rapidly.

I'll try to look at KNNGIST during the fest.
        regards, tom lane

Re: UNION ALL has higher cost than inheritance

From

Robert Haas

Date:

08 November 2010, 16:07:52

On Mon, Nov 8, 2010 at 3:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Nov 8, 2010 at 1:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> But this all looks like a pretty substantial amount of work, and
>>> given the low level of user demand for improving the performance of
>>> set operations, it seems to belong fairly far down the to-do list.
>>> So I'm not going to tackle it now.  Barring objection, I'll clean up
>>> yesterday's patch a bit more and commit it.
>
>> I agree.  If we had infinite resources it would be nice to tackle
>> this, but I think we have bigger fish to fry.  In particular, I wonder
>> if you've thought any more about the generalized inner-indexscan
>> machinery, or taken a look at any of the issues around KNNGIST.  I'd
>> like to see our limited supply of planner-fu invested in those areas,
>> or perhaps in making inner join removal work.
>
> The two top things on my to-do list for 9.1 are the generalized
> inner-indexscan stuff and automatic replans for parameterized queries.
> I had been hoping to finish one or the other before the next commitfest,
> though time is draining away rapidly.

Cool beans.

> I'll try to look at KNNGIST during the fest.

Thanks.  I have posted on it a few times; you may or may not find it
helpful to review those before diving in yourself...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: UNION ALL has higher cost than inheritance

From

Dimitri Fontaine

Date:

08 November 2010, 16:14:40

Tom Lane <tgl@sss.pgh.pa.us> writes:
> But this all looks like a pretty substantial amount of work, and
> given the low level of user demand for improving the performance of
> set operations, it seems to belong fairly far down the to-do list.

Whatever you say, that's your own todo list after all. I just wanted to
chime in and say that it could well be a chicken-and-eggs problem.

Other than that, the only general way I know of to optimise a WHERE
clause containing OR branches is producing the UNION ALL equivalent.
Now maybe I'm all wet, but your explaining sounded like a very nice
first step towards such an automatic optimisation.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: UNION ALL has higher cost than inheritance

From

Tom Lane

Date:

08 November 2010, 16:27:24

BTW, one other point for the archives, before I forget it: it'd probably
also be wise to change the parser's output representation of set
operations.  I think it should create a separate RTE entry corresponding
to each SetOperation node, similar to what we do for explicit JOIN
nodes.  Then there is a clean representation for Vars referencing the
setop outputs: they can use the RT index of the SetOp RTE entry.
The current implementation is that all Vars in the upper query use the
RTI of the leftmost leaf query of the setop structure, which is really
bogus.  First, that query doesn't necessarily output the same datatypes
as the upper setops do, so the datatypes shown by the Vars don't always
match the RTE they claim to refer to.  (This is one reason why all of
the setop optimizations we do have tend to punt as soon as any datatype
changes occur.)  Second, this design provides no good way to refer to
outputs of intermediate setops.  I think we will *have* to fix that if
we want to make the planner's handling of these cases much smarter than
it is now.
        regards, tom lane