Thread: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

[HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
It seems like a good idea for the planner not to generate Append or
MergeAppend paths when the node contains just a single subpath. If we
were able to remove these then the planner would have more flexibility
to build a better plan.

As of today, because we include this needless [Merge]Append node, we
cannot parallelise scans below the Append. We must also include a
Materialize node in Merge Joins since both MergeAppend and Append
don't support mark and restore. Also, as shown in [1], there's an
overhead to pulling tuples through nodes.

I've been looking into resolving this issue but the ways I've explored
so far seems to be bending the current planner a bit out of shape.

Method 1:

In set_append_rel_size() detect when just a single subpath would be
added to the Append path. Set a promotion_child RelOptInfo field in
the base rel's RelOptInfo, and during make_one_rel, after
set_base_rel_sizes() modify the joinlist to get rid of the parent and
use the child instead.

This pretty much breaks the assumption we have that the finalrel will
have all the relids to root->all_baserels. We have an Assert in
make_one_rel() which checks this is true.

There are also complications around set_rel_size() generating paths
for subqueries which may be parameterized paths with the Append parent
required for the parameterization to work.

Method 2:

Invent a ProxyPath concept that allows us to add Paths belonging to
one relation to another relation. The ProxyPaths can have
translation_vars to translate targetlists to reference the correct
Vars. These ProxyPaths could exist up as far as createplan, where we
could perform the translation and build the ProxyPath's subnode
instead.

This method is not exactly very clean either as there are various
places around the planner we'd need to give special treatment to these
ProxyPaths, such as is_projection_capable_path() would need to return
the projection capability of the subpath within the ProxyPath since we
never actually create a "Proxy" node.

Probably either of these two methods could be made to work. I prefer
method 2, since that infrastructure could one day be put towards
scanning a Materialized View instead of the relation. in order to
satisfy some query. However, method 2 appears to also require some Var
translation in Path targetlists which contain this Proxy path, either
that or some global Var replacement would need to be done during
setrefs.

I'm posting here in the hope that it will steer my development of this
in a direction that's acceptable to the community.

Perhaps there is also a "Method 3" which I've not thought about which
would be much cleaner.

[1] https://www.postgresql.org/message-id/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B%3DZRh-rxy9qxfPA5Gw%40mail.gmail.com

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Ashutosh Bapat
Date:
On Thu, Oct 26, 2017 at 3:29 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> However, method 2 appears to also require some Var
> translation in Path targetlists which contain this Proxy path, either
> that or some global Var replacement would need to be done during
> setrefs.
>

For inheritance and partitioning, we translate parent expressions to
child expression by applying Var translations. The translated
expressions differ only in the Var nodes (at least for partitioning
this true, and I believe it to be true for inheritance). I have been
toying with an idea of having some kind of a (polymorphic?) Var node
which can represent parent and children. That will greatly reduce the
need to translate the expression trees and will also help in this
implementation. I am wondering, if ProxyPath could be helpful in
avoiding the cost of reparameterize_path_by_child() introduced for
partition-wise join.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Robert Haas
Date:
On Wed, Oct 25, 2017 at 11:59 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> As of today, because we include this needless [Merge]Append node, we
> cannot parallelise scans below the Append.

Without disputing your general notion that singe-child Append or
MergeAppend nodes are a pointless waste, how does a such a needless
node prevent parallelizing scans beneath it?

> Invent a ProxyPath concept that allows us to add Paths belonging to
> one relation to another relation. The ProxyPaths can have
> translation_vars to translate targetlists to reference the correct
> Vars. These ProxyPaths could exist up as far as createplan, where we
> could perform the translation and build the ProxyPath's subnode
> instead.

This idea sounds like it might have some legs, although I'm not sure
"proxy" is really the right terminology.

Like both you and Ashutosh, I suspect that there might be some other
applications for this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Antonin Houska
Date:
David Rowley <david.rowley@2ndquadrant.com> wrote:

> Method 1:
>
> In set_append_rel_size() detect when just a single subpath would be
> added to the Append path.

I spent some time reviewing the partition-wise join patch during the last CF
and have an impression that set_append_rel_size() is called too early to find
out how many child paths will eventually exist if Append represents a join of
a partitioned table. I think the partition matching takes place at later stage
and that determines the actual number of partitions, and therefore the number
of Append subpaths.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 26 October 2017 at 23:30, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Oct 25, 2017 at 11:59 PM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> As of today, because we include this needless [Merge]Append node, we
>> cannot parallelise scans below the Append.
>
> Without disputing your general notion that singe-child Append or
> MergeAppend nodes are a pointless waste, how does a such a needless
> node prevent parallelizing scans beneath it?

hmm, I think I was wrong about that now. I had been looking over the
regression test diffs after having made tenk1 a partitioned table with
a single partition containing all the rows, but it looks like I read
the diff backwards. The planner actually parallelized the Append
version, not the non-Append version, like I had previously thought.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 26 October 2017 at 23:42, Antonin Houska <ah@cybertec.at> wrote:
> David Rowley <david.rowley@2ndquadrant.com> wrote:
>
>> Method 1:
>>
>> In set_append_rel_size() detect when just a single subpath would be
>> added to the Append path.
>
> I spent some time reviewing the partition-wise join patch during the last CF
> and have an impression that set_append_rel_size() is called too early to find
> out how many child paths will eventually exist if Append represents a join of
> a partitioned table. I think the partition matching takes place at later stage
> and that determines the actual number of partitions, and therefore the number
> of Append subpaths.

I understand that we must wait until set_append_rel_size() is being
called on the RELOPT_BASEREL since partitions may themselves be
partitioned tables and we'll only know what's left after we've
recursively checked all partitions of the baserel.

I've not studied the partition-wise join code yet, but if we've
eliminated all but one partition in set_append_rel_size() then I
imagine we'd just need to ensure partition-wise join is not attempted
since we'd be joining directly to the only partition anyway.


-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Ashutosh Bapat
Date:
On Thu, Oct 26, 2017 at 5:07 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 26 October 2017 at 23:42, Antonin Houska <ah@cybertec.at> wrote:
>> David Rowley <david.rowley@2ndquadrant.com> wrote:
>>
>>> Method 1:
>>>
>>> In set_append_rel_size() detect when just a single subpath would be
>>> added to the Append path.
>>
>> I spent some time reviewing the partition-wise join patch during the last CF
>> and have an impression that set_append_rel_size() is called too early to find
>> out how many child paths will eventually exist if Append represents a join of
>> a partitioned table. I think the partition matching takes place at later stage
>> and that determines the actual number of partitions, and therefore the number
>> of Append subpaths.
>
> I understand that we must wait until set_append_rel_size() is being
> called on the RELOPT_BASEREL since partitions may themselves be
> partitioned tables and we'll only know what's left after we've
> recursively checked all partitions of the baserel.
>
> I've not studied the partition-wise join code yet, but if we've
> eliminated all but one partition in set_append_rel_size() then I
> imagine we'd just need to ensure partition-wise join is not attempted
> since we'd be joining directly to the only partition anyway.
>

I think Antonin has a point. The join processing may deem some base
relations dummy (See populate_joinrel_with_paths()). So an Append
relation which had multiple child alive at the end of
set_append_rel_size() might ultimately have only one child after
partition-wise join has worked on it.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> It seems like a good idea for the planner not to generate Append or
> MergeAppend paths when the node contains just a single subpath.
> ...
> Perhaps there is also a "Method 3" which I've not thought about which
> would be much cleaner.

Have you considered turning AppendPath with a single child into more
of a special case?  I think this is morally equivalent to your ProxyPath
proposal, but it would take less new boilerplate code to support.
If you taught create_append_path to inherit the child's pathkeys
when there's only one child, and taught createplan.c to skip making the
useless Append node, I think you might be nearly done.

This might seem a bit ugly, but considering that zero-child AppendPaths
are already a special case (and a much bigger one), I don't have much
of a problem with decreeing that one-child is also a special case.

As an example of why this might be better than a separate ProxyPath
node type, consider this call in add_paths_to_append_rel:
   /*    * If we found unparameterized paths for all children, build an unordered,    * unparameterized Append path for
therel.  (Note: this is correct even    * if we have zero or one live subpath due to constraint exclusion.)    */   if
(subpaths_valid)      add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0,
               partitioned_rels));
 

That comment could stand a bit of adjustment with this approach, but
the code itself requires no extra work.

Now, you would have to do something about this in create_append_plan:
   /*    * XXX ideally, if there's just one child, we'd not bother to generate an    * Append node but just return the
singlechild.  At the moment this does    * not work because the varno of the child scan plan won't match the    *
parent-relVars it'll be asked to emit.    */
 

but that's going to come out in the wash someplace, no matter what you do.
Leaving it for the last minute is likely to reduce the number of places
you have to teach about this.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 27 October 2017 at 01:44, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> I think Antonin has a point. The join processing may deem some base
> relations dummy (See populate_joinrel_with_paths()). So an Append
> relation which had multiple child alive at the end of
> set_append_rel_size() might ultimately have only one child after
> partition-wise join has worked on it.

hmm, I see. partition-wise join is able to remove eliminate partitions
on the other side of the join that can't be matched to:

# set enable_partition_wise_join=on;
SET
# explain select count(*) from ab ab1 inner join ab ab2 on ab1.a=ab2.a
where ab1.a between 1 and 2 and ab2.a between 100000000 and 100000001;                  QUERY PLAN
------------------------------------------------Aggregate  (cost=0.00..0.01 rows=1 width=8)  ->  Result
(cost=0.00..0.00rows=0 width=0)        One-Time Filter: false
 
(3 rows)

# \d+ ab                                   Table "public.ab"Column |  Type   | Collation | Nullable | Default | Storage
|Stats
 
target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------a      | integer |
|         |         | plain   |              |b      | integer |           |          |         | plain   |
|
 
Partition key: RANGE (a)
Partitions: ab_p1 FOR VALUES FROM (1) TO (100000000),           ab_p2 FOR VALUES FROM (100000000) TO (200000000)



-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> On 27 October 2017 at 01:44, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> I think Antonin has a point. The join processing may deem some base
>> relations dummy (See populate_joinrel_with_paths()). So an Append
>> relation which had multiple child alive at the end of
>> set_append_rel_size() might ultimately have only one child after
>> partition-wise join has worked on it.

TBH, that means partition-wise join planning is broken and needs to be
redesigned.  It's impossible that we're going to make sane planning
choices if the sizes of relations change after we've already done some
planning work.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Ashutosh Bapat
Date:
On Fri, Oct 27, 2017 at 12:49 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 27 October 2017 at 01:44, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> I think Antonin has a point. The join processing may deem some base
>> relations dummy (See populate_joinrel_with_paths()). So an Append
>> relation which had multiple child alive at the end of
>> set_append_rel_size() might ultimately have only one child after
>> partition-wise join has worked on it.
>
> hmm, I see. partition-wise join is able to remove eliminate partitions
> on the other side of the join that can't be matched to:
>
> # set enable_partition_wise_join=on;
> SET
> # explain select count(*) from ab ab1 inner join ab ab2 on ab1.a=ab2.a
> where ab1.a between 1 and 2 and ab2.a between 100000000 and 100000001;
>                    QUERY PLAN
> ------------------------------------------------
>  Aggregate  (cost=0.00..0.01 rows=1 width=8)
>    ->  Result  (cost=0.00..0.00 rows=0 width=0)
>          One-Time Filter: false
> (3 rows)
>
> # \d+ ab
>                                     Table "public.ab"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> --------+---------+-----------+----------+---------+---------+--------------+-------------
>  a      | integer |           |          |         | plain   |              |
>  b      | integer |           |          |         | plain   |              |
> Partition key: RANGE (a)
> Partitions: ab_p1 FOR VALUES FROM (1) TO (100000000),
>             ab_p2 FOR VALUES FROM (100000000) TO (200000000)
>

IIUC, ab1_p2 and ab2_p1 are marked dummy by constraint exclusion.
Because of partition-wise join when the corresponding partitions are
joined, their inner joins are marked dummy resulting in an empty join.
This isn't the case of a join processing marking one of the joining
relations dummy, that I was referring to. But I think it's relevant
here since partition-wise join might create single child joins this
way.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Ashutosh Bapat
Date:
On Fri, Oct 27, 2017 at 1:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> On 27 October 2017 at 01:44, Ashutosh Bapat
>> <ashutosh.bapat@enterprisedb.com> wrote:
>>> I think Antonin has a point. The join processing may deem some base
>>> relations dummy (See populate_joinrel_with_paths()). So an Append
>>> relation which had multiple child alive at the end of
>>> set_append_rel_size() might ultimately have only one child after
>>> partition-wise join has worked on it.
>
> TBH, that means partition-wise join planning is broken and needs to be
> redesigned.  It's impossible that we're going to make sane planning
> choices if the sizes of relations change after we've already done some
> planning work.

The mark_dummy_rel() call that marks a joining relation dummy was
added by 719012e013f10f72938520c46889c52df40fa329. The joining
relations are planned before the joins they participate in are
planned. Further some of the joins in which it participates might have
been planned before the joining pair, which causes it to be marked
dummy, is processed by populate_joinrel_with_paths(). So we already
have places where the sizes of relation changes during planning.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 27 October 2017 at 04:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> It seems like a good idea for the planner not to generate Append or
>> MergeAppend paths when the node contains just a single subpath.
>> ...
>> Perhaps there is also a "Method 3" which I've not thought about which
>> would be much cleaner.
>
> Have you considered turning AppendPath with a single child into more
> of a special case?  I think this is morally equivalent to your ProxyPath
> proposal, but it would take less new boilerplate code to support.
> If you taught create_append_path to inherit the child's pathkeys
> when there's only one child, and taught createplan.c to skip making the
> useless Append node, I think you might be nearly done.

That seems reasonable, although after going and implementing this, it
does not seem quite as convenient as dummy paths are. I had to make
the  AppendPath carries a bit more weight than I originally thought.
It now stores translate to/from expressions and also a single bool
(isproxy) to mark if it's a proxy type Append. I tried to do without
this bool and just check if there's a single subpath and some
translation expressions, but that was getting fooled by Paths which
had empty targetlists which meant there were no translations, and
going on the single subpath alone is no good as do we create append
paths more places than just add_paths_to_append_rel().

I've attached a patch which goes and implements all this. I think it
still needs a bit of work. I'm not 100% confident I'm translating
expressions in all the required places in createplan.c. I did modify
tenk1 to become the parent of a single partition and ran the
regression tests. There appear to be 4 subtle differences to querying
a parent with an only-child vs querying the child directly.

1. EXPLAIN VERBOSE shows the columns prefixed by the table name. This
is caused by "useprefix = list_length(es->rtable) > 1;" in explain.c.
2. There's a small change in an error message in portals.out. "ERROR:
cursor "c" is not a simply updatable scan of table "tenk1a"" now
mentions the child table instead of the parent. Nothing to worry about
as it appears to already do this with native partitioning.
3. Some plans are parallelized that were not before (see join.out).
This is caused by compute_parallel_worker() having a special case for
non-baserels.
4. Some gating plans don't work the same as they used to, this is
highlighted in aggregates.out and is caused by child rel Const TRUE
baserestrictinfo clauses being dropped in set_append_rel_size.

For differences 2-4, I've attached another file which modifies tenk1
to become a partitioned table with a single partition. I've modified
all the expected results for #1 to reduce the noise, so 3 tests are
still failing for reasons 2-4. I'm not at all concerned with #2. #3
might not be worth fixing, I've just no idea how yet and #4 seems like
it's on purpose.

Other things I'm not 100% happy with are; in prepunion.c I had to
invent a new function named find_all_appinfos_for_child, which is a
bit similar to adjust_appendrel_attrs_multilevel, only it modifies the
attributes, but the use I have for find_all_appinfos_for_child
includes also calling add_child_rel_equivalences for the children at
each level. Probably a bit more thinking will come up with some way to
modify the existing function sets to be more reusable, I've just not
done that yet.

I decided that in createplan.c to have a generic expression translator
rather than something that only understands AppendRelInfos. The reason
for this is two-fold:

1. There may be multiple AppendRelInfos required for a single
translation between a parent and child if there are multiple other
children in between. Seems wasteful to have intermediate translations.
2. I have other uses in mind for this feature that are not Append
child relations.

A sanity check on what I have would be most welcome if anyone has time.

I'm going to add this to the November commitfest.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 1 November 2017 at 02:47, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 27 October 2017 at 04:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Have you considered turning AppendPath with a single child into more
>> of a special case?  I think this is morally equivalent to your ProxyPath
>> proposal, but it would take less new boilerplate code to support.
>> If you taught create_append_path to inherit the child's pathkeys
>> when there's only one child, and taught createplan.c to skip making the
>> useless Append node, I think you might be nearly done.

I've made another pass over this patch today to try to ensure it's in
a better state. I'm away on leave until the end of the month starting
from this Friday, so likely I won't get any time to resolve any issues
found until maybe the last day of the month.

I've attached an updated version of the patch which fixes a bug that I
discovered today where I had neglected to translate the quals to a
grouping set. I noticed this when I made all the main regression
tables partitioned tables containing a single child partition with
MINVALUE to MAXVALUE range. The fact that all plans now appear to work
as planned after having removed the Append has given me a bit more
confidence that I've not missed any places to translate Vars, but I'm
still not 100% certain. I just can't think of any other way or testing
that at the moment. Any that I have missed should result in an error
like "variable not found in subplan target list"

I've also modified the patch to make use of some existing functions in
prepunion.c which gather up AppendRelInfos for a childrel. I'd
previously invented my own function because the existing ones were not
quite exactly what I needed. I'd planned to come back to it, but only
just got around to fixing that.

Apart from that, the changes are just in the code comments.

The remove_singleton_appends_examples_of_differences_2017-11-15.patch
which I've attached applies changes to the regression tests to make
many of the major tables partitioned tables with a single partition.
It also changes the expected results of the small insignificant plan
changes and leaves only the actual true differences. This file is not
intended for commit, but more just a demo of it working for a larger
variety of plan shapes. There is actually no additional tests with the
patch. It relies on the places in the existing tests which have
changed.  I didn't add any extra as I can't think of any particular
area to target given that it's such a generic thing that can apply to
so many different cases.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Michael Paquier
Date:
On Wed, Nov 15, 2017 at 3:17 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> The remove_singleton_appends_examples_of_differences_2017-11-15.patch
> which I've attached applies changes to the regression tests to make
> many of the major tables partitioned tables with a single partition.
> It also changes the expected results of the small insignificant plan
> changes and leaves only the actual true differences. This file is not
> intended for commit, but more just a demo of it working for a larger
> variety of plan shapes. There is actually no additional tests with the
> patch. It relies on the places in the existing tests which have
> changed.  I didn't add any extra as I can't think of any particular
> area to target given that it's such a generic thing that can apply to
> so many different cases.

The last patch set does not apply and did not get any reviews. So I am
moving this item to the next with waiting on author as status.
-- 
Michael


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 30 November 2017 at 15:34, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Nov 15, 2017 at 3:17 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> The remove_singleton_appends_examples_of_differences_2017-11-15.patch
> which I've attached applies changes to the regression tests to make
> many of the major tables partitioned tables with a single partition.
> It also changes the expected results of the small insignificant plan
> changes and leaves only the actual true differences. This file is not
> intended for commit, but more just a demo of it working for a larger
> variety of plan shapes. There is actually no additional tests with the
> patch. It relies on the places in the existing tests which have
> changed.  I didn't add any extra as I can't think of any particular
> area to target given that it's such a generic thing that can apply to
> so many different cases.

The last patch set does not apply and did not get any reviews. So I am
moving this item to the next with waiting on author as status.

(returning from 2 weeks leave)

Thanks for managing the commitfest.

I've attached a patch which fixes the conflict with the regression test expected output in inherits.out. I've also made changes to the expected output in the new partition_prune test's expected output.

I'll set this back to waiting on review now.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 30 November 2017 at 16:04, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I've attached a patch which fixes the conflict with the regression test
> expected output in inherits.out. I've also made changes to the expected
> output in the new partition_prune test's expected output.

I've attached an updated patch which fixes the conflicts caused by ab727167.

Also, I've attached a patch which is not intended for commit which
shows the subtle differences between directly querying a partitioned
table with a single remaining leaf partition to scan vs directly
querying the leaf partition itself. The regression test failures seen
with both patches applied highlight the differences.

While rebasing this today I also noticed that we won't properly detect
unique joins in add_paths_to_joinrel() as we're still testing for
uniqueness against the partitioned parent rather than the only child.
This is likely not a huge problem since we'll always get a false
negative and never a false positive, but it is a missing optimisation.
I've not thought of how to solve it yet, it's perhaps not worth going
to too much trouble over.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Ashutosh Bapat
Date:
On Thu, Dec 7, 2017 at 5:11 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
>
> While rebasing this today I also noticed that we won't properly detect
> unique joins in add_paths_to_joinrel() as we're still testing for
> uniqueness against the partitioned parent rather than the only child.
> This is likely not a huge problem since we'll always get a false
> negative and never a false positive, but it is a missing optimisation.
> I've not thought of how to solve it yet, it's perhaps not worth going
> to too much trouble over.
>

It's only the jointrelids that are parent's relids, but all the
following tests for unique join use RelOptInfo::relids which are child
relids.

        case JOIN_UNIQUE_INNER:
            extra.inner_unique = bms_is_subset(sjinfo->min_lefthand,
                                               outerrel->relids);
            break;
        case JOIN_UNIQUE_OUTER:
            extra.inner_unique = innerrel_is_unique(root,
                                                    outerrel->relids,
                                                    innerrel,
                                                    JOIN_INNER,
                                                    restrictlist,
                                                    false);
            break;
        default:
            extra.inner_unique = innerrel_is_unique(root,
                                                    outerrel->relids,
                                                    innerrel,
                                                    jointype,
                                                    restrictlist,
                                                    false);
            break;

Do you have a testcase, which shows the problem?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 11 December 2017 at 21:18, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Thu, Dec 7, 2017 at 5:11 AM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> While rebasing this today I also noticed that we won't properly detect
>> unique joins in add_paths_to_joinrel() as we're still testing for
>> uniqueness against the partitioned parent rather than the only child.
>> This is likely not a huge problem since we'll always get a false
>> negative and never a false positive, but it is a missing optimisation.
>> I've not thought of how to solve it yet, it's perhaps not worth going
>> to too much trouble over.
>>
>
[...]

> Do you have a testcase, which shows the problem?

I do indeed:

create table p (a int not null) partition by range (a);
create table p1 partition of p for values from (minvalue) to (maxvalue);
create unique index on p1 (a);
create table t (a int not null);

insert into p values(1),(2);
insert into t values(1);

analyze p;
analyze t;

explain (verbose, costs off) select * from p inner join t on p.a = t.a;
         QUERY PLAN
-----------------------------
 Nested Loop
   Output: p1.a, t.a
   Join Filter: (p1.a = t.a)
   ->  Seq Scan on public.t
         Output: t.a
   ->  Seq Scan on public.p1
         Output: p1.a
(7 rows)

explain (verbose, costs off) select * from p1 inner join t on p1.a = t.a;
         QUERY PLAN
-----------------------------
 Nested Loop
   Output: p1.a, t.a
   Inner Unique: true
   Join Filter: (p1.a = t.a)
   ->  Seq Scan on public.t
         Output: t.a
   ->  Seq Scan on public.p1
         Output: p1.a
(8 rows)

Notice that when we join to the partitioned table directly and the
Append is removed, we don't get the "Inner Unique: true"

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Ashutosh Bapat
Date:
On Mon, Dec 11, 2017 at 2:42 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 11 December 2017 at 21:18, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> On Thu, Dec 7, 2017 at 5:11 AM, David Rowley
>> <david.rowley@2ndquadrant.com> wrote:
>>> While rebasing this today I also noticed that we won't properly detect
>>> unique joins in add_paths_to_joinrel() as we're still testing for
>>> uniqueness against the partitioned parent rather than the only child.
>>> This is likely not a huge problem since we'll always get a false
>>> negative and never a false positive, but it is a missing optimisation.
>>> I've not thought of how to solve it yet, it's perhaps not worth going
>>> to too much trouble over.
>>>
>>
> [...]
>
>> Do you have a testcase, which shows the problem?
>
> I do indeed:
>
> create table p (a int not null) partition by range (a);
> create table p1 partition of p for values from (minvalue) to (maxvalue);
> create unique index on p1 (a);
> create table t (a int not null);
>
> insert into p values(1),(2);
> insert into t values(1);
>
> analyze p;
> analyze t;
>
> explain (verbose, costs off) select * from p inner join t on p.a = t.a;
>          QUERY PLAN
> -----------------------------
>  Nested Loop
>    Output: p1.a, t.a
>    Join Filter: (p1.a = t.a)
>    ->  Seq Scan on public.t
>          Output: t.a
>    ->  Seq Scan on public.p1
>          Output: p1.a
> (7 rows)
>
> explain (verbose, costs off) select * from p1 inner join t on p1.a = t.a;
>          QUERY PLAN
> -----------------------------
>  Nested Loop
>    Output: p1.a, t.a
>    Inner Unique: true
>    Join Filter: (p1.a = t.a)
>    ->  Seq Scan on public.t
>          Output: t.a
>    ->  Seq Scan on public.p1
>          Output: p1.a
> (8 rows)
>
> Notice that when we join to the partitioned table directly and the
> Append is removed, we don't get the "Inner Unique: true"

Ok. I thought we don't use unique joins in partition-wise joins. Sorry
for the misunderstanding.

I think this didn't work with inheritance before partition-wise join
as well for the same reason.

Probably, we could do it if unique paths (sorted paths which can be
unique-ified) bubble up the Append hierarchy. We already have code in
add_paths_to_append_rel() to create sorted paths when even a single
child has an index on it.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 11 December 2017 at 22:23, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> I think this didn't work with inheritance before partition-wise join
> as well for the same reason.

Yeah, I was just demonstrating a reason that the plans are not
identical from querying a partitioned table when partition pruning
eliminates all but one partition vs directly querying that single
partition. I've been attaching some regression test changes which
demonstrate some other subtle differences along with each version of
the patch. I just wanted to be open with these subtle differences that
I've encountered while working on this.

> Probably, we could do it if unique paths (sorted paths which can be
> unique-ified) bubble up the Append hierarchy. We already have code in
> add_paths_to_append_rel() to create sorted paths when even a single
> child has an index on it.

Sometime in the future, I'd like to see some sort of UniqueKeys List
in RelOptInfo which is initially populated by using looking at the
unique indexes during build_simple_rel(). The idea is that these
UniqueKeys would get propagated into RELOPT_JOINRELs when the join
condition matches a set of unique keys on the other side of the join.
e.g. If the outer side of the join had a UniqueKey matching the join
condition then we could take all of the UniqueKeys from the RelOptInfo
for the inner side of the join (and vice versa). This infrastructure
would allow us to no-op a DISTINCT when the RelOptInfo has targetlist
items which completely cover at least one UniqueKey set, or not bother
performing a sort for a GROUP BY when the GROUP BY clause is covered
by a UniqueKey.

To fix the case we're discussing, we can also propagate those
UniqueKeys to an AppendPath with a single subpath similar to how I'm
propagating the PathKeys for the same case in this patch, that way the
unique join code would find the UniqueKeys instead of what it does
today and not find any unique indexes on the partitioned table.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Ashutosh Bapat
Date:
On Mon, Dec 11, 2017 at 3:16 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
>
> Sometime in the future, I'd like to see some sort of UniqueKeys List
> in RelOptInfo which is initially populated by using looking at the
> unique indexes during build_simple_rel(). The idea is that these
> UniqueKeys would get propagated into RELOPT_JOINRELs when the join
> condition matches a set of unique keys on the other side of the join.
> e.g. If the outer side of the join had a UniqueKey matching the join
> condition then we could take all of the UniqueKeys from the RelOptInfo
> for the inner side of the join (and vice versa). This infrastructure
> would allow us to no-op a DISTINCT when the RelOptInfo has targetlist
> items which completely cover at least one UniqueKey set, or not bother
> performing a sort for a GROUP BY when the GROUP BY clause is covered
> by a UniqueKey.
>

That looks neat.

> To fix the case we're discussing, we can also propagate those
> UniqueKeys to an AppendPath with a single subpath similar to how I'm
> propagating the PathKeys for the same case in this patch, that way the
> unique join code would find the UniqueKeys instead of what it does
> today and not find any unique indexes on the partitioned table.
>

Another possibility is to support partition-wise join between an
unpartitioned table and a partitioned table by joining the
unpartitioned table with each partition separately and appending the
results. That's a lot of work and quite some new infrastructure. Each
of those join will pick unique key if appropriate index exists on the
partitions.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
I've attached a rebased patch.

The previous patch was conflicting with parallel Hash Join (180428404)

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 28 December 2017 at 18:31, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I've attached a rebased patch.
>
> The previous patch was conflicting with parallel Hash Join (180428404)

And another one. Thanks to the patch tester [1], I realised that I
didn't make check-world and there was a postgres_fdw test that was
failing.

The attached fixes.

[1] http://commitfest.cputube.org/

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
I've attached an updated patch which takes care of the build failure
caused by the new call to create_append_path added in bb94ce4.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a singlesubpath

From
Tomas Vondra
Date:
Hi,

On 01/25/2018 01:55 PM, David Rowley wrote:
> I've attached an updated patch which takes care of the build failure 
> caused by the new call to create_append_path added in bb94ce4.
> 

I've been looking at the patch over the past few days. I haven't found
any obvious issues with it, but I do have a couple of questions.

1) I can confirm that it indeed eliminates the Append overhead, using
the example from [1], modified to use table with a single partition. Of
course, this is a pretty formal check, because the patch simply removes
the Append node and so it'd be utterly broken if the overhead did not
disappear too.

[1]
https://www.postgresql.org/message-id/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B%3DZRh-rxy9qxfPA5Gw%40mail.gmail.com


2) If I had to guess where the bugs will be, my guess would be a missing
finalize_plan_tlist call - not necessarily in existing code, but perhaps
in future improvements.

I wonder if we could automate this call somehow, say by detecting when
create_plan_recurse is called after build_path_tlist (for a given path),
and if needed calling finalize_plan_tlist from create_plan_recurse.

Although, now that I look at it, promote_child_relation may be doing
something like that by adding stuff to the translate_* variables. I'm
not quite sure why we need to append to the lists, though?


3) I'm not quite sure what replace_translatable_exprs does, or more
precisely why do we actually need it. At this point the comment pretty
much says "We need to do translation. This function does translation."
Perhaps it should mention why removing/skipping the AppendPath implies
the need for translation or something?


4) The thread talks about "[Merge]Append" but the patch apparently only
tweaks create_append_path and does absolutely nothing to "merge" case.
While looking into why, I notices that generate_mergeappend_paths always
gets all_child_pathkeys=NULL in this case (with single subpath), and so
we never create the MergePath at all. I suppose there's some condition
somewhere responsible for doing that, but I haven't found it ...


5) The comment before AppendPath typedef says this:

 * An AppendPath with a single subpath can be set up to become a "proxy"
 * path. This allows a Path which belongs to one relation to be added to
 * the pathlist of some other relation.  This is intended as generic
 * infrastructure, but its primary use case is to allow Appends with
 * only a single subpath to be removed from the final plan.

I'm not quite sure how adding a 'isproxy' flag into AppendPath node
makes this a generic infrastructure? Wouldn't an explicit ProxyPath be a
better way to achieve that?

What other proxy paths do you envision, and why should they be
represented as AppendPath? Although, there seems to be a precedent in
using AppendPath to represent dummy paths ...

FWIW one advantage of a separate ProxyPath would be that it would be a
much clearer breakage for third-party code (say, extensions tweaking
paths using hooks), either at compile or execution time. With just a new
flag in existing node, that may easily slip through. On the other hand
no one promises the structures to be stable API ...


6) I suggest we add this assert to create_append_path:

    Assert(!isproxy || (isproxy && (list_length(subpaths)==1)));

and perhaps we should do s/isproxy/is_proxy/ which seems like the usual
naming for boolean variables.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 19 February 2018 at 15:11, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> 1) I can confirm that it indeed eliminates the Append overhead, using
> the example from [1], modified to use table with a single partition. Of
> course, this is a pretty formal check, because the patch simply removes
> the Append node and so it'd be utterly broken if the overhead did not
> disappear too.
>
> [1]
> https://www.postgresql.org/message-id/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B%3DZRh-rxy9qxfPA5Gw%40mail.gmail.com

Thanks for testing that.

> 2) If I had to guess where the bugs will be, my guess would be a missing
> finalize_plan_tlist call - not necessarily in existing code, but perhaps
> in future improvements.

I agree with that. I'd say though that additionally for the future
that we might end up with some missing translation calls to
replace_translatable_exprs().

To try to ensure I didn't miss anything, I did previously modify the
regression test tables "tenk" and "tenk1" to become partitioned tables
with a single partition which allowed all possible values to try to
ensure I'd not missed anywhere. I just failed to find a reasonable way
to make this a more permanent test without enforcing that change on
the tables as a permanent change.

> I wonder if we could automate this call somehow, say by detecting when
> create_plan_recurse is called after build_path_tlist (for a given path),
> and if needed calling finalize_plan_tlist from create_plan_recurse.

Sounds nice, but I'm unsure how we could do that.

> Although, now that I look at it, promote_child_relation may be doing
> something like that by adding stuff to the translate_* variables.

Do you mean this?

/*
* Record this child as having been promoted.  Some places treat child
* relations in a special way, and this will give them a VIP ticket to
* adulthood, where required.
*/
root->translated_childrelids =
bms_add_members(root->translated_childrelids, child->relids);

That's to get around a problem in use_physical_tlist() where there's
different behaviour for RELOPT_OTHER_MEMBER_REL than there is for
RELOPT_BASEREL. Another option might be to instead change the
RELOPT_OTHER_MEMBER_REL  into  RELOPT_BASEREL, although I was a little
too scared that might result in many other areas requiring being
changed. I may be wrong about that though. We do, for example,
occasionally change a RELOPT_BASEREL into a RELOPT_DEADREL, so
RelOptInfos changing their RelOptKind is not a new concept, so perhaps
it's fine.

> I'm
> not quite sure why we need to append to the lists, though?

Multiple Append rels may be replaced by their only-child relation, so
we'd need to be able to translate the targetlists for both. For
example, joins above the Appends may contain Vars which belong to
either of the Appends and those need to be translated into the
promoted child relation's Vars.

> 3) I'm not quite sure what replace_translatable_exprs does, or more
> precisely why do we actually need it. At this point the comment pretty
> much says "We need to do translation. This function does translation."
> Perhaps it should mention why removing/skipping the AppendPath implies
> the need for translation or something?

It does mention "Expr translation is required to translate the
targetlists of nodes above the Append", but perhaps I can make that a
bit more clear.

Let's say we have a query such as:

SELECT * FROM fact f INNER JOIN dim1 d ON f.dim1_id = d.id WHERE
f.date BETWEEN '2018-01-01' AND '2018-01-31';

Which results in a hash join and a single Jan 2018 partition being
scanned for "fact". A plan without the Append node having been removed
would have the Append targetlist containing the Vars from the "fact"
table, as would the Hash Join node... The problem this function is
trying to solve is translating the Vars in the Hash Join node (or any
node above the Append) to change the "fact" Vars into "fact_jan2018"
Vars. In create_plan.c we skip over any isproxy Append paths and
instead return the recursive call to the single proxy-path. Without
the translation we'd run into problems when trying to find Vars in the
targetlists later.

> 4) The thread talks about "[Merge]Append" but the patch apparently only
> tweaks create_append_path and does absolutely nothing to "merge" case.
> While looking into why, I notices that generate_mergeappend_paths always
> gets all_child_pathkeys=NULL in this case (with single subpath), and so
> we never create the MergePath at all. I suppose there's some condition
> somewhere responsible for doing that, but I haven't found it ...

Yeah, only Append paths are used as proxy paths. The confusion here is
that the path creation logic has also been changed in allpaths.c (see
generate_proxy_paths()), so that we no longer build MergeAppend paths
when there's a single subpath. It's probably just the fact that Append
is being hi-jacked to act as ProxyPath that's causing the confusion
there. If you look over generate_proxy_paths and where it gets called
from, you'll see that we, instead of creating Append/MergeAppend
paths, we just add each path from pathlist and partial_pathlist from
the only-child subpath to the Append rel. This what gives the planner
the same flexibility to generate a plan as if the only child partition
had been written the query, instead of the parent.

> 5) The comment before AppendPath typedef says this:
>
>  * An AppendPath with a single subpath can be set up to become a "proxy"
>  * path. This allows a Path which belongs to one relation to be added to
>  * the pathlist of some other relation.  This is intended as generic
>  * infrastructure, but its primary use case is to allow Appends with
>  * only a single subpath to be removed from the final plan.
>
> I'm not quite sure how adding a 'isproxy' flag into AppendPath node
> makes this a generic infrastructure? Wouldn't an explicit ProxyPath be a
> better way to achieve that?

It's supposed to be generic in the sense that I didn't aim to code it
just to allow Appends with a single subpath to build a plan with the
single subpath instead of the Append. I'll give an example a bit
below.

It evolved that way due to an email early on in this thread. I
proposed 2 ways to do this, which included ProxyPath as 1 option. Tom
proposed hi-jacking Append to save on the additional boilerplate code.
At the time to implement that, I imagined that I could have gotten
away with just adding the two translation Lists to the AppendPath
struct, and have code determine it's a proxy rather than an Append by
the fact that the Lists are non-empty, but that fails due to lack of
anything to translate for SELECT COUNT(*) FROM ... type queries which
have empty target lists.

I don't have a huge preference as to which gets used here, but I don't
want to change it just to have to change it back again later.
Although, I'm not ruling out the fact that Tom might change his mind
when he sees that I had to add an "isproxy" flag to AppendPath to get
the whole thing to work. It's certainly not quite as convenient as how
dummy paths work.

> What other proxy paths do you envision, and why should they be
> represented as AppendPath?

For example:

CREATE MATERIALIZED VIEW mytable_count AS SELECT COUNT(*) FROM mytable;

which would allow queries such as:

SELECT COUNT(*) FROM mytable;

to have an isproxy AppendPath added to the pathlist for the RelOptInfo
for mytable which just performs a seqscan on mytable_count instead. It
would likely come out a much cheaper Path. The translation exprs, in
this case would translate the COUNT(*) Aggref into the Var belonging
to mytable_count.

Of course, there are other unrelated reasons why that particular
example can't work yet, but that was just an example of why I tried to
keep it generic.

> FWIW one advantage of a separate ProxyPath would be that it would be a
> much clearer breakage for third-party code (say, extensions tweaking
> paths using hooks), either at compile or execution time. With just a new
> flag in existing node, that may easily slip through. On the other hand
> no one promises the structures to be stable API ...

hmm true, but you could probably have said that for overloading an Agg
node to become Partial Aggregate and Finalize Aggregate nodes too.

> 6) I suggest we add this assert to create_append_path:
>
>     Assert(!isproxy || (isproxy && (list_length(subpaths)==1)));

That kind of indirectly exists already as two seperate Asserts(),
although, what I have is equivalent to:

Assert((!isproxy && translate_from == NIL) || (isproxy &&
(list_length(subpaths)==1)));

... as two separate Asserts(), which, if I'm not mistaken, is slightly
more strict than the one you mentioned.

> and perhaps we should do s/isproxy/is_proxy/ which seems like the usual
> naming for boolean variables.

You're right. I'll change that and post an updated patch. I'll also
reprocess your email again and try to improve some comments to make
the intent of the code more clear.

Thanks for the review!

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 19 February 2018 at 18:01, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 19 February 2018 at 15:11, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> and perhaps we should do s/isproxy/is_proxy/ which seems like the usual
>> naming for boolean variables.
>
> You're right. I'll change that and post an updated patch. I'll also
> reprocess your email again and try to improve some comments to make
> the intent of the code more clear.

I've made a few changes to the patch. "isproxy" is now "is_proxy".
I've made the comments a bit more verbose at the top of the definition
of struct AppendPath. Also shuffled some comments around in allpaths.c

I've attached both a delta patch with just these changes, and also a
completely new patch based on a recent master.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Robert Haas
Date:
On Mon, Feb 19, 2018 at 4:02 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 19 February 2018 at 18:01, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> On 19 February 2018 at 15:11, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>> and perhaps we should do s/isproxy/is_proxy/ which seems like the usual
>>> naming for boolean variables.
>>
>> You're right. I'll change that and post an updated patch. I'll also
>> reprocess your email again and try to improve some comments to make
>> the intent of the code more clear.
>
> I've made a few changes to the patch. "isproxy" is now "is_proxy".
> I've made the comments a bit more verbose at the top of the definition
> of struct AppendPath. Also shuffled some comments around in allpaths.c
>
> I've attached both a delta patch with just these changes, and also a
> completely new patch based on a recent master.

While I was working on
http://postgr.es/m/CA+TgmoakT5gmahbPWGqrR2nAdFOMAOnOXYoWHRdVfGWs34t6_A@mail.gmail.com
I noticed that a projection path is very close to being usable as a
proxy path, because we've already got code to strip out unnecessary
proxy paths at plan generation time.  I noticed that there were a few
problems with that which I wrote patches to fix (see 0001, 0002
attached to that email) but they seem to be only minor issues.

What do you think about the idea of using a projection path as a proxy
path instead of inventing a new method?  It seems simple enough to do:

new_path = (Path *) create_projection_path(root, new_rel, old_path,
old_path->pathtarget);

...when we need a proxy path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 14 March 2018 at 09:25, Robert Haas <robertmhaas@gmail.com> wrote:
> What do you think about the idea of using a projection path as a proxy
> path instead of inventing a new method?  It seems simple enough to do:
>
> new_path = (Path *) create_projection_path(root, new_rel, old_path,
> old_path->pathtarget);
>
> ...when we need a proxy path.

I'm very open to finding a better way to do this, but does that not
just handle the targetlist issue? The proxy path also carries
information which allows the translation of Vars in other places in
the plan from the old rel into the vars of the new rel. Join
conditions, sort clauses etc.

Wouldn't a ProjectionPath just need the same additional translation
fields that I've bolted onto AppendPath to make it work properly?

I've looked at the projection path code but got a bit confused with
the dummypp field. I see where it gets set, but not where it gets
checked. A comment in create_projection_plan() seems to explain that
dummypp is pretty useless since targetlists are modified sometimes, so
I'm a bit unsure what the point of it is. Maybe that just goes to show
that my understanding of projection paths is not that great.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Robert Haas
Date:
On Tue, Mar 13, 2018 at 8:20 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 14 March 2018 at 09:25, Robert Haas <robertmhaas@gmail.com> wrote:
>> What do you think about the idea of using a projection path as a proxy
>> path instead of inventing a new method?  It seems simple enough to do:
>>
>> new_path = (Path *) create_projection_path(root, new_rel, old_path,
>> old_path->pathtarget);
>>
>> ...when we need a proxy path.
>
> I'm very open to finding a better way to do this, but does that not
> just handle the targetlist issue? The proxy path also carries
> information which allows the translation of Vars in other places in
> the plan from the old rel into the vars of the new rel. Join
> conditions, sort clauses etc.
>
> Wouldn't a ProjectionPath just need the same additional translation
> fields that I've bolted onto AppendPath to make it work properly?

Well, I guess I'm not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Robert Haas
Date:
On Thu, Mar 15, 2018 at 9:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Wouldn't a ProjectionPath just need the same additional translation
>> fields that I've bolted onto AppendPath to make it work properly?
>
> Well, I guess I'm not sure.

Sorry, hit send too soon there.  I'm not sure I entirely understand
the purpose of those changes; I think the comments could use some
work.   For example:

+       /*
+        * First we must record the translation expressions in the PlannerInfo.
+        * These need to be found when the expression translation is being done
+        * when the final plan is being assembled.
+        */

+       /*
+        * Record this child as having been promoted.  Some places treat child
+        * relations in a special way, and this will give them a VIP ticket to
+        * adulthood, where required.
+        */

These comments just recapitulate what the code does, without really
explaining what problem we're trying to solve ("need" for unspecified
reasons, unspecified "special" handling).

That said, I gather that one problem is the path might contain
references to child varnos where we need to reference parent varnos.
That does seem like something we need to handle, but I'm not sure
whether this is really the right method.  I wonder if we couldn't
deduce the necessary translations from the parent pointers of the
paths.  That is, if we've got a proxy path associated with the parent
rel and the path it's proxying is associated with the child rel, then
we need to translate from realpath->parent->relids to
proxypath->parent_relids.

Not that this is an argument of overwhelming import, but note that
ExecSupportsMarkRestore wouldn't need to change if we used
ProjectionPath; that's already got the required handling.

If we stick with your idea of using AppendPath, do we actually need
generate_proxy_paths()?  What paths get lost if we don't have a
special case for them here?

I find promote_child_rel_equivalences() kind of scary.  It seems like
a change that might have unintended and hard-to-predict consequences.
Perhaps that's only my own lack of knowledge.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> That said, I gather that one problem is the path might contain
> references to child varnos where we need to reference parent varnos.
> That does seem like something we need to handle, but I'm not sure
> whether this is really the right method.  I wonder if we couldn't
> deduce the necessary translations from the parent pointers of the
> paths.  That is, if we've got a proxy path associated with the parent
> rel and the path it's proxying is associated with the child rel, then
> we need to translate from realpath->parent->relids to
> proxypath->parent_relids.

I hadn't been paying much attention to this thread, but I've now taken
a quick look at the 2018-02-19 patch, and I've got to say I do not like
it much.  The changes in createplan.c in particular seem like hack-and-
slash rather than anything principled or maintainable.

The core issue here is that Paths involving the appendrel and higher
will contain Vars referencing the appendrel's varno, whereas the child
is set up to emit Vars containing its own varno, and somewhere we've got
to match those up.  I don't think though that this problem is exactly
specific to single-member Appends, and so I would rather we not invent a
solution that's specific to that.  A nearly identical issue is getting
rid of no-op SubqueryScan nodes.  I've long wished we could simply not
emit those in the first place, but it's really hard to do because of
the fact that Vars inside the subquery have different varnos from those
outside.  (I've toyed with the idea of globally flattening the rangetable
before we start planning, not at the end, but haven't made it happen yet;
and anyway that would be only one step towards such a goal.)

It might be worth looking at whether we couldn't fix the single-member-
Append issue the same way we fix no-op SubqueryScans, ie let setrefs.c
get rid of them.  That's not the most beautiful solution perhaps, but
it'd be very localized and low-risk.

In general setrefs.c is the right place to deal with variable-matching
issues.  So even if you don't like that specific idea, it'd probably be
worth thinking about handling this by recording instructions telling
setrefs what to do, instead of actually doing it at earlier stages.

            regards, tom lane


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Robert Haas
Date:
On Thu, Mar 15, 2018 at 11:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It might be worth looking at whether we couldn't fix the single-member-
> Append issue the same way we fix no-op SubqueryScans, ie let setrefs.c
> get rid of them.  That's not the most beautiful solution perhaps, but
> it'd be very localized and low-risk.

That's definitely a thought; it's a probably the simplest way of
saving the run-time cost of the Append node.  However, I don't think
it's a great solution overall because it doesn't get us the other
advantages that David mentions in his original post.  I think that to
gain those advantages we'll need to know at path-creation time that
there won't ultimately be an Append node in the finished plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Mar 15, 2018 at 11:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It might be worth looking at whether we couldn't fix the single-member-
>> Append issue the same way we fix no-op SubqueryScans, ie let setrefs.c
>> get rid of them.  That's not the most beautiful solution perhaps, but
>> it'd be very localized and low-risk.

> That's definitely a thought; it's a probably the simplest way of
> saving the run-time cost of the Append node.  However, I don't think
> it's a great solution overall because it doesn't get us the other
> advantages that David mentions in his original post.  I think that to
> gain those advantages we'll need to know at path-creation time that
> there won't ultimately be an Append node in the finished plan.

Meh.  We could certainly know that by inspection ("only one child?
it'll be history").  I remain of the opinion that this is a big patch
with a small patch struggling to get out.

            regards, tom lane


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 16 March 2018 at 02:46, Robert Haas <robertmhaas@gmail.com> wrote:
> If we stick with your idea of using AppendPath, do we actually need
> generate_proxy_paths()?  What paths get lost if we don't have a
> special case for them here?

Actually, we do a surprisingly good job of allowing plan shapes to
stay the same when planning for an only-child scan for an Append or
MergeAppend.  The main difference I did find was that and Append and
MergeAppend don't support Mark and Restore, so if you just generated
the same paths and simply skipped over Appends and MergeAppends you'd
still be left with Materialize nodes which might not actually be
required at all.

This might not be huge, but seeing this made me worried that there
might be some valid reason, if not today, then sometime in the future
why it might not be safe to simply pluck the singleton
Append/MergeAppend nodes out the plan tree.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On 16 March 2018 at 04:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I hadn't been paying much attention to this thread, but I've now taken
> a quick look at the 2018-02-19 patch, and I've got to say I do not like
> it much.  The changes in createplan.c in particular seem like hack-and-
> slash rather than anything principled or maintainable.

Thanks for looking at this.  I didn't manage to discover any other
working solutions to when the Vars can be replaced. If we don't do
this in createplan.c then it's going to cause problems in places such
as apply_pathtarget_labeling_to_tlist, which is well before setrefs.c
gets hold of the plan.

> The core issue here is that Paths involving the appendrel and higher
> will contain Vars referencing the appendrel's varno, whereas the child
> is set up to emit Vars containing its own varno, and somewhere we've got
> to match those up.  I don't think though that this problem is exactly
> specific to single-member Appends, and so I would rather we not invent a
> solution that's specific to that.  A nearly identical issue is getting
> rid of no-op SubqueryScan nodes.  I've long wished we could simply not
> emit those in the first place, but it's really hard to do because of
> the fact that Vars inside the subquery have different varnos from those
> outside.  (I've toyed with the idea of globally flattening the rangetable
> before we start planning, not at the end, but haven't made it happen yet;
> and anyway that would be only one step towards such a goal.)

I'm not quite sure why you think the solution I came up with is
specific to single-member Appends. The solution merely uses
single-member Append paths as a proxy path for the solution which I've
tried to make generic to any node type. For example, the patch also
resolves the issue for MergeAppend, so certainly nothing in there is
specific to single-member Appends.  I could have made the proxy any
other path type, it's just that you had suggested Append would be
better than inventing ProxyPath, which is what I originally proposed.

> It might be worth looking at whether we couldn't fix the single-member-
> Append issue the same way we fix no-op SubqueryScans, ie let setrefs.c
> get rid of them.  That's not the most beautiful solution perhaps, but
> it'd be very localized and low-risk.

It might be possible, but wouldn't that only solve 1 out of 2
problems. The problem of the planner not generating the most optimal
plan is ignored with this. For example, it does not make much sense to
bolt a Materialize node on top of an IndexScan node in order to
provide the IndexScan with mark/restore capabilities... They already
allow that.

> In general setrefs.c is the right place to deal with variable-matching
> issues.  So even if you don't like that specific idea, it'd probably be
> worth thinking about handling this by recording instructions telling
> setrefs what to do, instead of actually doing it at earlier stages.

From what I can see, setrefs.c is too late. ERRORs are generated
before setrefs.c gets hold of the plan if we don't replace Vars.

I'm not opposed to finding a better way to do this.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
Now that the September 'fest is open for new patches I'm going to move
this patch over there.  This patch has become slightly less important
than some other stuff, but I'd still like to come back to it.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a singlesubpath

From
Michael Paquier
Date:
On Mon, Jul 02, 2018 at 09:37:31AM +1200, David Rowley wrote:
> Now that the September 'fest is open for new patches I'm going to move
> this patch over there.  This patch has become slightly less important
> than some other stuff, but I'd still like to come back to it.

Please note that the latest patch does not apply anymore, so I moved it
to CF 2018-11 (already this time of the year!), waiting for your input.
--
Michael

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On Mon, 2 Jul 2018 at 09:37, David Rowley <david.rowley@2ndquadrant.com> wrote:
> Now that the September 'fest is open for new patches I'm going to move
> this patch over there.  This patch has become slightly less important
> than some other stuff, but I'd still like to come back to it.

This had bit-rotted quite a bit, so I've rebased it on top of today's master.

There's not really any other changes to speak of. I've not reevaluated
the patch to see if there is any more simple way to take care of this
problem yet.  However, I do recall getting off to quite a few false
starts with this patch and the way I have it now was the only way I
saw to make it possible. Although, perhaps some more experience will
show me something different.

Anyway, I've attached the rebased version.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a singlesubpath

From
Tomas Vondra
Date:
On 4/1/18 9:26 AM, David Rowley wrote:
> On 16 March 2018 at 04:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I hadn't been paying much attention to this thread, but I've now taken
>> a quick look at the 2018-02-19 patch, and I've got to say I do not like
>> it much.  The changes in createplan.c in particular seem like hack-and-
>> slash rather than anything principled or maintainable.
> 
> Thanks for looking at this.  I didn't manage to discover any other
> working solutions to when the Vars can be replaced. If we don't do
> this in createplan.c then it's going to cause problems in places such
> as apply_pathtarget_labeling_to_tlist, which is well before setrefs.c
> gets hold of the plan.
> 
>> The core issue here is that Paths involving the appendrel and higher
>> will contain Vars referencing the appendrel's varno, whereas the child
>> is set up to emit Vars containing its own varno, and somewhere we've got
>> to match those up.  I don't think though that this problem is exactly
>> specific to single-member Appends, and so I would rather we not invent a
>> solution that's specific to that.  A nearly identical issue is getting
>> rid of no-op SubqueryScan nodes.  I've long wished we could simply not
>> emit those in the first place, but it's really hard to do because of
>> the fact that Vars inside the subquery have different varnos from those
>> outside.  (I've toyed with the idea of globally flattening the rangetable
>> before we start planning, not at the end, but haven't made it happen yet;
>> and anyway that would be only one step towards such a goal.)
> 
> I'm not quite sure why you think the solution I came up with is
> specific to single-member Appends. The solution merely uses
> single-member Append paths as a proxy path for the solution which I've
> tried to make generic to any node type. For example, the patch also
> resolves the issue for MergeAppend, so certainly nothing in there is
> specific to single-member Appends.  I could have made the proxy any
> other path type, it's just that you had suggested Append would be
> better than inventing ProxyPath, which is what I originally proposed.
> 

I think a natural question is how the approach devised in this thread
(based on single-member Append paths) could be used to deal with no-op
Subquery nodes. I don't see an obvious way to do that, and I guess
that's what Tom meant by "specific to single-member Appends".

>> It might be worth looking at whether we couldn't fix the single-member-
>> Append issue the same way we fix no-op SubqueryScans, ie let setrefs.c
>> get rid of them.  That's not the most beautiful solution perhaps, but
>> it'd be very localized and low-risk.
> 
> It might be possible, but wouldn't that only solve 1 out of 2
> problems. The problem of the planner not generating the most optimal
> plan is ignored with this. For example, it does not make much sense to
> bolt a Materialize node on top of an IndexScan node in order to
> provide the IndexScan with mark/restore capabilities... They already
> allow that.
> 

Yeah, I think we can't do all the magic in setrefs.c if we want the
decisions to affect plan shape in any significant way - the decision
whether an Append node has only a single node needs to happen while
constructing the path or shortly after it (before we build other paths
on top of it). And before costing the path.

So setrefs.c seems a too late for that :-(

I suppose this limitation was acceptable for no-op Subqueries, but I'm
not sure the same thing applies to single-member Appends.

>> In general setrefs.c is the right place to deal with variable-matching
>> issues.  So even if you don't like that specific idea, it'd probably be
>> worth thinking about handling this by recording instructions telling
>> setrefs what to do, instead of actually doing it at earlier stages.
> 
> From what I can see, setrefs.c is too late. ERRORs are generated
> before setrefs.c gets hold of the plan if we don't replace Vars.
> 
> I'm not opposed to finding a better way to do this.
> 

AFAICS the patch essentially does two things: (a) identifies Append
paths with a single member and (b) ensures the Vars are properly mapped
when the Append node is skipped when creating the plan.

I agree doing (a) in setrefs.c is too late, as mentioned earlier. But
perhaps we could do at least (b) to setrefs.c?

I'd say mapping the Vars is the most fragile and error-prone piece of
this patch. It's a bit annoying that it's inventing a new infrastructure
to do that, translates expressions in various places, establishes new
rules for tlists (having to call finalize_plan_tlist when calling
build_path_tlist before create_plan_recurse), etc.

But we already do such mappings (not exactly the same, but similar) for
other purposes - from setrefs.c. So why can't why do it the same way here?

FWIW I haven't tried to do any of that, and I haven't looked on this
patch for quite a long time, so perhaps I'm missing an obvious obstacle
preventing us from doing that ...


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On Thu, 3 Jan 2019 at 08:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> AFAICS the patch essentially does two things: (a) identifies Append
> paths with a single member and (b) ensures the Vars are properly mapped
> when the Append node is skipped when creating the plan.

Yes, but traditional Append and MergeAppend paths with a single member
are never actually generated.  I say "traditional" as we do happen to
use a single-subpath Append as the "proxy" path to represent a path
that needs to not have a plan node generated during createplan.
Originally I had a new Path type named "ProxyPath" to do this, but per
recommendation from Tom, I overloaded Append to mean that. (Append
already has its meaning overloaded in regards to dummy paths.)

> I agree doing (a) in setrefs.c is too late, as mentioned earlier. But
> perhaps we could do at least (b) to setrefs.c?

The problem I found with doing var translations in setrefs.c was that
the plan tree is traversed there breadth-first and in createplan.c we
build the plan depth-first.  The problem I didn't manage to overcome
with that is that when we replace a "ProxyPath" (actually an Append
path marked as is_proxy=true) it only alter targetlists, etc of nodes
above that in the plan tree. The nodes below it and their targetlists
don't care, as these don't fetch Vars from nodes above them.  If we do
the Var translation in setrefs then we've not yet looked at the nodes
that don't care and we've already done the nodes that do care. So the
tree traversal is backwards.  I sort of borrowed the traversal in
createplan.c since it was in the correct depth-first order.   Equally,
I could have invented an entirely new stage that traverses the path
tree depth-first, but I imagined that would have added more overhead
and raised even more questions. Piggy-backing on createplan seemed
like the best option at the time.

> I'd say mapping the Vars is the most fragile and error-prone piece of
> this patch. It's a bit annoying that it's inventing a new infrastructure
> to do that, translates expressions in various places, establishes new
> rules for tlists (having to call finalize_plan_tlist when calling
> build_path_tlist before create_plan_recurse), etc.

I entirely agree. It's by far the worst part of the patch.  Getting
the code to a working state to do this was hard. I kept finding places
that I'd missed the translation.  I'd rather it worked some other way.
I just don't yet know what that way is.  I changed the core regression
table "tenk" to become a partitioned table with a single partition in
the hope to catch all of the places that need the translations to be
performed.  I'm not entirely confident that I've caught them all by
doing this.

I've attached an updated patch that's really just a rebased version of
v8.  The old version conflicted with changes made in 1db5667b. The
only real change was to the commit message. I'd previously managed to
miss out the part about not generating needless Append/MergeAppend
paths can allow plan shapes that were previously not possible.  I'd
only mentioned the executor not having to pull tuples through these
nodes, which increases performance. Not having this perhaps caused
some of the confusion earlier in this thread.

One drawback to this patch, which I'm unsure if I previously mentioned
is that run-time pruning only works for Append and MergeAppend.  If we
remove the Append/MergeAppend node then the single Append/MergeAppend
sub-plan has no hope of being pruned.  Going from 1 to 0 sub-plans may
not be that common, but no longer supporting that is something that
needs to be considered.   I had imagined that gained flexibility of
plan shapes plus less executor overhead was better than that, but
there likely is a small use-case for keeping that ability.  It seems
impossible to have both advantages of this patch without that
disadvantage.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
I've attached an updated patch. The last one no longer applied due to
the changes made in d723f5687

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> On Thu, 3 Jan 2019 at 08:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> AFAICS the patch essentially does two things: (a) identifies Append
>> paths with a single member and (b) ensures the Vars are properly mapped
>> when the Append node is skipped when creating the plan.
>> I agree doing (a) in setrefs.c is too late, as mentioned earlier. But
>> perhaps we could do at least (b) to setrefs.c?

> The problem I found with doing var translations in setrefs.c was that
> the plan tree is traversed there breadth-first and in createplan.c we
> build the plan depth-first.  The problem I didn't manage to overcome
> with that is that when we replace a "ProxyPath" (actually an Append
> path marked as is_proxy=true) it only alter targetlists, etc of nodes
> above that in the plan tree. The nodes below it and their targetlists
> don't care, as these don't fetch Vars from nodes above them.  If we do
> the Var translation in setrefs then we've not yet looked at the nodes
> that don't care and we've already done the nodes that do care. So the
> tree traversal is backwards.

I don't quite buy this argument, because you haven't given a reason
why it doesn't apply with equal force to setrefs.c's elimination of
no-op SubqueryScan nodes.  Yet that does work.

Stepping back for a minute: even if we get this to work, how often
is it going to do anything useful?  It seems like most of the other
development that's going on is trying to postpone pruning till later,
so I wonder how often we'll really know at the beginning of planning
that only one child will survive.

            regards, tom lane


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On Sat, 26 Jan 2019 at 13:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > On Thu, 3 Jan 2019 at 08:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> >> AFAICS the patch essentially does two things: (a) identifies Append
> >> paths with a single member and (b) ensures the Vars are properly mapped
> >> when the Append node is skipped when creating the plan.
> >> I agree doing (a) in setrefs.c is too late, as mentioned earlier. But
> >> perhaps we could do at least (b) to setrefs.c?
>
> > The problem I found with doing var translations in setrefs.c was that
> > the plan tree is traversed there breadth-first and in createplan.c we
> > build the plan depth-first.  The problem I didn't manage to overcome
> > with that is that when we replace a "ProxyPath" (actually an Append
> > path marked as is_proxy=true) it only alter targetlists, etc of nodes
> > above that in the plan tree. The nodes below it and their targetlists
> > don't care, as these don't fetch Vars from nodes above them.  If we do
> > the Var translation in setrefs then we've not yet looked at the nodes
> > that don't care and we've already done the nodes that do care. So the
> > tree traversal is backwards.
>
> I don't quite buy this argument, because you haven't given a reason
> why it doesn't apply with equal force to setrefs.c's elimination of
> no-op SubqueryScan nodes.  Yet that does work.

I assume you're talking about the code that's in
set_subqueryscan_references() inside the trivial_subqueryscan()
condition?

If so, then that seems pretty different from what I'm doing here
because trivial_subqueryscan() only ever returns true when the
Vars/Exprs from the subquery's target list match the scan's targetlist
exactly, in the same order.  It's not quite that simple with the
single-subpath Append/MergeAppend case since the Vars/Exprs won't be
the same since they're from different relations and possibly in some
different order.

> Stepping back for a minute: even if we get this to work, how often
> is it going to do anything useful?  It seems like most of the other
> development that's going on is trying to postpone pruning till later,
> so I wonder how often we'll really know at the beginning of planning
> that only one child will survive.

It'll do something useful everytime the planner would otherwise
produce an Append or MergeAppend with a single subpath. Pruning down
to just 1 relation is pretty common, so seems worth optimising for.
There's no other development that postpones pruning until later. If
you're talking about run-time pruning then nothing is "postponed". We
still attempt to prune during planning, it's just that we might not be
able to prune in some cases until during execution, so we may do both.
  You make it sound like we're trying to do away with plan-time
pruning, but that's not happening, in fact, we're working pretty hard
to speed up the planner when plan-time pruning *can* be done. So far
there are no patches to speed up planner for partitioned tables when
the planner can't prune any partitions, although it is something that
I've been giving some thought to.

I've also attached a rebased patch.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On Thu, 31 Jan 2019 at 17:22, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I've also attached a rebased patch.

Rebased again.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On Tue, 5 Feb 2019 at 12:05, David Rowley <david.rowley@2ndquadrant.com> wrote:
> Rebased again.

and again, due to new create_append_path() call added in ab5fcf2b04f9.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment
David Rowley <david.rowley@2ndquadrant.com> writes:
> [ v13-0001-Forgo-generating-single-subpath-Append-and-Merge.patch ]

I continue to think that this is the wrong way to go about it,
and as proof of concept present the attached, which reproduces
all of the regression-test plan changes appearing in v13 ---
with a whole lot less mechanism and next to no added planning
cycles (which certainly cannot be said of v13).

I was a bit surprised to find that I didn't need to fool around
with lying about whether [Merge]Append can project.  I've not dug
into the exact reason why, but I suspect it's that previous changes
made in support of parallelization have resulted in ensuring that
we push the upper tlist down to the children anyway, at some earlier
stage.

I haven't looked into whether this does the right things for parallel
planning --- possibly create_[merge]append_path need to propagate up
parallel-related path fields from the single child?

Also, I wonder why you didn't teach ExecSupportsMarkRestore that a
single-child MergeAppend can support mark/restore.  I didn't add such
code here either, but I suspect that's an oversight.

One other remark is that the division of labor between
create_[merge]append_path and their costsize.c subroutines
seems pretty unprincipled.  I'd be inclined to push all the
relevant logic into costsize.c, but have not done so here.
Moving the existing cost calculations in create_mergeappend_path
into costsize.c would better be done as a separate refactoring patch,
perhaps.

            regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 42108bd..2be67bc 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8427,17 +8427,16 @@ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER J
  400 | 400 | 0008
 (4 rows)

--- left outer join + nullable clasue
-EXPLAIN (COSTS OFF)
+-- left outer join + nullable clause
+EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a)
WHEREt1.a < 10 ORDER BY 1,2,3; 
-                                    QUERY PLAN
------------------------------------------------------------------------------------
- Sort
-   Sort Key: t1.a, ftprt2_p1.b, ftprt2_p1.c
-   ->  Append
-         ->  Foreign Scan
-               Relations: (public.ftprt1_p1 t1) LEFT JOIN (public.ftprt2_p1 fprt2)
-(5 rows)
+
QUERYPLAN
      

+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
+   Output: t1.a, ftprt2_p1.b, ftprt2_p1.c
+   Relations: (public.ftprt1_p1 t1) LEFT JOIN (public.ftprt2_p1 fprt2)
+   Remote SQL: SELECT r6.a, r9.b, r9.c FROM (public.fprt1_p1 r6 LEFT JOIN public.fprt2_p1 r9 ON (((r6.a = r9.b)) AND
((r6.b= r9.a)) AND ((r9.a < 10)))) WHERE ((r6.a < 10)) ORDER BY r6.a ASC NULLS LAST, r9.b ASC NULLS LAST, r9.c ASC
NULLSLAST 
+(4 rows)

 SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a)
WHEREt1.a < 10 ORDER BY 1,2,3; 
  a | b |  c
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index eb9d1ad..4728511 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -2309,8 +2309,8 @@ EXPLAIN (COSTS OFF)
 SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE
t1.a% 25 =0 ORDER BY 1,2,3; 
 SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE
t1.a% 25 =0 ORDER BY 1,2,3; 

--- left outer join + nullable clasue
-EXPLAIN (COSTS OFF)
+-- left outer join + nullable clause
+EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a)
WHEREt1.a < 10 ORDER BY 1,2,3; 
 SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a)
WHEREt1.a < 10 ORDER BY 1,2,3; 

diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index 187f892..14ad738 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -447,6 +447,21 @@ ExecSupportsMarkRestore(Path *pathnode)
                 return false;    /* childless Result */
             }

+        case T_Append:
+            {
+                AppendPath *appendPath = castNode(AppendPath, pathnode);
+
+                /*
+                 * If there's exactly one child, then there will be no Append
+                 * in the final plan, so we can handle mark/restore if the
+                 * child plan node can.
+                 */
+                if (list_length(appendPath->subpaths) == 1)
+                    return ExecSupportsMarkRestore((Path *) linitial(appendPath->subpaths));
+                /* Otherwise, Append can't handle it */
+                return false;
+            }
+
         default:
             break;
     }
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 236f506..4c823e9 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1134,10 +1134,10 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path)
     }

     /*
-     * XXX ideally, if there's just one child, we'd not bother to generate an
-     * Append node but just return the single child.  At the moment this does
-     * not work because the varno of the child scan plan won't match the
-     * parent-rel Vars it'll be asked to emit.
+     * And build the Append plan.  Note that if there's just one child, the
+     * Append is pretty useless; but we wait till setrefs.c to get rid of it.
+     * Doing so here doesn't work because the varno of the child scan plan
+     * won't match the parent-rel Vars it'll be asked to emit.
      */

     plan = make_append(subplans, best_path->first_partial_path,
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 0213a37..4204ca4 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -94,12 +94,19 @@ static Plan *set_subqueryscan_references(PlannerInfo *root,
                             SubqueryScan *plan,
                             int rtoffset);
 static bool trivial_subqueryscan(SubqueryScan *plan);
+static Plan *clean_up_removed_plan_level(Plan *parent, Plan *child);
 static void set_foreignscan_references(PlannerInfo *root,
                            ForeignScan *fscan,
                            int rtoffset);
 static void set_customscan_references(PlannerInfo *root,
                           CustomScan *cscan,
                           int rtoffset);
+static Plan *set_append_references(PlannerInfo *root,
+                      Append *aplan,
+                      int rtoffset);
+static Plan *set_mergeappend_references(PlannerInfo *root,
+                           MergeAppend *mplan,
+                           int rtoffset);
 static Node *fix_scan_expr(PlannerInfo *root, Node *node, int rtoffset);
 static Node *fix_scan_expr_mutator(Node *node, fix_scan_expr_context *context);
 static bool fix_scan_expr_walker(Node *node, fix_scan_expr_context *context);
@@ -181,19 +188,22 @@ static List *set_returning_clause_references(PlannerInfo *root,
  * 8. We assign every plan node in the tree a unique ID.
  *
  * We also perform one final optimization step, which is to delete
- * SubqueryScan plan nodes that aren't doing anything useful (ie, have
- * no qual and a no-op targetlist).  The reason for doing this last is that
+ * SubqueryScan, Append, and MergeAppend plan nodes that aren't doing
+ * anything useful.  The reason for doing this last is that
  * it can't readily be done before set_plan_references, because it would
- * break set_upper_references: the Vars in the subquery's top tlist
- * wouldn't match up with the Vars in the outer plan tree.  The SubqueryScan
+ * break set_upper_references: the Vars in the child plan's top tlist
+ * wouldn't match up with the Vars in the outer plan tree.  A SubqueryScan
  * serves a necessary function as a buffer between outer query and subquery
  * variable numbering ... but after we've flattened the rangetable this is
  * no longer a problem, since then there's only one rtindex namespace.
+ * Likewise, Append and MergeAppend buffer between the parent and child vars
+ * of an appendrel, but we don't need to worry about that once we've done
+ * set_plan_references.
  *
  * set_plan_references recursively traverses the whole plan tree.
  *
  * The return value is normally the same Plan node passed in, but can be
- * different when the passed-in Plan is a SubqueryScan we decide isn't needed.
+ * different when the passed-in Plan is a node we decide isn't needed.
  *
  * The flattened rangetable entries are appended to root->glob->finalrtable.
  * Also, rowmarks entries are appended to root->glob->finalrowmarks, and the
@@ -897,71 +907,15 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
             }
             break;
         case T_Append:
-            {
-                Append       *splan = (Append *) plan;
-
-                /*
-                 * Append, like Sort et al, doesn't actually evaluate its
-                 * targetlist or check quals.
-                 */
-                set_dummy_tlist_references(plan, rtoffset);
-                Assert(splan->plan.qual == NIL);
-                foreach(l, splan->appendplans)
-                {
-                    lfirst(l) = set_plan_refs(root,
-                                              (Plan *) lfirst(l),
-                                              rtoffset);
-                }
-                if (splan->part_prune_info)
-                {
-                    foreach(l, splan->part_prune_info->prune_infos)
-                    {
-                        List       *prune_infos = lfirst(l);
-                        ListCell   *l2;
-
-                        foreach(l2, prune_infos)
-                        {
-                            PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-                            pinfo->rtindex += rtoffset;
-                        }
-                    }
-                }
-            }
-            break;
+            /* Needs special treatment, see comments below */
+            return set_append_references(root,
+                                         (Append *) plan,
+                                         rtoffset);
         case T_MergeAppend:
-            {
-                MergeAppend *splan = (MergeAppend *) plan;
-
-                /*
-                 * MergeAppend, like Sort et al, doesn't actually evaluate its
-                 * targetlist or check quals.
-                 */
-                set_dummy_tlist_references(plan, rtoffset);
-                Assert(splan->plan.qual == NIL);
-                foreach(l, splan->mergeplans)
-                {
-                    lfirst(l) = set_plan_refs(root,
-                                              (Plan *) lfirst(l),
+            /* Needs special treatment, see comments below */
+            return set_mergeappend_references(root,
+                                              (MergeAppend *) plan,
                                               rtoffset);
-                }
-                if (splan->part_prune_info)
-                {
-                    foreach(l, splan->part_prune_info->prune_infos)
-                    {
-                        List       *prune_infos = lfirst(l);
-                        ListCell   *l2;
-
-                        foreach(l2, prune_infos)
-                        {
-                            PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-                            pinfo->rtindex += rtoffset;
-                        }
-                    }
-                }
-            }
-            break;
         case T_RecursiveUnion:
             /* This doesn't evaluate targetlist or check quals either */
             set_dummy_tlist_references(plan, rtoffset);
@@ -1086,30 +1040,7 @@ set_subqueryscan_references(PlannerInfo *root,
         /*
          * We can omit the SubqueryScan node and just pull up the subplan.
          */
-        ListCell   *lp,
-                   *lc;
-
-        result = plan->subplan;
-
-        /* We have to be sure we don't lose any initplans */
-        result->initPlan = list_concat(plan->scan.plan.initPlan,
-                                       result->initPlan);
-
-        /*
-         * We also have to transfer the SubqueryScan's result-column names
-         * into the subplan, else columns sent to client will be improperly
-         * labeled if this is the topmost plan level.  Copy the "source
-         * column" information too.
-         */
-        forboth(lp, plan->scan.plan.targetlist, lc, result->targetlist)
-        {
-            TargetEntry *ptle = (TargetEntry *) lfirst(lp);
-            TargetEntry *ctle = (TargetEntry *) lfirst(lc);
-
-            ctle->resname = ptle->resname;
-            ctle->resorigtbl = ptle->resorigtbl;
-            ctle->resorigcol = ptle->resorigcol;
-        }
+        result = clean_up_removed_plan_level((Plan *) plan, plan->subplan);
     }
     else
     {
@@ -1191,6 +1122,30 @@ trivial_subqueryscan(SubqueryScan *plan)
 }

 /*
+ * clean_up_removed_plan_level
+ *        Do necessary cleanup when we strip out a SubqueryScan, Append, etc
+ *
+ * We are dropping the "parent" plan in favor of returning just its "child".
+ * A few small tweaks are needed.
+ */
+static Plan *
+clean_up_removed_plan_level(Plan *parent, Plan *child)
+{
+    /* We have to be sure we don't lose any initplans */
+    child->initPlan = list_concat(parent->initPlan,
+                                  child->initPlan);
+
+    /*
+     * We also have to transfer the parent's column labeling info into the
+     * child, else columns sent to client will be improperly labeled if this
+     * is the topmost plan level.  resjunk and so on may be important too.
+     */
+    apply_tlist_labeling(child->targetlist, parent->targetlist);
+
+    return child;
+}
+
+/*
  * set_foreignscan_references
  *       Do set_plan_references processing on a ForeignScan
  */
@@ -1341,6 +1296,131 @@ set_customscan_references(PlannerInfo *root,
 }

 /*
+ * set_append_references
+ *        Do set_plan_references processing on an Append
+ *
+ * We try to strip out the Append entirely; if we can't, we have
+ * to do the normal processing on it.
+ */
+static Plan *
+set_append_references(PlannerInfo *root,
+                      Append *aplan,
+                      int rtoffset)
+{
+    ListCell   *l;
+
+    /*
+     * Append, like Sort et al, doesn't actually evaluate its targetlist or
+     * check quals.  If it's got exactly one child plan, then it's not doing
+     * anything useful at all, and we can strip it out.
+     */
+    Assert(aplan->plan.qual == NIL);
+
+    /* First, we gotta recurse on the children */
+    foreach(l, aplan->appendplans)
+    {
+        lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
+    }
+
+    /* Now, if there's just one, forget the Append and return that child */
+    if (list_length(aplan->appendplans) == 1)
+        return clean_up_removed_plan_level((Plan *) aplan,
+                                           (Plan *) linitial(aplan->appendplans));
+
+    /*
+     * Otherwise, clean up the Append as needed.  It's okay to do this after
+     * recursing to the children, because set_dummy_tlist_references doesn't
+     * look at those.
+     */
+    set_dummy_tlist_references((Plan *) aplan, rtoffset);
+
+    if (aplan->part_prune_info)
+    {
+        foreach(l, aplan->part_prune_info->prune_infos)
+        {
+            List       *prune_infos = lfirst(l);
+            ListCell   *l2;
+
+            foreach(l2, prune_infos)
+            {
+                PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+                pinfo->rtindex += rtoffset;
+            }
+        }
+    }
+
+    /* We don't need to recurse to lefttree or righttree ... */
+    Assert(aplan->plan.lefttree == NULL);
+    Assert(aplan->plan.righttree == NULL);
+
+    return (Plan *) aplan;
+}
+
+/*
+ * set_mergeappend_references
+ *        Do set_plan_references processing on a MergeAppend
+ *
+ * We try to strip out the MergeAppend entirely; if we can't, we have
+ * to do the normal processing on it.
+ */
+static Plan *
+set_mergeappend_references(PlannerInfo *root,
+                           MergeAppend *mplan,
+                           int rtoffset)
+{
+    ListCell   *l;
+
+    /*
+     * MergeAppend, like Sort et al, doesn't actually evaluate its targetlist
+     * or check quals.  If it's got exactly one child plan, then it's not
+     * doing anything useful at all, and we can strip it out.
+     */
+    Assert(mplan->plan.qual == NIL);
+
+    /* First, we gotta recurse on the children */
+    foreach(l, mplan->mergeplans)
+    {
+        lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
+    }
+
+    /* Now, if there's just one, forget the MergeAppend and return that child */
+    if (list_length(mplan->mergeplans) == 1)
+        return clean_up_removed_plan_level((Plan *) mplan,
+                                           (Plan *) linitial(mplan->mergeplans));
+
+    /*
+     * Otherwise, clean up the MergeAppend as needed.  It's okay to do this
+     * after recursing to the children, because set_dummy_tlist_references
+     * doesn't look at those.
+     */
+    set_dummy_tlist_references((Plan *) mplan, rtoffset);
+
+    if (mplan->part_prune_info)
+    {
+        foreach(l, mplan->part_prune_info->prune_infos)
+        {
+            List       *prune_infos = lfirst(l);
+            ListCell   *l2;
+
+            foreach(l2, prune_infos)
+            {
+                PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+                pinfo->rtindex += rtoffset;
+            }
+        }
+    }
+
+    /* We don't need to recurse to lefttree or righttree ... */
+    Assert(mplan->plan.lefttree == NULL);
+    Assert(mplan->plan.righttree == NULL);
+
+    return (Plan *) mplan;
+}
+
+
+/*
  * copyVar
  *        Copy a Var node.
  *
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 169e51e..dde9e30 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1276,7 +1276,21 @@ create_append_path(PlannerInfo *root,

     Assert(!parallel_aware || pathnode->path.parallel_safe);

-    cost_append(pathnode);
+    /*
+     * If there's exactly one child path, the Append is a no-op and will be
+     * discarded later (in setrefs.c); therefore, we can inherit the child's
+     * size and cost.  Otherwise, do the normal costsize calculation.
+     */
+    if (list_length(pathnode->subpaths) == 1)
+    {
+        Path       *child = (Path *) linitial(pathnode->subpaths);
+
+        pathnode->path.rows = child->rows;
+        pathnode->path.startup_cost = child->startup_cost;
+        pathnode->path.total_cost = child->total_cost;
+    }
+    else
+        cost_append(pathnode);

     /* If the caller provided a row estimate, override the computed value. */
     if (rows >= 0)
@@ -1408,11 +1422,21 @@ create_merge_append_path(PlannerInfo *root,
         Assert(bms_equal(PATH_REQ_OUTER(subpath), required_outer));
     }

-    /* Now we can compute total costs of the MergeAppend */
-    cost_merge_append(&pathnode->path, root,
-                      pathkeys, list_length(subpaths),
-                      input_startup_cost, input_total_cost,
-                      pathnode->path.rows);
+    /*
+     * Now we can compute total costs of the MergeAppend.  If there's exactly
+     * one child path, the MergeAppend is a no-op and will be discarded later
+     * (in setrefs.c); otherwise we do the normal cost calculation.
+     */
+    if (list_length(subpaths) == 1)
+    {
+        pathnode->path.startup_cost = input_startup_cost;
+        pathnode->path.total_cost = input_total_cost;
+    }
+    else
+        cost_merge_append(&pathnode->path, root,
+                          pathkeys, list_length(subpaths),
+                          input_startup_cost, input_total_cost,
+                          pathnode->path.rows);

     return pathnode;
 }
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index 565d947..7518148 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1770,12 +1770,11 @@ explain (costs off) select * from list_parted;
 (4 rows)

 explain (costs off) select * from list_parted where a is null;
-           QUERY PLAN
---------------------------------
- Append
-   ->  Seq Scan on part_null_xy
-         Filter: (a IS NULL)
-(3 rows)
+        QUERY PLAN
+--------------------------
+ Seq Scan on part_null_xy
+   Filter: (a IS NULL)
+(2 rows)

 explain (costs off) select * from list_parted where a is not null;
            QUERY PLAN
@@ -1800,20 +1799,18 @@ explain (costs off) select * from list_parted where a in ('ab', 'cd', 'ef');
 (5 rows)

 explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd');
-                                      QUERY PLAN
----------------------------------------------------------------------------------------
- Append
-   ->  Seq Scan on part_ab_cd
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-(3 rows)
+                                   QUERY PLAN
+---------------------------------------------------------------------------------
+ Seq Scan on part_ab_cd
+   Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
+(2 rows)

 explain (costs off) select * from list_parted where a = 'ab';
-                QUERY PLAN
-------------------------------------------
- Append
-   ->  Seq Scan on part_ab_cd
-         Filter: ((a)::text = 'ab'::text)
-(3 rows)
+             QUERY PLAN
+------------------------------------
+ Seq Scan on part_ab_cd
+   Filter: ((a)::text = 'ab'::text)
+(2 rows)

 create table range_list_parted (
     a    int,
@@ -1893,12 +1890,11 @@ explain (costs off) select * from range_list_parted where a is null;

 /* Should only select rows from the null-accepting partition */
 explain (costs off) select * from range_list_parted where b is null;
-             QUERY PLAN
-------------------------------------
- Append
-   ->  Seq Scan on part_40_inf_null
-         Filter: (b IS NULL)
-(3 rows)
+          QUERY PLAN
+------------------------------
+ Seq Scan on part_40_inf_null
+   Filter: (b IS NULL)
+(2 rows)

 explain (costs off) select * from range_list_parted where a is not null and a < 67;
                    QUERY PLAN
@@ -2021,12 +2017,11 @@ explain (costs off) select * from mcrparted where a > -1;    -- scans all partition
 (15 rows)

 explain (costs off) select * from mcrparted where a = 20 and abs(b) = 10 and c > 10;    -- scans mcrparted4
-                        QUERY PLAN
------------------------------------------------------------
- Append
-   ->  Seq Scan on mcrparted4
-         Filter: ((c > 10) AND (a = 20) AND (abs(b) = 10))
-(3 rows)
+                     QUERY PLAN
+-----------------------------------------------------
+ Seq Scan on mcrparted4
+   Filter: ((c > 10) AND (a = 20) AND (abs(b) = 10))
+(2 rows)

 explain (costs off) select * from mcrparted where a = 20 and c > 20; -- scans mcrparted3, mcrparte4, mcrparte5,
mcrparted_def
                QUERY PLAN
@@ -2050,22 +2045,18 @@ create table parted_minmax1 partition of parted_minmax for values from (1) to (1
 create index parted_minmax1i on parted_minmax1 (a, b);
 insert into parted_minmax values (1,'12345');
 explain (costs off) select min(a), max(a) from parted_minmax where b = '12345';
-                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------
+                                           QUERY PLAN
+-------------------------------------------------------------------------------------------------
  Result
    InitPlan 1 (returns $0)
      ->  Limit
-           ->  Merge Append
-                 Sort Key: parted_minmax1.a
-                 ->  Index Only Scan using parted_minmax1i on parted_minmax1
-                       Index Cond: ((a IS NOT NULL) AND (b = '12345'::text))
+           ->  Index Only Scan using parted_minmax1i on parted_minmax1
+                 Index Cond: ((a IS NOT NULL) AND (b = '12345'::text))
    InitPlan 2 (returns $1)
      ->  Limit
-           ->  Merge Append
-                 Sort Key: parted_minmax1_1.a DESC
-                 ->  Index Only Scan Backward using parted_minmax1i on parted_minmax1 parted_minmax1_1
-                       Index Cond: ((a IS NOT NULL) AND (b = '12345'::text))
-(13 rows)
+           ->  Index Only Scan Backward using parted_minmax1i on parted_minmax1 parted_minmax1_1
+                 Index Cond: ((a IS NOT NULL) AND (b = '12345'::text))
+(9 rows)

 select min(a), max(a) from parted_minmax where b = '12345';
  min | max
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index bbdc373..e19535d 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -186,19 +186,18 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT 50 phv, * FROM prt1 WHERE prt1.b = 0)
 -- Join with pruned partitions from joining relations
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.a < 450 AND t2.b > 250 AND t1.b = 0 ORDER
BYt1.a, t2.b; 
-                        QUERY PLAN
------------------------------------------------------------
+                     QUERY PLAN
+-----------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Append
-         ->  Hash Join
-               Hash Cond: (t2.b = t1.a)
-               ->  Seq Scan on prt2_p2 t2
-                     Filter: (b > 250)
-               ->  Hash
-                     ->  Seq Scan on prt1_p2 t1
-                           Filter: ((a < 450) AND (b = 0))
-(10 rows)
+   ->  Hash Join
+         Hash Cond: (t2.b = t1.a)
+         ->  Seq Scan on prt2_p2 t2
+               Filter: (b > 250)
+         ->  Hash
+               ->  Seq Scan on prt1_p2 t1
+                     Filter: ((a < 450) AND (b = 0))
+(9 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.a < 450 AND t2.b > 250 AND t1.b = 0 ORDER
BYt1.a, t2.b; 
   a  |  c   |  b  |  c
@@ -1480,10 +1479,9 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.b AND t1
                      ->  Seq Scan on prt2_l_p3_p1 t2_3
                      ->  Seq Scan on prt2_l_p3_p2 t2_4
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on prt1_l_p3_p1 t1_3
-                                 Filter: (b = 0)
-(29 rows)
+                     ->  Seq Scan on prt1_l_p3_p1 t1_3
+                           Filter: (b = 0)
+(28 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b;
   a  |  c   |  b  |  c
@@ -1526,10 +1524,9 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 LEFT JOIN prt2_l t2 ON t1.a = t2.b
                      ->  Seq Scan on prt2_l_p3_p1 t2_3
                      ->  Seq Scan on prt2_l_p3_p2 t2_4
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on prt1_l_p3_p1 t1_3
-                                 Filter: (b = 0)
-(30 rows)
+                     ->  Seq Scan on prt1_l_p3_p1 t1_3
+                           Filter: (b = 0)
+(29 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 LEFT JOIN prt2_l t2 ON t1.a = t2.b AND t1.c = t2.c WHERE t1.b = 0 ORDER
BYt1.a, t2.b; 
   a  |  c   |  b  |  c
@@ -1580,10 +1577,9 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 RIGHT JOIN prt2_l t2 ON t1.a = t2.b
                      ->  Seq Scan on prt1_l_p3_p1 t1_3
                      ->  Seq Scan on prt1_l_p3_p2 t1_4
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on prt2_l_p3_p1 t2_3
-                                 Filter: (a = 0)
-(30 rows)
+                     ->  Seq Scan on prt2_l_p3_p1 t2_3
+                           Filter: (a = 0)
+(29 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 RIGHT JOIN prt2_l t2 ON t1.a = t2.b AND t1.c = t2.c WHERE t2.a = 0 ORDER
BYt1.a, t2.b; 
   a  |  c   |  b  |  c
@@ -1629,14 +1625,12 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1_l WHERE prt1_l.b = 0) t1
                            Filter: (a = 0)
          ->  Hash Full Join
                Hash Cond: ((prt1_l_p3_p1.a = prt2_l_p3_p1.b) AND ((prt1_l_p3_p1.c)::text = (prt2_l_p3_p1.c)::text))
-               ->  Append
-                     ->  Seq Scan on prt1_l_p3_p1
-                           Filter: (b = 0)
+               ->  Seq Scan on prt1_l_p3_p1
+                     Filter: (b = 0)
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on prt2_l_p3_p1
-                                 Filter: (a = 0)
-(33 rows)
+                     ->  Seq Scan on prt2_l_p3_p1
+                           Filter: (a = 0)
+(31 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1_l WHERE prt1_l.b = 0) t1 FULL JOIN (SELECT * FROM prt2_l WHERE
prt2_l.a= 0) t2 ON (t1.a = t2.b AND t1.c = t2.c) ORDER BY t1.a, t2.b; 
   a  |  c   |  b  |  c
@@ -1697,9 +1691,8 @@ SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
                            ->  Seq Scan on prt1_l_p2_p2 t2_2
                                  Filter: ((t1_2.a = a) AND ((t1_2.c)::text = (c)::text))
          ->  Nested Loop Left Join
-               ->  Append
-                     ->  Seq Scan on prt1_l_p3_p1 t1_3
-                           Filter: (b = 0)
+               ->  Seq Scan on prt1_l_p3_p1 t1_3
+                     Filter: (b = 0)
                ->  Hash Join
                      Hash Cond: ((t3_3.b = t2_3.a) AND ((t3_3.c)::text = (t2_3.c)::text))
                      ->  Append
@@ -1711,7 +1704,7 @@ SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
                                        Filter: ((t1_3.a = a) AND ((t1_3.c)::text = (c)::text))
                                  ->  Seq Scan on prt1_l_p3_p2 t2_4
                                        Filter: ((t1_3.a = a) AND ((t1_3.c)::text = (c)::text))
-(45 rows)
+(44 rows)

 SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t2.c AS t2c, t2.b AS t2b, t3.b AS t3b, least(t1.a,t2.a,t3.b) FROM prt1_l t2 JOIN
prt2_lt3 ON (t2.a = t3.b AND t2.c = t3.c)) ss 
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 30946f7..7659ef7 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -43,20 +43,18 @@ explain (costs off) select * from lp where a > 'a' and a <= 'd';
 (7 rows)

 explain (costs off) select * from lp where a = 'a';
-            QUERY PLAN
------------------------------------
- Append
-   ->  Seq Scan on lp_ad
-         Filter: (a = 'a'::bpchar)
-(3 rows)
+         QUERY PLAN
+-----------------------------
+ Seq Scan on lp_ad
+   Filter: (a = 'a'::bpchar)
+(2 rows)

 explain (costs off) select * from lp where 'a' = a;    /* commuted */
-            QUERY PLAN
------------------------------------
- Append
-   ->  Seq Scan on lp_ad
-         Filter: ('a'::bpchar = a)
-(3 rows)
+         QUERY PLAN
+-----------------------------
+ Seq Scan on lp_ad
+   Filter: ('a'::bpchar = a)
+(2 rows)

 explain (costs off) select * from lp where a is not null;
            QUERY PLAN
@@ -75,12 +73,11 @@ explain (costs off) select * from lp where a is not null;
 (11 rows)

 explain (costs off) select * from lp where a is null;
-         QUERY PLAN
------------------------------
- Append
-   ->  Seq Scan on lp_null
-         Filter: (a IS NULL)
-(3 rows)
+      QUERY PLAN
+-----------------------
+ Seq Scan on lp_null
+   Filter: (a IS NULL)
+(2 rows)

 explain (costs off) select * from lp where a = 'a' or a = 'c';
                         QUERY PLAN
@@ -150,12 +147,11 @@ create table coll_pruning_a partition of coll_pruning for values in ('a');
 create table coll_pruning_b partition of coll_pruning for values in ('b');
 create table coll_pruning_def partition of coll_pruning default;
 explain (costs off) select * from coll_pruning where a collate "C" = 'a' collate "C";
-                 QUERY PLAN
----------------------------------------------
- Append
-   ->  Seq Scan on coll_pruning_a
-         Filter: (a = 'a'::text COLLATE "C")
-(3 rows)
+              QUERY PLAN
+---------------------------------------
+ Seq Scan on coll_pruning_a
+   Filter: (a = 'a'::text COLLATE "C")
+(2 rows)

 -- collation doesn't match the partitioning collation, no pruning occurs
 explain (costs off) select * from coll_pruning where a collate "POSIX" = 'a' collate "POSIX";
@@ -192,20 +188,18 @@ create table rlp5 partition of rlp for values from (31) to (maxvalue) partition
 create table rlp5_default partition of rlp5 default;
 create table rlp5_1 partition of rlp5 for values from (31) to (40);
 explain (costs off) select * from rlp where a < 1;
-       QUERY PLAN
--------------------------
- Append
-   ->  Seq Scan on rlp1
-         Filter: (a < 1)
-(3 rows)
+    QUERY PLAN
+-------------------
+ Seq Scan on rlp1
+   Filter: (a < 1)
+(2 rows)

 explain (costs off) select * from rlp where 1 > a;    /* commuted */
-       QUERY PLAN
--------------------------
- Append
-   ->  Seq Scan on rlp1
-         Filter: (1 > a)
-(3 rows)
+    QUERY PLAN
+-------------------
+ Seq Scan on rlp1
+   Filter: (1 > a)
+(2 rows)

 explain (costs off) select * from rlp where a <= 1;
         QUERY PLAN
@@ -218,20 +212,18 @@ explain (costs off) select * from rlp where a <= 1;
 (5 rows)

 explain (costs off) select * from rlp where a = 1;
-       QUERY PLAN
--------------------------
- Append
-   ->  Seq Scan on rlp2
-         Filter: (a = 1)
-(3 rows)
+    QUERY PLAN
+-------------------
+ Seq Scan on rlp2
+   Filter: (a = 1)
+(2 rows)

 explain (costs off) select * from rlp where a = 1::bigint;        /* same as above */
-            QUERY PLAN
------------------------------------
- Append
-   ->  Seq Scan on rlp2
-         Filter: (a = '1'::bigint)
-(3 rows)
+         QUERY PLAN
+-----------------------------
+ Seq Scan on rlp2
+   Filter: (a = '1'::bigint)
+(2 rows)

 explain (costs off) select * from rlp where a = 1::numeric;        /* no pruning */
                   QUERY PLAN
@@ -384,20 +376,18 @@ explain (costs off) select * from rlp where a = 16;
 (9 rows)

 explain (costs off) select * from rlp where a = 16 and b in ('not', 'in', 'here');
-                                 QUERY PLAN
-----------------------------------------------------------------------------
- Append
-   ->  Seq Scan on rlp3_default
-         Filter: ((a = 16) AND ((b)::text = ANY ('{not,in,here}'::text[])))
-(3 rows)
+                              QUERY PLAN
+----------------------------------------------------------------------
+ Seq Scan on rlp3_default
+   Filter: ((a = 16) AND ((b)::text = ANY ('{not,in,here}'::text[])))
+(2 rows)

 explain (costs off) select * from rlp where a = 16 and b < 'ab';
-                       QUERY PLAN
----------------------------------------------------------
- Append
-   ->  Seq Scan on rlp3_default
-         Filter: (((b)::text < 'ab'::text) AND (a = 16))
-(3 rows)
+                    QUERY PLAN
+---------------------------------------------------
+ Seq Scan on rlp3_default
+   Filter: (((b)::text < 'ab'::text) AND (a = 16))
+(2 rows)

 explain (costs off) select * from rlp where a = 16 and b <= 'ab';
                         QUERY PLAN
@@ -410,12 +400,11 @@ explain (costs off) select * from rlp where a = 16 and b <= 'ab';
 (5 rows)

 explain (costs off) select * from rlp where a = 16 and b is null;
-                 QUERY PLAN
---------------------------------------------
- Append
-   ->  Seq Scan on rlp3nullxy
-         Filter: ((b IS NULL) AND (a = 16))
-(3 rows)
+              QUERY PLAN
+--------------------------------------
+ Seq Scan on rlp3nullxy
+   Filter: ((b IS NULL) AND (a = 16))
+(2 rows)

 explain (costs off) select * from rlp where a = 16 and b is not null;
                    QUERY PLAN
@@ -432,12 +421,11 @@ explain (costs off) select * from rlp where a = 16 and b is not null;
 (9 rows)

 explain (costs off) select * from rlp where a is null;
-             QUERY PLAN
-------------------------------------
- Append
-   ->  Seq Scan on rlp_default_null
-         Filter: (a IS NULL)
-(3 rows)
+          QUERY PLAN
+------------------------------
+ Seq Scan on rlp_default_null
+   Filter: (a IS NULL)
+(2 rows)

 explain (costs off) select * from rlp where a is not null;
               QUERY PLAN
@@ -486,12 +474,11 @@ explain (costs off) select * from rlp where a > 30;
 (7 rows)

 explain (costs off) select * from rlp where a = 30;    /* only default is scanned */
-            QUERY PLAN
-----------------------------------
- Append
-   ->  Seq Scan on rlp_default_30
-         Filter: (a = 30)
-(3 rows)
+         QUERY PLAN
+----------------------------
+ Seq Scan on rlp_default_30
+   Filter: (a = 30)
+(2 rows)

 explain (costs off) select * from rlp where a <= 31;
               QUERY PLAN
@@ -528,12 +515,11 @@ explain (costs off) select * from rlp where a <= 31;
 (29 rows)

 explain (costs off) select * from rlp where a = 1 or a = 7;
-              QUERY PLAN
---------------------------------------
- Append
-   ->  Seq Scan on rlp2
-         Filter: ((a = 1) OR (a = 7))
-(3 rows)
+           QUERY PLAN
+--------------------------------
+ Seq Scan on rlp2
+   Filter: ((a = 1) OR (a = 7))
+(2 rows)

 explain (costs off) select * from rlp where a = 1 or b = 'ab';
                       QUERY PLAN
@@ -580,12 +566,11 @@ explain (costs off) select * from rlp where a > 20 and a < 27;
 (9 rows)

 explain (costs off) select * from rlp where a = 29;
-           QUERY PLAN
---------------------------------
- Append
-   ->  Seq Scan on rlp4_default
-         Filter: (a = 29)
-(3 rows)
+        QUERY PLAN
+--------------------------
+ Seq Scan on rlp4_default
+   Filter: (a = 29)
+(2 rows)

 explain (costs off) select * from rlp where a >= 29;
               QUERY PLAN
@@ -605,12 +590,11 @@ explain (costs off) select * from rlp where a >= 29;

 -- redundant clauses are eliminated
 explain (costs off) select * from rlp where a > 1 and a = 10;    /* only default */
-               QUERY PLAN
-----------------------------------------
- Append
-   ->  Seq Scan on rlp_default_10
-         Filter: ((a > 1) AND (a = 10))
-(3 rows)
+            QUERY PLAN
+----------------------------------
+ Seq Scan on rlp_default_10
+   Filter: ((a > 1) AND (a = 10))
+(2 rows)

 explain (costs off) select * from rlp where a > 1 and a >=15;    /* rlp3 onwards, including default */
                QUERY PLAN
@@ -797,20 +781,18 @@ explain (costs off) select * from mc3p where a <= 10 and abs(b) < 10;
 (9 rows)

 explain (costs off) select * from mc3p where a = 11 and abs(b) = 0;
-                 QUERY PLAN
----------------------------------------------
- Append
-   ->  Seq Scan on mc3p_default
-         Filter: ((a = 11) AND (abs(b) = 0))
-(3 rows)
+              QUERY PLAN
+---------------------------------------
+ Seq Scan on mc3p_default
+   Filter: ((a = 11) AND (abs(b) = 0))
+(2 rows)

 explain (costs off) select * from mc3p where a = 20 and abs(b) = 10 and c = 100;
-                         QUERY PLAN
-------------------------------------------------------------
- Append
-   ->  Seq Scan on mc3p6
-         Filter: ((a = 20) AND (c = 100) AND (abs(b) = 10))
-(3 rows)
+                      QUERY PLAN
+------------------------------------------------------
+ Seq Scan on mc3p6
+   Filter: ((a = 20) AND (c = 100) AND (abs(b) = 10))
+(2 rows)

 explain (costs off) select * from mc3p where a > 20;
            QUERY PLAN
@@ -962,12 +944,11 @@ explain (costs off) select * from mc2p where a < 2;
 (9 rows)

 explain (costs off) select * from mc2p where a = 2 and b < 1;
-              QUERY PLAN
----------------------------------------
- Append
-   ->  Seq Scan on mc2p3
-         Filter: ((b < 1) AND (a = 2))
-(3 rows)
+           QUERY PLAN
+---------------------------------
+ Seq Scan on mc2p3
+   Filter: ((b < 1) AND (a = 2))
+(2 rows)

 explain (costs off) select * from mc2p where a > 1;
            QUERY PLAN
@@ -986,53 +967,47 @@ explain (costs off) select * from mc2p where a > 1;
 (11 rows)

 explain (costs off) select * from mc2p where a = 1 and b > 1;
-              QUERY PLAN
----------------------------------------
- Append
-   ->  Seq Scan on mc2p2
-         Filter: ((b > 1) AND (a = 1))
-(3 rows)
+           QUERY PLAN
+---------------------------------
+ Seq Scan on mc2p2
+   Filter: ((b > 1) AND (a = 1))
+(2 rows)

 -- all partitions but the default one should be pruned
 explain (costs off) select * from mc2p where a = 1 and b is null;
-                QUERY PLAN
--------------------------------------------
- Append
-   ->  Seq Scan on mc2p_default
-         Filter: ((b IS NULL) AND (a = 1))
-(3 rows)
+             QUERY PLAN
+-------------------------------------
+ Seq Scan on mc2p_default
+   Filter: ((b IS NULL) AND (a = 1))
+(2 rows)

 explain (costs off) select * from mc2p where a is null and b is null;
-                  QUERY PLAN
------------------------------------------------
- Append
-   ->  Seq Scan on mc2p_default
-         Filter: ((a IS NULL) AND (b IS NULL))
-(3 rows)
+               QUERY PLAN
+-----------------------------------------
+ Seq Scan on mc2p_default
+   Filter: ((a IS NULL) AND (b IS NULL))
+(2 rows)

 explain (costs off) select * from mc2p where a is null and b = 1;
-                QUERY PLAN
--------------------------------------------
- Append
-   ->  Seq Scan on mc2p_default
-         Filter: ((a IS NULL) AND (b = 1))
-(3 rows)
+             QUERY PLAN
+-------------------------------------
+ Seq Scan on mc2p_default
+   Filter: ((a IS NULL) AND (b = 1))
+(2 rows)

 explain (costs off) select * from mc2p where a is null;
-           QUERY PLAN
---------------------------------
- Append
-   ->  Seq Scan on mc2p_default
-         Filter: (a IS NULL)
-(3 rows)
+        QUERY PLAN
+--------------------------
+ Seq Scan on mc2p_default
+   Filter: (a IS NULL)
+(2 rows)

 explain (costs off) select * from mc2p where b is null;
-           QUERY PLAN
---------------------------------
- Append
-   ->  Seq Scan on mc2p_default
-         Filter: (b IS NULL)
-(3 rows)
+        QUERY PLAN
+--------------------------
+ Seq Scan on mc2p_default
+   Filter: (b IS NULL)
+(2 rows)

 -- boolean partitioning
 create table boolpart (a bool) partition by list (a);
@@ -1050,20 +1025,18 @@ explain (costs off) select * from boolpart where a in (true, false);
 (5 rows)

 explain (costs off) select * from boolpart where a = false;
-          QUERY PLAN
-------------------------------
- Append
-   ->  Seq Scan on boolpart_f
-         Filter: (NOT a)
-(3 rows)
+       QUERY PLAN
+------------------------
+ Seq Scan on boolpart_f
+   Filter: (NOT a)
+(2 rows)

 explain (costs off) select * from boolpart where not a = false;
-          QUERY PLAN
-------------------------------
- Append
-   ->  Seq Scan on boolpart_t
-         Filter: a
-(3 rows)
+       QUERY PLAN
+------------------------
+ Seq Scan on boolpart_t
+   Filter: a
+(2 rows)

 explain (costs off) select * from boolpart where a is true or a is not true;
                     QUERY PLAN
@@ -1076,12 +1049,11 @@ explain (costs off) select * from boolpart where a is true or a is not true;
 (5 rows)

 explain (costs off) select * from boolpart where a is not true;
-           QUERY PLAN
----------------------------------
- Append
-   ->  Seq Scan on boolpart_f
-         Filter: (a IS NOT TRUE)
-(3 rows)
+        QUERY PLAN
+---------------------------
+ Seq Scan on boolpart_f
+   Filter: (a IS NOT TRUE)
+(2 rows)

 explain (costs off) select * from boolpart where a is not true and a is not false;
         QUERY PLAN
@@ -1190,10 +1162,9 @@ EXPLAIN (COSTS OFF) SELECT tableoid::regclass as part, a, b FROM part WHERE a IS
 ---------------------------------------------------------------------------
  Sort
    Sort Key: ((part_p2_p1.tableoid)::regclass), part_p2_p1.a, part_p2_p1.b
-   ->  Append
-         ->  Seq Scan on part_p2_p1
-               Filter: (a IS NULL)
-(5 rows)
+   ->  Seq Scan on part_p2_p1
+         Filter: (a IS NULL)
+(4 rows)

 --
 -- some more cases
@@ -1260,13 +1231,12 @@ explain (costs off) select * from mc2p t1, lateral (select count(*) from mc3p t2

 -- also here, because values for all keys are provided
 explain (costs off) select * from mc2p t1, lateral (select count(*) from mc3p t2 where t2.a = 1 and abs(t2.b) = 1 and
t2.c= 1) s where t1.a = 1; 
-                             QUERY PLAN
---------------------------------------------------------------------
+                          QUERY PLAN
+--------------------------------------------------------------
  Nested Loop
    ->  Aggregate
-         ->  Append
-               ->  Seq Scan on mc3p1 t2
-                     Filter: ((a = 1) AND (c = 1) AND (abs(b) = 1))
+         ->  Seq Scan on mc3p1 t2
+               Filter: ((a = 1) AND (c = 1) AND (abs(b) = 1))
    ->  Append
          ->  Seq Scan on mc2p1 t1
                Filter: (a = 1)
@@ -1274,7 +1244,7 @@ explain (costs off) select * from mc2p t1, lateral (select count(*) from mc3p t2
                Filter: (a = 1)
          ->  Seq Scan on mc2p_default t1_2
                Filter: (a = 1)
-(12 rows)
+(11 rows)

 --
 -- pruning with clauses containing <> operator
@@ -1395,12 +1365,11 @@ explain (costs off) select * from coll_pruning_multi where substr(a, 1) = 'a' co

 -- pruning, with values provided for both keys
 explain (costs off) select * from coll_pruning_multi where substr(a, 1) = 'e' collate "C" and substr(a, 1) = 'a'
collate"POSIX"; 
-                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------
- Append
-   ->  Seq Scan on coll_pruning_multi2
-         Filter: ((substr(a, 1) = 'e'::text COLLATE "C") AND (substr(a, 1) = 'a'::text COLLATE "POSIX"))
-(3 rows)
+                                            QUERY PLAN
+---------------------------------------------------------------------------------------------------
+ Seq Scan on coll_pruning_multi2
+   Filter: ((substr(a, 1) = 'e'::text COLLATE "C") AND (substr(a, 1) = 'a'::text COLLATE "POSIX"))
+(2 rows)

 --
 -- LIKE operators don't prune
@@ -1445,12 +1414,11 @@ explain (costs off) select * from rparted_by_int2 where a > 100000000000000;
 create table rparted_by_int2_maxvalue partition of rparted_by_int2 for values from (16384) to (maxvalue);
 -- all partitions but rparted_by_int2_maxvalue pruned
 explain (costs off) select * from rparted_by_int2 where a > 100000000000000;
-                   QUERY PLAN
--------------------------------------------------
- Append
-   ->  Seq Scan on rparted_by_int2_maxvalue
-         Filter: (a > '100000000000000'::bigint)
-(3 rows)
+                QUERY PLAN
+-------------------------------------------
+ Seq Scan on rparted_by_int2_maxvalue
+   Filter: (a > '100000000000000'::bigint)
+(2 rows)

 drop table lp, coll_pruning, rlp, mc3p, mc2p, boolpart, rp, coll_pruning_multi, like_op_noprune, lparted_by_int2,
rparted_by_int2;
 --
@@ -1584,52 +1552,46 @@ explain (costs off) select * from hp where a <> 1 and b <> 'xxx';
 -- pruning should work if either a value or a IS NULL clause is provided for
 -- each of the keys
 explain (costs off) select * from hp where a is null and b is null;
-                  QUERY PLAN
------------------------------------------------
- Append
-   ->  Seq Scan on hp0
-         Filter: ((a IS NULL) AND (b IS NULL))
-(3 rows)
+               QUERY PLAN
+-----------------------------------------
+ Seq Scan on hp0
+   Filter: ((a IS NULL) AND (b IS NULL))
+(2 rows)

 explain (costs off) select * from hp where a = 1 and b is null;
-                QUERY PLAN
--------------------------------------------
- Append
-   ->  Seq Scan on hp1
-         Filter: ((b IS NULL) AND (a = 1))
-(3 rows)
+             QUERY PLAN
+-------------------------------------
+ Seq Scan on hp1
+   Filter: ((b IS NULL) AND (a = 1))
+(2 rows)

 explain (costs off) select * from hp where a = 1 and b = 'xxx';
-                   QUERY PLAN
--------------------------------------------------
- Append
-   ->  Seq Scan on hp0
-         Filter: ((a = 1) AND (b = 'xxx'::text))
-(3 rows)
+                QUERY PLAN
+-------------------------------------------
+ Seq Scan on hp0
+   Filter: ((a = 1) AND (b = 'xxx'::text))
+(2 rows)

 explain (costs off) select * from hp where a is null and b = 'xxx';
-                     QUERY PLAN
------------------------------------------------------
- Append
-   ->  Seq Scan on hp2
-         Filter: ((a IS NULL) AND (b = 'xxx'::text))
-(3 rows)
+                  QUERY PLAN
+-----------------------------------------------
+ Seq Scan on hp2
+   Filter: ((a IS NULL) AND (b = 'xxx'::text))
+(2 rows)

 explain (costs off) select * from hp where a = 2 and b = 'xxx';
-                   QUERY PLAN
--------------------------------------------------
- Append
-   ->  Seq Scan on hp3
-         Filter: ((a = 2) AND (b = 'xxx'::text))
-(3 rows)
+                QUERY PLAN
+-------------------------------------------
+ Seq Scan on hp3
+   Filter: ((a = 2) AND (b = 'xxx'::text))
+(2 rows)

 explain (costs off) select * from hp where a = 1 and b = 'abcde';
-                    QUERY PLAN
----------------------------------------------------
- Append
-   ->  Seq Scan on hp2
-         Filter: ((a = 1) AND (b = 'abcde'::text))
-(3 rows)
+                 QUERY PLAN
+---------------------------------------------
+ Seq Scan on hp2
+   Filter: ((a = 1) AND (b = 'abcde'::text))
+(2 rows)

 explain (costs off) select * from hp where (a = 1 and b = 'abcde') or (a = 2 and b = 'xxx') or (a is null and b is
null);
                                                        QUERY PLAN
  
@@ -2878,12 +2840,11 @@ execute part_abc_q1 (1, 2, 3);

 -- Single partition should be scanned.
 explain (analyze, costs off, summary off, timing off) execute part_abc_q1 (1, 2, 3);
-                      QUERY PLAN
--------------------------------------------------------
- Append (actual rows=0 loops=1)
-   ->  Seq Scan on part_abc_p1 (actual rows=0 loops=1)
-         Filter: ((a = $1) AND (b = $2) AND (c = $3))
-(3 rows)
+                   QUERY PLAN
+-------------------------------------------------
+ Seq Scan on part_abc_p1 (actual rows=0 loops=1)
+   Filter: ((a = $1) AND (b = $2) AND (c = $3))
+(2 rows)

 deallocate part_abc_q1;
 drop table part_abc;
@@ -3205,12 +3166,11 @@ create table pp_arrpart (a int[]) partition by list (a);
 create table pp_arrpart1 partition of pp_arrpart for values in ('{1}');
 create table pp_arrpart2 partition of pp_arrpart for values in ('{2, 3}', '{4, 5}');
 explain (costs off) select * from pp_arrpart where a = '{1}';
-               QUERY PLAN
-----------------------------------------
- Append
-   ->  Seq Scan on pp_arrpart1
-         Filter: (a = '{1}'::integer[])
-(3 rows)
+            QUERY PLAN
+----------------------------------
+ Seq Scan on pp_arrpart1
+   Filter: (a = '{1}'::integer[])
+(2 rows)

 explain (costs off) select * from pp_arrpart where a = '{1, 2}';
         QUERY PLAN
@@ -3262,20 +3222,18 @@ select tableoid::regclass, * from pph_arrpart order by 1;
 (3 rows)

 explain (costs off) select * from pph_arrpart where a = '{1}';
-               QUERY PLAN
-----------------------------------------
- Append
-   ->  Seq Scan on pph_arrpart2
-         Filter: (a = '{1}'::integer[])
-(3 rows)
+            QUERY PLAN
+----------------------------------
+ Seq Scan on pph_arrpart2
+   Filter: (a = '{1}'::integer[])
+(2 rows)

 explain (costs off) select * from pph_arrpart where a = '{1, 2}';
-                QUERY PLAN
-------------------------------------------
- Append
-   ->  Seq Scan on pph_arrpart1
-         Filter: (a = '{1,2}'::integer[])
-(3 rows)
+             QUERY PLAN
+------------------------------------
+ Seq Scan on pph_arrpart1
+   Filter: (a = '{1,2}'::integer[])
+(2 rows)

 explain (costs off) select * from pph_arrpart where a in ('{4, 5}', '{1}');
                               QUERY PLAN
@@ -3294,12 +3252,11 @@ create table pp_enumpart (a pp_colors) partition by list (a);
 create table pp_enumpart_green partition of pp_enumpart for values in ('green');
 create table pp_enumpart_blue partition of pp_enumpart for values in ('blue');
 explain (costs off) select * from pp_enumpart where a = 'blue';
-               QUERY PLAN
------------------------------------------
- Append
-   ->  Seq Scan on pp_enumpart_blue
-         Filter: (a = 'blue'::pp_colors)
-(3 rows)
+            QUERY PLAN
+-----------------------------------
+ Seq Scan on pp_enumpart_blue
+   Filter: (a = 'blue'::pp_colors)
+(2 rows)

 explain (costs off) select * from pp_enumpart where a = 'black';
         QUERY PLAN
@@ -3316,12 +3273,11 @@ create table pp_recpart (a pp_rectype) partition by list (a);
 create table pp_recpart_11 partition of pp_recpart for values in ('(1,1)');
 create table pp_recpart_23 partition of pp_recpart for values in ('(2,3)');
 explain (costs off) select * from pp_recpart where a = '(1,1)'::pp_rectype;
-                QUERY PLAN
--------------------------------------------
- Append
-   ->  Seq Scan on pp_recpart_11
-         Filter: (a = '(1,1)'::pp_rectype)
-(3 rows)
+             QUERY PLAN
+-------------------------------------
+ Seq Scan on pp_recpart_11
+   Filter: (a = '(1,1)'::pp_rectype)
+(2 rows)

 explain (costs off) select * from pp_recpart where a = '(1,2)'::pp_rectype;
         QUERY PLAN
@@ -3337,12 +3293,11 @@ create table pp_intrangepart (a int4range) partition by list (a);
 create table pp_intrangepart12 partition of pp_intrangepart for values in ('[1,2]');
 create table pp_intrangepart2inf partition of pp_intrangepart for values in ('[2,)');
 explain (costs off) select * from pp_intrangepart where a = '[1,2]'::int4range;
-                QUERY PLAN
-------------------------------------------
- Append
-   ->  Seq Scan on pp_intrangepart12
-         Filter: (a = '[1,3)'::int4range)
-(3 rows)
+             QUERY PLAN
+------------------------------------
+ Seq Scan on pp_intrangepart12
+   Filter: (a = '[1,3)'::int4range)
+(2 rows)

 explain (costs off) select * from pp_intrangepart where a = '(1,2)'::int4range;
         QUERY PLAN
@@ -3359,12 +3314,11 @@ create table pp_lp (a int, value int) partition by list (a);
 create table pp_lp1 partition of pp_lp for values in(1);
 create table pp_lp2 partition of pp_lp for values in(2);
 explain (costs off) select * from pp_lp where a = 1;
-        QUERY PLAN
---------------------------
- Append
-   ->  Seq Scan on pp_lp1
-         Filter: (a = 1)
-(3 rows)
+     QUERY PLAN
+--------------------
+ Seq Scan on pp_lp1
+   Filter: (a = 1)
+(2 rows)

 explain (costs off) update pp_lp set value = 10 where a = 1;
         QUERY PLAN
@@ -3529,12 +3483,11 @@ explain (costs off) select * from pp_temp_parent where true;
 (3 rows)

 explain (costs off) select * from pp_temp_parent where a = 2;
-             QUERY PLAN
-------------------------------------
- Append
-   ->  Seq Scan on pp_temp_part_def
-         Filter: (a = 2)
-(3 rows)
+          QUERY PLAN
+------------------------------
+ Seq Scan on pp_temp_part_def
+   Filter: (a = 2)
+(2 rows)

 drop table pp_temp_parent;
 -- Stress run-time partition pruning a bit more, per bug reports
@@ -3628,13 +3581,12 @@ create table listp2 partition of listp for values in(2) partition by list(b);
 create table listp2_10 partition of listp2 for values in (10);
 explain (analyze, costs off, summary off, timing off)
 select * from listp where a = (select 2) and b <> 10;
-                QUERY PLAN
--------------------------------------------
- Append (actual rows=0 loops=1)
+                 QUERY PLAN
+--------------------------------------------
+ Seq Scan on listp1 (actual rows=0 loops=1)
+   Filter: ((b <> 10) AND (a = $0))
    InitPlan 1 (returns $0)
-     ->  Result (actual rows=1 loops=1)
-   ->  Seq Scan on listp1 (never executed)
-         Filter: ((b <> 10) AND (a = $0))
-(5 rows)
+     ->  Result (never executed)
+(4 rows)

 drop table listp;
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
index 2e17049..0cc5851 100644
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -1057,15 +1057,14 @@ NOTICE:  f_leak => awesome science fiction
 (4 rows)

 EXPLAIN (COSTS OFF) SELECT * FROM part_document WHERE f_leak(dtitle);
-                             QUERY PLAN
---------------------------------------------------------------------
- Append
+                          QUERY PLAN
+--------------------------------------------------------------
+ Seq Scan on part_document_fiction
+   Filter: ((cid < 55) AND (dlevel <= $0) AND f_leak(dtitle))
    InitPlan 1 (returns $0)
      ->  Index Scan using uaccount_pkey on uaccount
            Index Cond: (pguser = CURRENT_USER)
-   ->  Seq Scan on part_document_fiction
-         Filter: ((cid < 55) AND (dlevel <= $0) AND f_leak(dtitle))
-(6 rows)
+(5 rows)

 -- pp1 ERROR
 INSERT INTO part_document VALUES (100, 11, 5, 'regress_rls_dave', 'testing pp1'); -- fail
@@ -1136,15 +1135,14 @@ NOTICE:  f_leak => awesome science fiction
 (4 rows)

 EXPLAIN (COSTS OFF) SELECT * FROM part_document WHERE f_leak(dtitle);
-                             QUERY PLAN
---------------------------------------------------------------------
- Append
+                          QUERY PLAN
+--------------------------------------------------------------
+ Seq Scan on part_document_fiction
+   Filter: ((cid < 55) AND (dlevel <= $0) AND f_leak(dtitle))
    InitPlan 1 (returns $0)
      ->  Index Scan using uaccount_pkey on uaccount
            Index Cond: (pguser = CURRENT_USER)
-   ->  Seq Scan on part_document_fiction
-         Filter: ((cid < 55) AND (dlevel <= $0) AND f_leak(dtitle))
-(6 rows)
+(5 rows)

 -- viewpoint from regress_rls_carol
 SET SESSION AUTHORIZATION regress_rls_carol;
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 92d427a..da70438 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -812,11 +812,10 @@ explain (costs off)
    UNION ALL
    SELECT 2 AS t, * FROM tenk1 b) c
  WHERE t = 2;
-        QUERY PLAN
----------------------------
- Append
-   ->  Seq Scan on tenk1 b
-(2 rows)
+     QUERY PLAN
+---------------------
+ Seq Scan on tenk1 b
+(1 row)

 -- Test that we push quals into UNION sub-selects only when it's safe
 explain (costs off)

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On Mon, 4 Mar 2019 at 16:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > [ v13-0001-Forgo-generating-single-subpath-Append-and-Merge.patch ]
>
> I continue to think that this is the wrong way to go about it,
> and as proof of concept present the attached, which reproduces
> all of the regression-test plan changes appearing in v13 ---
> with a whole lot less mechanism and next to no added planning
> cycles (which certainly cannot be said of v13).

Nice.  I see you solved the problem I had been complaining about by
pulling the surplus Append out of the final plan after setrefs, in
which case the varnos are all set to the special varnos, meaning you
don't suffer from the same problem with nodes higher in the plan
having the varnos of the parent instead of the append child.  This
method is certainly much much better than what I had. Thanks for
taking the time to show me how this should be done.

> Also, I wonder why you didn't teach ExecSupportsMarkRestore that a
> single-child MergeAppend can support mark/restore.  I didn't add such
> code here either, but I suspect that's an oversight.

The reason for that was that I didn't ever create any single-subpath
MergeAppends. I instead created Append paths having isproxy as true.
This is slightly confusing as I was just borrowing Append to act as
this proxy path and it was only these proxy Appends I was dealing with
in ExecSupportsMarkRestore.

I'll need to modify your version of the patch as you're keeping
single-subpath MergeAppends paths, so ExecSupportsMarkRestore needs to
know about those.

> One other remark is that the division of labor between
> create_[merge]append_path and their costsize.c subroutines
> seems pretty unprincipled.  I'd be inclined to push all the
> relevant logic into costsize.c, but have not done so here.
> Moving the existing cost calculations in create_mergeappend_path
> into costsize.c would better be done as a separate refactoring patch,
> perhaps.

The part I don't like about that is the fact that we don't generate
sort paths in the MergeAppend subpaths, instead, we rely on checking
pathkeys_contained_in() again in create_merge_append_plan() and just
creating a sort node, if required.  There is some repeat pathkeys
checking there but I wonder if we'll see much a performance regression
if we go to the trouble of making sort paths. Is this what you meant?

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


David Rowley <david.rowley@2ndquadrant.com> writes:
> On Mon, 4 Mar 2019 at 16:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> One other remark is that the division of labor between
>> create_[merge]append_path and their costsize.c subroutines
>> seems pretty unprincipled.  I'd be inclined to push all the
>> relevant logic into costsize.c, but have not done so here.
>> Moving the existing cost calculations in create_mergeappend_path
>> into costsize.c would better be done as a separate refactoring patch,
>> perhaps.

> The part I don't like about that is the fact that we don't generate
> sort paths in the MergeAppend subpaths, instead, we rely on checking
> pathkeys_contained_in() again in create_merge_append_plan() and just
> creating a sort node, if required.  There is some repeat pathkeys
> checking there but I wonder if we'll see much a performance regression
> if we go to the trouble of making sort paths. Is this what you meant?

No, I'm just suggesting taking the stanza "Add up the sizes and costs of
the input paths" out of create_merge_append_path and putting it into
cost_merge_append.  It seems weird to me that in the plain Append case,
cost_append does the adding-up of the child costs, but in the
MergeAppend case it's not done similarly.

            regards, tom lane


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On Mon, 4 Mar 2019 at 16:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I was a bit surprised to find that I didn't need to fool around
> with lying about whether [Merge]Append can project.  I've not dug
> into the exact reason why, but I suspect it's that previous changes
> made in support of parallelization have resulted in ensuring that
> we push the upper tlist down to the children anyway, at some earlier
> stage.

I can get a plan that does end up with Result nodes above a Scan node.

create table rangep (a int, b int) partition by range (a);
create table rangep1 partition of rangep for values from(0) to (1000000);
explain select r1::text from rangep r1 inner join rangep r2 on
r1::text = r2::Text order by r1::text;

However, if we modify is_projection_capable_plan() similar to how
ExecSupportsMarkRestore() has been modified then we'd need to ensure
that any changes made to the Append/MergeAppend's tlist also are made
to the underlying node's tlist.  I'm unsure if that's worth the
complexity.  What do you think?

> I haven't looked into whether this does the right things for parallel
> planning --- possibly create_[merge]append_path need to propagate up
> parallel-related path fields from the single child?

I looked at this. For Append, with the current code in
add_paths_to_append_rel() we won't increase the parallel_workers
higher than the parallel workers from the single child rel.  The
current code does:

parallel_workers = Max(parallel_workers, fls(list_length(live_childrels)));

so fls(1) == 1.  However, I wonder if it's better to set that
explicitly in create_append_path() for the
list_length(pathnode->subpaths) == 1 case.

The other issue I found when looking into the parallel code was that
partial paths with pathkeys will get made, but will never be used.
The reason is that Append can't normally do anything with these as it
can't maintain their sort order and MergeAppend is not a parallel
aware node.  I ended up adding some new code at the bottom of
generate_append_paths() to add these paths as single subpaths to
Append nodes.  This also required changing create_append_path() so
that it records the child pathkeys when dealing with a single subpath
Append.

> Also, I wonder why you didn't teach ExecSupportsMarkRestore that a
> single-child MergeAppend can support mark/restore.  I didn't add such
> code here either, but I suspect that's an oversight.

Added code for that.

> One other remark is that the division of labor between
> create_[merge]append_path and their costsize.c subroutines
> seems pretty unprincipled.  I'd be inclined to push all the
> relevant logic into costsize.c, but have not done so here.
> Moving the existing cost calculations in create_mergeappend_path
> into costsize.c would better be done as a separate refactoring patch,
> perhaps.

I've not touched that, but happy to do it as a separate patch.

I've attached my changes to your v14 patch.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment
David Rowley <david.rowley@2ndquadrant.com> writes:
> [ drop-useless-merge-appends-15.patch ]

Pushed with some minor adjustments.

> I can get a plan that does end up with Result nodes above a Scan node.

> create table rangep (a int, b int) partition by range (a);
> create table rangep1 partition of rangep for values from(0) to (1000000);
> explain select r1::text from rangep r1 inner join rangep r2 on
> r1::text = r2::Text order by r1::text;

> However, if we modify is_projection_capable_plan() similar to how
> ExecSupportsMarkRestore() has been modified then we'd need to ensure
> that any changes made to the Append/MergeAppend's tlist also are made
> to the underlying node's tlist.  I'm unsure if that's worth the
> complexity.  What do you think?

I'm inclined to think not, at least not without more compelling
examples than that one ;-).  Note that we had Result-above-the-Append
before, so we're not regressing for such cases, we're just leaving
something on the table.

>> One other remark is that the division of labor between
>> create_[merge]append_path and their costsize.c subroutines
>> seems pretty unprincipled.  I'd be inclined to push all the
>> relevant logic into costsize.c, but have not done so here.
>> Moving the existing cost calculations in create_mergeappend_path
>> into costsize.c would better be done as a separate refactoring patch,
>> perhaps.

> I've not touched that, but happy to do it as a separate patch.

After looking at this, I'm less excited than I was before.
It seems it'd be only marginally less crufty than the way it is now.
In particular, costsize.c would have to be aware of the rule about
single-child MergeAppend being removable, which seems a little outside
its purview: typically we only ask costsize.c to estimate the cost
of a plan node as-presented, not to make decisions about what the
shape of the plan tree will be.  So I left it alone.

            regards, tom lane


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From
David Rowley
Date:
On Tue, 26 Mar 2019 at 09:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > [ drop-useless-merge-appends-15.patch ]
>
> Pushed with some minor adjustments.

Thank for all your work on this, and thanks for the final push.

FWIW, you should probably have credited yourself as the main author
since I don't think there was much of my patch left.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services