Re: ModifyTable overheads in generic plans - Mailing list pgsql-hackers

From Amit Langote
Subject Re: ModifyTable overheads in generic plans
Date
Msg-id CA+HiwqFKAmzK6JykmX8kROzHVWF6L7nT9yCF_p2nxd2M6_QpAA@mail.gmail.com
Whole thread Raw
In response to Re: ModifyTable overheads in generic plans  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: ModifyTable overheads in generic plans
List pgsql-hackers
On Mon, Apr 5, 2021 at 1:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Sun, Apr 4, 2021 at 10:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> In some desultory performance testing here, it seemed like a
> >> significant part of the cost is ExecOpenIndices, and I don't see
> >> a reason offhand why we could not delay/skip that.  I also concur
> >> with delaying construction of ri_ChildToRootMap and the
> >> partition_tuple_routing data structures, since many queries will
> >> never need those at all.
>
> > As I mentioned in [1], creating ri_projectNew can be expensive too,
> > especially as column count (and partition count for the generic plan
> > case) grows.  I think we should have an static inline
> > initialize-on-first-access accessor function for that field too.
>
> > Actually, I remember considering having such accessor functions (all
> > static inline) for ri_WithCheckOptionExprs, ri_projectReturning,
> > ri_onConflictArbiterIndexes, and ri_onConflict (needed by ON CONFLICT
> > UPDATE) as well, prompted by Heikki's comments earlier in the
> > discussion.  I also remember, before even writing this patch, not
> > liking that WCO and RETURNING expressions are initialized in their own
> > separate loops, rather than being part of the earlier loop that says:
>
> Sure, we might as well try to improve the cosmetics here.
>
> >> Anyway, I think the way to proceed for now is to grab the low-hanging
> >> fruit of things that clearly won't change any semantics.  But tail end
> >> of the dev cycle is no time to be making really fundamental changes
> >> in how FDW direct modify works.
>
> > I agree.
>
> OK.  Do you want to pull out the bits of the patch that we can still
> do without postponing BeginDirectModify?

I ended up with the attached, whereby ExecInitResultRelation() is now
performed for all relations before calling ExecInitNode() on the
subplan.  As mentioned, I moved other per-result-rel initializations
into the same loop that does ExecInitResultRelation(), while moving
code related to some initializations into initialize-on-first-access
accessor functions for the concerned fields.  I chose to do that for
ri_WIthCheckOptionExprs, ri_projectReturning, and ri_projectNew.

ExecInitNode() is called on the subplan (to set
outerPlanState(mtstate) that is) after all of the per-result-rel
initializations are done.  One of the initializations is calling
BeginForeignModify() for non-direct modifications, an API to which we
currently pass mtstate.  Moving that to before setting
outerPlanState(mtstate) so as to be in the same loop as other
initializations had me worried just a little bit given a modification
I had to perform in postgresBeginForeignModify():

@@ -1879,7 +1879,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
                                    rte,
                                    resultRelInfo,
                                    mtstate->operation,
-                                   outerPlanState(mtstate)->plan,
+                                   outerPlan(mtstate->ps.plan),
                                    query,
                                    target_attrs,
                                    values_end_len,

Though I think that this is harmless, because I'd think that the
implementers of this API shouldn't really rely too strongly on
assuming that outerPlanState(mtstate) is valid when it is called, if
postgres_fdw's implementation is any indication.

Another slightly ugly bit is the dependence of direct modify API on
ri_projectReturning being set even if it doesn't care for anything
else in the ResultRelInfo.  So in ExecInitModifyTable()
ri_projectReturning initialization is not skipped for
directly-modified foreign result relations.

Notes on regression test changes:

* Initializing WCO quals during execution instead of during
ExecInitNode() of ModifyTable() causes a couple of regression test
changes in updatable_view.out that were a bit unexpected for me --
Subplans that are referenced in WCO quals are no longer shown in the
plain EXPLAIN output.  Even though that's a user-visible change, maybe
we can live with that?

* ri_RootResultRelInfo in *all* child relations instead of just in
tuple-routing result relations has caused changes to inherit.out and
privileges.out.  I think that's basically down to ExecConstraints() et
al doing one thing for child relations in which ri_RootResultRelInfo
is set and another for those in which it is not.  Now it's set in
*all* child relations, so it always does the former thing.  I remember
having checked that those changes are only cosmetic when I first
encountered them.

* Moving PartitionTupleRouting initialization to be done lazily for
cross-partition update cases causes changes to update.out.  They have
to do with the fact that the violations of the actual target table's
partition constraint are now shown as such, instead of reporting them
as occurring on one of the leaf partitions.  Again, only cosmetic.

> Another thing we could consider, perhaps, is keeping the behavior
> the same for foreign tables but postponing init of local ones.
> To avoid opening the relations to figure out which kind they are,
> we'd have to rely on the RTE copies of relkind, which is a bit
> worrisome --- I'm not certain that those are guaranteed to be
> up-to-date --- but it's probably okay since there is no way to
> convert a regular table to foreign or vice versa.  Anyway, that
> idea seems fairly messy so I'm inclined to just pursue the
> lower-hanging fruit for now.

It would be nice to try that idea out, but I tend to agree with the last part.

Also, I'm fairly happy with the kind of performance improvement I see
even with the lower-hanging fruit patch for my earlier earlier shared
benchmark that tests the performance of generic plan execution:

HEAD (especially with 86dc90056 now in):

nparts  10cols      20cols      40cols

64      6926        6394        6253
128     3758        3501        3482
256     1938        1822        1776
1024    406         371         406

Patched:

64      13147       12554       14787
128     7850        9788        9631
256     4472        5599        5638
1024    1218        1503        1309

I also tried with a version where the new tuple projections are built
in ExecInitModifyTable() as opposed to lazily:

64      10937       9969        8535
128     6586        5903        4887
256     3613        3118        2654
1024    884         749         652

This tells us that delaying initializing new tuple projection for
updates can have a sizable speedup and better scalability.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: "Euler Taveira"
Date:
Subject: Re: use AV worker items infrastructure for GIN pending list's cleanup
Next
From: Bharath Rupireddy
Date:
Subject: Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation