Re: ModifyTable overheads in generic plans - Mailing list pgsql-hackers

From Amit Langote
Subject Re: ModifyTable overheads in generic plans
Date
Msg-id CA+HiwqEcawatEaUh1uTbZMEZTJeLzbroRTz9_X9Z5CFjTWJkhw@mail.gmail.com
Whole thread Raw
In response to Re: ModifyTable overheads in generic plans  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: ModifyTable overheads in generic plans
Re: ModifyTable overheads in generic plans
List pgsql-hackers
On Wed, Apr 7, 2021 at 8:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Mon, Apr 5, 2021 at 1:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> OK.  Do you want to pull out the bits of the patch that we can still
> >> do without postponing BeginDirectModify?
>
> > I ended up with the attached, whereby ExecInitResultRelation() is now
> > performed for all relations before calling ExecInitNode() on the
> > subplan.  As mentioned, I moved other per-result-rel initializations
> > into the same loop that does ExecInitResultRelation(), while moving
> > code related to some initializations into initialize-on-first-access
> > accessor functions for the concerned fields.  I chose to do that for
> > ri_WIthCheckOptionExprs, ri_projectReturning, and ri_projectNew.
>
> I pushed the parts of this that I thought were safe and productive.

Thank you.

+/*
+ * ExecInitInsertProjection
+ *     Do one-time initialization of projection data for INSERT tuples.
+ *
+ * INSERT queries may need a projection to filter out junk attrs in the tlist.
+ *
+ * This is "one-time" for any given result rel, but we might touch
+ * more than one result rel in the course of a partitioned INSERT.

I don't think we need this last bit for INSERT, because the result
rels for leaf partitions will never have to go through
ExecInitInsertProjection().  Leaf partitions are never directly fed
tuples that ExecModifyTable() extracts out of the subplan, because
those tuples will have gone through the root target table's projection
before being passed to tuple routing.  So, if INSERTs will ever need a
projection, only the partitioned table being inserted into will need
to have one built for.

Also, I think we should update the commentary around ri_projectNew a
bit to make it clear that noplace beside ExecGet{Insert|Update}Tuple
should be touching it and the associated slots.

+ * This is "one-time" for any given result rel, but we might touch more than
+ * one result rel in the course of a partitioned UPDATE, and each one needs
+ * its own projection due to possible column order variation.

Minor quibble, but should we write it as "...in the course of an
inherited UPDATE"?

Attached patch contains these changes.

> The business about moving the subplan tree initialization to after
> calling FDWs' BeginForeignModify functions seems to me to be a
> nonstarter.  Existing FDWs are going to expect their scan initializations
> to have been done first.  I'm surprised that postgres_fdw seemed to
> need only a one-line fix; it could have been far worse.  The amount of
> trouble that could cause is absolutely not worth it to remove one loop
> over the result relations.

Okay, that sounds fair.  After all, we write this about 'mtstate' in
the description of BeginForeignModify(), which I had failed to notice:

"mtstate is the overall state of the ModifyTable plan node being
executed; global data about the plan and execution state is available
via this structure."

> I also could not get excited about postponing initialization of RETURNING
> or WITH CHECK OPTIONS expressions.  I grant that that can be helpful
> when those features are used, but I doubt that RETURNING is used that
> heavily, and WITH CHECK OPTIONS is surely next door to nonexistent
> in performance-critical queries.  If the feature isn't used, the cost
> of the existing code is about zero.  So I couldn't see that it was worth
> the amount of code thrashing and risk of new bugs involved.

Okay.

> The bit you
> noted about EXPLAIN missing a subplan is pretty scary in this connection;
> I'm not at all sure that that's just cosmetic.

Yeah, this and...

> (Having said that, I'm wondering if there are bugs in these cases for
> cross-partition updates that target a previously-not-used partition.
> So we might have things to fix anyway.)

...this would need to be looked at a bit more closely, which I'll try
to do sometime later this week.

> Anyway, looking at the test case you posted at the very top of this
> thread, I was getting this with HEAD on Friday:
>
> nparts  TPS
> 0       12152
> 10      8672
> 100     2753
> 1000    314
>
> and after the two patches I just pushed, it looks like:
>
> 0       12105
> 10      9928
> 100     5433
> 1000    938
>
> So while there's certainly work left to do, that's not bad for
> some low-hanging-fruit grabbing.

Yes, certainly.

I reran my usual benchmark and got the following numbers, this time
comparing v13.2 against the latest HEAD:

nparts  10cols      20cols      40cols

v13.2

64      3231        2747        2217
128     1528        1269        1121
256     709         652         491
1024    96          78          67

v14dev HEAD

64      14835       14360       14563
128     9469        9601        9490
256     5523        5383        5268
1024    1482        1415        1366

Clearly, we've made some very good progress here.  Thanks.



--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: New IndexAM API controlling index vacuum strategies
Next
From: Seino Yuki
Date:
Subject: Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements