Re: ModifyTable overheads in generic plans - Mailing list pgsql-hackers

From Amit Langote
Subject Re: ModifyTable overheads in generic plans
Date
Msg-id CA+HiwqH6NvE2yQQvTW5GaWkD5dkzdqdLOVBvYHP7yARntdMheQ@mail.gmail.com
Whole thread Raw
In response to Re: ModifyTable overheads in generic plans  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: ModifyTable overheads in generic plans  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Apr 4, 2021 at 10:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Thu, Apr 1, 2021 at 3:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Amit Langote <amitlangote09@gmail.com> writes:
> > [ v14-0002-Initialize-result-relation-information-lazily.patch ]
> >> Needs YA rebase over 86dc90056.
>
> > Done.
>
> I spent some time looking this over.

Thanks.

> There are bits of it we can
> adopt without too much trouble, but I'm afraid that 0001 (delay
> FDW BeginDirectModify until the first actual update) is a nonstarter,
> which makes the main idea of delaying ExecInitResultRelation unworkable.
>
> My fear about 0001 is that it will destroy any hope of direct updates
> on different remote partitions executing with consistent semantics
> (i.e. compatible snapshots), because some row updates triggered by the
> local query may have already happened before a given partition gets to
> start its remote query.  Maybe we can work around that, but I do not
> want to commit a major restructuring that assumes we can dodge this
> problem when we don't yet even have a fix for cross-partition updates
> that does rely on the assumption of synchronous startup.

Hmm, okay, I can understand the concern.

> In some desultory performance testing here, it seemed like a
> significant part of the cost is ExecOpenIndices, and I don't see
> a reason offhand why we could not delay/skip that.  I also concur
> with delaying construction of ri_ChildToRootMap and the
> partition_tuple_routing data structures, since many queries will
> never need those at all.

As I mentioned in [1], creating ri_projectNew can be expensive too,
especially as column count (and partition count for the generic plan
case) grows.  I think we should have an static inline
initialize-on-first-access accessor function for that field too.

Actually, I remember considering having such accessor functions (all
static inline) for ri_WithCheckOptionExprs, ri_projectReturning,
ri_onConflictArbiterIndexes, and ri_onConflict (needed by ON CONFLICT
UPDATE) as well, prompted by Heikki's comments earlier in the
discussion.  I also remember, before even writing this patch, not
liking that WCO and RETURNING expressions are initialized in their own
separate loops, rather than being part of the earlier loop that says:

    /*
     * Do additional per-result-relation initialization.
     */
    for (i = 0; i < nrels; i++)
    {

I guess ri_RowIdAttNo initialization can go into the same loop.

> > * PartitionTupleRouting.subplan_resultrel_htab is removed in favor
> > of using ModifyTableState.mt_resultOidHash to look up an UPDATE
> > result relation by OID.
>
> Hmm, that sounds promising too, though I didn't look at the details.
>
> Anyway, I think the way to proceed for now is to grab the low-hanging
> fruit of things that clearly won't change any semantics.  But tail end
> of the dev cycle is no time to be making really fundamental changes
> in how FDW direct modify works.

I agree.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CA+HiwqHLUNhMxy46Mrb04VJpN=HUdm9bD7xdZ6f5h2o4imX79g@mail.gmail.com



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Next
From: Bharath Rupireddy
Date:
Subject: Re: TRUNCATE on foreign table