Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CA+HiwqEYxMX-KZ8RcXio0=at0FgtegJSffr5h7JDjciduEygAA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)
List pgsql-hackers
On Fri, Feb 5, 2021 at 4:53 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> On Fri, Feb 5, 2021 at 5:21 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > BTW, the original query's cteList is copied into sub_action query but
> > not into rule_action for reasons I haven't looked very closely into,
> > even though we'd like to ultimately set the latter's hasModifyingCTE
> > to reflect the original query's, right?  So we should do the following
> > at some point before returning:
> >
> > if (sub_action->hasModifyingCTE)
> >     rule_action->hasModifyingCTE = true;
>
> Actually, rule_action will usually point to sub_action (in which case,
> no need to copy to rule_action), except if the rule action is an
> INSERT...SELECT, which seems to be handled by some "kludge" according
> to the following comment (and KLUDGE ALERT comment in the function
> that is called):
>
>     /*
>      * Adjust rule action and qual to offset its varnos, so that we can merge
>      * its rtable with the main parsetree's rtable.
>      *
>      * If the rule action is an INSERT...SELECT, the OLD/NEW rtable entries
>      * will be in the SELECT part, and we have to modify that rather than the
>      * top-level INSERT (kluge!).
>      */
>     sub_action = getInsertSelectQuery(rule_action, &sub_action_ptr);
>
> So in that case (sub_action_ptr != NULL), within rule_action there is
> a pointer to sub_action (RTE for the subquery), so whenever sub_action
> is re-created, this pointer needs to be fixed-up.
> It looks like I might need to copy hasModifyingCTE back to rule_action
> in this case - but not 100% sure on it yet - still checking that. All
> tests run so far pass without doing that though.

I guess we just don't have a test case where the rule_action query is
actually parallelized, like one that houzj shared a few emails ago.

> This is one reason for my original approach (though I admit, it was
> not optimal) because at least it was reliable and detected the
> modifyingCTE after all the rewriting and kludgy code had finished.

Yeah it's hard to go through all of this highly recursive legacy code
to be sure that hasModifyingCTE is consistent with reality in *all*
cases, but let's try to do it.  No other has* flags are set
after-the-fact, so I wouldn't bet on a committer letting this one
through.

-- 
Amit Langote
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Single transaction in the tablesync worker?
Next
From: Amul Sul
Date:
Subject: Re: Correct comment in StartupXLOG().