Re: partition routing layering in nodeModifyTable.c - Mailing list pgsql-hackers

From Amit Langote
Subject Re: partition routing layering in nodeModifyTable.c
Date
Msg-id CA+HiwqEgZhHWPFg-S5Kw8z0O-1X3dzfJomHjGL74X-Vw0wT2BQ@mail.gmail.com
Whole thread Raw
In response to Re: partition routing layering in nodeModifyTable.c  (Andres Freund <andres@anarazel.de>)
Responses Re: partition routing layering in nodeModifyTable.c  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Re: partition routing layering in nodeModifyTable.c  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Thu, Jul 18, 2019 at 2:53 PM Andres Freund <andres@anarazel.de> wrote:
> On 2019-07-18 14:24:29 +0900, Amit Langote wrote:
> > On Thu, Jul 18, 2019 at 10:09 AM Andres Freund <andres@anarazel.de> wrote:
> > > 1) How come partition routing is done outside of ExecInsert()?
> > >
> > >             case CMD_INSERT:
> > >                 /* Prepare for tuple routing if needed. */
> > >                 if (proute)
> > >                     slot = ExecPrepareTupleRouting(node, estate, proute,
> > >                                                    resultRelInfo, slot);
> > >                 slot = ExecInsert(node, slot, planSlot,
> > >                                   estate, node->canSetTag);
> > >                 /* Revert ExecPrepareTupleRouting's state change. */
> > >                 if (proute)
> > >                     estate->es_result_relation_info = resultRelInfo;
> > >                 break;
> > >
> > > That already seems like a layering violation,
> >
> > The decision to move partition routing out of ExecInsert() came about
> > when we encountered a bug [1] whereby ExecInsert() would fail to reset
> > estate->es_result_relation_info back to the root table if it had to
> > take an abnormal path out (early return), of which there are quite a
> > few instances.  The first solution I came up with was to add a goto
> > label  for the code to reset estate->es_result_relation_info and jump
> > to it from the various places that do an early return, which was
> > complained about as reducing readability.  So, the solution we
> > eventually settled on in 6666ee49f was to perform ResultRelInfos
> > switching at a higher level.
>
> I think that was the wrong path, given that the code now lives in
> multiple places. Without even a comment explaining that if one has to be
> changed, the other has to be changed too.
>
>
> > > It seems to me that if we just moved the ExecPrepareTupleRouting() into
> > > ExecInsert(), we could remove the duplication.
> >
> > I agree that there's duplication here.  Given what I wrote above, I
> > can think of doing this: move all of ExecInsert()'s code into
> > ExecInsertInternal() and make the former instead look like this:
>
> For me just having the gotos is cleaner than that here.
>
> But perhaps the right fix would be to not have ExecPrepareTupleRouting()
> change global state at all, and instead change it much more locally
> inside ExecInsert(), around the calls that need it to be set
> differently.
>
> Or perhaps the actually correct fix is to remove es_result_relation_info
> alltogether, and just pass it down the places that need it - we've a lot
> more code setting it than using the value.  And it'd not be hard to
> actually pass it to the places that read it.  Given all the
> setting/resetting of it it's pretty obvious that a query-global resource
> isn't the right place for it.

I tend to agree that managing state through es_result_relation_info
across various operations on a result relation has turned a bit messy
at this point.  That said, while most of the places that access the
currently active result relation from es_result_relation_info can be
easily modified to receive it directly, the FDW API BeginDirectModify
poses bit of a challenge.  BeginDirectlyModify() is called via
ExecInitForeignScan() that in turn can't be changed to add a result
relation (Index or ResultRelInfo *) argument, so the only way left for
BeginDirectlyModify() is to access it via es_result_relation_info.

Maybe we can do to ExecPrepareTupleRouting() what you say -- remove
all code in it that changes ModifyTable-global and EState-global
states.  Also, maybe call ExecPrepareTupleRouting() inside
ExecInsert() at the beginning instead of outside of it.  I agree that
setting and reverting global states around the exact piece of code
that need that to be done is better for clarity.  All of that assuming
you're not saying that we scrap ExecPrepareTupleRouting altogether.

Thoughts?  Other opinions?

> > Would you like me to write a patch for some or all items?
>
> Yes, that would be awesome.

OK, I will try to post a patch soon.

Thanks,
Amit



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Performance issue in foreign-key-aware join estimation
Next
From: Fabien COELHO
Date:
Subject: Re: refactoring - share str2*int64 functions