Re: partition routing layering in nodeModifyTable.c - Mailing list pgsql-hackers

From Andres Freund
Subject Re: partition routing layering in nodeModifyTable.c
Date
Msg-id 20190718055334.4arwrufbhe7zcsjx@alap3.anarazel.de
Whole thread Raw
In response to Re: partition routing layering in nodeModifyTable.c  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: partition routing layering in nodeModifyTable.c  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
Hi,

On 2019-07-18 14:24:29 +0900, Amit Langote wrote:
> On Thu, Jul 18, 2019 at 10:09 AM Andres Freund <andres@anarazel.de> wrote:
> > 1) How come partition routing is done outside of ExecInsert()?
> >
> >             case CMD_INSERT:
> >                 /* Prepare for tuple routing if needed. */
> >                 if (proute)
> >                     slot = ExecPrepareTupleRouting(node, estate, proute,
> >                                                    resultRelInfo, slot);
> >                 slot = ExecInsert(node, slot, planSlot,
> >                                   estate, node->canSetTag);
> >                 /* Revert ExecPrepareTupleRouting's state change. */
> >                 if (proute)
> >                     estate->es_result_relation_info = resultRelInfo;
> >                 break;
> >
> > That already seems like a layering violation,
> 
> The decision to move partition routing out of ExecInsert() came about
> when we encountered a bug [1] whereby ExecInsert() would fail to reset
> estate->es_result_relation_info back to the root table if it had to
> take an abnormal path out (early return), of which there are quite a
> few instances.  The first solution I came up with was to add a goto
> label  for the code to reset estate->es_result_relation_info and jump
> to it from the various places that do an early return, which was
> complained about as reducing readability.  So, the solution we
> eventually settled on in 6666ee49f was to perform ResultRelInfos
> switching at a higher level.

I think that was the wrong path, given that the code now lives in
multiple places. Without even a comment explaining that if one has to be
changed, the other has to be changed too.


> > It seems to me that if we just moved the ExecPrepareTupleRouting() into
> > ExecInsert(), we could remove the duplication.
> 
> I agree that there's duplication here.  Given what I wrote above, I
> can think of doing this: move all of ExecInsert()'s code into
> ExecInsertInternal() and make the former instead look like this:

For me just having the gotos is cleaner than that here.

But perhaps the right fix would be to not have ExecPrepareTupleRouting()
change global state at all, and instead change it much more locally
inside ExecInsert(), around the calls that need it to be set
differently.

Or perhaps the actually correct fix is to remove es_result_relation_info
alltogether, and just pass it down the places that need it - we've a lot
more code setting it than using the value.  And it'd not be hard to
actually pass it to the places that read it.  Given all the
setting/resetting of it it's pretty obvious that a query-global resource
isn't the right place for it.



> >     /*
> >      * triggers behave differently depending on this being a delete as
> >      * part of a partion move, or a deletion proper.
> >     if (mtstate->operation == CMD_UPDATE)
> >     {
> >         /*
> >          * If this insert is the result of a partition key update that moved the
> >          * tuple to a new partition, put this row into the transition NEW TABLE,
> >          * if there is one. We need to do this separately for DELETE and INSERT
> >          * because they happen on different tables.
> >          */
> >         ExecARUpdateTriggers(estate, resultRelInfo, NULL,
> >                              NULL,
> >                              slot,
> >                              NULL,
> >                              mtstate->mt_transition_capture);
> >
> >         /*
> >          * But we do want to fire plain per-row INSERT triggers on the
> >          * new table. By not passing in transition_capture we prevent
> >          * ....
> >          */
> >          ExecARInsertTriggers(estate, resultRelInfo, slot, recheckIndexes,
> >                               NULL);
> >     }
> >     else
> >     {
> >         /* AFTER ROW INSERT Triggers */
> >         ExecARInsertTriggers(estate, resultRelInfo, slot, recheckIndexes,
> >                              ar_insert_trig_tcs);
> >     }
> 
> Maybe you meant to use mtstate->mt_transition_capture instead of
> ar_insert_trig_tcs in the else block.  We don't need
> ar_insert_trig_tcs at all.

Yes, it was just a untested example of how the code could be made
clearer.


> > it seems like it'd be quite a bit clearer (although I do think the
> > comments also need a fair bit of polishing independent of this proposed
> > change).
> 
> Fwiw, I agree with your proposed restructuring, although I'd let Amit
> Kh chime in as he'd be more familiar with this code.  I wasn't aware
> of this partitioning-related bit being present in ExecInsert().
> 
> Would you like me to write a patch for some or all items?

Yes, that would be awesome.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs
Next
From: Andres Freund
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs