Thread: Using results from INSERT ... RETURNING

Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

07 July 2009, 17:37:22

Hello.

Here's a patch(WIP) that implements INSERT .. RETURNING inside a CTE.
Should apply cleanly against CVS head.

The INSERT query isn't rewritten so rules and default values don't work.
Recursive CTEs don't work either.

Regards,
Marko Tiikkaja
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***************
*** 651,656 **** explain_outNode(StringInfo str,
--- 651,659 ----
          case T_Hash:
              pname = "Hash";
              break;
+         case T_InsertReturning:
+             pname = "INSERT RETURNING";
+             break;
          default:
              pname = "???";
              break;
*** a/src/backend/executor/Makefile
--- b/src/backend/executor/Makefile
***************
*** 15,21 **** include $(top_builddir)/src/Makefile.global
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
--- 15,21 ----
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o nodeInsertReturning.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 86,94 **** static void ExecutePlan(EState *estate, PlanState *planstate,
              DestReceiver *dest);
  static void ExecSelect(TupleTableSlot *slot,
             DestReceiver *dest, EState *estate);
! static void ExecInsert(TupleTableSlot *slot, ItemPointer tupleid,
             TupleTableSlot *planSlot,
!            DestReceiver *dest, EState *estate);
  static void ExecDelete(ItemPointer tupleid,
             TupleTableSlot *planSlot,
             DestReceiver *dest, EState *estate);
--- 86,94 ----
              DestReceiver *dest);
  static void ExecSelect(TupleTableSlot *slot,
             DestReceiver *dest, EState *estate);
! void ExecInsert(TupleTableSlot *slot, ItemPointer tupleid,
             TupleTableSlot *planSlot,
!            DestReceiver *dest, EState *estate, ResultRelInfo* resultRelInfo, bool clearReturningTuple);
  static void ExecDelete(ItemPointer tupleid,
             TupleTableSlot *planSlot,
             DestReceiver *dest, EState *estate);
***************
*** 98,104 **** static void ExecUpdate(TupleTableSlot *slot, ItemPointer tupleid,
  static void ExecProcessReturning(ProjectionInfo *projectReturning,
                       TupleTableSlot *tupleSlot,
                       TupleTableSlot *planSlot,
!                      DestReceiver *dest);
  static TupleTableSlot *EvalPlanQualNext(EState *estate);
  static void EndEvalPlanQual(EState *estate);
  static void ExecCheckRTPerms(List *rangeTable);
--- 98,105 ----
  static void ExecProcessReturning(ProjectionInfo *projectReturning,
                       TupleTableSlot *tupleSlot,
                       TupleTableSlot *planSlot,
!                      DestReceiver *dest,
!                      bool clearTuple);
  static TupleTableSlot *EvalPlanQualNext(EState *estate);
  static void EndEvalPlanQual(EState *estate);
  static void ExecCheckRTPerms(List *rangeTable);
***************
*** 190,196 **** standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
          case CMD_SELECT:
              /* SELECT INTO and SELECT FOR UPDATE/SHARE need to mark tuples */
              if (queryDesc->plannedstmt->intoClause != NULL ||
!                 queryDesc->plannedstmt->rowMarks != NIL)
                  estate->es_output_cid = GetCurrentCommandId(true);
              break;

--- 191,198 ----
          case CMD_SELECT:
              /* SELECT INTO and SELECT FOR UPDATE/SHARE need to mark tuples */
              if (queryDesc->plannedstmt->intoClause != NULL ||
!                 queryDesc->plannedstmt->rowMarks != NIL ||
!                 queryDesc->plannedstmt->hasWritableCtes)
                  estate->es_output_cid = GetCurrentCommandId(true);
              break;

***************
*** 1670,1676 **** lnext:    ;
                  break;

              case CMD_INSERT:
!                 ExecInsert(slot, tupleid, planSlot, dest, estate);
                  break;

              case CMD_DELETE:
--- 1672,1678 ----
                  break;

              case CMD_INSERT:
!                 ExecInsert(slot, tupleid, planSlot, dest, estate, estate->es_result_relation_info, true);
                  break;

              case CMD_DELETE:
***************
*** 1742,1756 **** ExecSelect(TupleTableSlot *slot,
   *        index relations.
   * ----------------------------------------------------------------
   */
! static void
  ExecInsert(TupleTableSlot *slot,
             ItemPointer tupleid,
             TupleTableSlot *planSlot,
             DestReceiver *dest,
!            EState *estate)
  {
      HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
      Relation    resultRelationDesc;
      Oid            newId;

--- 1744,1759 ----
   *        index relations.
   * ----------------------------------------------------------------
   */
! void
  ExecInsert(TupleTableSlot *slot,
             ItemPointer tupleid,
             TupleTableSlot *planSlot,
             DestReceiver *dest,
!            EState *estate,
!            ResultRelInfo* resultRelInfo,
!            bool clearReturningTuple)
  {
      HeapTuple    tuple;
      Relation    resultRelationDesc;
      Oid            newId;

***************
*** 1763,1769 **** ExecInsert(TupleTableSlot *slot,
      /*
       * get information on the (current) result relation
       */
-     resultRelInfo = estate->es_result_relation_info;
      resultRelationDesc = resultRelInfo->ri_RelationDesc;

      /*
--- 1766,1771 ----
***************
*** 1842,1848 **** ExecInsert(TupleTableSlot *slot,
      /* Process RETURNING if present */
      if (resultRelInfo->ri_projectReturning)
          ExecProcessReturning(resultRelInfo->ri_projectReturning,
!                              slot, planSlot, dest);
  }

  /* ----------------------------------------------------------------
--- 1844,1850 ----
      /* Process RETURNING if present */
      if (resultRelInfo->ri_projectReturning)
          ExecProcessReturning(resultRelInfo->ri_projectReturning,
!                              slot, planSlot, dest, clearReturningTuple);
  }

  /* ----------------------------------------------------------------
***************
*** 1968,1974 **** ldelete:;
          ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);

          ExecProcessReturning(resultRelInfo->ri_projectReturning,
!                              slot, planSlot, dest);

          ExecClearTuple(slot);
          ReleaseBuffer(delbuffer);
--- 1970,1976 ----
          ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);

          ExecProcessReturning(resultRelInfo->ri_projectReturning,
!                              slot, planSlot, dest, true);

          ExecClearTuple(slot);
          ReleaseBuffer(delbuffer);
***************
*** 2140,2146 **** lreplace:;
      /* Process RETURNING if present */
      if (resultRelInfo->ri_projectReturning)
          ExecProcessReturning(resultRelInfo->ri_projectReturning,
!                              slot, planSlot, dest);
  }

  /*
--- 2142,2148 ----
      /* Process RETURNING if present */
      if (resultRelInfo->ri_projectReturning)
          ExecProcessReturning(resultRelInfo->ri_projectReturning,
!                              slot, planSlot, dest, true);
  }

  /*
***************
*** 2254,2260 **** static void
  ExecProcessReturning(ProjectionInfo *projectReturning,
                       TupleTableSlot *tupleSlot,
                       TupleTableSlot *planSlot,
!                      DestReceiver *dest)
  {
      ExprContext *econtext = projectReturning->pi_exprContext;
      TupleTableSlot *retSlot;
--- 2256,2262 ----
  ExecProcessReturning(ProjectionInfo *projectReturning,
                       TupleTableSlot *tupleSlot,
                       TupleTableSlot *planSlot,
!                      DestReceiver *dest, bool clearTuple)
  {
      ExprContext *econtext = projectReturning->pi_exprContext;
      TupleTableSlot *retSlot;
***************
*** 2275,2281 **** ExecProcessReturning(ProjectionInfo *projectReturning,
      /* Send to dest */
      (*dest->receiveSlot) (retSlot, dest);

!     ExecClearTuple(retSlot);
  }

  /*
--- 2277,2284 ----
      /* Send to dest */
      (*dest->receiveSlot) (retSlot, dest);

!     if (clearTuple)
!         ExecClearTuple(retSlot);
  }

  /*
*** a/src/backend/executor/execProcnode.c
--- b/src/backend/executor/execProcnode.c
***************
*** 286,291 **** ExecInitNode(Plan *node, EState *estate, int eflags)
--- 286,296 ----
                                                   estate, eflags);
              break;

+         case T_InsertReturning:
+             result = (PlanState *) ExecInitInsertReturning((Limit *) node,
+                                                  estate, eflags);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;        /* keep compiler quiet */
***************
*** 451,456 **** ExecProcNode(PlanState *node)
--- 456,465 ----
              result = ExecLimit((LimitState *) node);
              break;

+         case T_InsertReturningState:
+             result = ExecInsertReturning((InsertReturningState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;
***************
*** 627,632 **** ExecCountSlotsNode(Plan *node)
--- 636,644 ----
          case T_Limit:
              return ExecCountSlotsLimit((Limit *) node);

+         case T_InsertReturning:
+             return ExecCountSlotsInsertReturning((InsertReturning *) node);
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              break;
***************
*** 783,788 **** ExecEndNode(PlanState *node)
--- 795,804 ----
              ExecEndLimit((LimitState *) node);
              break;

+         case T_InsertReturningState:
+             ExecEndInsertReturning((InsertReturningState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              break;
*** /dev/null
--- b/src/backend/executor/nodeInsertReturning.c
***************
*** 0 ****
--- 1,120 ----
+ #include "postgres.h"
+
+ #include "executor/executor.h"
+ #include "executor/execdebug.h"
+ #include "executor/nodeInsertReturning.h"
+ #include "utils/memutils.h"
+
+ static struct DR_insertReturning
+ {
+     DestReceiver        dr;
+     TupleTableSlot      **targetSlot;
+ } receiver;
+
+ static void
+ receive(TupleTableSlot *slot, DestReceiver *self)
+ {
+     struct DR_insertReturning *myState = (struct DR_insertReturning *) self;
+
+     *myState->targetSlot = slot;
+ }
+
+ TupleTableSlot *
+ ExecInsertReturning(InsertReturningState *node)
+ {
+     EState        *estate = node->ps.state;
+
+     TupleTableSlot *slot;
+
+     /* Get a tuple from the subplan */
+     slot = ExecProcNode(outerPlanState(node));
+
+     if (TupIsNull(slot))
+         return NULL;
+
+     ExecInsert(slot, NULL, slot, (DestReceiver *) &receiver, estate, node->resultRelInfo, false);
+
+     return node->ps.ps_ResultTupleSlot;
+ }
+
+ InsertReturningState *
+ ExecInitInsertReturning(InsertReturning *node, EState *estate, int eflags)
+ {
+     InsertReturningState *resstate;
+     ExprContext *exprContext;
+     ResultRelInfo *resultRelInfo;
+     Relation resultRelation;
+
+     /*
+      * create state structure
+      */
+     resstate = makeNode(InsertReturningState);
+     resstate->ps.plan = (Plan *) node;
+     resstate->ps.state = estate;
+     resstate->ps.targetlist = node->plan.targetlist;
+
+     outerPlanState(resstate) = ExecInitNode(outerPlan(node), estate, eflags);
+
+     /*
+      * Initialize result tuple slot and assign
+      * type from the target list.
+      */
+     ExecInitResultTupleSlot(estate, &resstate->ps);
+     ExecAssignResultTypeFromTL(&resstate->ps);
+
+     /*
+      * Prepare the RETURNING expression tree for execution. This
+      * has to be done after calling ExecAssignResultTypeFromTL().
+      */
+     resstate->ps.targetlist = (List *)
+         ExecInitExpr((Expr *) node->plan.targetlist,
+                        (PlanState *) resstate);
+
+     /* Initialize result relation info */
+     resultRelInfo = (ResultRelInfo *) palloc0(sizeof(ResultRelInfo));
+     resultRelation = heap_open(node->resultRelationOid, RowExclusiveLock);
+     InitResultRelInfo(resultRelInfo, resultRelation, node->resultRelationIndex, CMD_INSERT, estate->es_instrument);
+
+     /* Initialize RETURNING projection */
+     exprContext = CreateExprContext(estate);
+     resultRelInfo->ri_projectReturning = ExecBuildProjectionInfo(resstate->ps.targetlist,
+                                 exprContext,
+                                 resstate->ps.ps_ResultTupleSlot,
+                                 NULL);
+
+     resstate->resultRelInfo = resultRelInfo;
+
+     /* Assign tuple receiver info */
+     receiver.dr.receiveSlot = receive;
+     receiver.targetSlot = &resstate->ps.ps_ResultTupleSlot;
+
+     return resstate;
+ }
+
+ int
+ ExecCountSlotsInsertReturning(InsertReturning *node)
+ {
+     return ExecCountSlotsNode(outerPlan(node)) + 2;
+ }
+
+ void
+ ExecEndInsertReturning(InsertReturningState *node)
+ {
+     heap_close(node->resultRelInfo->ri_RelationDesc, NoLock);
+     pfree(node->resultRelInfo);
+
+     /*
+      * Free the exprcontext
+      */
+     ExecFreeExprContext(&node->ps);
+
+     /*
+      * clean out the tuple table
+      */
+     ExecClearTuple(node->ps.ps_ResultTupleSlot);
+
+     /*
+      * shut down subplans
+      */
+     ExecEndNode(outerPlanState(node));
+ }
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 1391,1396 **** _copyXmlExpr(XmlExpr *from)
--- 1391,1407 ----

      return newnode;
  }
+
+ static InsertReturning *
+ _copyInsertReturning(InsertReturning *from)
+ {
+     InsertReturning    *newnode = makeNode(InsertReturning);
+
+     CopyPlanFields((Plan *) from, (Plan *) newnode);
+
+     return newnode;
+ }
+

  /*
   * _copyNullIfExpr (same as OpExpr)
***************
*** 4093,4098 **** copyObject(void *from)
--- 4104,4112 ----
          case T_XmlSerialize:
              retval = _copyXmlSerialize(from);
              break;
+         case T_InsertReturning:
+             retval = _copyInsertReturning(from);
+             break;

          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
***************
*** 2354,2359 **** bool
--- 2354,2403 ----
                      return true;
              }
              break;
+         case T_InsertStmt:
+             {
+                 InsertStmt *stmt = (InsertStmt *) node;
+
+                 if (walker(stmt->relation, context))
+                     return true;
+                 if (walker(stmt->cols, context))
+                     return true;
+                 if (walker(stmt->selectStmt, context))
+                     return true;
+                 if (walker(stmt->returningList, context))
+                     return true;
+             }
+             break;
+         case T_UpdateStmt:
+             {
+                 UpdateStmt *stmt = (UpdateStmt *) node;
+
+                 if (walker(stmt->relation, context))
+                     return true;
+                 if (walker(stmt->targetList, context))
+                     return true;
+                 if (walker(stmt->whereClause, context))
+                     return true;
+                 if (walker(stmt->fromClause, context))
+                     return true;
+                 if (walker(stmt->returningList, context))
+                     return true;
+             }
+             break;
+         case T_DeleteStmt:
+             {
+                 DeleteStmt *stmt = (DeleteStmt *) node;
+
+                 if (walker(stmt->relation, context))
+                     return true;
+                 if (walker(stmt->usingClause, context))
+                     return true;
+                 if (walker(stmt->whereClause, context))
+                     return true;
+                 if (walker(stmt->returningList, context))
+                     return true;
+             }
+             break;
          case T_A_Expr:
              {
                  A_Expr       *expr = (A_Expr *) node;
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***************
*** 155,160 **** standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
--- 155,161 ----
      glob->finalrtable = NIL;
      glob->relationOids = NIL;
      glob->invalItems = NIL;
+     glob->hasWritableCtes = false;
      glob->lastPHId = 0;
      glob->transientPlan = false;

***************
*** 224,229 **** standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
--- 225,231 ----
      result->resultRelations = root->resultRelations;
      result->utilityStmt = parse->utilityStmt;
      result->intoClause = parse->intoClause;
+     result->hasWritableCtes = glob->hasWritableCtes;
      result->subplans = glob->subplans;
      result->rewindPlanIDs = glob->rewindPlanIDs;
      result->returningLists = root->returningLists;
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***************
*** 375,380 **** set_plan_refs(PlannerGlobal *glob, Plan *plan, int rtoffset)
--- 375,393 ----
              set_join_references(glob, (Join *) plan, rtoffset);
              break;

+         case T_InsertReturning:
+             {
+                 /*
+                  * grouping_planner() already called
+                  * set_returning_clause_references so the targetList's
+                  * references are already set.
+                  */
+                 InsertReturning *splan = (InsertReturning *) plan;
+
+                 splan->resultRelationIndex += rtoffset;
+             }
+             break;
+
          case T_Hash:
          case T_Material:
          case T_Sort:
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***************
*** 880,885 **** SS_process_ctes(PlannerInfo *root)
--- 880,886 ----
          Bitmapset  *tmpset;
          int            paramid;
          Param       *prm;
+         InsertReturning *returningNode;

          /*
           * Ignore CTEs that are not actually referenced anywhere.
***************
*** 897,902 **** SS_process_ctes(PlannerInfo *root)
--- 898,904 ----
           */
          subquery = (Query *) copyObject(cte->ctequery);

+
          /*
           * Generate the plan for the CTE query.  Always plan for full
           * retrieval --- we don't have enough info to predict otherwise.
***************
*** 954,959 **** SS_process_ctes(PlannerInfo *root)
--- 956,985 ----
          prm = generate_new_param(root, INTERNALOID, -1);
          splan->setParam = list_make1_int(prm->paramid);

+         /* Handle INSERT .. RETURNING inside CTE */
+         if (subquery->commandType != CMD_SELECT)
+         {
+             Oid resultRelationOid;
+             Index resultRelationIndex;
+
+             Assert(subquery->commandType == CMD_INSERT);
+
+             Assert(subquery->resultRelation > 0);
+             Assert(list_length(subroot->returningLists) == 1);
+
+             returningNode = makeNode(InsertReturning);
+             returningNode->plan.lefttree = plan;
+
+             resultRelationOid = getrelid(subquery->resultRelation, subquery->rtable);
+             returningNode->resultRelationOid = resultRelationOid;
+
+             returningNode->plan.targetlist = linitial(subroot->returningLists);
+
+             root->glob->hasWritableCtes = true;
+
+             plan = returningNode;
+         }
+
          /*
           * Add the subplan and its rtable to the global lists.
           */
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 7026,7031 **** common_table_expr:  name opt_name_list AS select_with_parens
--- 7026,7058 ----
                  n->location = @1;
                  $$ = (Node *) n;
              }
+         | name opt_name_list AS '(' InsertStmt ')'
+             {
+                 CommonTableExpr *n = makeNode(CommonTableExpr);
+                 n->ctename = $1;
+                 n->aliascolnames = $2;
+                 n->ctequery = $5;
+                 n->location = @1;
+                 $$ = (Node *) n;
+             }
+         | name opt_name_list AS '(' UpdateStmt ')'
+             {
+                 CommonTableExpr *n = makeNode(CommonTableExpr);
+                 n->ctename = $1;
+                 n->aliascolnames = $2;
+                 n->ctequery = $5;
+                 n->location = @1;
+                 $$ = (Node *) n;
+             }
+         | name opt_name_list AS '(' DeleteStmt ')'
+             {
+                 CommonTableExpr *n = makeNode(CommonTableExpr);
+                 n->ctename = $1;
+                 n->aliascolnames = $2;
+                 n->ctequery = $5;
+                 n->location = @1;
+                 $$ = (Node *) n;
+             }
          ;

  into_clause:
*** a/src/backend/parser/parse_cte.c
--- b/src/backend/parser/parse_cte.c
***************
*** 18,23 ****
--- 18,24 ----
  #include "nodes/nodeFuncs.h"
  #include "parser/analyze.h"
  #include "parser/parse_cte.h"
+ #include "nodes/plannodes.h"
  #include "utils/builtins.h"


***************
*** 246,268 **** transformWithClause(ParseState *pstate, WithClause *withClause)
  static void
  analyzeCTE(ParseState *pstate, CommonTableExpr *cte)
  {
!     Query       *query;

      /* Analysis not done already */
!     Assert(IsA(cte->ctequery, SelectStmt));

      query = parse_sub_analyze(cte->ctequery, pstate);
      cte->ctequery = (Node *) query;

      /*
       * Check that we got something reasonable.    Many of these conditions are
       * impossible given restrictions of the grammar, but check 'em anyway.
!      * (These are the same checks as in transformRangeSubselect.)
       */
      if (!IsA(query, Query) ||
!         query->commandType != CMD_SELECT ||
!         query->utilityStmt != NULL)
!         elog(ERROR, "unexpected non-SELECT command in subquery in WITH");
      if (query->intoClause)
          ereport(ERROR,
                  (errcode(ERRCODE_SYNTAX_ERROR),
--- 247,284 ----
  static void
  analyzeCTE(ParseState *pstate, CommonTableExpr *cte)
  {
!     Query        *query;
!     List        *ctelist;

      /* Analysis not done already */
!     /* This needs to be one of SelectStmt, InsertStmt, UpdateStmt, DeleteStmt instead of:
!      * Assert(IsA(cte->ctequery, SelectStmt)); */

      query = parse_sub_analyze(cte->ctequery, pstate);
      cte->ctequery = (Node *) query;

+     if (query->commandType == CMD_SELECT)
+         ctelist = query->targetList;
+     else
+         ctelist = query->returningList;
+
      /*
       * Check that we got something reasonable.    Many of these conditions are
       * impossible given restrictions of the grammar, but check 'em anyway.
!      * (In addition to the same checks as in transformRangeSubselect,
!      * this adds checks for (INSERT|UPDATE|DELETE)...RETURNING.)
       */
      if (!IsA(query, Query) ||
!         query->utilityStmt != NULL ||
!         (query->commandType != CMD_SELECT &&
!         ((query->commandType == CMD_INSERT ||
!           query->commandType == CMD_UPDATE ||
!           query->commandType == CMD_DELETE) &&
!          query->returningList == NULL)))
!         ereport(ERROR,
!                 (errcode(ERRCODE_SYNTAX_ERROR),
!                  errmsg("unexpected non-row-returning command in subquery in WITH"),
!                  parser_errposition(pstate, 0)));
      if (query->intoClause)
          ereport(ERROR,
                  (errcode(ERRCODE_SYNTAX_ERROR),
***************
*** 273,279 **** analyzeCTE(ParseState *pstate, CommonTableExpr *cte)
      if (!cte->cterecursive)
      {
          /* Compute the output column names/types if not done yet */
!         analyzeCTETargetList(pstate, cte, query->targetList);
      }
      else
      {
--- 289,295 ----
      if (!cte->cterecursive)
      {
          /* Compute the output column names/types if not done yet */
!         analyzeCTETargetList(pstate, cte, ctelist);
      }
      else
      {
***************
*** 291,297 **** analyzeCTE(ParseState *pstate, CommonTableExpr *cte)
          lctyp = list_head(cte->ctecoltypes);
          lctypmod = list_head(cte->ctecoltypmods);
          varattno = 0;
!         foreach(lctlist, query->targetList)
          {
              TargetEntry *te = (TargetEntry *) lfirst(lctlist);
              Node       *texpr;
--- 307,313 ----
          lctyp = list_head(cte->ctecoltypes);
          lctypmod = list_head(cte->ctecoltypmods);
          varattno = 0;
!         foreach(lctlist, ctelist)
          {
              TargetEntry *te = (TargetEntry *) lfirst(lctlist);
              Node       *texpr;
*** a/src/backend/parser/parse_relation.c
--- b/src/backend/parser/parse_relation.c
***************
*** 1402,1409 **** addRangeTableEntryForCTE(ParseState *pstate,
      rte->ctelevelsup = levelsup;

      /* Self-reference if and only if CTE's parse analysis isn't completed */
!     rte->self_reference = !IsA(cte->ctequery, Query);
!     Assert(cte->cterecursive || !rte->self_reference);
      /* Bump the CTE's refcount if this isn't a self-reference */
      if (!rte->self_reference)
          cte->cterefcount++;
--- 1402,1409 ----
      rte->ctelevelsup = levelsup;

      /* Self-reference if and only if CTE's parse analysis isn't completed */
!     rte->self_reference = !IsA(cte->ctequery, Query) && !IsA(cte->ctequery, InsertReturning);
!     Assert(cte->cterecursive || !rte->self_reference || IsA(cte->ctequery, InsertReturning));
      /* Bump the CTE's refcount if this isn't a self-reference */
      if (!rte->self_reference)
          cte->cterefcount++;
*** a/src/backend/parser/parse_target.c
--- b/src/backend/parser/parse_target.c
***************
*** 310,319 **** markTargetListOrigin(ParseState *pstate, TargetEntry *tle,
              {
                  CommonTableExpr *cte = GetCTEForRTE(pstate, rte, netlevelsup);
                  TargetEntry *ste;

                  /* should be analyzed by now */
                  Assert(IsA(cte->ctequery, Query));
!                 ste = get_tle_by_resno(((Query *) cte->ctequery)->targetList,
                                         attnum);
                  if (ste == NULL || ste->resjunk)
                      elog(ERROR, "subquery %s does not have attribute %d",
--- 310,321 ----
              {
                  CommonTableExpr *cte = GetCTEForRTE(pstate, rte, netlevelsup);
                  TargetEntry *ste;
+                 Query        *query;

                  /* should be analyzed by now */
                  Assert(IsA(cte->ctequery, Query));
!                 query = (Query *) cte->ctequery;
!                 ste = get_tle_by_resno((query->commandType == CMD_SELECT) ? query->targetList : query->returningList,
                                         attnum);
                  if (ste == NULL || ste->resjunk)
                      elog(ERROR, "subquery %s does not have attribute %d",
***************
*** 1233,1243 **** expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
              {
                  CommonTableExpr *cte = GetCTEForRTE(pstate, rte, netlevelsup);
                  TargetEntry *ste;

                  /* should be analyzed by now */
                  Assert(IsA(cte->ctequery, Query));
!                 ste = get_tle_by_resno(((Query *) cte->ctequery)->targetList,
!                                        attnum);
                  if (ste == NULL || ste->resjunk)
                      elog(ERROR, "subquery %s does not have attribute %d",
                           rte->eref->aliasname, attnum);
--- 1235,1252 ----
              {
                  CommonTableExpr *cte = GetCTEForRTE(pstate, rte, netlevelsup);
                  TargetEntry *ste;
+                 Query        *query;
+                 List        *ctelist;

                  /* should be analyzed by now */
                  Assert(IsA(cte->ctequery, Query));
!                 query = (Query *) cte->ctequery;
!                 if (query->commandType == CMD_SELECT)
!                     ctelist = query->targetList;
!                 else
!                     ctelist = query->returningList;
!
!                 ste = get_tle_by_resno(ctelist, attnum);
                  if (ste == NULL || ste->resjunk)
                      elog(ERROR, "subquery %s does not have attribute %d",
                           rte->eref->aliasname, attnum);
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
***************
*** 3800,3808 **** get_name_for_var_field(Var *var, int fieldno,
                  }
                  if (lc != NULL)
                  {
!                     Query       *ctequery = (Query *) cte->ctequery;
!                     TargetEntry *ste = get_tle_by_resno(ctequery->targetList,
!                                                         attnum);

                      if (ste == NULL || ste->resjunk)
                          elog(ERROR, "subquery %s does not have attribute %d",
--- 3800,3814 ----
                  }
                  if (lc != NULL)
                  {
!                     Query        *ctequery = (Query *) cte->ctequery;
!                     List        *ctelist;
!
!                     if (ctequery->commandType == CMD_SELECT)
!                         ctelist = ctequery->targetList;
!                     else
!                         ctelist = ctequery->returningList;
!
!                     TargetEntry *ste = get_tle_by_resno(ctelist, attnum);

                      if (ste == NULL || ste->resjunk)
                          elog(ERROR, "subquery %s does not have attribute %d",
*** /dev/null
--- b/src/include/executor/nodeInsertReturning.h
***************
*** 0 ****
--- 1,11 ----
+ #ifndef NODEINSERTRETURNING_H
+ #define NODEINSERTRETURNING_H
+
+ #include "nodes/execnodes.h"
+
+ extern int    ExecCountSlotsInsertReturning(InsertReturning *node);
+ extern InsertReturningState *ExecInitInsertReturning(InsertReturning *node, EState *estate, int eflags);
+ extern TupleTableSlot *ExecInsertReturning(InsertReturningState *node);
+ extern void ExecEndInsertReturning(InsertReturningState *node);
+
+ #endif
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 978,983 **** typedef struct ResultState
--- 978,994 ----
  } ResultState;

  /* ----------------
+  *     InsertReturningState information
+  * ----------------
+  */
+ typedef struct InsertReturningState
+ {
+     PlanState        ps;                /* its first field is NodeTag */
+     ResultRelInfo  *resultRelInfo;
+ } InsertReturningState;
+
+
+ /* ----------------
   *     AppendState information
   *
   *        nplans            how many plans are in the list
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 71,76 **** typedef enum NodeTag
--- 71,77 ----
      T_Hash,
      T_SetOp,
      T_Limit,
+     T_InsertReturning,
      /* this one isn't a subclass of Plan: */
      T_PlanInvalItem,

***************
*** 190,195 **** typedef enum NodeTag
--- 191,197 ----
      T_NullTestState,
      T_CoerceToDomainState,
      T_DomainConstraintState,
+     T_InsertReturningState,

      /*
       * TAGS FOR PLANNER NODES (relation.h)
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
***************
*** 53,58 **** typedef struct PlannedStmt
--- 53,60 ----

      IntoClause *intoClause;        /* target for SELECT INTO / CREATE TABLE AS */

+     bool        hasWritableCtes; /* true if there's an (INSERT|UPDATE|DELETE) .. RETURNING inside a CTE */
+
      List       *subplans;        /* Plan trees for SubPlan expressions */

      Bitmapset  *rewindPlanIDs;    /* indices of subplans that require REWIND */
***************
*** 164,169 **** typedef struct Result
--- 166,179 ----
      Node       *resconstantqual;
  } Result;

+ typedef struct InsertReturning
+ {
+     Plan       plan;
+
+     Oid            resultRelationOid;
+     int            resultRelationIndex; /* rtable index of the result relation*/
+ } InsertReturning;
+
  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
***************
*** 76,81 **** typedef struct PlannerGlobal
--- 76,83 ----

      List       *invalItems;        /* other dependencies, as PlanInvalItems */

+     bool        hasWritableCtes; /* is there an (INSERT|UPDATE|DELETE) .. RETURNING inside a CTE? */
+
      Index        lastPHId;        /* highest PlaceHolderVar ID assigned */

      bool        transientPlan;    /* redo plan when TransactionXmin changes? */

Re: Using results from INSERT ... RETURNING

From

Peter Eisentraut

Date:

17 July 2009, 04:42:18

On Tuesday 07 July 2009 23:31:54 Marko Tiikkaja wrote:
> Here's a patch(WIP) that implements INSERT .. RETURNING inside a CTE.

Could you supply some test cases to illustrate what this patch accomplishes?

Re: Using results from INSERT ... RETURNING

From

David Fetter

Date:

17 July 2009, 23:12:36

On Fri, Jul 17, 2009 at 10:42:02AM +0300, Peter Eisentraut wrote:
> On Tuesday 07 July 2009 23:31:54 Marko Tiikkaja wrote:
> > Here's a patch(WIP) that implements INSERT .. RETURNING inside a CTE.
> 
> Could you supply some test cases to illustrate what this patch accomplishes?

postgres:54321=# CREATE TABLE t(i INTEGER);
CREATE TABLE

postgres:54321=# WITH t1 AS (   INSERT INTO t VALUES (1),(2),(3)   RETURNING 'INSERT', i
) SELECT * FROM t1;?column? | i 
----------+---INSERT   | 1INSERT   | 2INSERT   | 3
(3 rows)

Not working yet:

CREATE TABLE t(i SERIAL PRIMARY KEY);
NOTICE:  CREATE TABLE will create implicit sequence "t_i_seq" for serial column "t.i"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t_pkey" for table "t"
CREATE TABLE

postgres:54321=# WITH t1 AS (INSERT INTO t VALUES
(DEFAULT),(DEFAULT),(DEFAULT) RETURNING 'INSERT', i) SELECT * FROM t1;
ERROR:  unrecognized node type: 337

Also planned, but no code written yet:

UPDATE ... RETURNING
DELETE ... RETURNING

UNION [ALL] of each of INSERT, UPDATE, and DELETE...RETURNING inside the
CTE, analogous to recursive CTEs with SELECT.

Way Out There Possibility: mix'n'match recursion.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Using results from INSERT ... RETURNING

From

Jaime Casanova

Date:

18 July 2009, 18:21:34

On Tue, Jul 7, 2009 at 3:31 PM, Marko
Tiikkaja<marko.tiikkaja@cs.helsinki.fi> wrote:
> Hello.
>
> Here's a patch(WIP) that implements INSERT .. RETURNING inside a CTE. Should
> apply cleanly against CVS head.
>
> The INSERT query isn't rewritten so rules and default values don't work.
> Recursive CTEs don't work either.
>

my questions first:
- what's the use case for this?
- why you need a node InsertReturning (see nodeInsertReturning.c) at all?
- if we will support this, shouldn't we supporting INSERT RETURNING
inside subqueries too?

and it crashes for triggers (example using regression's int4_tbl)

create function trg_int4_tbl() returns trigger as $$
begin raise notice 'ejecutando'; return new;
end;
$$ language plpgsql;

create trigger trig_int4_tbl before insert on int4_tbl for each row
execute procedure trg_int4_tbl();
with  q as (insert into int4_tbl select generate_series(1, 5) returning *)
select * from q;
NOTICE:  ejecutando
LOG:  server process (PID 20356) was terminated by signal 11: Segmentation fault
LOG:  terminating any other active server processes
server closed the connection unexpectedlyThis probably means the server terminated abnormallybefore or while processing
therequest. 
The connection to the server was lost. Attempting reset: FATAL:  the
database system is in recovery mode
Failed.
!> LOG:  all server processes terminated; reinitializing

and for defaults (even if i'm providing the values, actually is worse
in that case)

CREATE TABLE t(i SERIAL PRIMARY KEY);

with t1 as (insert into t values (default), (default), (default)
returning 'INSERT', i)
select * from t1;
ERROR:  unrecognized node type: 337

with t1 as (insert into t values (1), (2), (3) returning 'INSERT', i)
select * from t1;
LOG:  server process (PID 21604) was terminated by signal 11: Segmentation fault
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted; last known up at 2009-07-18 15:29:28 ECT
LOG:  database system was not properly shut down; automatic recovery in progress
server closed the connection unexpectedlyThis probably means the server terminated abnormallybefore or while processing
therequest. 
The connection to the server was lost. Attempting reset: FATAL:  the
database system is in recovery mode
Failed.
!> LOG:  redo starts at 0/32A0310

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Using results from INSERT ... RETURNING

From

Merlin Moncure

Date:

18 July 2009, 20:25:34

On Sat, Jul 18, 2009 at 5:21 PM, Jaime
Casanova<jcasanov@systemguards.com.ec> wrote:
> my questions first:
> - what's the use case for this?

Being able to use 'returning' in a subquery is probably the #1 most
requested feature for postgresql (it's also a todo).  Solving it for
'with' queries is a nice step in the right direction, and sidesteps
some of the traps that result from the general case.  There are many
obvious ways this feature is helpful...here's a couple:

move records from one table to another:
with foo as (delete from bar where something returning *) insert
insert into baz select foo.*:

gather defaulted values following an insert for later use:
with foo as (insert into bar(field) select 'hello' from
generate_series(1,n) returning *)  insert into baz select foo.*;

merlin

Re: Using results from INSERT ... RETURNING

From

Jaime Casanova

Date:

18 July 2009, 22:13:09

On Sat, Jul 18, 2009 at 6:25 PM, Merlin Moncure<mmoncure@gmail.com> wrote:
> On Sat, Jul 18, 2009 at 5:21 PM, Jaime
> Casanova<jcasanov@systemguards.com.ec> wrote:
>> my questions first:
>> - what's the use case for this?
>
> Being able to use 'returning' in a subquery is probably the #1 most
> requested feature for postgresql (it's also a todo). Solving it for
> 'with' queries is a nice step in the right direction, and sidesteps
> some of the traps that result from the general case.

ah! that's why i asked: 'if we will support this, shouldn't we
supporting INSERT RETURNING inside subqueries too?'
i'm not too confident with the code but i think the problems for both
cases have to be similar so if we solve one, why not the other?

>
> move records from one table to another:
> with foo as (delete from bar where something returning *) insert
> insert into baz select foo.*:
>

seems like a corner case...

> gather defaulted values following an insert for later use:
> with foo as (insert into bar(field) select 'hello' from
> generate_series(1,n) returning *)  insert into baz select foo.*;
>

ok


--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

18 July 2009, 23:31:52

Jaime Casanova <jcasanov@systemguards.com.ec> writes:
> On Sat, Jul 18, 2009 at 6:25 PM, Merlin Moncure<mmoncure@gmail.com> wrote:
>> Being able to use 'returning' in a subquery is probably the #1 most
>> requested feature for postgresql (it's also a todo). Solving it for
>> 'with' queries is a nice step in the right direction, and sidesteps
>> some of the traps that result from the general case.

> ah! that's why i asked: 'if we will support this, shouldn't we
> supporting INSERT RETURNING inside subqueries too?'

We've been over that: when will you fire triggers?  What happens if the
outer query doesn't want to read the whole output of the DML command,
or wants to read it more than once?

If the DML command is in a single-evaluation WITH clause at the top
level of the command, then it's reasonable to identify the outer
command's own begin and end as the times to fire triggers; and there is
no issue about partial or repeated evaluation.  If it's in a subquery
then things get much messier and more poorly defined.

Note that it can't just be "a WITH clause".  It has to be one at the top
query level, or the problem comes right back.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

19 July 2009, 08:19:27

On Sat, Jul 18, 2009 at 5:21 PM, Jaime
Casanova<jcasanov@systemguards.com.ec> wrote:
> On Tue, Jul 7, 2009 at 3:31 PM, Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi> wrote:
>> [...] rules and default values don't work.
>> Recursive CTEs don't work either.
[...]
> and it crashes for triggers

I think this is a great feature, and it would be REALLY great if it
supported UPDATE and DELETE as well.  DELETE in particular seems like
a really useful case for, e.g., moving records between two partitions
of a partitioned table.

However, it sounds to me like this is going to need more reworking
than is going to get done in the next week or two.  I would encourage
the Marko (the patch author) to ask any specific questions he may have
so we can try to get him some assistance in resolving them, and then I
think we should mark this "Returned with feedback".

...Robert

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

19 July 2009, 10:03:17

Jaime Casanova wrote:
> - why you need a node InsertReturning (see nodeInsertReturning.c) at all?
>   

I couldn't come up with a better way to do this.

> and it crashes for triggers (example using regression's int4_tbl)
>   

Right. I never tested this with triggers. The trigger tuple slot isn't
allocated in InitPlan(). Seems to be one of the many places where the 
code isn't aware that there can be a non-top-level DML statement. Thanks 
for testing.

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

19 July 2009, 13:39:59

Robert Haas <robertmhaas@gmail.com> writes:
> I think this is a great feature, and it would be REALLY great if it
> supported UPDATE and DELETE as well.

It won't get applied until it does, and I imagine the patch author
wasn't expecting any differently.  The submission was clearly marked
"WIP" not "ready to apply".

> However, it sounds to me like this is going to need more reworking
> than is going to get done in the next week or two.

Yeah.  I did a quick scan of the patch and was distressed at how much of
it seemed to be addition of new code; that implies that he's more or
less duplicated the top-level executor processing.

The way that I think this should be approached is
(1) a code-refactoring patch that moves INSERT/UPDATE/DELETE control
into plan nodes; then
(2) a feature patch that makes use of that to expose RETURNING in CTEs.

One thing that's not totally clear to me is whether we'd like to use
control plan nodes all the time, or only for statements with RETURNING.
The nice thing about the latter approach is that there's a well-defined
meaning for the plan node's targetlist.  (Its input is the stuff to be
stored, its output is the RETURNING result; much cleaner than the way
RETURNING is bolted onto the executor now.)  If we do it all the time
then the control nodes would have dummy targetlists for non-RETURNING
cases, which is a little bit ugly.  OTOH it may not be practical to
handle it like that without a lot of code duplication.

Another thing to think about is whether there should be three types
of control nodes or only one.

But anyway, my $0.02 is that the *first* order of business is to propose
a suitable refactoring of execMain.c.  We skipped doing that when we
first added RETURNING, but it's time to make it happen.  Not fasten
another kluge on top of a kluge.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

19 July 2009, 15:39:30

On Sun, Jul 19, 2009 at 12:39 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I think this is a great feature, and it would be REALLY great if it
>> supported UPDATE and DELETE as well.
>
> It won't get applied until it does, and I imagine the patch author
> wasn't expecting any differently.  The submission was clearly marked
> "WIP" not "ready to apply".
>
>> However, it sounds to me like this is going to need more reworking
>> than is going to get done in the next week or two.
>
> Yeah.  I did a quick scan of the patch and was distressed at how much of
> it seemed to be addition of new code; that implies that he's more or
> less duplicated the top-level executor processing.
>
> The way that I think this should be approached is
> (1) a code-refactoring patch that moves INSERT/UPDATE/DELETE control
> into plan nodes; then
> (2) a feature patch that makes use of that to expose RETURNING in CTEs.
>
> One thing that's not totally clear to me is whether we'd like to use
> control plan nodes all the time, or only for statements with RETURNING.
> The nice thing about the latter approach is that there's a well-defined
> meaning for the plan node's targetlist.  (Its input is the stuff to be
> stored, its output is the RETURNING result; much cleaner than the way
> RETURNING is bolted onto the executor now.)  If we do it all the time
> then the control nodes would have dummy targetlists for non-RETURNING
> cases, which is a little bit ugly.  OTOH it may not be practical to
> handle it like that without a lot of code duplication.
>
> Another thing to think about is whether there should be three types
> of control nodes or only one.
>
> But anyway, my $0.02 is that the *first* order of business is to propose
> a suitable refactoring of execMain.c.  We skipped doing that when we
> first added RETURNING, but it's time to make it happen.  Not fasten
> another kluge on top of a kluge.

Tom,

This is a great review and great input.  I wish there were enough
hours in the day for you to do something like this for every patch.  I
feel good about saying that this is Returned With Feedback at this
point, and I see that Jaime Casanova has already made that update.

Thanks very much,

...Robert

Re: Using results from INSERT ... RETURNING

From

"Marko Tiikkaja"

Date:

31 July 2009, 12:58:03

On 7/19/2009, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> The way that I think this should be approached is
> (1) a code-refactoring patch that moves INSERT/UPDATE/DELETE control
> into plan nodes; then
> (2) a feature patch that makes use of that to expose RETURNING in CTEs.

I've been working on this and here's a patch I came up with. It's a
WIP that tries to
put together my thoughts about this. It works only for INSERT and DELETE
and
probably breaks with triggers.

In the attached patch, an Append node isn't added for inheritance sets.
Instead,
the UPDATE/DELETE node tries to take care of choosing the correct result
relation.
I tried to keep estate->es_result_relations as it is in order not to
break anything that
relies on it (for example ExecRelationIsTargetRelation) but
estate->es_result_relation_info
doesn't point to the target relation of the DML node any more. This was
replaced with
a pointer in DmlState to make it possible to add more DML nodes in the
future. Also
the result relation info initialization was moved to the node, because
InitResultRelInfo
needs to know the type of operation that is going to be performed on the
result relation.
Currently that info isn't available at the top level, so I went this
way. I'm not happy with it,
but couldn't come up with better ideas. Currently the result relation
for SELECT FOR
UPDATE/SHARE isn't initialized anywhere, so that won't work.

The patch doesn't do this, but the idea was that if the DML node has a
RETURNING
clause, the node returns the projected tuple and ExecutePlan sends it to
the DestReceiver.
In cases where there is no RETURNING clause, the node would return a
dummy tuple.

Comments are welcome.

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

"Marko Tiikkaja"

Date:

31 July 2009, 13:27:56

On 7/31/2009, "Marko Tiikkaja" <marko.tiikkaja@cs.helsinki.fi> wrote:
> ..

I seem to be having problems with my email client. The patch should be
attached this time. Sorry for the noise.

Regards,
Marko Tiikkaja

Attachment

patch3

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

29 August 2009, 20:20:03

Hi,

This WIP patch refactors the executor by creating nodes for DML (INSERT,
UPDATE, DELETE). It is designed both to clean up the executor and to
help with making it possible to use (INSERT|UPDATE|DELETE) ...RETURNING
inside a WITH clause. At first I thought about removing
PlannedStmt::returningLists, but there are a couple of places where it's
still used, and having it there won't hurt so I didn't touch it.
ExecInitDml() could still be better.

Does anyone see something seriously wrong with it? Ideas and further
improvements are welcome too.

Attached to the upcoming commitfest.

Regards,
Marko Tiikkaja

*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***************
*** 705,710 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 705,727 ----
          case T_Hash:
              pname = sname = "Hash";
              break;
+         case T_Dml:
+             switch( ((Dml *) plan)->operation)
+             {
+                 case CMD_INSERT:
+                     pname = "INSERT";
+                     break;
+                 case CMD_UPDATE:
+                     pname = "UPDATE";
+                     break;
+                 case CMD_DELETE:
+                     pname = "DELETE";
+                     break;
+                 default:
+                     pname = "???";
+                     break;
+             }
+             break;
          default:
              pname = sname = "???";
              break;
***************
*** 1064,1069 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 1081,1091 ----
                                 ((AppendState *) planstate)->appendplans,
                                 outer_plan, es);
              break;
+         case T_Dml:
+             ExplainMemberNodes(((Dml *) plan)->plans,
+                                ((DmlState *) planstate)->dmlplans,
+                                outer_plan, es);
+             break;
          case T_BitmapAnd:
              ExplainMemberNodes(((BitmapAnd *) plan)->bitmapplans,
                                 ((BitmapAndState *) planstate)->bitmapplans,
*** a/src/backend/executor/Makefile
--- b/src/backend/executor/Makefile
***************
*** 15,21 **** include $(top_builddir)/src/Makefile.global
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
--- 15,21 ----
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o nodeDml.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 77,83 **** typedef struct evalPlanQual

  /* decls for local routines only used within this module */
  static void InitPlan(QueryDesc *queryDesc, int eflags);
- static void ExecCheckPlanOutput(Relation resultRel, List *targetList);
  static void ExecEndPlan(PlanState *planstate, EState *estate);
  static void ExecutePlan(EState *estate, PlanState *planstate,
              CmdType operation,
--- 77,82 ----
***************
*** 86,104 **** static void ExecutePlan(EState *estate, PlanState *planstate,
              DestReceiver *dest);
  static void ExecSelect(TupleTableSlot *slot,
             DestReceiver *dest, EState *estate);
- static void ExecInsert(TupleTableSlot *slot, ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecDelete(ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecUpdate(TupleTableSlot *slot, ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecProcessReturning(ProjectionInfo *projectReturning,
-                      TupleTableSlot *tupleSlot,
-                      TupleTableSlot *planSlot,
-                      DestReceiver *dest);
  static TupleTableSlot *EvalPlanQualNext(EState *estate);
  static void EndEvalPlanQual(EState *estate);
  static void ExecCheckRTPerms(List *rangeTable);
--- 85,90 ----
***************
*** 695,700 **** InitPlan(QueryDesc *queryDesc, int eflags)
--- 681,687 ----
                                estate->es_instrument);
              resultRelInfo++;
          }
+
          estate->es_result_relations = resultRelInfos;
          estate->es_num_result_relations = numResultRelations;
          /* Initialize to first or only result rel */
***************
*** 756,765 **** InitPlan(QueryDesc *queryDesc, int eflags)

      /*
       * Initialize the executor "tuple" table.  We need slots for all the plan
!      * nodes, plus possibly output slots for the junkfilter(s). At this point
!      * we aren't sure if we need junkfilters, so just add slots for them
!      * unconditionally.  Also, if it's not a SELECT, set up a slot for use for
!      * trigger output tuples.  Also, one for RETURNING-list evaluation.
       */
      {
          int            nSlots;
--- 743,749 ----

      /*
       * Initialize the executor "tuple" table.  We need slots for all the plan
!      * nodes, plus possibly a slot for use for trigger output tuples.
       */
      {
          int            nSlots;
***************
*** 773,787 **** InitPlan(QueryDesc *queryDesc, int eflags)

              nSlots += ExecCountSlotsNode(subplan);
          }
!         /* Add slots for junkfilter(s) */
!         if (plannedstmt->resultRelations != NIL)
!             nSlots += list_length(plannedstmt->resultRelations);
!         else
!             nSlots += 1;
!         if (operation != CMD_SELECT)
!             nSlots++;            /* for es_trig_tuple_slot */
!         if (plannedstmt->returningLists)
!             nSlots++;            /* for RETURNING projection */

          estate->es_tupleTable = ExecCreateTupleTable(nSlots);

--- 757,769 ----

              nSlots += ExecCountSlotsNode(subplan);
          }
!
!         /*
!          * In SELECT, we might need one for a junkfilter. In DML,
!          * the DML node takes care of reserving slots for
!          * junkfilters, but we need one for es_trig_tuple_slot.
!          */
!         nSlots++;

          estate->es_tupleTable = ExecCreateTupleTable(nSlots);

***************
*** 842,937 **** InitPlan(QueryDesc *queryDesc, int eflags)
      tupType = ExecGetResultType(planstate);

      /*
!      * Initialize the junk filter if needed.  SELECT and INSERT queries need a
!      * filter if there are any junk attrs in the tlist.  UPDATE and DELETE
!      * always need a filter, since there's always a junk 'ctid' attribute
!      * present --- no need to look first.
!      *
!      * This section of code is also a convenient place to verify that the
!      * output of an INSERT or UPDATE matches the target table(s).
       */
      {
          bool        junk_filter_needed = false;
          ListCell   *tlist;

!         switch (operation)
          {
!             case CMD_SELECT:
!             case CMD_INSERT:
!                 foreach(tlist, plan->targetlist)
!                 {
!                     TargetEntry *tle = (TargetEntry *) lfirst(tlist);

!                     if (tle->resjunk)
!                     {
!                         junk_filter_needed = true;
!                         break;
!                     }
!                 }
!                 break;
!             case CMD_UPDATE:
!             case CMD_DELETE:
                  junk_filter_needed = true;
                  break;
!             default:
!                 break;
          }

          if (junk_filter_needed)
          {
-             /*
-              * If there are multiple result relations, each one needs its own
-              * junk filter.  Note this is only possible for UPDATE/DELETE, so
-              * we can't be fooled by some needing a filter and some not.
-              */
              if (list_length(plannedstmt->resultRelations) > 1)
              {
-                 PlanState **appendplans;
-                 int            as_nplans;
-                 ResultRelInfo *resultRelInfo;
-
-                 /* Top plan had better be an Append here. */
-                 Assert(IsA(plan, Append));
-                 Assert(((Append *) plan)->isTarget);
-                 Assert(IsA(planstate, AppendState));
-                 appendplans = ((AppendState *) planstate)->appendplans;
-                 as_nplans = ((AppendState *) planstate)->as_nplans;
-                 Assert(as_nplans == estate->es_num_result_relations);
-                 resultRelInfo = estate->es_result_relations;
-                 for (i = 0; i < as_nplans; i++)
-                 {
-                     PlanState  *subplan = appendplans[i];
-                     JunkFilter *j;
-
-                     if (operation == CMD_UPDATE)
-                         ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
-                                             subplan->plan->targetlist);
-
-                     j = ExecInitJunkFilter(subplan->plan->targetlist,
-                             resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
-                                   ExecAllocTableSlot(estate->es_tupleTable));
-
-                     /*
-                      * Since it must be UPDATE/DELETE, there had better be a
-                      * "ctid" junk attribute in the tlist ... but ctid could
-                      * be at a different resno for each result relation. We
-                      * look up the ctid resnos now and save them in the
-                      * junkfilters.
-                      */
-                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
-                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
-                         elog(ERROR, "could not find junk ctid column");
-                     resultRelInfo->ri_junkFilter = j;
-                     resultRelInfo++;
-                 }
-
-                 /*
-                  * Set active junkfilter too; at this point ExecInitAppend has
-                  * already selected an active result relation...
-                  */
-                 estate->es_junkFilter =
-                     estate->es_result_relation_info->ri_junkFilter;
-
                  /*
                   * We currently can't support rowmarks in this case, because
                   * the associated junk CTIDs might have different resnos in
--- 824,852 ----
      tupType = ExecGetResultType(planstate);

      /*
!      * Initialize the junk filter if needed.  SELECT queries need a
!      * filter if there are any junk attrs in the tlist.
       */
+     if (operation == CMD_SELECT)
      {
          bool        junk_filter_needed = false;
          ListCell   *tlist;

!         foreach(tlist, plan->targetlist)
          {
!             TargetEntry *tle = (TargetEntry *) lfirst(tlist);

!             if (tle->resjunk)
!             {
                  junk_filter_needed = true;
                  break;
!             }
          }

          if (junk_filter_needed)
          {
              if (list_length(plannedstmt->resultRelations) > 1)
              {
                  /*
                   * We currently can't support rowmarks in this case, because
                   * the associated junk CTIDs might have different resnos in
***************
*** 944,956 **** InitPlan(QueryDesc *queryDesc, int eflags)
              }
              else
              {
-                 /* Normal case with just one JunkFilter */
                  JunkFilter *j;

-                 if (operation == CMD_INSERT || operation == CMD_UPDATE)
-                     ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
-                                         planstate->plan->targetlist);
-
                  j = ExecInitJunkFilter(planstate->plan->targetlist,
                                         tupType->tdhasoid,
                                    ExecAllocTableSlot(estate->es_tupleTable));
--- 859,866 ----
***************
*** 958,975 **** InitPlan(QueryDesc *queryDesc, int eflags)
                  if (estate->es_result_relation_info)
                      estate->es_result_relation_info->ri_junkFilter = j;

!                 if (operation == CMD_SELECT)
!                 {
!                     /* For SELECT, want to return the cleaned tuple type */
!                     tupType = j->jf_cleanTupType;
!                 }
!                 else if (operation == CMD_UPDATE || operation == CMD_DELETE)
!                 {
!                     /* For UPDATE/DELETE, find the ctid junk attr now */
!                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
!                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
!                         elog(ERROR, "could not find junk ctid column");
!                 }

                  /* For SELECT FOR UPDATE/SHARE, find the junk attrs now */
                  foreach(l, estate->es_rowMarks)
--- 868,875 ----
                  if (estate->es_result_relation_info)
                      estate->es_result_relation_info->ri_junkFilter = j;

!                 /* For SELECT, want to return the cleaned tuple type */
!                 tupType = j->jf_cleanTupType;

                  /* For SELECT FOR UPDATE/SHARE, find the junk attrs now */
                  foreach(l, estate->es_rowMarks)
***************
*** 999,1055 **** InitPlan(QueryDesc *queryDesc, int eflags)
          }
          else
          {
-             if (operation == CMD_INSERT)
-                 ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
-                                     planstate->plan->targetlist);
-
              estate->es_junkFilter = NULL;
              if (estate->es_rowMarks)
                  elog(ERROR, "SELECT FOR UPDATE/SHARE, but no junk columns");
          }
      }

-     /*
-      * Initialize RETURNING projections if needed.
-      */
-     if (plannedstmt->returningLists)
-     {
-         TupleTableSlot *slot;
-         ExprContext *econtext;
-         ResultRelInfo *resultRelInfo;
-
-         /*
-          * We set QueryDesc.tupDesc to be the RETURNING rowtype in this case.
-          * We assume all the sublists will generate the same output tupdesc.
-          */
-         tupType = ExecTypeFromTL((List *) linitial(plannedstmt->returningLists),
-                                  false);
-
-         /* Set up a slot for the output of the RETURNING projection(s) */
-         slot = ExecAllocTableSlot(estate->es_tupleTable);
-         ExecSetSlotDescriptor(slot, tupType);
-         /* Need an econtext too */
-         econtext = CreateExprContext(estate);
-
-         /*
-          * Build a projection for each result rel.    Note that any SubPlans in
-          * the RETURNING lists get attached to the topmost plan node.
-          */
-         Assert(list_length(plannedstmt->returningLists) == estate->es_num_result_relations);
-         resultRelInfo = estate->es_result_relations;
-         foreach(l, plannedstmt->returningLists)
-         {
-             List       *rlist = (List *) lfirst(l);
-             List       *rliststate;
-
-             rliststate = (List *) ExecInitExpr((Expr *) rlist, planstate);
-             resultRelInfo->ri_projectReturning =
-                 ExecBuildProjectionInfo(rliststate, econtext, slot,
-                                      resultRelInfo->ri_RelationDesc->rd_att);
-             resultRelInfo++;
-         }
-     }
-
      queryDesc->tupDesc = tupType;
      queryDesc->planstate = planstate;

--- 899,910 ----
***************
*** 1151,1225 **** InitResultRelInfo(ResultRelInfo *resultRelInfo,
  }

  /*
-  * Verify that the tuples to be produced by INSERT or UPDATE match the
-  * target relation's rowtype
-  *
-  * We do this to guard against stale plans.  If plan invalidation is
-  * functioning properly then we should never get a failure here, but better
-  * safe than sorry.  Note that this is called after we have obtained lock
-  * on the target rel, so the rowtype can't change underneath us.
-  *
-  * The plan output is represented by its targetlist, because that makes
-  * handling the dropped-column case easier.
-  */
- static void
- ExecCheckPlanOutput(Relation resultRel, List *targetList)
- {
-     TupleDesc    resultDesc = RelationGetDescr(resultRel);
-     int            attno = 0;
-     ListCell   *lc;
-
-     foreach(lc, targetList)
-     {
-         TargetEntry *tle = (TargetEntry *) lfirst(lc);
-         Form_pg_attribute attr;
-
-         if (tle->resjunk)
-             continue;            /* ignore junk tlist items */
-
-         if (attno >= resultDesc->natts)
-             ereport(ERROR,
-                     (errcode(ERRCODE_DATATYPE_MISMATCH),
-                      errmsg("table row type and query-specified row type do not match"),
-                      errdetail("Query has too many columns.")));
-         attr = resultDesc->attrs[attno++];
-
-         if (!attr->attisdropped)
-         {
-             /* Normal case: demand type match */
-             if (exprType((Node *) tle->expr) != attr->atttypid)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_DATATYPE_MISMATCH),
-                          errmsg("table row type and query-specified row type do not match"),
-                          errdetail("Table has type %s at ordinal position %d, but query expects %s.",
-                                    format_type_be(attr->atttypid),
-                                    attno,
-                              format_type_be(exprType((Node *) tle->expr)))));
-         }
-         else
-         {
-             /*
-              * For a dropped column, we can't check atttypid (it's likely 0).
-              * In any case the planner has most likely inserted an INT4 null.
-              * What we insist on is just *some* NULL constant.
-              */
-             if (!IsA(tle->expr, Const) ||
-                 !((Const *) tle->expr)->constisnull)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_DATATYPE_MISMATCH),
-                          errmsg("table row type and query-specified row type do not match"),
-                          errdetail("Query provides a value for a dropped column at ordinal position %d.",
-                                    attno)));
-         }
-     }
-     if (attno != resultDesc->natts)
-         ereport(ERROR,
-                 (errcode(ERRCODE_DATATYPE_MISMATCH),
-           errmsg("table row type and query-specified row type do not match"),
-                  errdetail("Query has too few columns.")));
- }
-
- /*
   *        ExecGetTriggerResultRel
   *
   * Get a ResultRelInfo for a trigger target relation.  Most of the time,
--- 1006,1011 ----
***************
*** 1449,1456 **** ExecutePlan(EState *estate,
      JunkFilter *junkfilter;
      TupleTableSlot *planSlot;
      TupleTableSlot *slot;
-     ItemPointer tupleid = NULL;
-     ItemPointerData tuple_ctid;
      long        current_tuple_count;

      /*
--- 1235,1240 ----
***************
*** 1521,1527 **** lnext:    ;
           *
           * But first, extract all the junk information we need.
           */
!         if ((junkfilter = estate->es_junkFilter) != NULL)
          {
              /*
               * Process any FOR UPDATE or FOR SHARE locking requested.
--- 1305,1311 ----
           *
           * But first, extract all the junk information we need.
           */
!         if (operation == CMD_SELECT && (junkfilter = estate->es_junkFilter) != NULL)
          {
              /*
               * Process any FOR UPDATE or FOR SHARE locking requested.
***************
*** 1630,1661 **** lnext:    ;
                  }
              }

!             /*
!              * extract the 'ctid' junk attribute.
!              */
!             if (operation == CMD_UPDATE || operation == CMD_DELETE)
!             {
!                 Datum        datum;
!                 bool        isNull;
!
!                 datum = ExecGetJunkAttribute(slot, junkfilter->jf_junkAttNo,
!                                              &isNull);
!                 /* shouldn't ever get a null result... */
!                 if (isNull)
!                     elog(ERROR, "ctid is NULL");
!
!                 tupleid = (ItemPointer) DatumGetPointer(datum);
!                 tuple_ctid = *tupleid;    /* make sure we don't free the ctid!! */
!                 tupleid = &tuple_ctid;
!             }
!
!             /*
!              * Create a new "clean" tuple with all junk attributes removed. We
!              * don't need to do this for DELETE, however (there will in fact
!              * be no non-junk attributes in a DELETE!)
!              */
!             if (operation != CMD_DELETE)
!                 slot = ExecFilterJunk(junkfilter, slot);
          }

          /*
--- 1414,1420 ----
                  }
              }

!             slot = ExecFilterJunk(junkfilter, slot);
          }

          /*
***************
*** 1670,1684 **** lnext:    ;
                  break;

              case CMD_INSERT:
-                 ExecInsert(slot, tupleid, planSlot, dest, estate);
-                 break;
-
              case CMD_DELETE:
-                 ExecDelete(tupleid, planSlot, dest, estate);
-                 break;
-
              case CMD_UPDATE:
!                 ExecUpdate(slot, tupleid, planSlot, dest, estate);
                  break;

              default:
--- 1429,1438 ----
                  break;

              case CMD_INSERT:
              case CMD_DELETE:
              case CMD_UPDATE:
!                 if (estate->es_plannedstmt->returningLists)
!                     (*dest->receiveSlot) (slot, dest);
                  break;

              default:
***************
*** 1734,2153 **** ExecSelect(TupleTableSlot *slot,
      (estate->es_processed)++;
  }

- /* ----------------------------------------------------------------
-  *        ExecInsert
-  *
-  *        INSERTs are trickier.. we have to insert the tuple into
-  *        the base relation and insert appropriate tuples into the
-  *        index relations.
-  * ----------------------------------------------------------------
-  */
- static void
- ExecInsert(TupleTableSlot *slot,
-            ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     Oid            newId;
-     List       *recheckIndexes = NIL;
-
-     /*
-      * get the heap tuple out of the tuple table slot, making sure we have a
-      * writable copy
-      */
-     tuple = ExecMaterializeSlot(slot);
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /*
-      * If the result relation has OIDs, force the tuple's OID to zero so that
-      * heap_insert will assign a fresh OID.  Usually the OID already will be
-      * zero at this point, but there are corner cases where the plan tree can
-      * return a tuple extracted literally from some table with the same
-      * rowtype.
-      *
-      * XXX if we ever wanted to allow users to assign their own OIDs to new
-      * rows, this'd be the place to do it.  For the moment, we make a point of
-      * doing this before calling triggers, so that a user-supplied trigger
-      * could hack the OID if desired.
-      */
-     if (resultRelationDesc->rd_rel->relhasoids)
-         HeapTupleSetOid(tuple, InvalidOid);
-
-     /* BEFORE ROW INSERT Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
-     {
-         HeapTuple    newtuple;
-
-         newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
-
-         if (newtuple == NULL)    /* "do nothing" */
-             return;
-
-         if (newtuple != tuple)    /* modified by Trigger(s) */
-         {
-             /*
-              * Put the modified tuple into a slot for convenience of routines
-              * below.  We assume the tuple was allocated in per-tuple memory
-              * context, and therefore will go away by itself. The tuple table
-              * slot should not try to clear it.
-              */
-             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
-
-             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
-                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
-             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
-             slot = newslot;
-             tuple = newtuple;
-         }
-     }
-
-     /*
-      * Check the constraints of the tuple
-      */
-     if (resultRelationDesc->rd_att->constr)
-         ExecConstraints(resultRelInfo, slot, estate);
-
-     /*
-      * insert the tuple
-      *
-      * Note: heap_insert returns the tid (location) of the new tuple in the
-      * t_self field.
-      */
-     newId = heap_insert(resultRelationDesc, tuple,
-                         estate->es_output_cid, 0, NULL);
-
-     IncrAppended();
-     (estate->es_processed)++;
-     estate->es_lastoid = newId;
-     setLastTid(&(tuple->t_self));
-
-     /*
-      * insert index entries for tuple
-      */
-     if (resultRelInfo->ri_NumIndices > 0)
-         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
-                                                estate, false);
-
-     /* AFTER ROW INSERT Triggers */
-     ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
- }
-
- /* ----------------------------------------------------------------
-  *        ExecDelete
-  *
-  *        DELETE is like UPDATE, except that we delete the tuple and no
-  *        index modifications are needed
-  * ----------------------------------------------------------------
-  */
- static void
- ExecDelete(ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     HTSU_Result result;
-     ItemPointerData update_ctid;
-     TransactionId update_xmax;
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /* BEFORE ROW DELETE Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_DELETE] > 0)
-     {
-         bool        dodelete;
-
-         dodelete = ExecBRDeleteTriggers(estate, resultRelInfo, tupleid);
-
-         if (!dodelete)            /* "do nothing" */
-             return;
-     }
-
-     /*
-      * delete the tuple
-      *
-      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
-      * the row to be deleted is visible to that snapshot, and throw a can't-
-      * serialize error if not.    This is a special-case behavior needed for
-      * referential integrity updates in serializable transactions.
-      */
- ldelete:;
-     result = heap_delete(resultRelationDesc, tupleid,
-                          &update_ctid, &update_xmax,
-                          estate->es_output_cid,
-                          estate->es_crosscheck_snapshot,
-                          true /* wait for commit */ );
-     switch (result)
-     {
-         case HeapTupleSelfUpdated:
-             /* already deleted by self; nothing to do */
-             return;
-
-         case HeapTupleMayBeUpdated:
-             break;
-
-         case HeapTupleUpdated:
-             if (IsXactIsoLevelSerializable)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                          errmsg("could not serialize access due to concurrent update")));
-             else if (!ItemPointerEquals(tupleid, &update_ctid))
-             {
-                 TupleTableSlot *epqslot;
-
-                 epqslot = EvalPlanQual(estate,
-                                        resultRelInfo->ri_RangeTableIndex,
-                                        &update_ctid,
-                                        update_xmax);
-                 if (!TupIsNull(epqslot))
-                 {
-                     *tupleid = update_ctid;
-                     goto ldelete;
-                 }
-             }
-             /* tuple already deleted; nothing to do */
-             return;
-
-         default:
-             elog(ERROR, "unrecognized heap_delete status: %u", result);
-             return;
-     }
-
-     IncrDeleted();
-     (estate->es_processed)++;
-
-     /*
-      * Note: Normally one would think that we have to delete index tuples
-      * associated with the heap tuple now...
-      *
-      * ... but in POSTGRES, we have no need to do this because VACUUM will
-      * take care of it later.  We can't delete index tuples immediately
-      * anyway, since the tuple is still visible to other transactions.
-      */
-
-     /* AFTER ROW DELETE Triggers */
-     ExecARDeleteTriggers(estate, resultRelInfo, tupleid);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-     {
-         /*
-          * We have to put the target tuple into a slot, which means first we
-          * gotta fetch it.    We can use the trigger tuple slot.
-          */
-         TupleTableSlot *slot = estate->es_trig_tuple_slot;
-         HeapTupleData deltuple;
-         Buffer        delbuffer;
-
-         deltuple.t_self = *tupleid;
-         if (!heap_fetch(resultRelationDesc, SnapshotAny,
-                         &deltuple, &delbuffer, false, NULL))
-             elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
-
-         if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
-             ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
-         ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
-
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
-
-         ExecClearTuple(slot);
-         ReleaseBuffer(delbuffer);
-     }
- }
-
- /* ----------------------------------------------------------------
-  *        ExecUpdate
-  *
-  *        note: we can't run UPDATE queries with transactions
-  *        off because UPDATEs are actually INSERTs and our
-  *        scan will mistakenly loop forever, updating the tuple
-  *        it just inserted..    This should be fixed but until it
-  *        is, we don't want to get stuck in an infinite loop
-  *        which corrupts your database..
-  * ----------------------------------------------------------------
-  */
- static void
- ExecUpdate(TupleTableSlot *slot,
-            ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     HTSU_Result result;
-     ItemPointerData update_ctid;
-     TransactionId update_xmax;
-     List *recheckIndexes = NIL;
-
-     /*
-      * abort the operation if not running transactions
-      */
-     if (IsBootstrapProcessingMode())
-         elog(ERROR, "cannot UPDATE during bootstrap");
-
-     /*
-      * get the heap tuple out of the tuple table slot, making sure we have a
-      * writable copy
-      */
-     tuple = ExecMaterializeSlot(slot);
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /* BEFORE ROW UPDATE Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_UPDATE] > 0)
-     {
-         HeapTuple    newtuple;
-
-         newtuple = ExecBRUpdateTriggers(estate, resultRelInfo,
-                                         tupleid, tuple);
-
-         if (newtuple == NULL)    /* "do nothing" */
-             return;
-
-         if (newtuple != tuple)    /* modified by Trigger(s) */
-         {
-             /*
-              * Put the modified tuple into a slot for convenience of routines
-              * below.  We assume the tuple was allocated in per-tuple memory
-              * context, and therefore will go away by itself. The tuple table
-              * slot should not try to clear it.
-              */
-             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
-
-             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
-                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
-             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
-             slot = newslot;
-             tuple = newtuple;
-         }
-     }
-
-     /*
-      * Check the constraints of the tuple
-      *
-      * If we generate a new candidate tuple after EvalPlanQual testing, we
-      * must loop back here and recheck constraints.  (We don't need to redo
-      * triggers, however.  If there are any BEFORE triggers then trigger.c
-      * will have done heap_lock_tuple to lock the correct tuple, so there's no
-      * need to do them again.)
-      */
- lreplace:;
-     if (resultRelationDesc->rd_att->constr)
-         ExecConstraints(resultRelInfo, slot, estate);
-
-     /*
-      * replace the heap tuple
-      *
-      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
-      * the row to be updated is visible to that snapshot, and throw a can't-
-      * serialize error if not.    This is a special-case behavior needed for
-      * referential integrity updates in serializable transactions.
-      */
-     result = heap_update(resultRelationDesc, tupleid, tuple,
-                          &update_ctid, &update_xmax,
-                          estate->es_output_cid,
-                          estate->es_crosscheck_snapshot,
-                          true /* wait for commit */ );
-     switch (result)
-     {
-         case HeapTupleSelfUpdated:
-             /* already deleted by self; nothing to do */
-             return;
-
-         case HeapTupleMayBeUpdated:
-             break;
-
-         case HeapTupleUpdated:
-             if (IsXactIsoLevelSerializable)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                          errmsg("could not serialize access due to concurrent update")));
-             else if (!ItemPointerEquals(tupleid, &update_ctid))
-             {
-                 TupleTableSlot *epqslot;
-
-                 epqslot = EvalPlanQual(estate,
-                                        resultRelInfo->ri_RangeTableIndex,
-                                        &update_ctid,
-                                        update_xmax);
-                 if (!TupIsNull(epqslot))
-                 {
-                     *tupleid = update_ctid;
-                     slot = ExecFilterJunk(estate->es_junkFilter, epqslot);
-                     tuple = ExecMaterializeSlot(slot);
-                     goto lreplace;
-                 }
-             }
-             /* tuple already deleted; nothing to do */
-             return;
-
-         default:
-             elog(ERROR, "unrecognized heap_update status: %u", result);
-             return;
-     }
-
-     IncrReplaced();
-     (estate->es_processed)++;
-
-     /*
-      * Note: instead of having to update the old index tuples associated with
-      * the heap tuple, all we do is form and insert new index tuples. This is
-      * because UPDATEs are actually DELETEs and INSERTs, and index tuple
-      * deletion is done later by VACUUM (see notes in ExecDelete).    All we do
-      * here is insert new index tuples.  -cim 9/27/89
-      */
-
-     /*
-      * insert index entries for tuple
-      *
-      * Note: heap_update returns the tid (location) of the new tuple in the
-      * t_self field.
-      *
-      * If it's a HOT update, we mustn't insert new index entries.
-      */
-     if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
-         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
-                                                estate, false);
-
-     /* AFTER ROW UPDATE Triggers */
-     ExecARUpdateTriggers(estate, resultRelInfo, tupleid, tuple,
-                          recheckIndexes);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
- }
-
  /*
   * ExecRelCheck --- check that tuple meets constraints for result relation
   */
--- 1488,1493 ----
***************
*** 2248,2289 **** ExecConstraints(ResultRelInfo *resultRelInfo,
  }

  /*
-  * ExecProcessReturning --- evaluate a RETURNING list and send to dest
-  *
-  * projectReturning: RETURNING projection info for current result rel
-  * tupleSlot: slot holding tuple actually inserted/updated/deleted
-  * planSlot: slot holding tuple returned by top plan node
-  * dest: where to send the output
-  */
- static void
- ExecProcessReturning(ProjectionInfo *projectReturning,
-                      TupleTableSlot *tupleSlot,
-                      TupleTableSlot *planSlot,
-                      DestReceiver *dest)
- {
-     ExprContext *econtext = projectReturning->pi_exprContext;
-     TupleTableSlot *retSlot;
-
-     /*
-      * Reset per-tuple memory context to free any expression evaluation
-      * storage allocated in the previous cycle.
-      */
-     ResetExprContext(econtext);
-
-     /* Make tuple and any needed join variables available to ExecProject */
-     econtext->ecxt_scantuple = tupleSlot;
-     econtext->ecxt_outertuple = planSlot;
-
-     /* Compute the RETURNING expressions */
-     retSlot = ExecProject(projectReturning, NULL);
-
-     /* Send to dest */
-     (*dest->receiveSlot) (retSlot, dest);
-
-     ExecClearTuple(retSlot);
- }
-
- /*
   * Check a modified tuple to see if we want to process its updated version
   * under READ COMMITTED rules.
   *
--- 1588,1593 ----
*** a/src/backend/executor/execProcnode.c
--- b/src/backend/executor/execProcnode.c
***************
*** 91,96 ****
--- 91,97 ----
  #include "executor/nodeHash.h"
  #include "executor/nodeHashjoin.h"
  #include "executor/nodeIndexscan.h"
+ #include "executor/nodeDml.h"
  #include "executor/nodeLimit.h"
  #include "executor/nodeMaterial.h"
  #include "executor/nodeMergejoin.h"
***************
*** 286,291 **** ExecInitNode(Plan *node, EState *estate, int eflags)
--- 287,297 ----
                                                   estate, eflags);
              break;

+         case T_Dml:
+             result = (PlanState *) ExecInitDml((Dml *) node,
+                                                  estate, eflags);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;        /* keep compiler quiet */
***************
*** 451,456 **** ExecProcNode(PlanState *node)
--- 457,466 ----
              result = ExecLimit((LimitState *) node);
              break;

+         case T_DmlState:
+             result = ExecDml((DmlState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;
***************
*** 627,632 **** ExecCountSlotsNode(Plan *node)
--- 637,645 ----
          case T_Limit:
              return ExecCountSlotsLimit((Limit *) node);

+         case T_Dml:
+             return ExecCountSlotsDml((Dml *) node);
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              break;
***************
*** 783,788 **** ExecEndNode(PlanState *node)
--- 796,805 ----
              ExecEndLimit((LimitState *) node);
              break;

+         case T_DmlState:
+             ExecEndDml((DmlState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              break;
*** a/src/backend/executor/nodeAppend.c
--- b/src/backend/executor/nodeAppend.c
***************
*** 103,123 **** exec_append_initialize_next(AppendState *appendstate)
      }
      else
      {
-         /*
-          * initialize the scan
-          *
-          * If we are controlling the target relation, select the proper active
-          * ResultRelInfo and junk filter for this target.
-          */
-         if (((Append *) appendstate->ps.plan)->isTarget)
-         {
-             Assert(whichplan < estate->es_num_result_relations);
-             estate->es_result_relation_info =
-                 estate->es_result_relations + whichplan;
-             estate->es_junkFilter =
-                 estate->es_result_relation_info->ri_junkFilter;
-         }
-
          return TRUE;
      }
  }
--- 103,108 ----
***************
*** 164,189 **** ExecInitAppend(Append *node, EState *estate, int eflags)
      appendstate->appendplans = appendplanstates;
      appendstate->as_nplans = nplans;

!     /*
!      * Do we want to scan just one subplan?  (Special case for EvalPlanQual)
!      * XXX pretty dirty way of determining that this case applies ...
!      */
!     if (node->isTarget && estate->es_evTuple != NULL)
!     {
!         int            tplan;
!
!         tplan = estate->es_result_relation_info - estate->es_result_relations;
!         Assert(tplan >= 0 && tplan < nplans);
!
!         appendstate->as_firstplan = tplan;
!         appendstate->as_lastplan = tplan;
!     }
!     else
!     {
!         /* normal case, scan all subplans */
!         appendstate->as_firstplan = 0;
!         appendstate->as_lastplan = nplans - 1;
!     }

      /*
       * Miscellaneous initialization
--- 149,157 ----
      appendstate->appendplans = appendplanstates;
      appendstate->as_nplans = nplans;

!
!     appendstate->as_firstplan = 0;
!     appendstate->as_lastplan = nplans - 1;

      /*
       * Miscellaneous initialization
*** /dev/null
--- b/src/backend/executor/nodeDml.c
***************
*** 0 ****
--- 1,829 ----
+ #include "postgres.h"
+
+ #include "access/xact.h"
+ #include "parser/parsetree.h"
+ #include "executor/executor.h"
+ #include "executor/execdebug.h"
+ #include "executor/nodeDml.h"
+ #include "commands/trigger.h"
+ #include "nodes/nodeFuncs.h"
+ #include "utils/memutils.h"
+ #include "utils/builtins.h"
+ #include "utils/tqual.h"
+ #include "storage/bufmgr.h"
+ #include "miscadmin.h"
+
+ /*
+  * Verify that the tuples to be produced by INSERT or UPDATE match the
+  * target relation's rowtype
+  *
+  * We do this to guard against stale plans.  If plan invalidation is
+  * functioning properly then we should never get a failure here, but better
+  * safe than sorry.  Note that this is called after we have obtained lock
+  * on the target rel, so the rowtype can't change underneath us.
+  *
+  * The plan output is represented by its targetlist, because that makes
+  * handling the dropped-column case easier.
+  */
+ static void
+ ExecCheckPlanOutput(Relation resultRel, List *targetList)
+ {
+     TupleDesc    resultDesc = RelationGetDescr(resultRel);
+     int            attno = 0;
+     ListCell   *lc;
+
+     foreach(lc, targetList)
+     {
+         TargetEntry *tle = (TargetEntry *) lfirst(lc);
+         Form_pg_attribute attr;
+
+         if (tle->resjunk)
+             continue;            /* ignore junk tlist items */
+
+         if (attno >= resultDesc->natts)
+             ereport(ERROR,
+                     (errcode(ERRCODE_DATATYPE_MISMATCH),
+                      errmsg("table row type and query-specified row type do not match"),
+                      errdetail("Query has too many columns.")));
+         attr = resultDesc->attrs[attno++];
+
+         if (!attr->attisdropped)
+         {
+             /* Normal case: demand type match */
+             if (exprType((Node *) tle->expr) != attr->atttypid)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_DATATYPE_MISMATCH),
+                          errmsg("table row type and query-specified row type do not match"),
+                          errdetail("Table has type %s at ordinal position %d, but query expects %s.",
+                                    format_type_be(attr->atttypid),
+                                    attno,
+                              format_type_be(exprType((Node *) tle->expr)))));
+         }
+         else
+         {
+             /*
+              * For a dropped column, we can't check atttypid (it's likely 0).
+              * In any case the planner has most likely inserted an INT4 null.
+              * What we insist on is just *some* NULL constant.
+              */
+             if (!IsA(tle->expr, Const) ||
+                 !((Const *) tle->expr)->constisnull)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_DATATYPE_MISMATCH),
+                          errmsg("table row type and query-specified row type do not match"),
+                          errdetail("Query provides a value for a dropped column at ordinal position %d.",
+                                    attno)));
+         }
+     }
+     if (attno != resultDesc->natts)
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+           errmsg("table row type and query-specified row type do not match"),
+                  errdetail("Query has too few columns.")));
+ }
+
+ static TupleTableSlot*
+ ExecProcessReturning(ProjectionInfo *projectReturning,
+                      TupleTableSlot *tupleSlot,
+                      TupleTableSlot *planSlot)
+ {
+     ExprContext *econtext = projectReturning->pi_exprContext;
+     TupleTableSlot *retSlot;
+
+     /*
+      * Reset per-tuple memory context to free any expression evaluation
+      * storage allocated in the previous cycle.
+      */
+     ResetExprContext(econtext);
+
+     /* Make tuple and any needed join variables available to ExecProject */
+     econtext->ecxt_scantuple = tupleSlot;
+     econtext->ecxt_outertuple = planSlot;
+
+     /* Compute the RETURNING expressions */
+     retSlot = ExecProject(projectReturning, NULL);
+
+     return retSlot;
+ }
+
+ static TupleTableSlot *
+ ExecInsert(TupleTableSlot *slot,
+             ItemPointer tupleid,
+             TupleTableSlot *planSlot,
+             EState *estate)
+ {
+     HeapTuple    tuple;
+     ResultRelInfo *resultRelInfo;
+     Relation    resultRelationDesc;
+     Oid            newId;
+     List        *recheckIndexes = NIL;
+
+     /*
+      * get the heap tuple out of the tuple table slot, making sure we have a
+      * writable copy
+      */
+     tuple = ExecMaterializeSlot(slot);
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relations;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /*
+      * If the result relation has OIDs, force the tuple's OID to zero so that
+      * heap_insert will assign a fresh OID.  Usually the OID already will be
+      * zero at this point, but there are corner cases where the plan tree can
+      * return a tuple extracted literally from some table with the same
+      * rowtype.
+      *
+      * XXX if we ever wanted to allow users to assign their own OIDs to new
+      * rows, this'd be the place to do it.  For the moment, we make a point of
+      * doing this before calling triggers, so that a user-supplied trigger
+      * could hack the OID if desired.
+      */
+     if (resultRelationDesc->rd_rel->relhasoids)
+         HeapTupleSetOid(tuple, InvalidOid);
+
+     /* BEFORE ROW INSERT Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
+     {
+         HeapTuple    newtuple;
+
+         newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
+
+         if (newtuple == NULL)    /* "do nothing" */
+             return NULL;
+
+         if (newtuple != tuple)    /* modified by Trigger(s) */
+         {
+             /*
+              * Put the modified tuple into a slot for convenience of routines
+              * below.  We assume the tuple was allocated in per-tuple memory
+              * context, and therefore will go away by itself. The tuple table
+              * slot should not try to clear it.
+              */
+             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
+
+             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
+                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
+             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
+             slot = newslot;
+             tuple = newtuple;
+         }
+     }
+
+     /*
+      * Check the constraints of the tuple
+      */
+     if (resultRelationDesc->rd_att->constr)
+         ExecConstraints(resultRelInfo, slot, estate);
+
+     /*
+      * insert the tuple
+      *
+      * Note: heap_insert returns the tid (location) of the new tuple in the
+      * t_self field.
+      */
+     newId = heap_insert(resultRelationDesc, tuple,
+                         estate->es_output_cid, 0, NULL);
+
+     IncrAppended();
+     (estate->es_processed)++;
+     estate->es_lastoid = newId;
+     setLastTid(&(tuple->t_self));
+
+     /*
+      * insert index entries for tuple
+      */
+     if (resultRelInfo->ri_NumIndices > 0)
+         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false);
+
+     /* AFTER ROW INSERT Triggers */
+     ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+         slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                              slot, planSlot);
+
+     return slot;
+ }
+
+ /* ----------------------------------------------------------------
+  *        ExecDelete
+  *
+  *        DELETE is like UPDATE, except that we delete the tuple and no
+  *        index modifications are needed
+  * ----------------------------------------------------------------
+  */
+ static TupleTableSlot *
+ ExecDelete(ItemPointer tupleid,
+            TupleTableSlot *planSlot,
+            EState *estate)
+ {
+     ResultRelInfo* resultRelInfo;
+     Relation    resultRelationDesc;
+     HTSU_Result result;
+     ItemPointerData update_ctid;
+     TransactionId update_xmax;
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relation_info;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /* BEFORE ROW DELETE Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_DELETE] > 0)
+     {
+         bool        dodelete;
+
+         dodelete = ExecBRDeleteTriggers(estate, resultRelInfo, tupleid);
+
+         if (!dodelete)            /* "do nothing" */
+             return planSlot;
+     }
+
+     /*
+      * delete the tuple
+      *
+      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
+      * the row to be deleted is visible to that snapshot, and throw a can't-
+      * serialize error if not.    This is a special-case behavior needed for
+      * referential integrity updates in serializable transactions.
+      */
+ ldelete:;
+     result = heap_delete(resultRelationDesc, tupleid,
+                          &update_ctid, &update_xmax,
+                          estate->es_output_cid,
+                          estate->es_crosscheck_snapshot,
+                          true /* wait for commit */ );
+     switch (result)
+     {
+         case HeapTupleSelfUpdated:
+             /* already deleted by self; nothing to do */
+             return planSlot;
+
+         case HeapTupleMayBeUpdated:
+             break;
+
+         case HeapTupleUpdated:
+             if (IsXactIsoLevelSerializable)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                          errmsg("could not serialize access due to concurrent update")));
+             else if (!ItemPointerEquals(tupleid, &update_ctid))
+             {
+                 TupleTableSlot *epqslot;
+
+                 epqslot = EvalPlanQual(estate,
+                                        resultRelInfo->ri_RangeTableIndex,
+                                        &update_ctid,
+                                        update_xmax);
+                 if (!TupIsNull(epqslot))
+                 {
+                     *tupleid = update_ctid;
+                     goto ldelete;
+                 }
+             }
+             /* tuple already deleted; nothing to do */
+             return planSlot;
+
+         default:
+             elog(ERROR, "unrecognized heap_delete status: %u", result);
+             return NULL;
+     }
+
+     IncrDeleted();
+     (estate->es_processed)++;
+
+     /*
+      * Note: Normally one would think that we have to delete index tuples
+      * associated with the heap tuple now...
+      *
+      * ... but in POSTGRES, we have no need to do this because VACUUM will
+      * take care of it later.  We can't delete index tuples immediately
+      * anyway, since the tuple is still visible to other transactions.
+      */
+
+     /* AFTER ROW DELETE Triggers */
+     ExecARDeleteTriggers(estate, resultRelInfo, tupleid);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+     {
+         /*
+          * We have to put the target tuple into a slot, which means first we
+          * gotta fetch it.    We can use the trigger tuple slot.
+          */
+         TupleTableSlot *slot = estate->es_trig_tuple_slot;
+         HeapTupleData deltuple;
+         Buffer        delbuffer;
+
+         deltuple.t_self = *tupleid;
+         if (!heap_fetch(resultRelationDesc, SnapshotAny,
+                         &deltuple, &delbuffer, false, NULL))
+             elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
+
+         if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
+             ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
+         ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
+
+         planSlot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                              slot, planSlot);
+
+         ExecClearTuple(slot);
+         ReleaseBuffer(delbuffer);
+     }
+
+     return planSlot;
+ }
+
+ /* ----------------------------------------------------------------
+  *        ExecUpdate
+  *
+  *        note: we can't run UPDATE queries with transactions
+  *        off because UPDATEs are actually INSERTs and our
+  *        scan will mistakenly loop forever, updating the tuple
+  *        it just inserted..    This should be fixed but until it
+  *        is, we don't want to get stuck in an infinite loop
+  *        which corrupts your database..
+  * ----------------------------------------------------------------
+  */
+ static TupleTableSlot *
+ ExecUpdate(TupleTableSlot *slot,
+            ItemPointer tupleid,
+            TupleTableSlot *planSlot,
+            EState *estate)
+ {
+     HeapTuple    tuple;
+     ResultRelInfo *resultRelInfo;
+     Relation    resultRelationDesc;
+     HTSU_Result result;
+     ItemPointerData update_ctid;
+     TransactionId update_xmax;
+     List *recheckIndexes = NIL;
+
+     /*
+      * abort the operation if not running transactions
+      */
+     if (IsBootstrapProcessingMode())
+         elog(ERROR, "cannot UPDATE during bootstrap");
+
+     /*
+      * get the heap tuple out of the tuple table slot, making sure we have a
+      * writable copy
+      */
+     tuple = ExecMaterializeSlot(slot);
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relation_info;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /* BEFORE ROW UPDATE Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_UPDATE] > 0)
+     {
+         HeapTuple    newtuple;
+
+         newtuple = ExecBRUpdateTriggers(estate, resultRelInfo,
+                                         tupleid, tuple);
+
+         if (newtuple == NULL)    /* "do nothing" */
+             return planSlot;
+
+         if (newtuple != tuple)    /* modified by Trigger(s) */
+         {
+             /*
+              * Put the modified tuple into a slot for convenience of routines
+              * below.  We assume the tuple was allocated in per-tuple memory
+              * context, and therefore will go away by itself. The tuple table
+              * slot should not try to clear it.
+              */
+             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
+
+             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
+                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
+             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
+             slot = newslot;
+             tuple = newtuple;
+         }
+     }
+
+     /*
+      * Check the constraints of the tuple
+      *
+      * If we generate a new candidate tuple after EvalPlanQual testing, we
+      * must loop back here and recheck constraints.  (We don't need to redo
+      * triggers, however.  If there are any BEFORE triggers then trigger.c
+      * will have done heap_lock_tuple to lock the correct tuple, so there's no
+      * need to do them again.)
+      */
+ lreplace:;
+     if (resultRelationDesc->rd_att->constr)
+         ExecConstraints(resultRelInfo, slot, estate);
+
+     /*
+      * replace the heap tuple
+      *
+      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
+      * the row to be updated is visible to that snapshot, and throw a can't-
+      * serialize error if not.    This is a special-case behavior needed for
+      * referential integrity updates in serializable transactions.
+      */
+     result = heap_update(resultRelationDesc, tupleid, tuple,
+                          &update_ctid, &update_xmax,
+                          estate->es_output_cid,
+                          estate->es_crosscheck_snapshot,
+                          true /* wait for commit */ );
+     switch (result)
+     {
+         case HeapTupleSelfUpdated:
+             /* already deleted by self; nothing to do */
+             return planSlot;
+
+         case HeapTupleMayBeUpdated:
+             break;
+
+         case HeapTupleUpdated:
+             if (IsXactIsoLevelSerializable)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                          errmsg("could not serialize access due to concurrent update")));
+             else if (!ItemPointerEquals(tupleid, &update_ctid))
+             {
+                 TupleTableSlot *epqslot;
+
+                 epqslot = EvalPlanQual(estate,
+                                        resultRelInfo->ri_RangeTableIndex,
+                                        &update_ctid,
+                                        update_xmax);
+                 if (!TupIsNull(epqslot))
+                 {
+                     *tupleid = update_ctid;
+                     slot = ExecFilterJunk(estate->es_result_relation_info->ri_junkFilter, epqslot);
+                     tuple = ExecMaterializeSlot(slot);
+                     goto lreplace;
+                 }
+             }
+             /* tuple already deleted; nothing to do */
+             return planSlot;
+
+         default:
+             elog(ERROR, "unrecognized heap_update status: %u", result);
+             return NULL;
+     }
+
+     IncrReplaced();
+     (estate->es_processed)++;
+
+     /*
+      * Note: instead of having to update the old index tuples associated with
+      * the heap tuple, all we do is form and insert new index tuples. This is
+      * because UPDATEs are actually DELETEs and INSERTs, and index tuple
+      * deletion is done later by VACUUM (see notes in ExecDelete).    All we do
+      * here is insert new index tuples.  -cim 9/27/89
+      */
+
+     /*
+      * insert index entries for tuple
+      *
+      * Note: heap_update returns the tid (location) of the new tuple in the
+      * t_self field.
+      *
+      * If it's a HOT update, we mustn't insert new index entries.
+      */
+     if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
+         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
+                                                estate, false);
+
+     /* AFTER ROW UPDATE Triggers */
+     ExecARUpdateTriggers(estate, resultRelInfo, tupleid, tuple,
+                          recheckIndexes);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+         slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                                     slot, planSlot);
+
+     return slot;
+ }
+
+ TupleTableSlot *
+ ExecDml(DmlState *node)
+ {
+     CmdType operation = node->operation;
+     EState *estate = node->ps.state;
+     JunkFilter *junkfilter;
+     TupleTableSlot *slot;
+     TupleTableSlot *planSlot;
+     ItemPointer tupleid = NULL;
+     ItemPointerData tuple_ctid;
+
+     for (;;)
+     {
+         planSlot = ExecProcNode(node->dmlplans[node->ds_whichplan]);
+         if (TupIsNull(planSlot))
+         {
+             node->ds_whichplan++;
+             if (node->ds_whichplan < node->ds_nplans)
+             {
+                 estate->es_result_relation_info++;
+                 continue;
+             }
+             else
+                 return NULL;
+         }
+         else
+             break;
+     }
+
+     slot = planSlot;
+
+     if ((junkfilter = estate->es_result_relation_info->ri_junkFilter) != NULL)
+     {
+         /*
+          * extract the 'ctid' junk attribute.
+          */
+         if (operation == CMD_UPDATE || operation == CMD_DELETE)
+         {
+             Datum        datum;
+             bool        isNull;
+
+             datum = ExecGetJunkAttribute(slot, junkfilter->jf_junkAttNo,
+                                              &isNull);
+             /* shouldn't ever get a null result... */
+             if (isNull)
+                 elog(ERROR, "ctid is NULL");
+
+             tupleid = (ItemPointer) DatumGetPointer(datum);
+             tuple_ctid = *tupleid;    /* make sure we don't free the ctid!! */
+             tupleid = &tuple_ctid;
+         }
+
+         if (operation != CMD_DELETE)
+             slot = ExecFilterJunk(junkfilter, slot);
+     }
+
+     switch (operation)
+     {
+         case CMD_INSERT:
+             return ExecInsert(slot, tupleid, planSlot, estate);
+             break;
+         case CMD_UPDATE:
+             return ExecUpdate(slot, tupleid, planSlot, estate);
+             break;
+         case CMD_DELETE:
+             return ExecDelete(tupleid, slot, estate);
+         default:
+             elog(ERROR, "unknown operation");
+             break;
+     }
+
+     return NULL;
+ }
+
+ DmlState *
+ ExecInitDml(Dml *node, EState *estate, int eflags)
+ {
+     DmlState *dmlstate;
+     ResultRelInfo *resultRelInfo;
+     Plan *subplan;
+     ListCell *l;
+     ListCell *relindex;
+     CmdType operation = node->operation;
+     int i;
+
+     TupleDesc tupDesc;
+
+     /*
+      * create state structure
+      */
+     dmlstate = makeNode(DmlState);
+     dmlstate->ps.plan = (Plan *) node;
+     dmlstate->ps.state = estate;
+     dmlstate->ps.targetlist = node->plan.targetlist;
+
+     dmlstate->ds_nplans = list_length(node->plans);
+     dmlstate->dmlplans = (PlanState **) palloc0(sizeof(PlanState *) * dmlstate->ds_nplans);
+     dmlstate->operation = node->operation;
+
+     estate->es_result_relation_info = estate->es_result_relations;
+     relindex = list_head(node->resultRelations);
+     i = 0;
+     foreach(l, node->plans)
+     {
+         subplan = lfirst(l);
+
+         dmlstate->dmlplans[i] = ExecInitNode(subplan, estate, eflags);
+
+         i++;
+         estate->es_result_relation_info++;
+         relindex = lnext(relindex);
+     }
+
+     estate->es_result_relation_info = estate->es_result_relations;
+
+     dmlstate->ds_whichplan = 0;
+
+     subplan = (Plan *) linitial(node->plans);
+
+     /* Initialize targetlist for RETURNING */
+     if (node->returningLists)
+     {
+         TupleTableSlot *slot;
+         ExprContext *econtext;
+
+         /*
+          * Initialize result tuple slot and assign
+          * type from the target list.
+          */
+         tupDesc = ExecTypeFromTL((List *) linitial(node->returningLists),
+                                  false);
+
+         /*
+          * Set up a slot for the output of the RETURNING projection(s).
+          */
+         slot = ExecAllocTableSlot(estate->es_tupleTable);
+         ExecSetSlotDescriptor(slot, tupDesc);
+
+         econtext = CreateExprContext(estate);
+
+         Assert(list_length(node->returningLists) == estate->es_num_result_relations);
+         resultRelInfo = estate->es_result_relations;
+         foreach(l, node->returningLists)
+         {
+             List       *rlist = (List *) lfirst(l);
+             List       *rliststate;
+
+             rliststate = (List *) ExecInitExpr((Expr *) rlist, &dmlstate->ps);
+             resultRelInfo->ri_projectReturning =
+                 ExecBuildProjectionInfo(rliststate, econtext, slot,
+                                         resultRelInfo->ri_RelationDesc->rd_att);
+             resultRelInfo++;
+         }
+
+         dmlstate->ps.targetlist = estate->es_result_relation_info->ri_projectReturning->pi_targetlist;
+         dmlstate->ps.ps_ResultTupleSlot = slot;
+         dmlstate->ps.ps_ExprContext = econtext;
+     }
+     else
+     {
+         ExecInitResultTupleSlot(estate, &dmlstate->ps);
+         tupDesc = ExecTypeFromTL(subplan->targetlist, false);
+         ExecAssignResultType(&dmlstate->ps, tupDesc);
+
+         dmlstate->ps.targetlist = subplan->targetlist;
+         dmlstate->ps.ps_ExprContext = NULL;
+     }
+
+     /*
+      * Initialize the junk filter if needed. INSERT queries need a filter
+      * if there are any junk attrs in the tlist.  UPDATE and DELETE
+      * always need a filter, since there's always a junk 'ctid' attribute
+      * present --- no need to look first.
+      *
+      * This section of code is also a convenient place to verify that the
+      * output of an INSERT or UPDATE matches the target table(s).
+      */
+     {
+         bool        junk_filter_needed = false;
+         ListCell   *tlist;
+
+         switch (operation)
+         {
+             case CMD_INSERT:
+                 foreach(tlist, subplan->targetlist)
+                 {
+                     TargetEntry *tle = (TargetEntry *) lfirst(tlist);
+
+                     if (tle->resjunk)
+                     {
+                         junk_filter_needed = true;
+                         break;
+                     }
+                 }
+                 break;
+             case CMD_UPDATE:
+             case CMD_DELETE:
+                 junk_filter_needed = true;
+                 break;
+             default:
+                 break;
+         }
+
+         if (junk_filter_needed)
+         {
+             /*
+              * If there are multiple result relations, each one needs its own
+              * junk filter.  Note this is only possible for UPDATE/DELETE, so
+              * we can't be fooled by some needing a filter and some not.
+
+              */
+             if (dmlstate->ds_nplans > 1)
+             {
+                 resultRelInfo = estate->es_result_relations;
+                 for (i = 0; i < dmlstate->ds_nplans; i++)
+                 {
+                     PlanState *ps = dmlstate->dmlplans[i];
+                     JunkFilter *j;
+
+                     if (operation == CMD_UPDATE)
+                         ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
+                                             ps->plan->targetlist);
+
+                     j = ExecInitJunkFilter(ps->plan->targetlist,
+                             resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
+                                 ExecAllocTableSlot(estate->es_tupleTable));
+
+
+                     /*
+                      * Since it must be UPDATE/DELETE, there had better be a
+                      * "ctid" junk attribute in the tlist ... but ctid could
+                      * be at a different resno for each result relation. We
+                      * look up the ctid resnos now and save them in the
+                      * junkfilters.
+                      */
+                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
+                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
+                         elog(ERROR, "could not find junk ctid column");
+                     resultRelInfo->ri_junkFilter = j;
+                     resultRelInfo++;
+                 }
+             }
+             else
+             {
+                 JunkFilter *j;
+                 subplan = dmlstate->dmlplans[0]->plan;
+
+                 if (operation == CMD_INSERT || operation == CMD_UPDATE)
+                     ExecCheckPlanOutput(estate->es_result_relations->ri_RelationDesc,
+                                         subplan->targetlist);
+
+                 j = ExecInitJunkFilter(subplan->targetlist,
estate->es_result_relation_info->ri_RelationDesc->rd_att->tdhasoid,
+                                        ExecAllocTableSlot(estate->es_tupleTable));
+
+                 if (operation == CMD_UPDATE || operation == CMD_DELETE)
+                 {
+                     /* FOR UPDATE/DELETE, find the ctid junk attr now */
+                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
+                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
+                         elog(ERROR, "could not find junk ctid column");
+                 }
+
+                 estate->es_result_relation_info->ri_junkFilter = j;
+             }
+         }
+         else
+         {
+             if (operation == CMD_INSERT)
+                 ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
+                         ((Plan *) linitial(((Dml *) dmlstate->ps.plan)->plans))->targetlist);
+         }
+     }
+
+     return dmlstate;
+ }
+
+ int
+ ExecCountSlotsDml(Dml *node)
+ {
+     ListCell* l;
+     int nslots = 5;
+
+     foreach(l, node->plans)
+         nslots += ExecCountSlotsNode((Plan *) lfirst(l));
+
+     return nslots;
+ }
+
+ void
+ ExecEndDml(DmlState *node)
+ {
+     int i;
+
+     /*
+      * Free the exprcontext
+      */
+     ExecFreeExprContext(&node->ps);
+
+     /*
+      * clean out the tuple table
+      */
+     ExecClearTuple(node->ps.ps_ResultTupleSlot);
+
+     /*
+      * shut down subplans
+      */
+     for (i=0;i<node->ds_nplans;++i)
+     {
+         ExecEndNode(node->dmlplans[i]);
+     }
+
+     pfree(node->dmlplans);
+ }
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 171,177 **** _copyAppend(Append *from)
       * copy remainder of node
       */
      COPY_NODE_FIELD(appendplans);
-     COPY_SCALAR_FIELD(isTarget);

      return newnode;
  }
--- 171,176 ----
***************
*** 1391,1396 **** _copyXmlExpr(XmlExpr *from)
--- 1390,1411 ----

      return newnode;
  }
+
+ static Dml *
+ _copyDml(Dml *from)
+ {
+     Dml    *newnode = makeNode(Dml);
+
+     CopyPlanFields((Plan *) from, (Plan *) newnode);
+
+     COPY_NODE_FIELD(plans);
+     COPY_SCALAR_FIELD(operation);
+     COPY_NODE_FIELD(resultRelations);
+     COPY_NODE_FIELD(returningLists);
+
+     return newnode;
+ }
+

  /*
   * _copyNullIfExpr (same as OpExpr)
***************
*** 4083,4088 **** copyObject(void *from)
--- 4098,4106 ----
          case T_XmlSerialize:
              retval = _copyXmlSerialize(from);
              break;
+         case T_Dml:
+             retval = _copyDml(from);
+             break;

          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 326,332 **** _outAppend(StringInfo str, Append *node)
      _outPlanInfo(str, (Plan *) node);

      WRITE_NODE_FIELD(appendplans);
-     WRITE_BOOL_FIELD(isTarget);
  }

  static void
--- 326,331 ----
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 574,580 **** create_append_plan(PlannerInfo *root, AppendPath *best_path)
          subplans = lappend(subplans, create_plan(root, subpath));
      }

!     plan = make_append(subplans, false, tlist);

      return (Plan *) plan;
  }
--- 574,580 ----
          subplans = lappend(subplans, create_plan(root, subpath));
      }

!     plan = make_append(subplans, tlist);

      return (Plan *) plan;
  }
***************
*** 2616,2622 **** make_worktablescan(List *qptlist,
  }

  Append *
! make_append(List *appendplans, bool isTarget, List *tlist)
  {
      Append       *node = makeNode(Append);
      Plan       *plan = &node->plan;
--- 2616,2622 ----
  }

  Append *
! make_append(List *appendplans, List *tlist)
  {
      Append       *node = makeNode(Append);
      Plan       *plan = &node->plan;
***************
*** 2652,2658 **** make_append(List *appendplans, bool isTarget, List *tlist)
      plan->lefttree = NULL;
      plan->righttree = NULL;
      node->appendplans = appendplans;
-     node->isTarget = isTarget;

      return node;
  }
--- 2652,2657 ----
***************
*** 3659,3664 **** make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
--- 3658,3689 ----
      return node;
  }

+ Dml *
+ make_dml(List *subplans, List *returningLists, List *resultRelations, CmdType operation)
+ {
+     Dml *node = makeNode(Dml);
+
+     Assert(list_length(subplans) == list_length(resultRelations));
+     Assert(!returningLists || list_length(returningLists) == list_length(resultRelations));
+
+     node->plan.lefttree = NULL;
+     node->plan.righttree = NULL;
+     node->plan.qual = NIL;
+
+     if (returningLists)
+         node->plan.targetlist = linitial(returningLists);
+     else
+         node->plan.targetlist = NIL;
+
+     node->plans = subplans;
+     node->resultRelations = resultRelations;
+     node->returningLists = returningLists;
+
+     node->operation = operation;
+
+     return node;
+ }
+
  /*
   * make_result
   *      Build a Result plan node
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***************
*** 478,485 **** subquery_planner(PlannerGlobal *glob, Query *parse,
--- 478,494 ----
          rt_fetch(parse->resultRelation, parse->rtable)->inh)
          plan = inheritance_planner(root);
      else
+     {
          plan = grouping_planner(root, tuple_fraction);

+         if (parse->commandType != CMD_SELECT)
+             plan = (Plan *) make_dml(list_make1(plan),
+                                      root->returningLists,
+                                      root->resultRelations,
+                                      parse->commandType);
+     }
+
+
      /*
       * If any subplans were generated, or if we're inside a subplan, build
       * initPlan list and extParam/allParam sets for plan nodes, and attach the
***************
*** 624,632 **** preprocess_qual_conditions(PlannerInfo *root, Node *jtnode)
   * is an inheritance set. Source inheritance is expanded at the bottom of the
   * plan tree (see allpaths.c), but target inheritance has to be expanded at
   * the top.  The reason is that for UPDATE, each target relation needs a
!  * different targetlist matching its own column set.  Also, for both UPDATE
!  * and DELETE, the executor needs the Append plan node at the top, else it
!  * can't keep track of which table is the current target table.  Fortunately,
   * the UPDATE/DELETE target can never be the nullable side of an outer join,
   * so it's OK to generate the plan this way.
   *
--- 633,639 ----
   * is an inheritance set. Source inheritance is expanded at the bottom of the
   * plan tree (see allpaths.c), but target inheritance has to be expanded at
   * the top.  The reason is that for UPDATE, each target relation needs a
!  * different targetlist matching its own column set.  Fortunately,
   * the UPDATE/DELETE target can never be the nullable side of an outer join,
   * so it's OK to generate the plan this way.
   *
***************
*** 737,747 **** inheritance_planner(PlannerInfo *root)
       */
      parse->rtable = rtable;

!     /* Suppress Append if there's only one surviving child rel */
!     if (list_length(subplans) == 1)
!         return (Plan *) linitial(subplans);
!
!     return (Plan *) make_append(subplans, true, tlist);
  }

  /*--------------------
--- 744,753 ----
       */
      parse->rtable = rtable;

!     return (Plan *) make_dml(subplans,
!                              root->returningLists,
!                              root->resultRelations,
!                              parse->commandType);
  }

  /*--------------------
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***************
*** 375,380 **** set_plan_refs(PlannerGlobal *glob, Plan *plan, int rtoffset)
--- 375,403 ----
              set_join_references(glob, (Join *) plan, rtoffset);
              break;

+         case T_Dml:
+             {
+                 /*
+                  * grouping_planner() already called set_returning_clause_references
+                  * so the targetList's references are already set.
+                  */
+                 Dml *splan = (Dml *) plan;
+
+                 foreach(l, splan->resultRelations)
+                 {
+                     lfirst_int(l) += rtoffset;
+                 }
+
+                 Assert(splan->plan.qual == NIL);
+                 foreach(l, splan->plans)
+                 {
+                     lfirst(l) = set_plan_refs(glob,
+                                               (Plan *) lfirst(l),
+                                               rtoffset);
+                 }
+             }
+             break;
+
          case T_Hash:
          case T_Material:
          case T_Sort:
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***************
*** 2034,2039 **** finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params)
--- 2034,2040 ----
          case T_Unique:
          case T_SetOp:
          case T_Group:
+         case T_Dml:
              break;

          default:
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
***************
*** 448,454 **** generate_union_plan(SetOperationStmt *op, PlannerInfo *root,
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, false, tlist);

      /*
       * For UNION ALL, we just need the Append plan.  For UNION, need to add
--- 448,454 ----
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, tlist);

      /*
       * For UNION ALL, we just need the Append plan.  For UNION, need to add
***************
*** 539,545 **** generate_nonunion_plan(SetOperationStmt *op, PlannerInfo *root,
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, false, tlist);

      /* Identify the grouping semantics */
      groupList = generate_setop_grouplist(op, tlist);
--- 539,545 ----
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, tlist);

      /* Identify the grouping semantics */
      groupList = generate_setop_grouplist(op, tlist);
*** /dev/null
--- b/src/include/executor/nodeDml.h
***************
*** 0 ****
--- 1,11 ----
+ #ifndef NODEDML_H
+ #define NODEDML_H
+
+ #include "nodes/execnodes.h"
+
+ extern int ExecCountSlotsDml(Dml *node);
+ extern DmlState *ExecInitDml(Dml *node, EState *estate, int eflags);
+ extern TupleTableSlot *ExecDml(DmlState *node);
+ extern void ExecEndDml(DmlState *node);
+
+ #endif
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 976,981 **** typedef struct ResultState
--- 976,996 ----
  } ResultState;

  /* ----------------
+  *     DmlState information
+  * ----------------
+  */
+ typedef struct DmlState
+ {
+     PlanState        ps;                /* its first field is NodeTag */
+     PlanState      **dmlplans;
+     int                ds_nplans;
+     int                ds_whichplan;
+
+     CmdType            operation;
+ } DmlState;
+
+
+ /* ----------------
   *     AppendState information
   *
   *        nplans            how many plans are in the list
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 71,76 **** typedef enum NodeTag
--- 71,77 ----
      T_Hash,
      T_SetOp,
      T_Limit,
+     T_Dml,
      /* this one isn't a subclass of Plan: */
      T_PlanInvalItem,

***************
*** 190,195 **** typedef enum NodeTag
--- 191,197 ----
      T_NullTestState,
      T_CoerceToDomainState,
      T_DomainConstraintState,
+     T_DmlState,

      /*
       * TAGS FOR PLANNER NODES (relation.h)
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
***************
*** 164,185 **** typedef struct Result
      Node       *resconstantqual;
  } Result;

  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
-  *
-  * Append nodes are sometimes used to switch between several result relations
-  * (when the target of an UPDATE or DELETE is an inheritance set).    Such a
-  * node will have isTarget true.  The Append executor is then responsible
-  * for updating the executor state to point at the correct target relation
-  * whenever it switches subplans.
   * ----------------
   */
  typedef struct Append
  {
      Plan        plan;
      List       *appendplans;
-     bool        isTarget;
  } Append;

  /* ----------------
--- 164,189 ----
      Node       *resconstantqual;
  } Result;

+ typedef struct Dml
+ {
+     Plan        plan;
+
+     CmdType        operation;
+     List       *plans;
+     List       *resultRelations;
+     List       *returningLists;
+ } Dml;
+
+
  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
   * ----------------
   */
  typedef struct Append
  {
      Plan        plan;
      List       *appendplans;
  } Append;

  /* ----------------
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 41,47 **** extern Plan *optimize_minmax_aggregates(PlannerInfo *root, List *tlist,
  extern Plan *create_plan(PlannerInfo *root, Path *best_path);
  extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
                    Index scanrelid, Plan *subplan, List *subrtable);
! extern Append *make_append(List *appendplans, bool isTarget, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
                       Plan *lefttree, Plan *righttree, int wtParam,
                       List *distinctList, long numGroups);
--- 41,47 ----
  extern Plan *create_plan(PlannerInfo *root, Path *best_path);
  extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
                    Index scanrelid, Plan *subplan, List *subrtable);
! extern Append *make_append(List *appendplans, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
                       Plan *lefttree, Plan *righttree, int wtParam,
                       List *distinctList, long numGroups);
***************
*** 69,74 **** extern Plan *materialize_finished_plan(Plan *subplan);
--- 69,76 ----
  extern Unique *make_unique(Plan *lefttree, List *distinctList);
  extern Limit *make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
             int64 offset_est, int64 count_est);
+ extern Dml *make_dml(List *subplans, List *returningLists, List *resultRelation,
+             CmdType operation);
  extern SetOp *make_setop(SetOpCmd cmd, SetOpStrategy strategy, Plan *lefttree,
             List *distinctList, AttrNumber flagColIdx, int firstFlag,
             long numGroups, double outputRows);

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

06 September 2009, 07:11:25

Hi,

Fixed a couple of bugs and renovated ExecInitDml() a bit.  Patch attached.

Regards,
Marko Tiikkaja

*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***************
*** 705,710 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 705,727 ----
          case T_Hash:
              pname = sname = "Hash";
              break;
+         case T_Dml:
+             switch( ((Dml *) plan)->operation)
+             {
+                 case CMD_INSERT:
+                     pname = "INSERT";
+                     break;
+                 case CMD_UPDATE:
+                     pname = "UPDATE";
+                     break;
+                 case CMD_DELETE:
+                     pname = "DELETE";
+                     break;
+                 default:
+                     pname = "???";
+                     break;
+             }
+             break;
          default:
              pname = sname = "???";
              break;
***************
*** 1064,1069 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 1081,1091 ----
                                 ((AppendState *) planstate)->appendplans,
                                 outer_plan, es);
              break;
+         case T_Dml:
+             ExplainMemberNodes(((Dml *) plan)->plans,
+                                ((DmlState *) planstate)->dmlplans,
+                                outer_plan, es);
+             break;
          case T_BitmapAnd:
              ExplainMemberNodes(((BitmapAnd *) plan)->bitmapplans,
                                 ((BitmapAndState *) planstate)->bitmapplans,
*** a/src/backend/executor/Makefile
--- b/src/backend/executor/Makefile
***************
*** 15,21 **** include $(top_builddir)/src/Makefile.global
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
--- 15,21 ----
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o nodeDml.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 77,83 **** typedef struct evalPlanQual

  /* decls for local routines only used within this module */
  static void InitPlan(QueryDesc *queryDesc, int eflags);
- static void ExecCheckPlanOutput(Relation resultRel, List *targetList);
  static void ExecEndPlan(PlanState *planstate, EState *estate);
  static void ExecutePlan(EState *estate, PlanState *planstate,
              CmdType operation,
--- 77,82 ----
***************
*** 86,104 **** static void ExecutePlan(EState *estate, PlanState *planstate,
              DestReceiver *dest);
  static void ExecSelect(TupleTableSlot *slot,
             DestReceiver *dest, EState *estate);
- static void ExecInsert(TupleTableSlot *slot, ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecDelete(ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecUpdate(TupleTableSlot *slot, ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecProcessReturning(ProjectionInfo *projectReturning,
-                      TupleTableSlot *tupleSlot,
-                      TupleTableSlot *planSlot,
-                      DestReceiver *dest);
  static TupleTableSlot *EvalPlanQualNext(EState *estate);
  static void EndEvalPlanQual(EState *estate);
  static void ExecCheckRTPerms(List *rangeTable);
--- 85,90 ----
***************
*** 695,700 **** InitPlan(QueryDesc *queryDesc, int eflags)
--- 681,687 ----
                                estate->es_instrument);
              resultRelInfo++;
          }
+
          estate->es_result_relations = resultRelInfos;
          estate->es_num_result_relations = numResultRelations;
          /* Initialize to first or only result rel */
***************
*** 758,765 **** InitPlan(QueryDesc *queryDesc, int eflags)
       * Initialize the executor "tuple" table.  We need slots for all the plan
       * nodes, plus possibly output slots for the junkfilter(s). At this point
       * we aren't sure if we need junkfilters, so just add slots for them
!      * unconditionally.  Also, if it's not a SELECT, set up a slot for use for
!      * trigger output tuples.  Also, one for RETURNING-list evaluation.
       */
      {
          int            nSlots;
--- 745,751 ----
       * Initialize the executor "tuple" table.  We need slots for all the plan
       * nodes, plus possibly output slots for the junkfilter(s). At this point
       * we aren't sure if we need junkfilters, so just add slots for them
!      * unconditionally.
       */
      {
          int            nSlots;
***************
*** 779,787 **** InitPlan(QueryDesc *queryDesc, int eflags)
          else
              nSlots += 1;
          if (operation != CMD_SELECT)
!             nSlots++;            /* for es_trig_tuple_slot */
!         if (plannedstmt->returningLists)
!             nSlots++;            /* for RETURNING projection */

          estate->es_tupleTable = ExecCreateTupleTable(nSlots);

--- 765,771 ----
          else
              nSlots += 1;
          if (operation != CMD_SELECT)
!             nSlots++;        /* for es_trig_tuple_slot */

          estate->es_tupleTable = ExecCreateTupleTable(nSlots);

***************
*** 842,937 **** InitPlan(QueryDesc *queryDesc, int eflags)
      tupType = ExecGetResultType(planstate);

      /*
!      * Initialize the junk filter if needed.  SELECT and INSERT queries need a
!      * filter if there are any junk attrs in the tlist.  UPDATE and DELETE
!      * always need a filter, since there's always a junk 'ctid' attribute
!      * present --- no need to look first.
!      *
!      * This section of code is also a convenient place to verify that the
!      * output of an INSERT or UPDATE matches the target table(s).
       */
      {
          bool        junk_filter_needed = false;
          ListCell   *tlist;

!         switch (operation)
          {
!             case CMD_SELECT:
!             case CMD_INSERT:
!                 foreach(tlist, plan->targetlist)
!                 {
!                     TargetEntry *tle = (TargetEntry *) lfirst(tlist);

!                     if (tle->resjunk)
!                     {
!                         junk_filter_needed = true;
!                         break;
!                     }
!                 }
!                 break;
!             case CMD_UPDATE:
!             case CMD_DELETE:
                  junk_filter_needed = true;
                  break;
!             default:
!                 break;
          }

          if (junk_filter_needed)
          {
-             /*
-              * If there are multiple result relations, each one needs its own
-              * junk filter.  Note this is only possible for UPDATE/DELETE, so
-              * we can't be fooled by some needing a filter and some not.
-              */
              if (list_length(plannedstmt->resultRelations) > 1)
              {
-                 PlanState **appendplans;
-                 int            as_nplans;
-                 ResultRelInfo *resultRelInfo;
-
-                 /* Top plan had better be an Append here. */
-                 Assert(IsA(plan, Append));
-                 Assert(((Append *) plan)->isTarget);
-                 Assert(IsA(planstate, AppendState));
-                 appendplans = ((AppendState *) planstate)->appendplans;
-                 as_nplans = ((AppendState *) planstate)->as_nplans;
-                 Assert(as_nplans == estate->es_num_result_relations);
-                 resultRelInfo = estate->es_result_relations;
-                 for (i = 0; i < as_nplans; i++)
-                 {
-                     PlanState  *subplan = appendplans[i];
-                     JunkFilter *j;
-
-                     if (operation == CMD_UPDATE)
-                         ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
-                                             subplan->plan->targetlist);
-
-                     j = ExecInitJunkFilter(subplan->plan->targetlist,
-                             resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
-                                   ExecAllocTableSlot(estate->es_tupleTable));
-
-                     /*
-                      * Since it must be UPDATE/DELETE, there had better be a
-                      * "ctid" junk attribute in the tlist ... but ctid could
-                      * be at a different resno for each result relation. We
-                      * look up the ctid resnos now and save them in the
-                      * junkfilters.
-                      */
-                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
-                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
-                         elog(ERROR, "could not find junk ctid column");
-                     resultRelInfo->ri_junkFilter = j;
-                     resultRelInfo++;
-                 }
-
-                 /*
-                  * Set active junkfilter too; at this point ExecInitAppend has
-                  * already selected an active result relation...
-                  */
-                 estate->es_junkFilter =
-                     estate->es_result_relation_info->ri_junkFilter;
-
                  /*
                   * We currently can't support rowmarks in this case, because
                   * the associated junk CTIDs might have different resnos in
--- 826,854 ----
      tupType = ExecGetResultType(planstate);

      /*
!      * Initialize the junk filter if needed.  SELECT queries need a
!      * filter if there are any junk attrs in the tlist.
       */
+     if (operation == CMD_SELECT)
      {
          bool        junk_filter_needed = false;
          ListCell   *tlist;

!         foreach(tlist, plan->targetlist)
          {
!             TargetEntry *tle = (TargetEntry *) lfirst(tlist);

!             if (tle->resjunk)
!             {
                  junk_filter_needed = true;
                  break;
!             }
          }

          if (junk_filter_needed)
          {
              if (list_length(plannedstmt->resultRelations) > 1)
              {
                  /*
                   * We currently can't support rowmarks in this case, because
                   * the associated junk CTIDs might have different resnos in
***************
*** 944,956 **** InitPlan(QueryDesc *queryDesc, int eflags)
              }
              else
              {
-                 /* Normal case with just one JunkFilter */
                  JunkFilter *j;

-                 if (operation == CMD_INSERT || operation == CMD_UPDATE)
-                     ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
-                                         planstate->plan->targetlist);
-
                  j = ExecInitJunkFilter(planstate->plan->targetlist,
                                         tupType->tdhasoid,
                                    ExecAllocTableSlot(estate->es_tupleTable));
--- 861,868 ----
***************
*** 958,975 **** InitPlan(QueryDesc *queryDesc, int eflags)
                  if (estate->es_result_relation_info)
                      estate->es_result_relation_info->ri_junkFilter = j;

!                 if (operation == CMD_SELECT)
!                 {
!                     /* For SELECT, want to return the cleaned tuple type */
!                     tupType = j->jf_cleanTupType;
!                 }
!                 else if (operation == CMD_UPDATE || operation == CMD_DELETE)
!                 {
!                     /* For UPDATE/DELETE, find the ctid junk attr now */
!                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
!                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
!                         elog(ERROR, "could not find junk ctid column");
!                 }

                  /* For SELECT FOR UPDATE/SHARE, find the junk attrs now */
                  foreach(l, estate->es_rowMarks)
--- 870,877 ----
                  if (estate->es_result_relation_info)
                      estate->es_result_relation_info->ri_junkFilter = j;

!                 /* For SELECT, want to return the cleaned tuple type */
!                 tupType = j->jf_cleanTupType;

                  /* For SELECT FOR UPDATE/SHARE, find the junk attrs now */
                  foreach(l, estate->es_rowMarks)
***************
*** 999,1055 **** InitPlan(QueryDesc *queryDesc, int eflags)
          }
          else
          {
-             if (operation == CMD_INSERT)
-                 ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
-                                     planstate->plan->targetlist);
-
              estate->es_junkFilter = NULL;
              if (estate->es_rowMarks)
                  elog(ERROR, "SELECT FOR UPDATE/SHARE, but no junk columns");
          }
      }

-     /*
-      * Initialize RETURNING projections if needed.
-      */
-     if (plannedstmt->returningLists)
-     {
-         TupleTableSlot *slot;
-         ExprContext *econtext;
-         ResultRelInfo *resultRelInfo;
-
-         /*
-          * We set QueryDesc.tupDesc to be the RETURNING rowtype in this case.
-          * We assume all the sublists will generate the same output tupdesc.
-          */
-         tupType = ExecTypeFromTL((List *) linitial(plannedstmt->returningLists),
-                                  false);
-
-         /* Set up a slot for the output of the RETURNING projection(s) */
-         slot = ExecAllocTableSlot(estate->es_tupleTable);
-         ExecSetSlotDescriptor(slot, tupType);
-         /* Need an econtext too */
-         econtext = CreateExprContext(estate);
-
-         /*
-          * Build a projection for each result rel.    Note that any SubPlans in
-          * the RETURNING lists get attached to the topmost plan node.
-          */
-         Assert(list_length(plannedstmt->returningLists) == estate->es_num_result_relations);
-         resultRelInfo = estate->es_result_relations;
-         foreach(l, plannedstmt->returningLists)
-         {
-             List       *rlist = (List *) lfirst(l);
-             List       *rliststate;
-
-             rliststate = (List *) ExecInitExpr((Expr *) rlist, planstate);
-             resultRelInfo->ri_projectReturning =
-                 ExecBuildProjectionInfo(rliststate, econtext, slot,
-                                      resultRelInfo->ri_RelationDesc->rd_att);
-             resultRelInfo++;
-         }
-     }
-
      queryDesc->tupDesc = tupType;
      queryDesc->planstate = planstate;

--- 901,912 ----
***************
*** 1151,1225 **** InitResultRelInfo(ResultRelInfo *resultRelInfo,
  }

  /*
-  * Verify that the tuples to be produced by INSERT or UPDATE match the
-  * target relation's rowtype
-  *
-  * We do this to guard against stale plans.  If plan invalidation is
-  * functioning properly then we should never get a failure here, but better
-  * safe than sorry.  Note that this is called after we have obtained lock
-  * on the target rel, so the rowtype can't change underneath us.
-  *
-  * The plan output is represented by its targetlist, because that makes
-  * handling the dropped-column case easier.
-  */
- static void
- ExecCheckPlanOutput(Relation resultRel, List *targetList)
- {
-     TupleDesc    resultDesc = RelationGetDescr(resultRel);
-     int            attno = 0;
-     ListCell   *lc;
-
-     foreach(lc, targetList)
-     {
-         TargetEntry *tle = (TargetEntry *) lfirst(lc);
-         Form_pg_attribute attr;
-
-         if (tle->resjunk)
-             continue;            /* ignore junk tlist items */
-
-         if (attno >= resultDesc->natts)
-             ereport(ERROR,
-                     (errcode(ERRCODE_DATATYPE_MISMATCH),
-                      errmsg("table row type and query-specified row type do not match"),
-                      errdetail("Query has too many columns.")));
-         attr = resultDesc->attrs[attno++];
-
-         if (!attr->attisdropped)
-         {
-             /* Normal case: demand type match */
-             if (exprType((Node *) tle->expr) != attr->atttypid)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_DATATYPE_MISMATCH),
-                          errmsg("table row type and query-specified row type do not match"),
-                          errdetail("Table has type %s at ordinal position %d, but query expects %s.",
-                                    format_type_be(attr->atttypid),
-                                    attno,
-                              format_type_be(exprType((Node *) tle->expr)))));
-         }
-         else
-         {
-             /*
-              * For a dropped column, we can't check atttypid (it's likely 0).
-              * In any case the planner has most likely inserted an INT4 null.
-              * What we insist on is just *some* NULL constant.
-              */
-             if (!IsA(tle->expr, Const) ||
-                 !((Const *) tle->expr)->constisnull)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_DATATYPE_MISMATCH),
-                          errmsg("table row type and query-specified row type do not match"),
-                          errdetail("Query provides a value for a dropped column at ordinal position %d.",
-                                    attno)));
-         }
-     }
-     if (attno != resultDesc->natts)
-         ereport(ERROR,
-                 (errcode(ERRCODE_DATATYPE_MISMATCH),
-           errmsg("table row type and query-specified row type do not match"),
-                  errdetail("Query has too few columns.")));
- }
-
- /*
   *        ExecGetTriggerResultRel
   *
   * Get a ResultRelInfo for a trigger target relation.  Most of the time,
--- 1008,1013 ----
***************
*** 1449,1456 **** ExecutePlan(EState *estate,
      JunkFilter *junkfilter;
      TupleTableSlot *planSlot;
      TupleTableSlot *slot;
-     ItemPointer tupleid = NULL;
-     ItemPointerData tuple_ctid;
      long        current_tuple_count;

      /*
--- 1237,1242 ----
***************
*** 1521,1527 **** lnext:    ;
           *
           * But first, extract all the junk information we need.
           */
!         if ((junkfilter = estate->es_junkFilter) != NULL)
          {
              /*
               * Process any FOR UPDATE or FOR SHARE locking requested.
--- 1307,1313 ----
           *
           * But first, extract all the junk information we need.
           */
!         if (operation == CMD_SELECT && (junkfilter = estate->es_junkFilter) != NULL)
          {
              /*
               * Process any FOR UPDATE or FOR SHARE locking requested.
***************
*** 1630,1661 **** lnext:    ;
                  }
              }

!             /*
!              * extract the 'ctid' junk attribute.
!              */
!             if (operation == CMD_UPDATE || operation == CMD_DELETE)
!             {
!                 Datum        datum;
!                 bool        isNull;
!
!                 datum = ExecGetJunkAttribute(slot, junkfilter->jf_junkAttNo,
!                                              &isNull);
!                 /* shouldn't ever get a null result... */
!                 if (isNull)
!                     elog(ERROR, "ctid is NULL");
!
!                 tupleid = (ItemPointer) DatumGetPointer(datum);
!                 tuple_ctid = *tupleid;    /* make sure we don't free the ctid!! */
!                 tupleid = &tuple_ctid;
!             }
!
!             /*
!              * Create a new "clean" tuple with all junk attributes removed. We
!              * don't need to do this for DELETE, however (there will in fact
!              * be no non-junk attributes in a DELETE!)
!              */
!             if (operation != CMD_DELETE)
!                 slot = ExecFilterJunk(junkfilter, slot);
          }

          /*
--- 1416,1422 ----
                  }
              }

!             slot = ExecFilterJunk(junkfilter, slot);
          }

          /*
***************
*** 1670,1684 **** lnext:    ;
                  break;

              case CMD_INSERT:
-                 ExecInsert(slot, tupleid, planSlot, dest, estate);
-                 break;
-
              case CMD_DELETE:
-                 ExecDelete(tupleid, planSlot, dest, estate);
-                 break;
-
              case CMD_UPDATE:
!                 ExecUpdate(slot, tupleid, planSlot, dest, estate);
                  break;

              default:
--- 1431,1440 ----
                  break;

              case CMD_INSERT:
              case CMD_DELETE:
              case CMD_UPDATE:
!                 if (estate->es_plannedstmt->returningLists)
!                     (*dest->receiveSlot) (slot, dest);
                  break;

              default:
***************
*** 1734,2153 **** ExecSelect(TupleTableSlot *slot,
      (estate->es_processed)++;
  }

- /* ----------------------------------------------------------------
-  *        ExecInsert
-  *
-  *        INSERTs are trickier.. we have to insert the tuple into
-  *        the base relation and insert appropriate tuples into the
-  *        index relations.
-  * ----------------------------------------------------------------
-  */
- static void
- ExecInsert(TupleTableSlot *slot,
-            ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     Oid            newId;
-     List       *recheckIndexes = NIL;
-
-     /*
-      * get the heap tuple out of the tuple table slot, making sure we have a
-      * writable copy
-      */
-     tuple = ExecMaterializeSlot(slot);
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /*
-      * If the result relation has OIDs, force the tuple's OID to zero so that
-      * heap_insert will assign a fresh OID.  Usually the OID already will be
-      * zero at this point, but there are corner cases where the plan tree can
-      * return a tuple extracted literally from some table with the same
-      * rowtype.
-      *
-      * XXX if we ever wanted to allow users to assign their own OIDs to new
-      * rows, this'd be the place to do it.  For the moment, we make a point of
-      * doing this before calling triggers, so that a user-supplied trigger
-      * could hack the OID if desired.
-      */
-     if (resultRelationDesc->rd_rel->relhasoids)
-         HeapTupleSetOid(tuple, InvalidOid);
-
-     /* BEFORE ROW INSERT Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
-     {
-         HeapTuple    newtuple;
-
-         newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
-
-         if (newtuple == NULL)    /* "do nothing" */
-             return;
-
-         if (newtuple != tuple)    /* modified by Trigger(s) */
-         {
-             /*
-              * Put the modified tuple into a slot for convenience of routines
-              * below.  We assume the tuple was allocated in per-tuple memory
-              * context, and therefore will go away by itself. The tuple table
-              * slot should not try to clear it.
-              */
-             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
-
-             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
-                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
-             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
-             slot = newslot;
-             tuple = newtuple;
-         }
-     }
-
-     /*
-      * Check the constraints of the tuple
-      */
-     if (resultRelationDesc->rd_att->constr)
-         ExecConstraints(resultRelInfo, slot, estate);
-
-     /*
-      * insert the tuple
-      *
-      * Note: heap_insert returns the tid (location) of the new tuple in the
-      * t_self field.
-      */
-     newId = heap_insert(resultRelationDesc, tuple,
-                         estate->es_output_cid, 0, NULL);
-
-     IncrAppended();
-     (estate->es_processed)++;
-     estate->es_lastoid = newId;
-     setLastTid(&(tuple->t_self));
-
-     /*
-      * insert index entries for tuple
-      */
-     if (resultRelInfo->ri_NumIndices > 0)
-         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
-                                                estate, false);
-
-     /* AFTER ROW INSERT Triggers */
-     ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
- }
-
- /* ----------------------------------------------------------------
-  *        ExecDelete
-  *
-  *        DELETE is like UPDATE, except that we delete the tuple and no
-  *        index modifications are needed
-  * ----------------------------------------------------------------
-  */
- static void
- ExecDelete(ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     HTSU_Result result;
-     ItemPointerData update_ctid;
-     TransactionId update_xmax;
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /* BEFORE ROW DELETE Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_DELETE] > 0)
-     {
-         bool        dodelete;
-
-         dodelete = ExecBRDeleteTriggers(estate, resultRelInfo, tupleid);
-
-         if (!dodelete)            /* "do nothing" */
-             return;
-     }
-
-     /*
-      * delete the tuple
-      *
-      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
-      * the row to be deleted is visible to that snapshot, and throw a can't-
-      * serialize error if not.    This is a special-case behavior needed for
-      * referential integrity updates in serializable transactions.
-      */
- ldelete:;
-     result = heap_delete(resultRelationDesc, tupleid,
-                          &update_ctid, &update_xmax,
-                          estate->es_output_cid,
-                          estate->es_crosscheck_snapshot,
-                          true /* wait for commit */ );
-     switch (result)
-     {
-         case HeapTupleSelfUpdated:
-             /* already deleted by self; nothing to do */
-             return;
-
-         case HeapTupleMayBeUpdated:
-             break;
-
-         case HeapTupleUpdated:
-             if (IsXactIsoLevelSerializable)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                          errmsg("could not serialize access due to concurrent update")));
-             else if (!ItemPointerEquals(tupleid, &update_ctid))
-             {
-                 TupleTableSlot *epqslot;
-
-                 epqslot = EvalPlanQual(estate,
-                                        resultRelInfo->ri_RangeTableIndex,
-                                        &update_ctid,
-                                        update_xmax);
-                 if (!TupIsNull(epqslot))
-                 {
-                     *tupleid = update_ctid;
-                     goto ldelete;
-                 }
-             }
-             /* tuple already deleted; nothing to do */
-             return;
-
-         default:
-             elog(ERROR, "unrecognized heap_delete status: %u", result);
-             return;
-     }
-
-     IncrDeleted();
-     (estate->es_processed)++;
-
-     /*
-      * Note: Normally one would think that we have to delete index tuples
-      * associated with the heap tuple now...
-      *
-      * ... but in POSTGRES, we have no need to do this because VACUUM will
-      * take care of it later.  We can't delete index tuples immediately
-      * anyway, since the tuple is still visible to other transactions.
-      */
-
-     /* AFTER ROW DELETE Triggers */
-     ExecARDeleteTriggers(estate, resultRelInfo, tupleid);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-     {
-         /*
-          * We have to put the target tuple into a slot, which means first we
-          * gotta fetch it.    We can use the trigger tuple slot.
-          */
-         TupleTableSlot *slot = estate->es_trig_tuple_slot;
-         HeapTupleData deltuple;
-         Buffer        delbuffer;
-
-         deltuple.t_self = *tupleid;
-         if (!heap_fetch(resultRelationDesc, SnapshotAny,
-                         &deltuple, &delbuffer, false, NULL))
-             elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
-
-         if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
-             ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
-         ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
-
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
-
-         ExecClearTuple(slot);
-         ReleaseBuffer(delbuffer);
-     }
- }
-
- /* ----------------------------------------------------------------
-  *        ExecUpdate
-  *
-  *        note: we can't run UPDATE queries with transactions
-  *        off because UPDATEs are actually INSERTs and our
-  *        scan will mistakenly loop forever, updating the tuple
-  *        it just inserted..    This should be fixed but until it
-  *        is, we don't want to get stuck in an infinite loop
-  *        which corrupts your database..
-  * ----------------------------------------------------------------
-  */
- static void
- ExecUpdate(TupleTableSlot *slot,
-            ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     HTSU_Result result;
-     ItemPointerData update_ctid;
-     TransactionId update_xmax;
-     List *recheckIndexes = NIL;
-
-     /*
-      * abort the operation if not running transactions
-      */
-     if (IsBootstrapProcessingMode())
-         elog(ERROR, "cannot UPDATE during bootstrap");
-
-     /*
-      * get the heap tuple out of the tuple table slot, making sure we have a
-      * writable copy
-      */
-     tuple = ExecMaterializeSlot(slot);
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /* BEFORE ROW UPDATE Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_UPDATE] > 0)
-     {
-         HeapTuple    newtuple;
-
-         newtuple = ExecBRUpdateTriggers(estate, resultRelInfo,
-                                         tupleid, tuple);
-
-         if (newtuple == NULL)    /* "do nothing" */
-             return;
-
-         if (newtuple != tuple)    /* modified by Trigger(s) */
-         {
-             /*
-              * Put the modified tuple into a slot for convenience of routines
-              * below.  We assume the tuple was allocated in per-tuple memory
-              * context, and therefore will go away by itself. The tuple table
-              * slot should not try to clear it.
-              */
-             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
-
-             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
-                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
-             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
-             slot = newslot;
-             tuple = newtuple;
-         }
-     }
-
-     /*
-      * Check the constraints of the tuple
-      *
-      * If we generate a new candidate tuple after EvalPlanQual testing, we
-      * must loop back here and recheck constraints.  (We don't need to redo
-      * triggers, however.  If there are any BEFORE triggers then trigger.c
-      * will have done heap_lock_tuple to lock the correct tuple, so there's no
-      * need to do them again.)
-      */
- lreplace:;
-     if (resultRelationDesc->rd_att->constr)
-         ExecConstraints(resultRelInfo, slot, estate);
-
-     /*
-      * replace the heap tuple
-      *
-      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
-      * the row to be updated is visible to that snapshot, and throw a can't-
-      * serialize error if not.    This is a special-case behavior needed for
-      * referential integrity updates in serializable transactions.
-      */
-     result = heap_update(resultRelationDesc, tupleid, tuple,
-                          &update_ctid, &update_xmax,
-                          estate->es_output_cid,
-                          estate->es_crosscheck_snapshot,
-                          true /* wait for commit */ );
-     switch (result)
-     {
-         case HeapTupleSelfUpdated:
-             /* already deleted by self; nothing to do */
-             return;
-
-         case HeapTupleMayBeUpdated:
-             break;
-
-         case HeapTupleUpdated:
-             if (IsXactIsoLevelSerializable)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                          errmsg("could not serialize access due to concurrent update")));
-             else if (!ItemPointerEquals(tupleid, &update_ctid))
-             {
-                 TupleTableSlot *epqslot;
-
-                 epqslot = EvalPlanQual(estate,
-                                        resultRelInfo->ri_RangeTableIndex,
-                                        &update_ctid,
-                                        update_xmax);
-                 if (!TupIsNull(epqslot))
-                 {
-                     *tupleid = update_ctid;
-                     slot = ExecFilterJunk(estate->es_junkFilter, epqslot);
-                     tuple = ExecMaterializeSlot(slot);
-                     goto lreplace;
-                 }
-             }
-             /* tuple already deleted; nothing to do */
-             return;
-
-         default:
-             elog(ERROR, "unrecognized heap_update status: %u", result);
-             return;
-     }
-
-     IncrReplaced();
-     (estate->es_processed)++;
-
-     /*
-      * Note: instead of having to update the old index tuples associated with
-      * the heap tuple, all we do is form and insert new index tuples. This is
-      * because UPDATEs are actually DELETEs and INSERTs, and index tuple
-      * deletion is done later by VACUUM (see notes in ExecDelete).    All we do
-      * here is insert new index tuples.  -cim 9/27/89
-      */
-
-     /*
-      * insert index entries for tuple
-      *
-      * Note: heap_update returns the tid (location) of the new tuple in the
-      * t_self field.
-      *
-      * If it's a HOT update, we mustn't insert new index entries.
-      */
-     if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
-         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
-                                                estate, false);
-
-     /* AFTER ROW UPDATE Triggers */
-     ExecARUpdateTriggers(estate, resultRelInfo, tupleid, tuple,
-                          recheckIndexes);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
- }
-
  /*
   * ExecRelCheck --- check that tuple meets constraints for result relation
   */
--- 1490,1495 ----
***************
*** 2248,2289 **** ExecConstraints(ResultRelInfo *resultRelInfo,
  }

  /*
-  * ExecProcessReturning --- evaluate a RETURNING list and send to dest
-  *
-  * projectReturning: RETURNING projection info for current result rel
-  * tupleSlot: slot holding tuple actually inserted/updated/deleted
-  * planSlot: slot holding tuple returned by top plan node
-  * dest: where to send the output
-  */
- static void
- ExecProcessReturning(ProjectionInfo *projectReturning,
-                      TupleTableSlot *tupleSlot,
-                      TupleTableSlot *planSlot,
-                      DestReceiver *dest)
- {
-     ExprContext *econtext = projectReturning->pi_exprContext;
-     TupleTableSlot *retSlot;
-
-     /*
-      * Reset per-tuple memory context to free any expression evaluation
-      * storage allocated in the previous cycle.
-      */
-     ResetExprContext(econtext);
-
-     /* Make tuple and any needed join variables available to ExecProject */
-     econtext->ecxt_scantuple = tupleSlot;
-     econtext->ecxt_outertuple = planSlot;
-
-     /* Compute the RETURNING expressions */
-     retSlot = ExecProject(projectReturning, NULL);
-
-     /* Send to dest */
-     (*dest->receiveSlot) (retSlot, dest);
-
-     ExecClearTuple(retSlot);
- }
-
- /*
   * Check a modified tuple to see if we want to process its updated version
   * under READ COMMITTED rules.
   *
--- 1590,1595 ----
*** a/src/backend/executor/execProcnode.c
--- b/src/backend/executor/execProcnode.c
***************
*** 91,96 ****
--- 91,97 ----
  #include "executor/nodeHash.h"
  #include "executor/nodeHashjoin.h"
  #include "executor/nodeIndexscan.h"
+ #include "executor/nodeDml.h"
  #include "executor/nodeLimit.h"
  #include "executor/nodeMaterial.h"
  #include "executor/nodeMergejoin.h"
***************
*** 286,291 **** ExecInitNode(Plan *node, EState *estate, int eflags)
--- 287,297 ----
                                                   estate, eflags);
              break;

+         case T_Dml:
+             result = (PlanState *) ExecInitDml((Dml *) node,
+                                                  estate, eflags);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;        /* keep compiler quiet */
***************
*** 451,456 **** ExecProcNode(PlanState *node)
--- 457,466 ----
              result = ExecLimit((LimitState *) node);
              break;

+         case T_DmlState:
+             result = ExecDml((DmlState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;
***************
*** 627,632 **** ExecCountSlotsNode(Plan *node)
--- 637,645 ----
          case T_Limit:
              return ExecCountSlotsLimit((Limit *) node);

+         case T_Dml:
+             return ExecCountSlotsDml((Dml *) node);
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              break;
***************
*** 783,788 **** ExecEndNode(PlanState *node)
--- 796,805 ----
              ExecEndLimit((LimitState *) node);
              break;

+         case T_DmlState:
+             ExecEndDml((DmlState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              break;
*** a/src/backend/executor/nodeAppend.c
--- b/src/backend/executor/nodeAppend.c
***************
*** 103,123 **** exec_append_initialize_next(AppendState *appendstate)
      }
      else
      {
-         /*
-          * initialize the scan
-          *
-          * If we are controlling the target relation, select the proper active
-          * ResultRelInfo and junk filter for this target.
-          */
-         if (((Append *) appendstate->ps.plan)->isTarget)
-         {
-             Assert(whichplan < estate->es_num_result_relations);
-             estate->es_result_relation_info =
-                 estate->es_result_relations + whichplan;
-             estate->es_junkFilter =
-                 estate->es_result_relation_info->ri_junkFilter;
-         }
-
          return TRUE;
      }
  }
--- 103,108 ----
***************
*** 164,189 **** ExecInitAppend(Append *node, EState *estate, int eflags)
      appendstate->appendplans = appendplanstates;
      appendstate->as_nplans = nplans;

!     /*
!      * Do we want to scan just one subplan?  (Special case for EvalPlanQual)
!      * XXX pretty dirty way of determining that this case applies ...
!      */
!     if (node->isTarget && estate->es_evTuple != NULL)
!     {
!         int            tplan;
!
!         tplan = estate->es_result_relation_info - estate->es_result_relations;
!         Assert(tplan >= 0 && tplan < nplans);
!
!         appendstate->as_firstplan = tplan;
!         appendstate->as_lastplan = tplan;
!     }
!     else
!     {
!         /* normal case, scan all subplans */
!         appendstate->as_firstplan = 0;
!         appendstate->as_lastplan = nplans - 1;
!     }

      /*
       * Miscellaneous initialization
--- 149,157 ----
      appendstate->appendplans = appendplanstates;
      appendstate->as_nplans = nplans;

!
!     appendstate->as_firstplan = 0;
!     appendstate->as_lastplan = nplans - 1;

      /*
       * Miscellaneous initialization
*** /dev/null
--- b/src/backend/executor/nodeDml.c
***************
*** 0 ****
--- 1,855 ----
+ #include "postgres.h"
+
+ #include "access/xact.h"
+ #include "parser/parsetree.h"
+ #include "executor/executor.h"
+ #include "executor/execdebug.h"
+ #include "executor/nodeDml.h"
+ #include "commands/trigger.h"
+ #include "nodes/nodeFuncs.h"
+ #include "utils/memutils.h"
+ #include "utils/builtins.h"
+ #include "utils/tqual.h"
+ #include "storage/bufmgr.h"
+ #include "miscadmin.h"
+
+ /*
+  * Verify that the tuples to be produced by INSERT or UPDATE match the
+  * target relation's rowtype
+  *
+  * We do this to guard against stale plans.  If plan invalidation is
+  * functioning properly then we should never get a failure here, but better
+  * safe than sorry.  Note that this is called after we have obtained lock
+  * on the target rel, so the rowtype can't change underneath us.
+  *
+  * The plan output is represented by its targetlist, because that makes
+  * handling the dropped-column case easier.
+  */
+ static void
+ ExecCheckPlanOutput(Relation resultRel, List *targetList)
+ {
+     TupleDesc    resultDesc = RelationGetDescr(resultRel);
+     int            attno = 0;
+     ListCell   *lc;
+
+     foreach(lc, targetList)
+     {
+         TargetEntry *tle = (TargetEntry *) lfirst(lc);
+         Form_pg_attribute attr;
+
+         if (tle->resjunk)
+             continue;            /* ignore junk tlist items */
+
+         if (attno >= resultDesc->natts)
+             ereport(ERROR,
+                     (errcode(ERRCODE_DATATYPE_MISMATCH),
+                      errmsg("table row type and query-specified row type do not match"),
+                      errdetail("Query has too many columns.")));
+         attr = resultDesc->attrs[attno++];
+
+         if (!attr->attisdropped)
+         {
+             /* Normal case: demand type match */
+             if (exprType((Node *) tle->expr) != attr->atttypid)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_DATATYPE_MISMATCH),
+                          errmsg("table row type and query-specified row type do not match"),
+                          errdetail("Table has type %s at ordinal position %d, but query expects %s.",
+                                    format_type_be(attr->atttypid),
+                                    attno,
+                              format_type_be(exprType((Node *) tle->expr)))));
+         }
+         else
+         {
+             /*
+              * For a dropped column, we can't check atttypid (it's likely 0).
+              * In any case the planner has most likely inserted an INT4 null.
+              * What we insist on is just *some* NULL constant.
+              */
+             if (!IsA(tle->expr, Const) ||
+                 !((Const *) tle->expr)->constisnull)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_DATATYPE_MISMATCH),
+                          errmsg("table row type and query-specified row type do not match"),
+                          errdetail("Query provides a value for a dropped column at ordinal position %d.",
+                                    attno)));
+         }
+     }
+     if (attno != resultDesc->natts)
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+           errmsg("table row type and query-specified row type do not match"),
+                  errdetail("Query has too few columns.")));
+ }
+
+ static TupleTableSlot*
+ ExecProcessReturning(ProjectionInfo *projectReturning,
+                      TupleTableSlot *tupleSlot,
+                      TupleTableSlot *planSlot)
+ {
+     ExprContext *econtext = projectReturning->pi_exprContext;
+     TupleTableSlot *retSlot;
+
+     /*
+      * Reset per-tuple memory context to free any expression evaluation
+      * storage allocated in the previous cycle.
+      */
+     ResetExprContext(econtext);
+
+     /* Make tuple and any needed join variables available to ExecProject */
+     econtext->ecxt_scantuple = tupleSlot;
+     econtext->ecxt_outertuple = planSlot;
+
+     /* Compute the RETURNING expressions */
+     retSlot = ExecProject(projectReturning, NULL);
+
+     return retSlot;
+ }
+
+ static TupleTableSlot *
+ ExecInsert(TupleTableSlot *slot,
+             ItemPointer tupleid,
+             TupleTableSlot *planSlot,
+             EState *estate)
+ {
+     HeapTuple    tuple;
+     ResultRelInfo *resultRelInfo;
+     Relation    resultRelationDesc;
+     Oid            newId;
+     List        *recheckIndexes = NIL;
+
+     /*
+      * get the heap tuple out of the tuple table slot, making sure we have a
+      * writable copy
+      */
+     tuple = ExecMaterializeSlot(slot);
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relations;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /*
+      * If the result relation has OIDs, force the tuple's OID to zero so that
+      * heap_insert will assign a fresh OID.  Usually the OID already will be
+      * zero at this point, but there are corner cases where the plan tree can
+      * return a tuple extracted literally from some table with the same
+      * rowtype.
+      *
+      * XXX if we ever wanted to allow users to assign their own OIDs to new
+      * rows, this'd be the place to do it.  For the moment, we make a point of
+      * doing this before calling triggers, so that a user-supplied trigger
+      * could hack the OID if desired.
+      */
+     if (resultRelationDesc->rd_rel->relhasoids)
+         HeapTupleSetOid(tuple, InvalidOid);
+
+     /* BEFORE ROW INSERT Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
+     {
+         HeapTuple    newtuple;
+
+         newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
+
+         if (newtuple == NULL)    /* "do nothing" */
+             return NULL;
+
+         if (newtuple != tuple)    /* modified by Trigger(s) */
+         {
+             /*
+              * Put the modified tuple into a slot for convenience of routines
+              * below.  We assume the tuple was allocated in per-tuple memory
+              * context, and therefore will go away by itself. The tuple table
+              * slot should not try to clear it.
+              */
+             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
+
+             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
+                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
+             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
+             slot = newslot;
+             tuple = newtuple;
+         }
+     }
+
+     /*
+      * Check the constraints of the tuple
+      */
+     if (resultRelationDesc->rd_att->constr)
+         ExecConstraints(resultRelInfo, slot, estate);
+
+     /*
+      * insert the tuple
+      *
+      * Note: heap_insert returns the tid (location) of the new tuple in the
+      * t_self field.
+      */
+     newId = heap_insert(resultRelationDesc, tuple,
+                         estate->es_output_cid, 0, NULL);
+
+     IncrAppended();
+     (estate->es_processed)++;
+     estate->es_lastoid = newId;
+     setLastTid(&(tuple->t_self));
+
+     /*
+      * insert index entries for tuple
+      */
+     if (resultRelInfo->ri_NumIndices > 0)
+         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false);
+
+     /* AFTER ROW INSERT Triggers */
+     ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+         slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                              slot, planSlot);
+
+     return slot;
+ }
+
+ /* ----------------------------------------------------------------
+  *        ExecDelete
+  *
+  *        DELETE is like UPDATE, except that we delete the tuple and no
+  *        index modifications are needed
+  * ----------------------------------------------------------------
+  */
+ static TupleTableSlot *
+ ExecDelete(ItemPointer tupleid,
+            TupleTableSlot *planSlot,
+            EState *estate)
+ {
+     ResultRelInfo* resultRelInfo;
+     Relation    resultRelationDesc;
+     HTSU_Result result;
+     ItemPointerData update_ctid;
+     TransactionId update_xmax;
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relation_info;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /* BEFORE ROW DELETE Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_DELETE] > 0)
+     {
+         bool        dodelete;
+
+         dodelete = ExecBRDeleteTriggers(estate, resultRelInfo, tupleid);
+
+         if (!dodelete)            /* "do nothing" */
+             return planSlot;
+     }
+
+     /*
+      * delete the tuple
+      *
+      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
+      * the row to be deleted is visible to that snapshot, and throw a can't-
+      * serialize error if not.    This is a special-case behavior needed for
+      * referential integrity updates in serializable transactions.
+      */
+ ldelete:;
+     result = heap_delete(resultRelationDesc, tupleid,
+                          &update_ctid, &update_xmax,
+                          estate->es_output_cid,
+                          estate->es_crosscheck_snapshot,
+                          true /* wait for commit */ );
+     switch (result)
+     {
+         case HeapTupleSelfUpdated:
+             /* already deleted by self; nothing to do */
+             return planSlot;
+
+         case HeapTupleMayBeUpdated:
+             break;
+
+         case HeapTupleUpdated:
+             if (IsXactIsoLevelSerializable)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                          errmsg("could not serialize access due to concurrent update")));
+             else if (!ItemPointerEquals(tupleid, &update_ctid))
+             {
+                 TupleTableSlot *epqslot;
+
+                 epqslot = EvalPlanQual(estate,
+                                        resultRelInfo->ri_RangeTableIndex,
+                                        &update_ctid,
+                                        update_xmax);
+                 if (!TupIsNull(epqslot))
+                 {
+                     *tupleid = update_ctid;
+                     goto ldelete;
+                 }
+             }
+             /* tuple already deleted; nothing to do */
+             return planSlot;
+
+         default:
+             elog(ERROR, "unrecognized heap_delete status: %u", result);
+             return NULL;
+     }
+
+     IncrDeleted();
+     (estate->es_processed)++;
+
+     /*
+      * Note: Normally one would think that we have to delete index tuples
+      * associated with the heap tuple now...
+      *
+      * ... but in POSTGRES, we have no need to do this because VACUUM will
+      * take care of it later.  We can't delete index tuples immediately
+      * anyway, since the tuple is still visible to other transactions.
+      */
+
+     /* AFTER ROW DELETE Triggers */
+     ExecARDeleteTriggers(estate, resultRelInfo, tupleid);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+     {
+         /*
+          * We have to put the target tuple into a slot, which means first we
+          * gotta fetch it.    We can use the trigger tuple slot.
+          */
+         TupleTableSlot *slot = estate->es_trig_tuple_slot;
+         HeapTupleData deltuple;
+         Buffer        delbuffer;
+
+         deltuple.t_self = *tupleid;
+         if (!heap_fetch(resultRelationDesc, SnapshotAny,
+                         &deltuple, &delbuffer, false, NULL))
+             elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
+
+         if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
+             ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
+         ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
+
+         planSlot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                              slot, planSlot);
+
+         ExecClearTuple(slot);
+         ReleaseBuffer(delbuffer);
+     }
+
+     return planSlot;
+ }
+
+ /* ----------------------------------------------------------------
+  *        ExecUpdate
+  *
+  *        note: we can't run UPDATE queries with transactions
+  *        off because UPDATEs are actually INSERTs and our
+  *        scan will mistakenly loop forever, updating the tuple
+  *        it just inserted..    This should be fixed but until it
+  *        is, we don't want to get stuck in an infinite loop
+  *        which corrupts your database..
+  * ----------------------------------------------------------------
+  */
+ static TupleTableSlot *
+ ExecUpdate(TupleTableSlot *slot,
+            ItemPointer tupleid,
+            TupleTableSlot *planSlot,
+            EState *estate)
+ {
+     HeapTuple    tuple;
+     ResultRelInfo *resultRelInfo;
+     Relation    resultRelationDesc;
+     HTSU_Result result;
+     ItemPointerData update_ctid;
+     TransactionId update_xmax;
+     List *recheckIndexes = NIL;
+
+     /*
+      * abort the operation if not running transactions
+      */
+     if (IsBootstrapProcessingMode())
+         elog(ERROR, "cannot UPDATE during bootstrap");
+
+     /*
+      * get the heap tuple out of the tuple table slot, making sure we have a
+      * writable copy
+      */
+     tuple = ExecMaterializeSlot(slot);
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relation_info;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /* BEFORE ROW UPDATE Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_UPDATE] > 0)
+     {
+         HeapTuple    newtuple;
+
+         newtuple = ExecBRUpdateTriggers(estate, resultRelInfo,
+                                         tupleid, tuple);
+
+         if (newtuple == NULL)    /* "do nothing" */
+             return planSlot;
+
+         if (newtuple != tuple)    /* modified by Trigger(s) */
+         {
+             /*
+              * Put the modified tuple into a slot for convenience of routines
+              * below.  We assume the tuple was allocated in per-tuple memory
+              * context, and therefore will go away by itself. The tuple table
+              * slot should not try to clear it.
+              */
+             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
+
+             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
+                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
+             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
+             slot = newslot;
+             tuple = newtuple;
+         }
+     }
+
+     /*
+      * Check the constraints of the tuple
+      *
+      * If we generate a new candidate tuple after EvalPlanQual testing, we
+      * must loop back here and recheck constraints.  (We don't need to redo
+      * triggers, however.  If there are any BEFORE triggers then trigger.c
+      * will have done heap_lock_tuple to lock the correct tuple, so there's no
+      * need to do them again.)
+      */
+ lreplace:;
+     if (resultRelationDesc->rd_att->constr)
+         ExecConstraints(resultRelInfo, slot, estate);
+
+     /*
+      * replace the heap tuple
+      *
+      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
+      * the row to be updated is visible to that snapshot, and throw a can't-
+      * serialize error if not.    This is a special-case behavior needed for
+      * referential integrity updates in serializable transactions.
+      */
+     result = heap_update(resultRelationDesc, tupleid, tuple,
+                          &update_ctid, &update_xmax,
+                          estate->es_output_cid,
+                          estate->es_crosscheck_snapshot,
+                          true /* wait for commit */ );
+     switch (result)
+     {
+         case HeapTupleSelfUpdated:
+             /* already deleted by self; nothing to do */
+             return planSlot;
+
+         case HeapTupleMayBeUpdated:
+             break;
+
+         case HeapTupleUpdated:
+             if (IsXactIsoLevelSerializable)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                          errmsg("could not serialize access due to concurrent update")));
+             else if (!ItemPointerEquals(tupleid, &update_ctid))
+             {
+                 TupleTableSlot *epqslot;
+
+                 epqslot = EvalPlanQual(estate,
+                                        resultRelInfo->ri_RangeTableIndex,
+                                        &update_ctid,
+                                        update_xmax);
+                 if (!TupIsNull(epqslot))
+                 {
+                     *tupleid = update_ctid;
+                     slot = ExecFilterJunk(estate->es_result_relation_info->ri_junkFilter, epqslot);
+                     tuple = ExecMaterializeSlot(slot);
+                     goto lreplace;
+                 }
+             }
+             /* tuple already deleted; nothing to do */
+             return planSlot;
+
+         default:
+             elog(ERROR, "unrecognized heap_update status: %u", result);
+             return NULL;
+     }
+
+     IncrReplaced();
+     (estate->es_processed)++;
+
+     /*
+      * Note: instead of having to update the old index tuples associated with
+      * the heap tuple, all we do is form and insert new index tuples. This is
+      * because UPDATEs are actually DELETEs and INSERTs, and index tuple
+      * deletion is done later by VACUUM (see notes in ExecDelete).    All we do
+      * here is insert new index tuples.  -cim 9/27/89
+      */
+
+     /*
+      * insert index entries for tuple
+      *
+      * Note: heap_update returns the tid (location) of the new tuple in the
+      * t_self field.
+      *
+      * If it's a HOT update, we mustn't insert new index entries.
+      */
+     if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
+         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
+                                                estate, false);
+
+     /* AFTER ROW UPDATE Triggers */
+     ExecARUpdateTriggers(estate, resultRelInfo, tupleid, tuple,
+                          recheckIndexes);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+         slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                                     slot, planSlot);
+
+     return slot;
+ }
+
+ TupleTableSlot *
+ ExecDml(DmlState *node)
+ {
+     CmdType operation = node->operation;
+     EState *estate = node->ps.state;
+     JunkFilter *junkfilter;
+     TupleTableSlot *slot;
+     TupleTableSlot *planSlot;
+     ItemPointer tupleid = NULL;
+     ItemPointerData tuple_ctid;
+
+     for (;;)
+     {
+         planSlot = ExecProcNode(node->dmlplans[node->ds_whichplan]);
+         if (TupIsNull(planSlot))
+         {
+             node->ds_whichplan++;
+             if (node->ds_whichplan < node->ds_nplans)
+             {
+                 estate->es_result_relation_info++;
+                 continue;
+             }
+             else
+                 return NULL;
+         }
+         else
+             break;
+     }
+
+     slot = planSlot;
+
+     if ((junkfilter = estate->es_result_relation_info->ri_junkFilter) != NULL)
+     {
+         /*
+          * extract the 'ctid' junk attribute.
+          */
+         if (operation == CMD_UPDATE || operation == CMD_DELETE)
+         {
+             Datum        datum;
+             bool        isNull;
+
+             datum = ExecGetJunkAttribute(slot, junkfilter->jf_junkAttNo,
+                                              &isNull);
+             /* shouldn't ever get a null result... */
+             if (isNull)
+                 elog(ERROR, "ctid is NULL");
+
+             tupleid = (ItemPointer) DatumGetPointer(datum);
+             tuple_ctid = *tupleid;    /* make sure we don't free the ctid!! */
+             tupleid = &tuple_ctid;
+         }
+
+         if (operation != CMD_DELETE)
+             slot = ExecFilterJunk(junkfilter, slot);
+     }
+
+     switch (operation)
+     {
+         case CMD_INSERT:
+             return ExecInsert(slot, tupleid, planSlot, estate);
+             break;
+         case CMD_UPDATE:
+             return ExecUpdate(slot, tupleid, planSlot, estate);
+             break;
+         case CMD_DELETE:
+             return ExecDelete(tupleid, slot, estate);
+         default:
+             elog(ERROR, "unknown operation");
+             break;
+     }
+
+     return NULL;
+ }
+
+ DmlState *
+ ExecInitDml(Dml *node, EState *estate, int eflags)
+ {
+     DmlState *dmlstate;
+     ResultRelInfo *resultRelInfo;
+     Plan *subplan;
+     ListCell *l;
+     ListCell *relindex;
+     CmdType operation = node->operation;
+     int nplans;
+     int i;
+
+     TupleDesc tupDesc;
+
+     nplans = list_length(node->plans);
+
+     /*
+      * Do we want to scan just one subplan?  (Special case for EvalPlanQual)
+      * XXX pretty dirty way of determining that this case applies ...
+      */
+     if (estate->es_evTuple != NULL)
+     {
+         int tplan;
+
+         tplan = estate->es_result_relation_info - estate->es_result_relations;
+         Assert(tplan >= 0 && tplan < nplans);
+
+         /*
+          * We don't want another DmlNode on top, so just
+          * return a PlanState for the subplan wanted.
+          */
+         return (DmlState *) ExecInitNode(list_nth(node->plans, tplan), estate, eflags);
+     }
+
+     /*
+      * create state structure
+      */
+     dmlstate = makeNode(DmlState);
+     dmlstate->ps.plan = (Plan *) node;
+     dmlstate->ps.state = estate;
+     dmlstate->ps.targetlist = node->plan.targetlist;
+
+     dmlstate->ds_nplans = nplans;
+     dmlstate->dmlplans = (PlanState **) palloc0(sizeof(PlanState *) * nplans);
+     dmlstate->operation = node->operation;
+
+     estate->es_result_relation_info = estate->es_result_relations;
+     relindex = list_head(node->resultRelations);
+     i = 0;
+     foreach(l, node->plans)
+     {
+         subplan = lfirst(l);
+
+         dmlstate->dmlplans[i] = ExecInitNode(subplan, estate, eflags);
+
+         i++;
+         estate->es_result_relation_info++;
+         relindex = lnext(relindex);
+     }
+
+     estate->es_result_relation_info = estate->es_result_relations;
+
+     dmlstate->ds_whichplan = 0;
+
+     subplan = (Plan *) linitial(node->plans);
+
+ #define DMLNODE_NSLOTS 1
+
+     if (node->returningLists)
+     {
+         TupleTableSlot *slot;
+         ExprContext *econtext;
+
+         /*
+          * Initialize result tuple slot and assign
+          * type from the RETURNING list.
+          */
+         tupDesc = ExecTypeFromTL((List *) linitial(node->returningLists),
+                                  false);
+
+         /*
+          * Set up a slot for the output of the RETURNING projection(s).
+          */
+         slot = ExecAllocTableSlot(estate->es_tupleTable);
+         ExecSetSlotDescriptor(slot, tupDesc);
+
+         econtext = CreateExprContext(estate);
+
+         Assert(list_length(node->returningLists) == estate->es_num_result_relations);
+         resultRelInfo = estate->es_result_relations;
+         foreach(l, node->returningLists)
+         {
+             List       *rlist = (List *) lfirst(l);
+             List       *rliststate;
+
+             rliststate = (List *) ExecInitExpr((Expr *) rlist, &dmlstate->ps);
+             resultRelInfo->ri_projectReturning =
+                 ExecBuildProjectionInfo(rliststate, econtext, slot,
+                                         resultRelInfo->ri_RelationDesc->rd_att);
+             resultRelInfo++;
+         }
+
+         dmlstate->ps.ps_ResultTupleSlot = slot;
+         dmlstate->ps.ps_ExprContext = econtext;
+     }
+     else
+     {
+         ExecInitResultTupleSlot(estate, &dmlstate->ps);
+         tupDesc = ExecTypeFromTL(subplan->targetlist, false);
+         ExecAssignResultType(&dmlstate->ps, tupDesc);
+
+         dmlstate->ps.ps_ExprContext = NULL;
+     }
+
+     /*
+      * Initialize the junk filter if needed. INSERT queries need a filter
+      * if there are any junk attrs in the tlist.  UPDATE and DELETE
+      * always need a filter, since there's always a junk 'ctid' attribute
+      * present --- no need to look first.
+      *
+      * This section of code is also a convenient place to verify that the
+      * output of an INSERT or UPDATE matches the target table(s).
+      */
+     {
+         bool        junk_filter_needed = false;
+         ListCell   *tlist;
+
+         switch (operation)
+         {
+             case CMD_INSERT:
+                 foreach(tlist, subplan->targetlist)
+                 {
+                     TargetEntry *tle = (TargetEntry *) lfirst(tlist);
+
+                     if (tle->resjunk)
+                     {
+                         junk_filter_needed = true;
+                         break;
+                     }
+                 }
+                 break;
+             case CMD_UPDATE:
+             case CMD_DELETE:
+                 junk_filter_needed = true;
+                 break;
+             default:
+                 break;
+         }
+
+         resultRelInfo = estate->es_result_relations;
+
+         if (junk_filter_needed)
+         {
+             /*
+              * If there are multiple result relations, each one needs its own
+              * junk filter.  Note this is only possible for UPDATE/DELETE, so
+              * we can't be fooled by some needing a filter and some not.
+
+              */
+             if (nplans > 1)
+             {
+                 for (i = 0; i < nplans; i++)
+                 {
+                     PlanState *ps = dmlstate->dmlplans[i];
+                     JunkFilter *j;
+
+                     if (operation == CMD_UPDATE)
+                         ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
+                                             ps->plan->targetlist);
+
+                     j = ExecInitJunkFilter(ps->plan->targetlist,
+                             resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
+                                 ExecAllocTableSlot(estate->es_tupleTable));
+
+
+                     /*
+                      * Since it must be UPDATE/DELETE, there had better be a
+                      * "ctid" junk attribute in the tlist ... but ctid could
+                      * be at a different resno for each result relation. We
+                      * look up the ctid resnos now and save them in the
+                      * junkfilters.
+                      */
+                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
+                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
+                         elog(ERROR, "could not find junk ctid column");
+                     resultRelInfo->ri_junkFilter = j;
+                     resultRelInfo++;
+                 }
+             }
+             else
+             {
+                 JunkFilter *j;
+                 subplan = dmlstate->dmlplans[0]->plan;
+
+                 if (operation == CMD_INSERT || operation == CMD_UPDATE)
+                     ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
+                                         subplan->targetlist);
+
+                 j = ExecInitJunkFilter(subplan->targetlist,
+                                        resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
+                                        ExecAllocTableSlot(estate->es_tupleTable));
+
+                 if (operation == CMD_UPDATE || operation == CMD_DELETE)
+                 {
+                     /* FOR UPDATE/DELETE, find the ctid junk attr now */
+                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
+                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
+                         elog(ERROR, "could not find junk ctid column");
+                 }
+
+                 estate->es_result_relation_info->ri_junkFilter = j;
+             }
+         }
+         else
+         {
+             if (operation == CMD_INSERT)
+                 ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
+                                     subplan->targetlist);
+         }
+     }
+
+     return dmlstate;
+ }
+
+ int
+ ExecCountSlotsDml(Dml *node)
+ {
+     ListCell* l;
+     int nslots = DMLNODE_NSLOTS;
+
+     /* Add a slot RETURNING projection */
+     if (node->returningLists)
+         nslots++;
+
+     foreach(l, node->plans)
+         nslots += ExecCountSlotsNode((Plan *) lfirst(l));
+
+     return nslots;
+ }
+
+ void
+ ExecEndDml(DmlState *node)
+ {
+     int i;
+
+     /*
+      * Free the exprcontext
+      */
+     ExecFreeExprContext(&node->ps);
+
+     /*
+      * clean out the tuple table
+      */
+     ExecClearTuple(node->ps.ps_ResultTupleSlot);
+
+     /*
+      * shut down subplans
+      */
+     for (i=0;i<node->ds_nplans;++i)
+     {
+         ExecEndNode(node->dmlplans[i]);
+     }
+
+     pfree(node->dmlplans);
+ }
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 171,177 **** _copyAppend(Append *from)
       * copy remainder of node
       */
      COPY_NODE_FIELD(appendplans);
-     COPY_SCALAR_FIELD(isTarget);

      return newnode;
  }
--- 171,176 ----
***************
*** 1391,1396 **** _copyXmlExpr(XmlExpr *from)
--- 1390,1411 ----

      return newnode;
  }
+
+ static Dml *
+ _copyDml(Dml *from)
+ {
+     Dml    *newnode = makeNode(Dml);
+
+     CopyPlanFields((Plan *) from, (Plan *) newnode);
+
+     COPY_NODE_FIELD(plans);
+     COPY_SCALAR_FIELD(operation);
+     COPY_NODE_FIELD(resultRelations);
+     COPY_NODE_FIELD(returningLists);
+
+     return newnode;
+ }
+

  /*
   * _copyNullIfExpr (same as OpExpr)
***************
*** 4083,4088 **** copyObject(void *from)
--- 4098,4106 ----
          case T_XmlSerialize:
              retval = _copyXmlSerialize(from);
              break;
+         case T_Dml:
+             retval = _copyDml(from);
+             break;

          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 326,332 **** _outAppend(StringInfo str, Append *node)
      _outPlanInfo(str, (Plan *) node);

      WRITE_NODE_FIELD(appendplans);
-     WRITE_BOOL_FIELD(isTarget);
  }

  static void
--- 326,331 ----
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 574,580 **** create_append_plan(PlannerInfo *root, AppendPath *best_path)
          subplans = lappend(subplans, create_plan(root, subpath));
      }

!     plan = make_append(subplans, false, tlist);

      return (Plan *) plan;
  }
--- 574,580 ----
          subplans = lappend(subplans, create_plan(root, subpath));
      }

!     plan = make_append(subplans, tlist);

      return (Plan *) plan;
  }
***************
*** 2616,2622 **** make_worktablescan(List *qptlist,
  }

  Append *
! make_append(List *appendplans, bool isTarget, List *tlist)
  {
      Append       *node = makeNode(Append);
      Plan       *plan = &node->plan;
--- 2616,2622 ----
  }

  Append *
! make_append(List *appendplans, List *tlist)
  {
      Append       *node = makeNode(Append);
      Plan       *plan = &node->plan;
***************
*** 2652,2658 **** make_append(List *appendplans, bool isTarget, List *tlist)
      plan->lefttree = NULL;
      plan->righttree = NULL;
      node->appendplans = appendplans;
-     node->isTarget = isTarget;

      return node;
  }
--- 2652,2657 ----
***************
*** 3659,3664 **** make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
--- 3658,3689 ----
      return node;
  }

+ Dml *
+ make_dml(List *subplans, List *returningLists, List *resultRelations, CmdType operation)
+ {
+     Dml *node = makeNode(Dml);
+
+     Assert(list_length(subplans) == list_length(resultRelations));
+     Assert(!returningLists || list_length(returningLists) == list_length(resultRelations));
+
+     node->plan.lefttree = NULL;
+     node->plan.righttree = NULL;
+     node->plan.qual = NIL;
+
+     if (returningLists)
+         node->plan.targetlist = linitial(returningLists);
+     else
+         node->plan.targetlist = NIL;
+
+     node->plans = subplans;
+     node->resultRelations = resultRelations;
+     node->returningLists = returningLists;
+
+     node->operation = operation;
+
+     return node;
+ }
+
  /*
   * make_result
   *      Build a Result plan node
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***************
*** 478,485 **** subquery_planner(PlannerGlobal *glob, Query *parse,
--- 478,494 ----
          rt_fetch(parse->resultRelation, parse->rtable)->inh)
          plan = inheritance_planner(root);
      else
+     {
          plan = grouping_planner(root, tuple_fraction);

+         if (parse->commandType != CMD_SELECT)
+             plan = (Plan *) make_dml(list_make1(plan),
+                                      root->returningLists,
+                                      root->resultRelations,
+                                      parse->commandType);
+     }
+
+
      /*
       * If any subplans were generated, or if we're inside a subplan, build
       * initPlan list and extParam/allParam sets for plan nodes, and attach the
***************
*** 624,632 **** preprocess_qual_conditions(PlannerInfo *root, Node *jtnode)
   * is an inheritance set. Source inheritance is expanded at the bottom of the
   * plan tree (see allpaths.c), but target inheritance has to be expanded at
   * the top.  The reason is that for UPDATE, each target relation needs a
!  * different targetlist matching its own column set.  Also, for both UPDATE
!  * and DELETE, the executor needs the Append plan node at the top, else it
!  * can't keep track of which table is the current target table.  Fortunately,
   * the UPDATE/DELETE target can never be the nullable side of an outer join,
   * so it's OK to generate the plan this way.
   *
--- 633,639 ----
   * is an inheritance set. Source inheritance is expanded at the bottom of the
   * plan tree (see allpaths.c), but target inheritance has to be expanded at
   * the top.  The reason is that for UPDATE, each target relation needs a
!  * different targetlist matching its own column set.  Fortunately,
   * the UPDATE/DELETE target can never be the nullable side of an outer join,
   * so it's OK to generate the plan this way.
   *
***************
*** 737,747 **** inheritance_planner(PlannerInfo *root)
       */
      parse->rtable = rtable;

!     /* Suppress Append if there's only one surviving child rel */
!     if (list_length(subplans) == 1)
!         return (Plan *) linitial(subplans);
!
!     return (Plan *) make_append(subplans, true, tlist);
  }

  /*--------------------
--- 744,753 ----
       */
      parse->rtable = rtable;

!     return (Plan *) make_dml(subplans,
!                              root->returningLists,
!                              root->resultRelations,
!                              parse->commandType);
  }

  /*--------------------
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***************
*** 375,380 **** set_plan_refs(PlannerGlobal *glob, Plan *plan, int rtoffset)
--- 375,403 ----
              set_join_references(glob, (Join *) plan, rtoffset);
              break;

+         case T_Dml:
+             {
+                 /*
+                  * grouping_planner() already called set_returning_clause_references
+                  * so the targetList's references are already set.
+                  */
+                 Dml *splan = (Dml *) plan;
+
+                 foreach(l, splan->resultRelations)
+                 {
+                     lfirst_int(l) += rtoffset;
+                 }
+
+                 Assert(splan->plan.qual == NIL);
+                 foreach(l, splan->plans)
+                 {
+                     lfirst(l) = set_plan_refs(glob,
+                                               (Plan *) lfirst(l),
+                                               rtoffset);
+                 }
+             }
+             break;
+
          case T_Hash:
          case T_Material:
          case T_Sort:
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***************
*** 2034,2039 **** finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params)
--- 2034,2040 ----
          case T_Unique:
          case T_SetOp:
          case T_Group:
+         case T_Dml:
              break;

          default:
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
***************
*** 448,454 **** generate_union_plan(SetOperationStmt *op, PlannerInfo *root,
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, false, tlist);

      /*
       * For UNION ALL, we just need the Append plan.  For UNION, need to add
--- 448,454 ----
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, tlist);

      /*
       * For UNION ALL, we just need the Append plan.  For UNION, need to add
***************
*** 539,545 **** generate_nonunion_plan(SetOperationStmt *op, PlannerInfo *root,
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, false, tlist);

      /* Identify the grouping semantics */
      groupList = generate_setop_grouplist(op, tlist);
--- 539,545 ----
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, tlist);

      /* Identify the grouping semantics */
      groupList = generate_setop_grouplist(op, tlist);
*** /dev/null
--- b/src/include/executor/nodeDml.h
***************
*** 0 ****
--- 1,11 ----
+ #ifndef NODEDML_H
+ #define NODEDML_H
+
+ #include "nodes/execnodes.h"
+
+ extern int ExecCountSlotsDml(Dml *node);
+ extern DmlState *ExecInitDml(Dml *node, EState *estate, int eflags);
+ extern TupleTableSlot *ExecDml(DmlState *node);
+ extern void ExecEndDml(DmlState *node);
+
+ #endif
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 976,981 **** typedef struct ResultState
--- 976,996 ----
  } ResultState;

  /* ----------------
+  *     DmlState information
+  * ----------------
+  */
+ typedef struct DmlState
+ {
+     PlanState        ps;                /* its first field is NodeTag */
+     PlanState      **dmlplans;
+     int                ds_nplans;
+     int                ds_whichplan;
+
+     CmdType            operation;
+ } DmlState;
+
+
+ /* ----------------
   *     AppendState information
   *
   *        nplans            how many plans are in the list
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 71,76 **** typedef enum NodeTag
--- 71,77 ----
      T_Hash,
      T_SetOp,
      T_Limit,
+     T_Dml,
      /* this one isn't a subclass of Plan: */
      T_PlanInvalItem,

***************
*** 190,195 **** typedef enum NodeTag
--- 191,197 ----
      T_NullTestState,
      T_CoerceToDomainState,
      T_DomainConstraintState,
+     T_DmlState,

      /*
       * TAGS FOR PLANNER NODES (relation.h)
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
***************
*** 164,185 **** typedef struct Result
      Node       *resconstantqual;
  } Result;

  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
-  *
-  * Append nodes are sometimes used to switch between several result relations
-  * (when the target of an UPDATE or DELETE is an inheritance set).    Such a
-  * node will have isTarget true.  The Append executor is then responsible
-  * for updating the executor state to point at the correct target relation
-  * whenever it switches subplans.
   * ----------------
   */
  typedef struct Append
  {
      Plan        plan;
      List       *appendplans;
-     bool        isTarget;
  } Append;

  /* ----------------
--- 164,189 ----
      Node       *resconstantqual;
  } Result;

+ typedef struct Dml
+ {
+     Plan        plan;
+
+     CmdType        operation;
+     List       *plans;
+     List       *resultRelations;
+     List       *returningLists;
+ } Dml;
+
+
  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
   * ----------------
   */
  typedef struct Append
  {
      Plan        plan;
      List       *appendplans;
  } Append;

  /* ----------------
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 41,47 **** extern Plan *optimize_minmax_aggregates(PlannerInfo *root, List *tlist,
  extern Plan *create_plan(PlannerInfo *root, Path *best_path);
  extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
                    Index scanrelid, Plan *subplan, List *subrtable);
! extern Append *make_append(List *appendplans, bool isTarget, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
                       Plan *lefttree, Plan *righttree, int wtParam,
                       List *distinctList, long numGroups);
--- 41,47 ----
  extern Plan *create_plan(PlannerInfo *root, Path *best_path);
  extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
                    Index scanrelid, Plan *subplan, List *subrtable);
! extern Append *make_append(List *appendplans, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
                       Plan *lefttree, Plan *righttree, int wtParam,
                       List *distinctList, long numGroups);
***************
*** 69,74 **** extern Plan *materialize_finished_plan(Plan *subplan);
--- 69,76 ----
  extern Unique *make_unique(Plan *lefttree, List *distinctList);
  extern Limit *make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
             int64 offset_est, int64 count_est);
+ extern Dml *make_dml(List *subplans, List *returningLists, List *resultRelation,
+             CmdType operation);
  extern SetOp *make_setop(SetOpCmd cmd, SetOpStrategy strategy, Plan *lefttree,
             List *distinctList, AttrNumber flagColIdx, int firstFlag,
             long numGroups, double outputRows);

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

21 September 2009, 23:34:53

On Sun, Sep 6, 2009 at 6:10 AM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:
> Fixed a couple of bugs and renovated ExecInitDml() a bit.  Patch attached.

Hi, I'm reviewing this patch for this CommitFest.

With regard to the changes in explain.c, I think that the way you've
capitalized INSERT, UPDATE, and DELETE is not consistent with our
usual style for labelling nodes.  Also, you've failed to set sname, so
this reads from uninitialized memory when using JSON or XML format.  I
think that you should handle XML/JSON format by setting sname to "Dml"
and then emit an "operation" field down around where we do if
(strategy) ExplainPropertyText("Strategy", ...).

I am not sure that I like the name Dml for the node type.  Most of our
node types are descriptions of the action that will be performed, like
Sort or HashJoin; Dml is the name of the feature we're trying to
implement, but it's not the name of the action we're performing.  Not
sure what would be better, though.  Write?  Modify?

Can you explain the motivation for changing the Append stuff as part
of this patch?  It's not immediately clear to me why that needs to be
done as part of this patch or what we get out of it.

What is your general impression about the level of maturity of this
code?  Are you submitting this as complete and ready for commit, or is
it a WIP?  If the latter, what are the known issues?

I'll try to provide some more feedback on this after I look it over some more.

...Robert

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

22 September 2009, 11:30:11

(Sorry, forgot to CC hackers)

Robert Haas wrote:
> With regard to the changes in explain.c, I think that the way you've
> capitalized INSERT, UPDATE, and DELETE is not consistent with our
> usual style for labelling nodes.  Also, you've failed to set sname, so
> this reads from uninitialized memory when using JSON or XML format.  I
> think that you should handle XML/JSON format by setting sname to "Dml"
> and then emit an "operation" field down around where we do if
> (strategy) ExplainPropertyText("Strategy", ...).

You're right, I should fix that.

> I am not sure that I like the name Dml for the node type.  Most of our
> node types are descriptions of the action that will be performed, like
> Sort or HashJoin; Dml is the name of the feature we're trying to
> implement, but it's not the name of the action we're performing.  Not
> sure what would be better, though.  Write?  Modify?

Dml was the first name I came up with and it stuck, but it could be
better.  I don't really like Write or Modify either.

> Can you explain the motivation for changing the Append stuff as part
> of this patch?  It's not immediately clear to me why that needs to be
> done as part of this patch or what we get out of it.

It seemed to me that the Append on top was only a workaround for the
fact that we didn't have a node for DML operations that would select the
correct result relation.  I don't see why an Append node should do this
at all if we have a special node for handling DML.

> What is your general impression about the level of maturity of this
> code?  Are you submitting this as complete and ready for commit, or is
> it a WIP?  If the latter, what are the known issues?

Aside from the EXPLAIN stuff you brought up, there are no issues that
I'm aware of.  There are a few spots that could be prettier, but I have
no good ideas for them.

> I'll try to provide some more feedback on this after I look it over some more.

Thanks!

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

22 September 2009, 12:04:22

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> Robert Haas wrote:
>> Can you explain the motivation for changing the Append stuff as part
>> of this patch?  It's not immediately clear to me why that needs to be
>> done as part of this patch or what we get out of it.

> It seemed to me that the Append on top was only a workaround for the
> fact that we didn't have a node for DML operations that would select the
> correct result relation.  I don't see why an Append node should do this
> at all if we have a special node for handling DML.

The stuff for inherited target relations is certainly ugly, and if this
can clean it up then so much the better ... but is a DML node that has
to deal with multiple targets really better?  It's not only the Append
itself that's funny, it's the fact that the generated tuples don't all
have the same tupdesc in UPDATE cases.


FWIW, I'd think of having three separate node types Insert, Update,
Delete, not a combined Dml node.  The behavior and the required inputs
are sufficiently different that I don't think a combined node type
is saving much.  And it'd avoid the "what is that?" problem.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

22 September 2009, 12:11:28

On Tue, Sep 22, 2009 at 11:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
>> Robert Haas wrote:
>>> Can you explain the motivation for changing the Append stuff as part
>>> of this patch?  It's not immediately clear to me why that needs to be
>>> done as part of this patch or what we get out of it.
>
>> It seemed to me that the Append on top was only a workaround for the
>> fact that we didn't have a node for DML operations that would select the
>> correct result relation.  I don't see why an Append node should do this
>> at all if we have a special node for handling DML.
>
> The stuff for inherited target relations is certainly ugly, and if this
> can clean it up then so much the better ... but is a DML node that has
> to deal with multiple targets really better?  It's not only the Append
> itself that's funny, it's the fact that the generated tuples don't all
> have the same tupdesc in UPDATE cases.
>
>
> FWIW, I'd think of having three separate node types Insert, Update,
> Delete, not a combined Dml node.  The behavior and the required inputs
> are sufficiently different that I don't think a combined node type
> is saving much.  And it'd avoid the "what is that?" problem.

Right now, it looks like most of the code is being shared between all
three plan types.  I'm pretty suspicious of how much code this patch
moves around and how little of it is actually changed.  I can't really
tell if there's an actual design improvement here or if this is all
window-dressing.

...Robert

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

22 September 2009, 12:51:17

Robert Haas <robertmhaas@gmail.com> writes:
> Right now, it looks like most of the code is being shared between all
> three plan types.  I'm pretty suspicious of how much code this patch
> moves around and how little of it is actually changed.  I can't really
> tell if there's an actual design improvement here or if this is all
> window-dressing.

My recollection is that we *told* Marko to set things up so that the
first patch was mainly just code refactoring.  So your second sentence
doesn't surprise me.  As to the third, I've not looked at the patch,
but perhaps it needs to expend more effort on documentation?
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

23 September 2009, 23:47:08

On Tue, Sep 22, 2009 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Right now, it looks like most of the code is being shared between all
>> three plan types.  I'm pretty suspicious of how much code this patch
>> moves around and how little of it is actually changed.  I can't really
>> tell if there's an actual design improvement here or if this is all
>> window-dressing.
>
> My recollection is that we *told* Marko to set things up so that the
> first patch was mainly just code refactoring.  So your second sentence
> doesn't surprise me.  As to the third, I've not looked at the patch,
> but perhaps it needs to expend more effort on documentation?

Well, part of the problem is that I've not had a lot of luck trying to
understand how the executor really works (what's a tuple table slot
and why do we need to know in advance how many of them there are?).
There's this fine comment, which has been in
src/backend/executor/README for 8 years and change:

XXX a great deal more documentation needs to be written here...

However, that's not the whole problem, either.  To your point about
documentation, it seems this path doesn't touch the README at all, and
it needs to, because some of the statements in that file would
certainly become false were this patch to be applied.

So I think we should at a minimum ask the patch author to (1) fix the
explain bugs I found and (2) update the README, as well as (3) revert
needless whitespace changes - there are a couple in execMain.c, from
the looks of it.

However, before he spends too much more time on this feature, it would
probably be good for you to take a quick scan through the patch and
see what you think of the general approach.  I don't think I'm
qualified to judge.

...Robert

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

24 September 2009, 11:23:38

Robert Haas wrote:
> However, that's not the whole problem, either.  To your point about
> documentation, it seems this path doesn't touch the README at all, and
> it needs to, because some of the statements in that file would
> certainly become false were this patch to be applied.
> 
> So I think we should at a minimum ask the patch author to (1) fix the
> explain bugs I found and (2) update the README, as well as (3) revert
> needless whitespace changes - there are a couple in execMain.c, from
> the looks of it.

I overlooked the README completely.  I'll see what I can do about these
and submit an updated patch in a couple of days.


Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

27 September 2009, 01:40:24

Robert Haas <robertmhaas@gmail.com> writes:
> Well, part of the problem is that I've not had a lot of luck trying to
> understand how the executor really works (what's a tuple table slot
> and why do we need to know in advance how many of them there are?).

You know, that's actually a really good question.  There is not, as far
as I can see, much advantage to keeping the Slots in an array.  We could
perfectly well keep them in a List and eliminate the notational overhead
of having to count them in advance.  There'd be a bit more palloc
overhead (for the list cells) but avoiding the counting work would more
or less make up for that.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

27 September 2009, 10:49:26

On Sun, Sep 27, 2009 at 12:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Well, part of the problem is that I've not had a lot of luck trying to
>> understand how the executor really works (what's a tuple table slot
>> and why do we need to know in advance how many of them there are?).
>
> You know, that's actually a really good question.  There is not, as far
> as I can see, much advantage to keeping the Slots in an array.  We could
> perfectly well keep them in a List and eliminate the notational overhead
> of having to count them in advance.  There'd be a bit more palloc
> overhead (for the list cells) but avoiding the counting work would more
> or less make up for that.

Heh.  I was actually asking an even stupider question, which is why do
we need to keep all of them in ANY centrally known data structure?
What operation do we perform that requires us to find all of the
exstant TTS?

...Robert

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

27 September 2009, 12:30:44

Robert Haas <robertmhaas@gmail.com> writes:
> Heh.  I was actually asking an even stupider question, which is why do
> we need to keep all of them in ANY centrally known data structure?
> What operation do we perform that requires us to find all of the
> exstant TTS?

ExecDropTupleTable is used to release slot-related buffer pins and
tupdesc refcounts at conclusion of a query.  I suppose we could require
the individual plan node End routines to do it instead.  Not sure if
that's an improvement.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

28 September 2009, 12:32:39

Robert Haas wrote:
> So I think we should at a minimum ask the patch author to (1) fix the
> explain bugs I found and (2) update the README, as well as (3) revert
> needless whitespace changes - there are a couple in execMain.c, from
> the looks of it.

In the attached patch, I made the changes to explain as you suggested
and reverted the only whitespace change I could find from execMain.c.
However, English isn't my first language so I'm not very confident about
fixing the README.

Regards,
Marko Tiikkaja
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***************
*** 581,586 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 581,587 ----
      const char *pname;            /* node type name for text output */
      const char *sname;            /* node type name for non-text output */
      const char *strategy = NULL;
+     const char *operation = NULL; /* DML operation */
      int            save_indent = es->indent;
      bool        haschildren;

***************
*** 705,710 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 706,729 ----
          case T_Hash:
              pname = sname = "Hash";
              break;
+         case T_Dml:
+             sname = "Dml";
+             switch( ((Dml *) plan)->operation)
+             {
+                 case CMD_INSERT:
+                     pname = operation = "Insert";
+                     break;
+                 case CMD_UPDATE:
+                     pname = operation = "Update";
+                     break;
+                 case CMD_DELETE:
+                     pname = operation = "Delete";
+                     break;
+                 default:
+                     pname = "???";
+                     break;
+             }
+             break;
          default:
              pname = sname = "???";
              break;
***************
*** 740,745 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 759,766 ----
              ExplainPropertyText("Parent Relationship", relationship, es);
          if (plan_name)
              ExplainPropertyText("Subplan Name", plan_name, es);
+         if (operation)
+             ExplainPropertyText("Operation", operation, es);
      }

      switch (nodeTag(plan))
***************
*** 1064,1069 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 1085,1095 ----
                                 ((AppendState *) planstate)->appendplans,
                                 outer_plan, es);
              break;
+         case T_Dml:
+             ExplainMemberNodes(((Dml *) plan)->plans,
+                                ((DmlState *) planstate)->dmlplans,
+                                outer_plan, es);
+             break;
          case T_BitmapAnd:
              ExplainMemberNodes(((BitmapAnd *) plan)->bitmapplans,
                                 ((BitmapAndState *) planstate)->bitmapplans,
*** a/src/backend/executor/Makefile
--- b/src/backend/executor/Makefile
***************
*** 15,21 **** include $(top_builddir)/src/Makefile.global
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
--- 15,21 ----
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o nodeDml.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 77,83 **** typedef struct evalPlanQual

  /* decls for local routines only used within this module */
  static void InitPlan(QueryDesc *queryDesc, int eflags);
- static void ExecCheckPlanOutput(Relation resultRel, List *targetList);
  static void ExecEndPlan(PlanState *planstate, EState *estate);
  static void ExecutePlan(EState *estate, PlanState *planstate,
              CmdType operation,
--- 77,82 ----
***************
*** 86,104 **** static void ExecutePlan(EState *estate, PlanState *planstate,
              DestReceiver *dest);
  static void ExecSelect(TupleTableSlot *slot,
             DestReceiver *dest, EState *estate);
- static void ExecInsert(TupleTableSlot *slot, ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecDelete(ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecUpdate(TupleTableSlot *slot, ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecProcessReturning(ProjectionInfo *projectReturning,
-                      TupleTableSlot *tupleSlot,
-                      TupleTableSlot *planSlot,
-                      DestReceiver *dest);
  static TupleTableSlot *EvalPlanQualNext(EState *estate);
  static void EndEvalPlanQual(EState *estate);
  static void ExecCheckRTPerms(List *rangeTable);
--- 85,90 ----
***************
*** 814,909 **** InitPlan(QueryDesc *queryDesc, int eflags)
      tupType = ExecGetResultType(planstate);

      /*
!      * Initialize the junk filter if needed.  SELECT and INSERT queries need a
!      * filter if there are any junk attrs in the tlist.  UPDATE and DELETE
!      * always need a filter, since there's always a junk 'ctid' attribute
!      * present --- no need to look first.
!      *
!      * This section of code is also a convenient place to verify that the
!      * output of an INSERT or UPDATE matches the target table(s).
       */
      {
          bool        junk_filter_needed = false;
          ListCell   *tlist;

!         switch (operation)
          {
!             case CMD_SELECT:
!             case CMD_INSERT:
!                 foreach(tlist, plan->targetlist)
!                 {
!                     TargetEntry *tle = (TargetEntry *) lfirst(tlist);

!                     if (tle->resjunk)
!                     {
!                         junk_filter_needed = true;
!                         break;
!                     }
!                 }
!                 break;
!             case CMD_UPDATE:
!             case CMD_DELETE:
                  junk_filter_needed = true;
                  break;
!             default:
!                 break;
          }

          if (junk_filter_needed)
          {
-             /*
-              * If there are multiple result relations, each one needs its own
-              * junk filter.  Note this is only possible for UPDATE/DELETE, so
-              * we can't be fooled by some needing a filter and some not.
-              */
              if (list_length(plannedstmt->resultRelations) > 1)
              {
-                 PlanState **appendplans;
-                 int            as_nplans;
-                 ResultRelInfo *resultRelInfo;
-
-                 /* Top plan had better be an Append here. */
-                 Assert(IsA(plan, Append));
-                 Assert(((Append *) plan)->isTarget);
-                 Assert(IsA(planstate, AppendState));
-                 appendplans = ((AppendState *) planstate)->appendplans;
-                 as_nplans = ((AppendState *) planstate)->as_nplans;
-                 Assert(as_nplans == estate->es_num_result_relations);
-                 resultRelInfo = estate->es_result_relations;
-                 for (i = 0; i < as_nplans; i++)
-                 {
-                     PlanState  *subplan = appendplans[i];
-                     JunkFilter *j;
-
-                     if (operation == CMD_UPDATE)
-                         ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
-                                             subplan->plan->targetlist);
-
-                     j = ExecInitJunkFilter(subplan->plan->targetlist,
-                             resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
-                                            ExecInitExtraTupleSlot(estate));
-
-                     /*
-                      * Since it must be UPDATE/DELETE, there had better be a
-                      * "ctid" junk attribute in the tlist ... but ctid could
-                      * be at a different resno for each result relation. We
-                      * look up the ctid resnos now and save them in the
-                      * junkfilters.
-                      */
-                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
-                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
-                         elog(ERROR, "could not find junk ctid column");
-                     resultRelInfo->ri_junkFilter = j;
-                     resultRelInfo++;
-                 }
-
-                 /*
-                  * Set active junkfilter too; at this point ExecInitAppend has
-                  * already selected an active result relation...
-                  */
-                 estate->es_junkFilter =
-                     estate->es_result_relation_info->ri_junkFilter;
-
                  /*
                   * We currently can't support rowmarks in this case, because
                   * the associated junk CTIDs might have different resnos in
--- 800,828 ----
      tupType = ExecGetResultType(planstate);

      /*
!      * Initialize the junk filter if needed.  SELECT queries need a
!      * filter if there are any junk attrs in the tlist.
       */
+     if (operation == CMD_SELECT)
      {
          bool        junk_filter_needed = false;
          ListCell   *tlist;

!         foreach(tlist, plan->targetlist)
          {
!             TargetEntry *tle = (TargetEntry *) lfirst(tlist);

!             if (tle->resjunk)
!             {
                  junk_filter_needed = true;
                  break;
!             }
          }

          if (junk_filter_needed)
          {
              if (list_length(plannedstmt->resultRelations) > 1)
              {
                  /*
                   * We currently can't support rowmarks in this case, because
                   * the associated junk CTIDs might have different resnos in
***************
*** 916,928 **** InitPlan(QueryDesc *queryDesc, int eflags)
              }
              else
              {
-                 /* Normal case with just one JunkFilter */
                  JunkFilter *j;

-                 if (operation == CMD_INSERT || operation == CMD_UPDATE)
-                     ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
-                                         planstate->plan->targetlist);
-
                  j = ExecInitJunkFilter(planstate->plan->targetlist,
                                         tupType->tdhasoid,
                                         ExecInitExtraTupleSlot(estate));
--- 835,842 ----
***************
*** 930,947 **** InitPlan(QueryDesc *queryDesc, int eflags)
                  if (estate->es_result_relation_info)
                      estate->es_result_relation_info->ri_junkFilter = j;

!                 if (operation == CMD_SELECT)
!                 {
!                     /* For SELECT, want to return the cleaned tuple type */
!                     tupType = j->jf_cleanTupType;
!                 }
!                 else if (operation == CMD_UPDATE || operation == CMD_DELETE)
!                 {
!                     /* For UPDATE/DELETE, find the ctid junk attr now */
!                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
!                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
!                         elog(ERROR, "could not find junk ctid column");
!                 }

                  /* For SELECT FOR UPDATE/SHARE, find the junk attrs now */
                  foreach(l, estate->es_rowMarks)
--- 844,851 ----
                  if (estate->es_result_relation_info)
                      estate->es_result_relation_info->ri_junkFilter = j;

!                 /* For SELECT, want to return the cleaned tuple type */
!                 tupType = j->jf_cleanTupType;

                  /* For SELECT FOR UPDATE/SHARE, find the junk attrs now */
                  foreach(l, estate->es_rowMarks)
***************
*** 971,1027 **** InitPlan(QueryDesc *queryDesc, int eflags)
          }
          else
          {
-             if (operation == CMD_INSERT)
-                 ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
-                                     planstate->plan->targetlist);
-
              estate->es_junkFilter = NULL;
              if (estate->es_rowMarks)
                  elog(ERROR, "SELECT FOR UPDATE/SHARE, but no junk columns");
          }
      }

-     /*
-      * Initialize RETURNING projections if needed.
-      */
-     if (plannedstmt->returningLists)
-     {
-         TupleTableSlot *slot;
-         ExprContext *econtext;
-         ResultRelInfo *resultRelInfo;
-
-         /*
-          * We set QueryDesc.tupDesc to be the RETURNING rowtype in this case.
-          * We assume all the sublists will generate the same output tupdesc.
-          */
-         tupType = ExecTypeFromTL((List *) linitial(plannedstmt->returningLists),
-                                  false);
-
-         /* Set up a slot for the output of the RETURNING projection(s) */
-         slot = ExecInitExtraTupleSlot(estate);
-         ExecSetSlotDescriptor(slot, tupType);
-         /* Need an econtext too */
-         econtext = CreateExprContext(estate);
-
-         /*
-          * Build a projection for each result rel.    Note that any SubPlans in
-          * the RETURNING lists get attached to the topmost plan node.
-          */
-         Assert(list_length(plannedstmt->returningLists) == estate->es_num_result_relations);
-         resultRelInfo = estate->es_result_relations;
-         foreach(l, plannedstmt->returningLists)
-         {
-             List       *rlist = (List *) lfirst(l);
-             List       *rliststate;
-
-             rliststate = (List *) ExecInitExpr((Expr *) rlist, planstate);
-             resultRelInfo->ri_projectReturning =
-                 ExecBuildProjectionInfo(rliststate, econtext, slot,
-                                      resultRelInfo->ri_RelationDesc->rd_att);
-             resultRelInfo++;
-         }
-     }
-
      queryDesc->tupDesc = tupType;
      queryDesc->planstate = planstate;

--- 875,886 ----
***************
*** 1123,1197 **** InitResultRelInfo(ResultRelInfo *resultRelInfo,
  }

  /*
-  * Verify that the tuples to be produced by INSERT or UPDATE match the
-  * target relation's rowtype
-  *
-  * We do this to guard against stale plans.  If plan invalidation is
-  * functioning properly then we should never get a failure here, but better
-  * safe than sorry.  Note that this is called after we have obtained lock
-  * on the target rel, so the rowtype can't change underneath us.
-  *
-  * The plan output is represented by its targetlist, because that makes
-  * handling the dropped-column case easier.
-  */
- static void
- ExecCheckPlanOutput(Relation resultRel, List *targetList)
- {
-     TupleDesc    resultDesc = RelationGetDescr(resultRel);
-     int            attno = 0;
-     ListCell   *lc;
-
-     foreach(lc, targetList)
-     {
-         TargetEntry *tle = (TargetEntry *) lfirst(lc);
-         Form_pg_attribute attr;
-
-         if (tle->resjunk)
-             continue;            /* ignore junk tlist items */
-
-         if (attno >= resultDesc->natts)
-             ereport(ERROR,
-                     (errcode(ERRCODE_DATATYPE_MISMATCH),
-                      errmsg("table row type and query-specified row type do not match"),
-                      errdetail("Query has too many columns.")));
-         attr = resultDesc->attrs[attno++];
-
-         if (!attr->attisdropped)
-         {
-             /* Normal case: demand type match */
-             if (exprType((Node *) tle->expr) != attr->atttypid)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_DATATYPE_MISMATCH),
-                          errmsg("table row type and query-specified row type do not match"),
-                          errdetail("Table has type %s at ordinal position %d, but query expects %s.",
-                                    format_type_be(attr->atttypid),
-                                    attno,
-                              format_type_be(exprType((Node *) tle->expr)))));
-         }
-         else
-         {
-             /*
-              * For a dropped column, we can't check atttypid (it's likely 0).
-              * In any case the planner has most likely inserted an INT4 null.
-              * What we insist on is just *some* NULL constant.
-              */
-             if (!IsA(tle->expr, Const) ||
-                 !((Const *) tle->expr)->constisnull)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_DATATYPE_MISMATCH),
-                          errmsg("table row type and query-specified row type do not match"),
-                          errdetail("Query provides a value for a dropped column at ordinal position %d.",
-                                    attno)));
-         }
-     }
-     if (attno != resultDesc->natts)
-         ereport(ERROR,
-                 (errcode(ERRCODE_DATATYPE_MISMATCH),
-           errmsg("table row type and query-specified row type do not match"),
-                  errdetail("Query has too few columns.")));
- }
-
- /*
   *        ExecGetTriggerResultRel
   *
   * Get a ResultRelInfo for a trigger target relation.  Most of the time,
--- 982,987 ----
***************
*** 1423,1430 **** ExecutePlan(EState *estate,
      JunkFilter *junkfilter;
      TupleTableSlot *planSlot;
      TupleTableSlot *slot;
-     ItemPointer tupleid = NULL;
-     ItemPointerData tuple_ctid;
      long        current_tuple_count;

      /*
--- 1213,1218 ----
***************
*** 1495,1501 **** lnext:    ;
           *
           * But first, extract all the junk information we need.
           */
!         if ((junkfilter = estate->es_junkFilter) != NULL)
          {
              /*
               * Process any FOR UPDATE or FOR SHARE locking requested.
--- 1283,1289 ----
           *
           * But first, extract all the junk information we need.
           */
!         if (operation == CMD_SELECT && (junkfilter = estate->es_junkFilter) != NULL)
          {
              /*
               * Process any FOR UPDATE or FOR SHARE locking requested.
***************
*** 1604,1635 **** lnext:    ;
                  }
              }

!             /*
!              * extract the 'ctid' junk attribute.
!              */
!             if (operation == CMD_UPDATE || operation == CMD_DELETE)
!             {
!                 Datum        datum;
!                 bool        isNull;
!
!                 datum = ExecGetJunkAttribute(slot, junkfilter->jf_junkAttNo,
!                                              &isNull);
!                 /* shouldn't ever get a null result... */
!                 if (isNull)
!                     elog(ERROR, "ctid is NULL");
!
!                 tupleid = (ItemPointer) DatumGetPointer(datum);
!                 tuple_ctid = *tupleid;    /* make sure we don't free the ctid!! */
!                 tupleid = &tuple_ctid;
!             }
!
!             /*
!              * Create a new "clean" tuple with all junk attributes removed. We
!              * don't need to do this for DELETE, however (there will in fact
!              * be no non-junk attributes in a DELETE!)
!              */
!             if (operation != CMD_DELETE)
!                 slot = ExecFilterJunk(junkfilter, slot);
          }

          /*
--- 1392,1398 ----
                  }
              }

!             slot = ExecFilterJunk(junkfilter, slot);
          }

          /*
***************
*** 1644,1658 **** lnext:    ;
                  break;

              case CMD_INSERT:
-                 ExecInsert(slot, tupleid, planSlot, dest, estate);
-                 break;
-
              case CMD_DELETE:
-                 ExecDelete(tupleid, planSlot, dest, estate);
-                 break;
-
              case CMD_UPDATE:
!                 ExecUpdate(slot, tupleid, planSlot, dest, estate);
                  break;

              default:
--- 1407,1416 ----
                  break;

              case CMD_INSERT:
              case CMD_DELETE:
              case CMD_UPDATE:
!                 if (estate->es_plannedstmt->returningLists)
!                     (*dest->receiveSlot) (slot, dest);
                  break;

              default:
***************
*** 1708,2127 **** ExecSelect(TupleTableSlot *slot,
      (estate->es_processed)++;
  }

- /* ----------------------------------------------------------------
-  *        ExecInsert
-  *
-  *        INSERTs are trickier.. we have to insert the tuple into
-  *        the base relation and insert appropriate tuples into the
-  *        index relations.
-  * ----------------------------------------------------------------
-  */
- static void
- ExecInsert(TupleTableSlot *slot,
-            ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     Oid            newId;
-     List       *recheckIndexes = NIL;
-
-     /*
-      * get the heap tuple out of the tuple table slot, making sure we have a
-      * writable copy
-      */
-     tuple = ExecMaterializeSlot(slot);
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /*
-      * If the result relation has OIDs, force the tuple's OID to zero so that
-      * heap_insert will assign a fresh OID.  Usually the OID already will be
-      * zero at this point, but there are corner cases where the plan tree can
-      * return a tuple extracted literally from some table with the same
-      * rowtype.
-      *
-      * XXX if we ever wanted to allow users to assign their own OIDs to new
-      * rows, this'd be the place to do it.  For the moment, we make a point of
-      * doing this before calling triggers, so that a user-supplied trigger
-      * could hack the OID if desired.
-      */
-     if (resultRelationDesc->rd_rel->relhasoids)
-         HeapTupleSetOid(tuple, InvalidOid);
-
-     /* BEFORE ROW INSERT Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
-     {
-         HeapTuple    newtuple;
-
-         newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
-
-         if (newtuple == NULL)    /* "do nothing" */
-             return;
-
-         if (newtuple != tuple)    /* modified by Trigger(s) */
-         {
-             /*
-              * Put the modified tuple into a slot for convenience of routines
-              * below.  We assume the tuple was allocated in per-tuple memory
-              * context, and therefore will go away by itself. The tuple table
-              * slot should not try to clear it.
-              */
-             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
-
-             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
-                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
-             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
-             slot = newslot;
-             tuple = newtuple;
-         }
-     }
-
-     /*
-      * Check the constraints of the tuple
-      */
-     if (resultRelationDesc->rd_att->constr)
-         ExecConstraints(resultRelInfo, slot, estate);
-
-     /*
-      * insert the tuple
-      *
-      * Note: heap_insert returns the tid (location) of the new tuple in the
-      * t_self field.
-      */
-     newId = heap_insert(resultRelationDesc, tuple,
-                         estate->es_output_cid, 0, NULL);
-
-     IncrAppended();
-     (estate->es_processed)++;
-     estate->es_lastoid = newId;
-     setLastTid(&(tuple->t_self));
-
-     /*
-      * insert index entries for tuple
-      */
-     if (resultRelInfo->ri_NumIndices > 0)
-         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
-                                                estate, false);
-
-     /* AFTER ROW INSERT Triggers */
-     ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
- }
-
- /* ----------------------------------------------------------------
-  *        ExecDelete
-  *
-  *        DELETE is like UPDATE, except that we delete the tuple and no
-  *        index modifications are needed
-  * ----------------------------------------------------------------
-  */
- static void
- ExecDelete(ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     HTSU_Result result;
-     ItemPointerData update_ctid;
-     TransactionId update_xmax;
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /* BEFORE ROW DELETE Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_DELETE] > 0)
-     {
-         bool        dodelete;
-
-         dodelete = ExecBRDeleteTriggers(estate, resultRelInfo, tupleid);
-
-         if (!dodelete)            /* "do nothing" */
-             return;
-     }
-
-     /*
-      * delete the tuple
-      *
-      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
-      * the row to be deleted is visible to that snapshot, and throw a can't-
-      * serialize error if not.    This is a special-case behavior needed for
-      * referential integrity updates in serializable transactions.
-      */
- ldelete:;
-     result = heap_delete(resultRelationDesc, tupleid,
-                          &update_ctid, &update_xmax,
-                          estate->es_output_cid,
-                          estate->es_crosscheck_snapshot,
-                          true /* wait for commit */ );
-     switch (result)
-     {
-         case HeapTupleSelfUpdated:
-             /* already deleted by self; nothing to do */
-             return;
-
-         case HeapTupleMayBeUpdated:
-             break;
-
-         case HeapTupleUpdated:
-             if (IsXactIsoLevelSerializable)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                          errmsg("could not serialize access due to concurrent update")));
-             else if (!ItemPointerEquals(tupleid, &update_ctid))
-             {
-                 TupleTableSlot *epqslot;
-
-                 epqslot = EvalPlanQual(estate,
-                                        resultRelInfo->ri_RangeTableIndex,
-                                        &update_ctid,
-                                        update_xmax);
-                 if (!TupIsNull(epqslot))
-                 {
-                     *tupleid = update_ctid;
-                     goto ldelete;
-                 }
-             }
-             /* tuple already deleted; nothing to do */
-             return;
-
-         default:
-             elog(ERROR, "unrecognized heap_delete status: %u", result);
-             return;
-     }
-
-     IncrDeleted();
-     (estate->es_processed)++;
-
-     /*
-      * Note: Normally one would think that we have to delete index tuples
-      * associated with the heap tuple now...
-      *
-      * ... but in POSTGRES, we have no need to do this because VACUUM will
-      * take care of it later.  We can't delete index tuples immediately
-      * anyway, since the tuple is still visible to other transactions.
-      */
-
-     /* AFTER ROW DELETE Triggers */
-     ExecARDeleteTriggers(estate, resultRelInfo, tupleid);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-     {
-         /*
-          * We have to put the target tuple into a slot, which means first we
-          * gotta fetch it.    We can use the trigger tuple slot.
-          */
-         TupleTableSlot *slot = estate->es_trig_tuple_slot;
-         HeapTupleData deltuple;
-         Buffer        delbuffer;
-
-         deltuple.t_self = *tupleid;
-         if (!heap_fetch(resultRelationDesc, SnapshotAny,
-                         &deltuple, &delbuffer, false, NULL))
-             elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
-
-         if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
-             ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
-         ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
-
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
-
-         ExecClearTuple(slot);
-         ReleaseBuffer(delbuffer);
-     }
- }
-
- /* ----------------------------------------------------------------
-  *        ExecUpdate
-  *
-  *        note: we can't run UPDATE queries with transactions
-  *        off because UPDATEs are actually INSERTs and our
-  *        scan will mistakenly loop forever, updating the tuple
-  *        it just inserted..    This should be fixed but until it
-  *        is, we don't want to get stuck in an infinite loop
-  *        which corrupts your database..
-  * ----------------------------------------------------------------
-  */
- static void
- ExecUpdate(TupleTableSlot *slot,
-            ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     HTSU_Result result;
-     ItemPointerData update_ctid;
-     TransactionId update_xmax;
-     List *recheckIndexes = NIL;
-
-     /*
-      * abort the operation if not running transactions
-      */
-     if (IsBootstrapProcessingMode())
-         elog(ERROR, "cannot UPDATE during bootstrap");
-
-     /*
-      * get the heap tuple out of the tuple table slot, making sure we have a
-      * writable copy
-      */
-     tuple = ExecMaterializeSlot(slot);
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /* BEFORE ROW UPDATE Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_UPDATE] > 0)
-     {
-         HeapTuple    newtuple;
-
-         newtuple = ExecBRUpdateTriggers(estate, resultRelInfo,
-                                         tupleid, tuple);
-
-         if (newtuple == NULL)    /* "do nothing" */
-             return;
-
-         if (newtuple != tuple)    /* modified by Trigger(s) */
-         {
-             /*
-              * Put the modified tuple into a slot for convenience of routines
-              * below.  We assume the tuple was allocated in per-tuple memory
-              * context, and therefore will go away by itself. The tuple table
-              * slot should not try to clear it.
-              */
-             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
-
-             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
-                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
-             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
-             slot = newslot;
-             tuple = newtuple;
-         }
-     }
-
-     /*
-      * Check the constraints of the tuple
-      *
-      * If we generate a new candidate tuple after EvalPlanQual testing, we
-      * must loop back here and recheck constraints.  (We don't need to redo
-      * triggers, however.  If there are any BEFORE triggers then trigger.c
-      * will have done heap_lock_tuple to lock the correct tuple, so there's no
-      * need to do them again.)
-      */
- lreplace:;
-     if (resultRelationDesc->rd_att->constr)
-         ExecConstraints(resultRelInfo, slot, estate);
-
-     /*
-      * replace the heap tuple
-      *
-      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
-      * the row to be updated is visible to that snapshot, and throw a can't-
-      * serialize error if not.    This is a special-case behavior needed for
-      * referential integrity updates in serializable transactions.
-      */
-     result = heap_update(resultRelationDesc, tupleid, tuple,
-                          &update_ctid, &update_xmax,
-                          estate->es_output_cid,
-                          estate->es_crosscheck_snapshot,
-                          true /* wait for commit */ );
-     switch (result)
-     {
-         case HeapTupleSelfUpdated:
-             /* already deleted by self; nothing to do */
-             return;
-
-         case HeapTupleMayBeUpdated:
-             break;
-
-         case HeapTupleUpdated:
-             if (IsXactIsoLevelSerializable)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                          errmsg("could not serialize access due to concurrent update")));
-             else if (!ItemPointerEquals(tupleid, &update_ctid))
-             {
-                 TupleTableSlot *epqslot;
-
-                 epqslot = EvalPlanQual(estate,
-                                        resultRelInfo->ri_RangeTableIndex,
-                                        &update_ctid,
-                                        update_xmax);
-                 if (!TupIsNull(epqslot))
-                 {
-                     *tupleid = update_ctid;
-                     slot = ExecFilterJunk(estate->es_junkFilter, epqslot);
-                     tuple = ExecMaterializeSlot(slot);
-                     goto lreplace;
-                 }
-             }
-             /* tuple already deleted; nothing to do */
-             return;
-
-         default:
-             elog(ERROR, "unrecognized heap_update status: %u", result);
-             return;
-     }
-
-     IncrReplaced();
-     (estate->es_processed)++;
-
-     /*
-      * Note: instead of having to update the old index tuples associated with
-      * the heap tuple, all we do is form and insert new index tuples. This is
-      * because UPDATEs are actually DELETEs and INSERTs, and index tuple
-      * deletion is done later by VACUUM (see notes in ExecDelete).    All we do
-      * here is insert new index tuples.  -cim 9/27/89
-      */
-
-     /*
-      * insert index entries for tuple
-      *
-      * Note: heap_update returns the tid (location) of the new tuple in the
-      * t_self field.
-      *
-      * If it's a HOT update, we mustn't insert new index entries.
-      */
-     if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
-         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
-                                                estate, false);
-
-     /* AFTER ROW UPDATE Triggers */
-     ExecARUpdateTriggers(estate, resultRelInfo, tupleid, tuple,
-                          recheckIndexes);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
- }
-
  /*
   * ExecRelCheck --- check that tuple meets constraints for result relation
   */
--- 1466,1471 ----
***************
*** 2222,2263 **** ExecConstraints(ResultRelInfo *resultRelInfo,
  }

  /*
-  * ExecProcessReturning --- evaluate a RETURNING list and send to dest
-  *
-  * projectReturning: RETURNING projection info for current result rel
-  * tupleSlot: slot holding tuple actually inserted/updated/deleted
-  * planSlot: slot holding tuple returned by top plan node
-  * dest: where to send the output
-  */
- static void
- ExecProcessReturning(ProjectionInfo *projectReturning,
-                      TupleTableSlot *tupleSlot,
-                      TupleTableSlot *planSlot,
-                      DestReceiver *dest)
- {
-     ExprContext *econtext = projectReturning->pi_exprContext;
-     TupleTableSlot *retSlot;
-
-     /*
-      * Reset per-tuple memory context to free any expression evaluation
-      * storage allocated in the previous cycle.
-      */
-     ResetExprContext(econtext);
-
-     /* Make tuple and any needed join variables available to ExecProject */
-     econtext->ecxt_scantuple = tupleSlot;
-     econtext->ecxt_outertuple = planSlot;
-
-     /* Compute the RETURNING expressions */
-     retSlot = ExecProject(projectReturning, NULL);
-
-     /* Send to dest */
-     (*dest->receiveSlot) (retSlot, dest);
-
-     ExecClearTuple(retSlot);
- }
-
- /*
   * Check a modified tuple to see if we want to process its updated version
   * under READ COMMITTED rules.
   *
--- 1566,1571 ----
*** a/src/backend/executor/execProcnode.c
--- b/src/backend/executor/execProcnode.c
***************
*** 90,95 ****
--- 90,96 ----
  #include "executor/nodeHash.h"
  #include "executor/nodeHashjoin.h"
  #include "executor/nodeIndexscan.h"
+ #include "executor/nodeDml.h"
  #include "executor/nodeLimit.h"
  #include "executor/nodeMaterial.h"
  #include "executor/nodeMergejoin.h"
***************
*** 285,290 **** ExecInitNode(Plan *node, EState *estate, int eflags)
--- 286,296 ----
                                                   estate, eflags);
              break;

+         case T_Dml:
+             result = (PlanState *) ExecInitDml((Dml *) node,
+                                                  estate, eflags);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;        /* keep compiler quiet */
***************
*** 450,455 **** ExecProcNode(PlanState *node)
--- 456,465 ----
              result = ExecLimit((LimitState *) node);
              break;

+         case T_DmlState:
+             result = ExecDml((DmlState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;
***************
*** 666,671 **** ExecEndNode(PlanState *node)
--- 676,685 ----
              ExecEndLimit((LimitState *) node);
              break;

+         case T_DmlState:
+             ExecEndDml((DmlState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              break;
*** a/src/backend/executor/nodeAppend.c
--- b/src/backend/executor/nodeAppend.c
***************
*** 103,123 **** exec_append_initialize_next(AppendState *appendstate)
      }
      else
      {
-         /*
-          * initialize the scan
-          *
-          * If we are controlling the target relation, select the proper active
-          * ResultRelInfo and junk filter for this target.
-          */
-         if (((Append *) appendstate->ps.plan)->isTarget)
-         {
-             Assert(whichplan < estate->es_num_result_relations);
-             estate->es_result_relation_info =
-                 estate->es_result_relations + whichplan;
-             estate->es_junkFilter =
-                 estate->es_result_relation_info->ri_junkFilter;
-         }
-
          return TRUE;
      }
  }
--- 103,108 ----
***************
*** 164,189 **** ExecInitAppend(Append *node, EState *estate, int eflags)
      appendstate->appendplans = appendplanstates;
      appendstate->as_nplans = nplans;

!     /*
!      * Do we want to scan just one subplan?  (Special case for EvalPlanQual)
!      * XXX pretty dirty way of determining that this case applies ...
!      */
!     if (node->isTarget && estate->es_evTuple != NULL)
!     {
!         int            tplan;
!
!         tplan = estate->es_result_relation_info - estate->es_result_relations;
!         Assert(tplan >= 0 && tplan < nplans);
!
!         appendstate->as_firstplan = tplan;
!         appendstate->as_lastplan = tplan;
!     }
!     else
!     {
!         /* normal case, scan all subplans */
!         appendstate->as_firstplan = 0;
!         appendstate->as_lastplan = nplans - 1;
!     }

      /*
       * Miscellaneous initialization
--- 149,157 ----
      appendstate->appendplans = appendplanstates;
      appendstate->as_nplans = nplans;

!
!     appendstate->as_firstplan = 0;
!     appendstate->as_lastplan = nplans - 1;

      /*
       * Miscellaneous initialization
*** /dev/null
--- b/src/backend/executor/nodeDml.c
***************
*** 0 ****
--- 1,834 ----
+ #include "postgres.h"
+
+ #include "access/xact.h"
+ #include "parser/parsetree.h"
+ #include "executor/executor.h"
+ #include "executor/execdebug.h"
+ #include "executor/nodeDml.h"
+ #include "commands/trigger.h"
+ #include "nodes/nodeFuncs.h"
+ #include "utils/memutils.h"
+ #include "utils/builtins.h"
+ #include "utils/tqual.h"
+ #include "storage/bufmgr.h"
+ #include "miscadmin.h"
+
+ /*
+  * Verify that the tuples to be produced by INSERT or UPDATE match the
+  * target relation's rowtype
+  *
+  * We do this to guard against stale plans.  If plan invalidation is
+  * functioning properly then we should never get a failure here, but better
+  * safe than sorry.  Note that this is called after we have obtained lock
+  * on the target rel, so the rowtype can't change underneath us.
+  *
+  * The plan output is represented by its targetlist, because that makes
+  * handling the dropped-column case easier.
+  */
+ static void
+ ExecCheckPlanOutput(Relation resultRel, List *targetList)
+ {
+     TupleDesc    resultDesc = RelationGetDescr(resultRel);
+     int            attno = 0;
+     ListCell   *lc;
+
+     foreach(lc, targetList)
+     {
+         TargetEntry *tle = (TargetEntry *) lfirst(lc);
+         Form_pg_attribute attr;
+
+         if (tle->resjunk)
+             continue;            /* ignore junk tlist items */
+
+         if (attno >= resultDesc->natts)
+             ereport(ERROR,
+                     (errcode(ERRCODE_DATATYPE_MISMATCH),
+                      errmsg("table row type and query-specified row type do not match"),
+                      errdetail("Query has too many columns.")));
+         attr = resultDesc->attrs[attno++];
+
+         if (!attr->attisdropped)
+         {
+             /* Normal case: demand type match */
+             if (exprType((Node *) tle->expr) != attr->atttypid)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_DATATYPE_MISMATCH),
+                          errmsg("table row type and query-specified row type do not match"),
+                          errdetail("Table has type %s at ordinal position %d, but query expects %s.",
+                                    format_type_be(attr->atttypid),
+                                    attno,
+                              format_type_be(exprType((Node *) tle->expr)))));
+         }
+         else
+         {
+             /*
+              * For a dropped column, we can't check atttypid (it's likely 0).
+              * In any case the planner has most likely inserted an INT4 null.
+              * What we insist on is just *some* NULL constant.
+              */
+             if (!IsA(tle->expr, Const) ||
+                 !((Const *) tle->expr)->constisnull)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_DATATYPE_MISMATCH),
+                          errmsg("table row type and query-specified row type do not match"),
+                          errdetail("Query provides a value for a dropped column at ordinal position %d.",
+                                    attno)));
+         }
+     }
+     if (attno != resultDesc->natts)
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+           errmsg("table row type and query-specified row type do not match"),
+                  errdetail("Query has too few columns.")));
+ }
+
+ static TupleTableSlot*
+ ExecProcessReturning(ProjectionInfo *projectReturning,
+                      TupleTableSlot *tupleSlot,
+                      TupleTableSlot *planSlot)
+ {
+     ExprContext *econtext = projectReturning->pi_exprContext;
+     TupleTableSlot *retSlot;
+
+     /*
+      * Reset per-tuple memory context to free any expression evaluation
+      * storage allocated in the previous cycle.
+      */
+     ResetExprContext(econtext);
+
+     /* Make tuple and any needed join variables available to ExecProject */
+     econtext->ecxt_scantuple = tupleSlot;
+     econtext->ecxt_outertuple = planSlot;
+
+     /* Compute the RETURNING expressions */
+     retSlot = ExecProject(projectReturning, NULL);
+
+     return retSlot;
+ }
+
+ static TupleTableSlot *
+ ExecInsert(TupleTableSlot *slot,
+             ItemPointer tupleid,
+             TupleTableSlot *planSlot,
+             EState *estate)
+ {
+     HeapTuple    tuple;
+     ResultRelInfo *resultRelInfo;
+     Relation    resultRelationDesc;
+     Oid            newId;
+     List        *recheckIndexes = NIL;
+
+     /*
+      * get the heap tuple out of the tuple table slot, making sure we have a
+      * writable copy
+      */
+     tuple = ExecMaterializeSlot(slot);
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relations;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /*
+      * If the result relation has OIDs, force the tuple's OID to zero so that
+      * heap_insert will assign a fresh OID.  Usually the OID already will be
+      * zero at this point, but there are corner cases where the plan tree can
+      * return a tuple extracted literally from some table with the same
+      * rowtype.
+      *
+      * XXX if we ever wanted to allow users to assign their own OIDs to new
+      * rows, this'd be the place to do it.  For the moment, we make a point of
+      * doing this before calling triggers, so that a user-supplied trigger
+      * could hack the OID if desired.
+      */
+     if (resultRelationDesc->rd_rel->relhasoids)
+         HeapTupleSetOid(tuple, InvalidOid);
+
+     /* BEFORE ROW INSERT Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
+     {
+         HeapTuple    newtuple;
+
+         newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
+
+         if (newtuple == NULL)    /* "do nothing" */
+             return NULL;
+
+         if (newtuple != tuple)    /* modified by Trigger(s) */
+         {
+             /*
+              * Put the modified tuple into a slot for convenience of routines
+              * below.  We assume the tuple was allocated in per-tuple memory
+              * context, and therefore will go away by itself. The tuple table
+              * slot should not try to clear it.
+              */
+             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
+
+             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
+                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
+             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
+             slot = newslot;
+             tuple = newtuple;
+         }
+     }
+
+     /*
+      * Check the constraints of the tuple
+      */
+     if (resultRelationDesc->rd_att->constr)
+         ExecConstraints(resultRelInfo, slot, estate);
+
+     /*
+      * insert the tuple
+      *
+      * Note: heap_insert returns the tid (location) of the new tuple in the
+      * t_self field.
+      */
+     newId = heap_insert(resultRelationDesc, tuple,
+                         estate->es_output_cid, 0, NULL);
+
+     IncrAppended();
+     (estate->es_processed)++;
+     estate->es_lastoid = newId;
+     setLastTid(&(tuple->t_self));
+
+     /*
+      * insert index entries for tuple
+      */
+     if (resultRelInfo->ri_NumIndices > 0)
+         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false);
+
+     /* AFTER ROW INSERT Triggers */
+     ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+         slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                              slot, planSlot);
+
+     return slot;
+ }
+
+ /* ----------------------------------------------------------------
+  *        ExecDelete
+  *
+  *        DELETE is like UPDATE, except that we delete the tuple and no
+  *        index modifications are needed
+  * ----------------------------------------------------------------
+  */
+ static TupleTableSlot *
+ ExecDelete(ItemPointer tupleid,
+            TupleTableSlot *planSlot,
+            EState *estate)
+ {
+     ResultRelInfo* resultRelInfo;
+     Relation    resultRelationDesc;
+     HTSU_Result result;
+     ItemPointerData update_ctid;
+     TransactionId update_xmax;
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relation_info;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /* BEFORE ROW DELETE Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_DELETE] > 0)
+     {
+         bool        dodelete;
+
+         dodelete = ExecBRDeleteTriggers(estate, resultRelInfo, tupleid);
+
+         if (!dodelete)            /* "do nothing" */
+             return planSlot;
+     }
+
+     /*
+      * delete the tuple
+      *
+      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
+      * the row to be deleted is visible to that snapshot, and throw a can't-
+      * serialize error if not.    This is a special-case behavior needed for
+      * referential integrity updates in serializable transactions.
+      */
+ ldelete:;
+     result = heap_delete(resultRelationDesc, tupleid,
+                          &update_ctid, &update_xmax,
+                          estate->es_output_cid,
+                          estate->es_crosscheck_snapshot,
+                          true /* wait for commit */ );
+     switch (result)
+     {
+         case HeapTupleSelfUpdated:
+             /* already deleted by self; nothing to do */
+             return planSlot;
+
+         case HeapTupleMayBeUpdated:
+             break;
+
+         case HeapTupleUpdated:
+             if (IsXactIsoLevelSerializable)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                          errmsg("could not serialize access due to concurrent update")));
+             else if (!ItemPointerEquals(tupleid, &update_ctid))
+             {
+                 TupleTableSlot *epqslot;
+
+                 epqslot = EvalPlanQual(estate,
+                                        resultRelInfo->ri_RangeTableIndex,
+                                        &update_ctid,
+                                        update_xmax);
+                 if (!TupIsNull(epqslot))
+                 {
+                     *tupleid = update_ctid;
+                     goto ldelete;
+                 }
+             }
+             /* tuple already deleted; nothing to do */
+             return planSlot;
+
+         default:
+             elog(ERROR, "unrecognized heap_delete status: %u", result);
+             return NULL;
+     }
+
+     IncrDeleted();
+     (estate->es_processed)++;
+
+     /*
+      * Note: Normally one would think that we have to delete index tuples
+      * associated with the heap tuple now...
+      *
+      * ... but in POSTGRES, we have no need to do this because VACUUM will
+      * take care of it later.  We can't delete index tuples immediately
+      * anyway, since the tuple is still visible to other transactions.
+      */
+
+     /* AFTER ROW DELETE Triggers */
+     ExecARDeleteTriggers(estate, resultRelInfo, tupleid);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+     {
+         /*
+          * We have to put the target tuple into a slot, which means first we
+          * gotta fetch it.    We can use the trigger tuple slot.
+          */
+         TupleTableSlot *slot = estate->es_trig_tuple_slot;
+         HeapTupleData deltuple;
+         Buffer        delbuffer;
+
+         deltuple.t_self = *tupleid;
+         if (!heap_fetch(resultRelationDesc, SnapshotAny,
+                         &deltuple, &delbuffer, false, NULL))
+             elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
+
+         if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
+             ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
+         ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
+
+         planSlot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                              slot, planSlot);
+
+         ExecClearTuple(slot);
+         ReleaseBuffer(delbuffer);
+     }
+
+     return planSlot;
+ }
+
+ /* ----------------------------------------------------------------
+  *        ExecUpdate
+  *
+  *        note: we can't run UPDATE queries with transactions
+  *        off because UPDATEs are actually INSERTs and our
+  *        scan will mistakenly loop forever, updating the tuple
+  *        it just inserted..    This should be fixed but until it
+  *        is, we don't want to get stuck in an infinite loop
+  *        which corrupts your database..
+  * ----------------------------------------------------------------
+  */
+ static TupleTableSlot *
+ ExecUpdate(TupleTableSlot *slot,
+            ItemPointer tupleid,
+            TupleTableSlot *planSlot,
+            EState *estate)
+ {
+     HeapTuple    tuple;
+     ResultRelInfo *resultRelInfo;
+     Relation    resultRelationDesc;
+     HTSU_Result result;
+     ItemPointerData update_ctid;
+     TransactionId update_xmax;
+     List *recheckIndexes = NIL;
+
+     /*
+      * abort the operation if not running transactions
+      */
+     if (IsBootstrapProcessingMode())
+         elog(ERROR, "cannot UPDATE during bootstrap");
+
+     /*
+      * get the heap tuple out of the tuple table slot, making sure we have a
+      * writable copy
+      */
+     tuple = ExecMaterializeSlot(slot);
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relation_info;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /* BEFORE ROW UPDATE Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_UPDATE] > 0)
+     {
+         HeapTuple    newtuple;
+
+         newtuple = ExecBRUpdateTriggers(estate, resultRelInfo,
+                                         tupleid, tuple);
+
+         if (newtuple == NULL)    /* "do nothing" */
+             return planSlot;
+
+         if (newtuple != tuple)    /* modified by Trigger(s) */
+         {
+             /*
+              * Put the modified tuple into a slot for convenience of routines
+              * below.  We assume the tuple was allocated in per-tuple memory
+              * context, and therefore will go away by itself. The tuple table
+              * slot should not try to clear it.
+              */
+             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
+
+             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
+                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
+             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
+             slot = newslot;
+             tuple = newtuple;
+         }
+     }
+
+     /*
+      * Check the constraints of the tuple
+      *
+      * If we generate a new candidate tuple after EvalPlanQual testing, we
+      * must loop back here and recheck constraints.  (We don't need to redo
+      * triggers, however.  If there are any BEFORE triggers then trigger.c
+      * will have done heap_lock_tuple to lock the correct tuple, so there's no
+      * need to do them again.)
+      */
+ lreplace:;
+     if (resultRelationDesc->rd_att->constr)
+         ExecConstraints(resultRelInfo, slot, estate);
+
+     /*
+      * replace the heap tuple
+      *
+      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
+      * the row to be updated is visible to that snapshot, and throw a can't-
+      * serialize error if not.    This is a special-case behavior needed for
+      * referential integrity updates in serializable transactions.
+      */
+     result = heap_update(resultRelationDesc, tupleid, tuple,
+                          &update_ctid, &update_xmax,
+                          estate->es_output_cid,
+                          estate->es_crosscheck_snapshot,
+                          true /* wait for commit */ );
+     switch (result)
+     {
+         case HeapTupleSelfUpdated:
+             /* already deleted by self; nothing to do */
+             return planSlot;
+
+         case HeapTupleMayBeUpdated:
+             break;
+
+         case HeapTupleUpdated:
+             if (IsXactIsoLevelSerializable)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                          errmsg("could not serialize access due to concurrent update")));
+             else if (!ItemPointerEquals(tupleid, &update_ctid))
+             {
+                 TupleTableSlot *epqslot;
+
+                 epqslot = EvalPlanQual(estate,
+                                        resultRelInfo->ri_RangeTableIndex,
+                                        &update_ctid,
+                                        update_xmax);
+                 if (!TupIsNull(epqslot))
+                 {
+                     *tupleid = update_ctid;
+                     slot = ExecFilterJunk(estate->es_result_relation_info->ri_junkFilter, epqslot);
+                     tuple = ExecMaterializeSlot(slot);
+                     goto lreplace;
+                 }
+             }
+             /* tuple already deleted; nothing to do */
+             return planSlot;
+
+         default:
+             elog(ERROR, "unrecognized heap_update status: %u", result);
+             return NULL;
+     }
+
+     IncrReplaced();
+     (estate->es_processed)++;
+
+     /*
+      * Note: instead of having to update the old index tuples associated with
+      * the heap tuple, all we do is form and insert new index tuples. This is
+      * because UPDATEs are actually DELETEs and INSERTs, and index tuple
+      * deletion is done later by VACUUM (see notes in ExecDelete).    All we do
+      * here is insert new index tuples.  -cim 9/27/89
+      */
+
+     /*
+      * insert index entries for tuple
+      *
+      * Note: heap_update returns the tid (location) of the new tuple in the
+      * t_self field.
+      *
+      * If it's a HOT update, we mustn't insert new index entries.
+      */
+     if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
+         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
+                                                estate, false);
+
+     /* AFTER ROW UPDATE Triggers */
+     ExecARUpdateTriggers(estate, resultRelInfo, tupleid, tuple,
+                          recheckIndexes);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+         slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                                     slot, planSlot);
+
+     return slot;
+ }
+
+ TupleTableSlot *
+ ExecDml(DmlState *node)
+ {
+     CmdType operation = node->operation;
+     EState *estate = node->ps.state;
+     JunkFilter *junkfilter;
+     TupleTableSlot *slot;
+     TupleTableSlot *planSlot;
+     ItemPointer tupleid = NULL;
+     ItemPointerData tuple_ctid;
+
+     for (;;)
+     {
+         planSlot = ExecProcNode(node->dmlplans[node->ds_whichplan]);
+         if (TupIsNull(planSlot))
+         {
+             node->ds_whichplan++;
+             if (node->ds_whichplan < node->ds_nplans)
+             {
+                 estate->es_result_relation_info++;
+                 continue;
+             }
+             else
+                 return NULL;
+         }
+         else
+             break;
+     }
+
+     slot = planSlot;
+
+     if ((junkfilter = estate->es_result_relation_info->ri_junkFilter) != NULL)
+     {
+         /*
+          * extract the 'ctid' junk attribute.
+          */
+         if (operation == CMD_UPDATE || operation == CMD_DELETE)
+         {
+             Datum        datum;
+             bool        isNull;
+
+             datum = ExecGetJunkAttribute(slot, junkfilter->jf_junkAttNo,
+                                              &isNull);
+             /* shouldn't ever get a null result... */
+             if (isNull)
+                 elog(ERROR, "ctid is NULL");
+
+             tupleid = (ItemPointer) DatumGetPointer(datum);
+             tuple_ctid = *tupleid;    /* make sure we don't free the ctid!! */
+             tupleid = &tuple_ctid;
+         }
+
+         if (operation != CMD_DELETE)
+             slot = ExecFilterJunk(junkfilter, slot);
+     }
+
+     switch (operation)
+     {
+         case CMD_INSERT:
+             return ExecInsert(slot, tupleid, planSlot, estate);
+             break;
+         case CMD_UPDATE:
+             return ExecUpdate(slot, tupleid, planSlot, estate);
+             break;
+         case CMD_DELETE:
+             return ExecDelete(tupleid, slot, estate);
+         default:
+             elog(ERROR, "unknown operation");
+             break;
+     }
+
+     return NULL;
+ }
+
+ DmlState *
+ ExecInitDml(Dml *node, EState *estate, int eflags)
+ {
+     DmlState *dmlstate;
+     ResultRelInfo *resultRelInfo;
+     Plan *subplan;
+     ListCell *l;
+     CmdType operation = node->operation;
+     int nplans;
+     int i;
+
+     TupleDesc tupDesc;
+
+     nplans = list_length(node->plans);
+
+     /*
+      * Do we want to scan just one subplan?  (Special case for EvalPlanQual)
+      * XXX pretty dirty way of determining that this case applies ...
+      */
+     if (estate->es_evTuple != NULL)
+     {
+         int tplan;
+
+         tplan = estate->es_result_relation_info - estate->es_result_relations;
+         Assert(tplan >= 0 && tplan < nplans);
+
+         /*
+          * We don't want another DmlNode on top, so just
+          * return a PlanState for the subplan wanted.
+          */
+         return (DmlState *) ExecInitNode(list_nth(node->plans, tplan), estate, eflags);
+     }
+
+     /*
+      * create state structure
+      */
+     dmlstate = makeNode(DmlState);
+     dmlstate->ps.plan = (Plan *) node;
+     dmlstate->ps.state = estate;
+     dmlstate->ps.targetlist = node->plan.targetlist;
+
+     dmlstate->ds_nplans = nplans;
+     dmlstate->dmlplans = (PlanState **) palloc0(sizeof(PlanState *) * nplans);
+     dmlstate->operation = node->operation;
+
+     estate->es_result_relation_info = estate->es_result_relations;
+     i = 0;
+     foreach(l, node->plans)
+     {
+         subplan = lfirst(l);
+
+         dmlstate->dmlplans[i] = ExecInitNode(subplan, estate, eflags);
+
+         i++;
+         estate->es_result_relation_info++;
+     }
+
+     estate->es_result_relation_info = estate->es_result_relations;
+
+     dmlstate->ds_whichplan = 0;
+
+     subplan = (Plan *) linitial(node->plans);
+
+     if (node->returningLists)
+     {
+         TupleTableSlot *slot;
+         ExprContext *econtext;
+
+         /*
+          * Initialize result tuple slot and assign
+          * type from the RETURNING list.
+          */
+         tupDesc = ExecTypeFromTL((List *) linitial(node->returningLists),
+                                  false);
+
+         /*
+          * Set up a slot for the output of the RETURNING projection(s).
+          */
+         slot = ExecAllocTableSlot(&estate->es_tupleTable);
+         ExecSetSlotDescriptor(slot, tupDesc);
+
+         econtext = CreateExprContext(estate);
+
+         Assert(list_length(node->returningLists) == estate->es_num_result_relations);
+         resultRelInfo = estate->es_result_relations;
+         foreach(l, node->returningLists)
+         {
+             List       *rlist = (List *) lfirst(l);
+             List       *rliststate;
+
+             rliststate = (List *) ExecInitExpr((Expr *) rlist, &dmlstate->ps);
+             resultRelInfo->ri_projectReturning =
+                 ExecBuildProjectionInfo(rliststate, econtext, slot,
+                                         resultRelInfo->ri_RelationDesc->rd_att);
+             resultRelInfo++;
+         }
+
+         dmlstate->ps.ps_ResultTupleSlot = slot;
+         dmlstate->ps.ps_ExprContext = econtext;
+     }
+     else
+     {
+         ExecInitResultTupleSlot(estate, &dmlstate->ps);
+         tupDesc = ExecTypeFromTL(subplan->targetlist, false);
+         ExecAssignResultType(&dmlstate->ps, tupDesc);
+
+         dmlstate->ps.ps_ExprContext = NULL;
+     }
+
+     /*
+      * Initialize the junk filter if needed. INSERT queries need a filter
+      * if there are any junk attrs in the tlist.  UPDATE and DELETE
+      * always need a filter, since there's always a junk 'ctid' attribute
+      * present --- no need to look first.
+      *
+      * This section of code is also a convenient place to verify that the
+      * output of an INSERT or UPDATE matches the target table(s).
+      */
+     {
+         bool        junk_filter_needed = false;
+         ListCell   *tlist;
+
+         switch (operation)
+         {
+             case CMD_INSERT:
+                 foreach(tlist, subplan->targetlist)
+                 {
+                     TargetEntry *tle = (TargetEntry *) lfirst(tlist);
+
+                     if (tle->resjunk)
+                     {
+                         junk_filter_needed = true;
+                         break;
+                     }
+                 }
+                 break;
+             case CMD_UPDATE:
+             case CMD_DELETE:
+                 junk_filter_needed = true;
+                 break;
+             default:
+                 break;
+         }
+
+         resultRelInfo = estate->es_result_relations;
+
+         if (junk_filter_needed)
+         {
+             /*
+              * If there are multiple result relations, each one needs its own
+              * junk filter.  Note this is only possible for UPDATE/DELETE, so
+              * we can't be fooled by some needing a filter and some not.
+
+              */
+             if (nplans > 1)
+             {
+                 for (i = 0; i < nplans; i++)
+                 {
+                     PlanState *ps = dmlstate->dmlplans[i];
+                     JunkFilter *j;
+
+                     if (operation == CMD_UPDATE)
+                         ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
+                                             ps->plan->targetlist);
+
+                     j = ExecInitJunkFilter(ps->plan->targetlist,
+                             resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
+                                 ExecInitExtraTupleSlot(estate));
+
+
+                     /*
+                      * Since it must be UPDATE/DELETE, there had better be a
+                      * "ctid" junk attribute in the tlist ... but ctid could
+                      * be at a different resno for each result relation. We
+                      * look up the ctid resnos now and save them in the
+                      * junkfilters.
+                      */
+                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
+                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
+                         elog(ERROR, "could not find junk ctid column");
+                     resultRelInfo->ri_junkFilter = j;
+                     resultRelInfo++;
+                 }
+             }
+             else
+             {
+                 JunkFilter *j;
+                 subplan = dmlstate->dmlplans[0]->plan;
+
+                 if (operation == CMD_INSERT || operation == CMD_UPDATE)
+                     ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
+                                         subplan->targetlist);
+
+                 j = ExecInitJunkFilter(subplan->targetlist,
+                                        resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
+                                        ExecInitExtraTupleSlot(estate));
+
+                 if (operation == CMD_UPDATE || operation == CMD_DELETE)
+                 {
+                     /* FOR UPDATE/DELETE, find the ctid junk attr now */
+                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
+                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
+                         elog(ERROR, "could not find junk ctid column");
+                 }
+
+                 estate->es_result_relation_info->ri_junkFilter = j;
+             }
+         }
+         else
+         {
+             if (operation == CMD_INSERT)
+                 ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
+                                     subplan->targetlist);
+         }
+     }
+
+     return dmlstate;
+ }
+
+ void
+ ExecEndDml(DmlState *node)
+ {
+     int i;
+
+     /*
+      * Free the exprcontext
+      */
+     ExecFreeExprContext(&node->ps);
+
+     /*
+      * clean out the tuple table
+      */
+     ExecClearTuple(node->ps.ps_ResultTupleSlot);
+
+     /*
+      * shut down subplans
+      */
+     for (i=0;i<node->ds_nplans;++i)
+     {
+         ExecEndNode(node->dmlplans[i]);
+     }
+
+     pfree(node->dmlplans);
+ }
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 171,177 **** _copyAppend(Append *from)
       * copy remainder of node
       */
      COPY_NODE_FIELD(appendplans);
-     COPY_SCALAR_FIELD(isTarget);

      return newnode;
  }
--- 171,176 ----
***************
*** 1391,1396 **** _copyXmlExpr(XmlExpr *from)
--- 1390,1411 ----

      return newnode;
  }
+
+ static Dml *
+ _copyDml(Dml *from)
+ {
+     Dml    *newnode = makeNode(Dml);
+
+     CopyPlanFields((Plan *) from, (Plan *) newnode);
+
+     COPY_NODE_FIELD(plans);
+     COPY_SCALAR_FIELD(operation);
+     COPY_NODE_FIELD(resultRelations);
+     COPY_NODE_FIELD(returningLists);
+
+     return newnode;
+ }
+

  /*
   * _copyNullIfExpr (same as OpExpr)
***************
*** 4097,4102 **** copyObject(void *from)
--- 4112,4120 ----
          case T_XmlSerialize:
              retval = _copyXmlSerialize(from);
              break;
+         case T_Dml:
+             retval = _copyDml(from);
+             break;

          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 326,332 **** _outAppend(StringInfo str, Append *node)
      _outPlanInfo(str, (Plan *) node);

      WRITE_NODE_FIELD(appendplans);
-     WRITE_BOOL_FIELD(isTarget);
  }

  static void
--- 326,331 ----
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 579,585 **** create_append_plan(PlannerInfo *root, AppendPath *best_path)
          subplans = lappend(subplans, create_plan(root, subpath));
      }

!     plan = make_append(subplans, false, tlist);

      return (Plan *) plan;
  }
--- 579,585 ----
          subplans = lappend(subplans, create_plan(root, subpath));
      }

!     plan = make_append(subplans, tlist);

      return (Plan *) plan;
  }
***************
*** 2621,2627 **** make_worktablescan(List *qptlist,
  }

  Append *
! make_append(List *appendplans, bool isTarget, List *tlist)
  {
      Append       *node = makeNode(Append);
      Plan       *plan = &node->plan;
--- 2621,2627 ----
  }

  Append *
! make_append(List *appendplans, List *tlist)
  {
      Append       *node = makeNode(Append);
      Plan       *plan = &node->plan;
***************
*** 2657,2663 **** make_append(List *appendplans, bool isTarget, List *tlist)
      plan->lefttree = NULL;
      plan->righttree = NULL;
      node->appendplans = appendplans;
-     node->isTarget = isTarget;

      return node;
  }
--- 2657,2662 ----
***************
*** 3665,3670 **** make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
--- 3664,3720 ----
      return node;
  }

+ Dml *
+ make_dml(List *subplans, List *returningLists, List *resultRelations, CmdType operation)
+ {
+     Dml *node = makeNode(Dml);
+     Plan       *plan = &node->plan;
+     double        total_size;
+     ListCell   *subnode;
+
+     Assert(list_length(subplans) == list_length(resultRelations));
+     Assert(!returningLists || list_length(returningLists) == list_length(resultRelations));
+
+     /*
+      * Compute cost as sum of subplan costs.
+      */
+     plan->startup_cost = 0;
+     plan->total_cost = 0;
+     plan->plan_rows = 0;
+     total_size = 0;
+     foreach(subnode, subplans)
+     {
+         Plan       *subplan = (Plan *) lfirst(subnode);
+
+         if (subnode == list_head(subplans))    /* first node? */
+             plan->startup_cost = subplan->startup_cost;
+         plan->total_cost += subplan->total_cost;
+         plan->plan_rows += subplan->plan_rows;
+         total_size += subplan->plan_width * subplan->plan_rows;
+     }
+     if (plan->plan_rows > 0)
+         plan->plan_width = rint(total_size / plan->plan_rows);
+     else
+         plan->plan_width = 0;
+
+     node->plan.lefttree = NULL;
+     node->plan.righttree = NULL;
+     node->plan.qual = NIL;
+
+     if (returningLists)
+         node->plan.targetlist = linitial(returningLists);
+     else
+         node->plan.targetlist = NIL;
+
+     node->plans = subplans;
+     node->resultRelations = resultRelations;
+     node->returningLists = returningLists;
+
+     node->operation = operation;
+
+     return node;
+ }
+
  /*
   * make_result
   *      Build a Result plan node
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***************
*** 478,485 **** subquery_planner(PlannerGlobal *glob, Query *parse,
--- 478,494 ----
          rt_fetch(parse->resultRelation, parse->rtable)->inh)
          plan = inheritance_planner(root);
      else
+     {
          plan = grouping_planner(root, tuple_fraction);

+         if (parse->commandType != CMD_SELECT)
+             plan = (Plan *) make_dml(list_make1(plan),
+                                      root->returningLists,
+                                      root->resultRelations,
+                                      parse->commandType);
+     }
+
+
      /*
       * If any subplans were generated, or if we're inside a subplan, build
       * initPlan list and extParam/allParam sets for plan nodes, and attach the
***************
*** 624,632 **** preprocess_qual_conditions(PlannerInfo *root, Node *jtnode)
   * is an inheritance set. Source inheritance is expanded at the bottom of the
   * plan tree (see allpaths.c), but target inheritance has to be expanded at
   * the top.  The reason is that for UPDATE, each target relation needs a
!  * different targetlist matching its own column set.  Also, for both UPDATE
!  * and DELETE, the executor needs the Append plan node at the top, else it
!  * can't keep track of which table is the current target table.  Fortunately,
   * the UPDATE/DELETE target can never be the nullable side of an outer join,
   * so it's OK to generate the plan this way.
   *
--- 633,639 ----
   * is an inheritance set. Source inheritance is expanded at the bottom of the
   * plan tree (see allpaths.c), but target inheritance has to be expanded at
   * the top.  The reason is that for UPDATE, each target relation needs a
!  * different targetlist matching its own column set.  Fortunately,
   * the UPDATE/DELETE target can never be the nullable side of an outer join,
   * so it's OK to generate the plan this way.
   *
***************
*** 737,747 **** inheritance_planner(PlannerInfo *root)
       */
      parse->rtable = rtable;

!     /* Suppress Append if there's only one surviving child rel */
!     if (list_length(subplans) == 1)
!         return (Plan *) linitial(subplans);
!
!     return (Plan *) make_append(subplans, true, tlist);
  }

  /*--------------------
--- 744,753 ----
       */
      parse->rtable = rtable;

!     return (Plan *) make_dml(subplans,
!                              root->returningLists,
!                              root->resultRelations,
!                              parse->commandType);
  }

  /*--------------------
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***************
*** 375,380 **** set_plan_refs(PlannerGlobal *glob, Plan *plan, int rtoffset)
--- 375,403 ----
              set_join_references(glob, (Join *) plan, rtoffset);
              break;

+         case T_Dml:
+             {
+                 /*
+                  * grouping_planner() already called set_returning_clause_references
+                  * so the targetList's references are already set.
+                  */
+                 Dml *splan = (Dml *) plan;
+
+                 foreach(l, splan->resultRelations)
+                 {
+                     lfirst_int(l) += rtoffset;
+                 }
+
+                 Assert(splan->plan.qual == NIL);
+                 foreach(l, splan->plans)
+                 {
+                     lfirst(l) = set_plan_refs(glob,
+                                               (Plan *) lfirst(l),
+                                               rtoffset);
+                 }
+             }
+             break;
+
          case T_Hash:
          case T_Material:
          case T_Sort:
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***************
*** 2018,2023 **** finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params)
--- 2018,2024 ----
          case T_Unique:
          case T_SetOp:
          case T_Group:
+         case T_Dml:
              break;

          default:
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
***************
*** 448,454 **** generate_union_plan(SetOperationStmt *op, PlannerInfo *root,
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, false, tlist);

      /*
       * For UNION ALL, we just need the Append plan.  For UNION, need to add
--- 448,454 ----
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, tlist);

      /*
       * For UNION ALL, we just need the Append plan.  For UNION, need to add
***************
*** 539,545 **** generate_nonunion_plan(SetOperationStmt *op, PlannerInfo *root,
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, false, tlist);

      /* Identify the grouping semantics */
      groupList = generate_setop_grouplist(op, tlist);
--- 539,545 ----
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, tlist);

      /* Identify the grouping semantics */
      groupList = generate_setop_grouplist(op, tlist);
*** /dev/null
--- b/src/include/executor/nodeDml.h
***************
*** 0 ****
--- 1,10 ----
+ #ifndef NODEDML_H
+ #define NODEDML_H
+
+ #include "nodes/execnodes.h"
+
+ extern DmlState *ExecInitDml(Dml *node, EState *estate, int eflags);
+ extern TupleTableSlot *ExecDml(DmlState *node);
+ extern void ExecEndDml(DmlState *node);
+
+ #endif
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 976,981 **** typedef struct ResultState
--- 976,996 ----
  } ResultState;

  /* ----------------
+  *     DmlState information
+  * ----------------
+  */
+ typedef struct DmlState
+ {
+     PlanState        ps;                /* its first field is NodeTag */
+     PlanState      **dmlplans;
+     int                ds_nplans;
+     int                ds_whichplan;
+
+     CmdType            operation;
+ } DmlState;
+
+
+ /* ----------------
   *     AppendState information
   *
   *        nplans            how many plans are in the list
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 71,76 **** typedef enum NodeTag
--- 71,77 ----
      T_Hash,
      T_SetOp,
      T_Limit,
+     T_Dml,
      /* this one isn't a subclass of Plan: */
      T_PlanInvalItem,

***************
*** 190,195 **** typedef enum NodeTag
--- 191,197 ----
      T_NullTestState,
      T_CoerceToDomainState,
      T_DomainConstraintState,
+     T_DmlState,

      /*
       * TAGS FOR PLANNER NODES (relation.h)
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
***************
*** 164,185 **** typedef struct Result
      Node       *resconstantqual;
  } Result;

  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
-  *
-  * Append nodes are sometimes used to switch between several result relations
-  * (when the target of an UPDATE or DELETE is an inheritance set).    Such a
-  * node will have isTarget true.  The Append executor is then responsible
-  * for updating the executor state to point at the correct target relation
-  * whenever it switches subplans.
   * ----------------
   */
  typedef struct Append
  {
      Plan        plan;
      List       *appendplans;
-     bool        isTarget;
  } Append;

  /* ----------------
--- 164,189 ----
      Node       *resconstantqual;
  } Result;

+ typedef struct Dml
+ {
+     Plan        plan;
+
+     CmdType        operation;
+     List       *plans;
+     List       *resultRelations;
+     List       *returningLists;
+ } Dml;
+
+
  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
   * ----------------
   */
  typedef struct Append
  {
      Plan        plan;
      List       *appendplans;
  } Append;

  /* ----------------
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 41,47 **** extern Plan *optimize_minmax_aggregates(PlannerInfo *root, List *tlist,
  extern Plan *create_plan(PlannerInfo *root, Path *best_path);
  extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
                    Index scanrelid, Plan *subplan, List *subrtable);
! extern Append *make_append(List *appendplans, bool isTarget, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
                       Plan *lefttree, Plan *righttree, int wtParam,
                       List *distinctList, long numGroups);
--- 41,47 ----
  extern Plan *create_plan(PlannerInfo *root, Path *best_path);
  extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
                    Index scanrelid, Plan *subplan, List *subrtable);
! extern Append *make_append(List *appendplans, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
                       Plan *lefttree, Plan *righttree, int wtParam,
                       List *distinctList, long numGroups);
***************
*** 69,74 **** extern Plan *materialize_finished_plan(Plan *subplan);
--- 69,76 ----
  extern Unique *make_unique(Plan *lefttree, List *distinctList);
  extern Limit *make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
             int64 offset_est, int64 count_est);
+ extern Dml *make_dml(List *subplans, List *returningLists, List *resultRelation,
+             CmdType operation);
  extern SetOp *make_setop(SetOpCmd cmd, SetOpStrategy strategy, Plan *lefttree,
             List *distinctList, AttrNumber flagColIdx, int firstFlag,
             long numGroups, double outputRows);

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

28 September 2009, 13:05:18

On Mon, Sep 28, 2009 at 11:31 AM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:
> Robert Haas wrote:
>>
>> So I think we should at a minimum ask the patch author to (1) fix the
>> explain bugs I found and (2) update the README, as well as (3) revert
>> needless whitespace changes - there are a couple in execMain.c, from
>> the looks of it.
>
> In the attached patch, I made the changes to explain as you suggested and
> reverted the only whitespace change I could find from execMain.c. However,
> English isn't my first language so I'm not very confident about fixing the
> README.

Can you at least take a stab at it?  We can fix your grammar, but
guessing what's going on without documentation is hard.

...Robert

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

28 September 2009, 16:19:48

Robert Haas wrote:
> Can you at least take a stab at it?  We can fix your grammar, but
> guessing what's going on without documentation is hard.

With some help from David Fetter, I took another try at it.  I hope
someone finds this helpful.  I'm happy to answer any questions.

Regards,
Marko Tiikkaja
*** a/src/backend/executor/README
--- b/src/backend/executor/README
***************
*** 25,38 **** There is a moderately intelligent scheme to avoid rescanning nodes
  unnecessarily (for example, Sort does not rescan its input if no parameters
  of the input have changed, since it can just reread its stored sorted data).

! The plan tree concept implements SELECT directly: it is only necessary to
! deliver the top-level result tuples to the client, or insert them into
! another table in the case of INSERT ... SELECT.  (INSERT ... VALUES is
! handled similarly, but the plan tree is just a Result node with no source
! tables.)  For UPDATE, the plan tree selects the tuples that need to be
! updated (WHERE condition) and delivers a new calculated tuple value for each
! such tuple, plus a "junk" (hidden) tuple CTID identifying the target tuple.
! The executor's top level then uses this information to update the correct
  tuple.  DELETE is similar to UPDATE except that only a CTID need be
  delivered by the plan tree.

--- 25,42 ----
  unnecessarily (for example, Sort does not rescan its input if no parameters
  of the input have changed, since it can just reread its stored sorted data).

! It is only necessary to deliver the top-level result tuples to the client.
! If the query is a SELECT, the topmost plan node is the output of the SELECT.
! If the query is a DML operation, a DML node is added to the top, which calls
! its child nodes to fetch the tuples.  If the DML operation has a RETURNING
! clause the node will output the projected tuples, otherwise it gives out
! dummy tuples until it has processed all tuples from its child nodes.  After
! that, it gives NULL.
!
! Handling INSERT is pretty straightforward: the tuples returned from the
! subtree are inserted into the correct result relation.  In addition to the
! tuple value, UPDATE needs a "junk" (hidden) tuple CTID identifying the
! target tuple.  The DML node needs this information to update the correct
  tuple.  DELETE is similar to UPDATE except that only a CTID need be
  delivered by the plan tree.

Re: Using results from INSERT ... RETURNING

From

Alvaro Herrera

Date:

29 September 2009, 12:30:14

Marko Tiikkaja escribió:
> Robert Haas wrote:
> >So I think we should at a minimum ask the patch author to (1) fix the
> >explain bugs I found and (2) update the README, as well as (3) revert
> >needless whitespace changes - there are a couple in execMain.c, from
> >the looks of it.
> 
> In the attached patch, I made the changes to explain as you
> suggested and reverted the only whitespace change I could find from
> execMain.c. However, English isn't my first language so I'm not very
> confident about fixing the README.

BTW what was the conclusion of the idea about having three separate
nodes Insert, Delete, Update instead of a single Dml node?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

29 September 2009, 12:48:55

Alvaro Herrera <alvherre@commandprompt.com> writes:
> BTW what was the conclusion of the idea about having three separate
> nodes Insert, Delete, Update instead of a single Dml node?

If we stick with a single node type then I'd suggest calling it
something like ModifyTable.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

29 September 2009, 13:33:10

On Tue, Sep 29, 2009 at 11:29 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Marko Tiikkaja escribió:
>> Robert Haas wrote:
>> >So I think we should at a minimum ask the patch author to (1) fix the
>> >explain bugs I found and (2) update the README, as well as (3) revert
>> >needless whitespace changes - there are a couple in execMain.c, from
>> >the looks of it.
>>
>> In the attached patch, I made the changes to explain as you
>> suggested and reverted the only whitespace change I could find from
>> execMain.c. However, English isn't my first language so I'm not very
>> confident about fixing the README.
>
> BTW what was the conclusion of the idea about having three separate
> nodes Insert, Delete, Update instead of a single Dml node?

It wasn't obvious from reading the patch why multiple node types would
be superior; but I'm not 100% sure I understand what Tom had in mind,
either.

...Robert

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

29 September 2009, 13:36:53

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Sep 29, 2009 at 11:29 AM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
>> BTW what was the conclusion of the idea about having three separate
>> nodes Insert, Delete, Update instead of a single Dml node?

> It wasn't obvious from reading the patch why multiple node types would
> be superior; but I'm not 100% sure I understand what Tom had in mind,
> either.

I thought they would be simpler/faster individually.  However, that
might not be enough to outweigh 3x repetition of the per-node-type
boilerplate.  I haven't read the patch yet so this isn't an informed
recommendation, just a suggestion of an alternative to consider.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

01 October 2009, 23:48:55

On Mon, Sep 28, 2009 at 3:19 PM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:
> Robert Haas wrote:
>>
>> Can you at least take a stab at it?  We can fix your grammar, but
>> guessing what's going on without documentation is hard.
>
> With some help from David Fetter, I took another try at it.  I hope
> someone finds this helpful.  I'm happy to answer any questions.

Thanks.  I read through this patch some more tonight and I guess I am
a bit confused about what it accomplishes.  AIUI, the point here is to
lay the groundwork for a future patch to allow writeable CTEs, and I
guess I'm not understanding how it's going to do that.

rhaas=# create table project (id serial primary key, name varchar not
null);NOTICE:  CREATE TABLE will create implicit sequence
"project_id_seq" for serial column "project.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"project_pkey" for table "project"
CREATE TABLE
rhaas=# create table shadow (id integer not null primary key, name
varchar not null);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"shadow_pkey" for table "shadow"
CREATE TABLE
rhaas=# create rule clone as on insert to project do also insert into
shadow (id, name) values (NEW.id, NEW.name);
CREATE RULE
rhaas=# insert into project (name) values ('Writeable CTEs') returning id;id
---- 1
(1 row)

INSERT 0 1
rhaas=# explain insert into project (name) values ('Writeable CTEs')
returning id;                  QUERY PLAN
------------------------------------------------Insert  (cost=0.00..0.01 rows=1 width=0)  ->  Result  (cost=0.00..0.01
rows=1width=0) 
Insert  (cost=0.00..0.01 rows=1 width=0)  ->  Result  (cost=0.00..0.01 rows=1 width=0)
(5 rows)

Now the point here is that I eventually want to be able to write
something like this:

with foo as (insert into project (name) values ('Writeable CTEs')
returning id) select * from foo;

...but how does this get me any closer?  It seems to me that the plan
for THAT statement has to be a CTE scan over top of BOTH of the
inserts, but here I have two insert nodes that comprise two separate
plans.  The DML node, as presently implemented, supports a list of
plans, but they all have to be of the same type, so it's really only
useful for handling append, and as previously discussed, it's not
clear that the proposed handling is any better than what we already
have.

What am I missing?

...Robert

Re: Using results from INSERT ... RETURNING

From

David Fetter

Date:

01 October 2009, 23:53:13

On Thu, Oct 01, 2009 at 10:48:41PM -0400, Robert Haas wrote:
> On Mon, Sep 28, 2009 at 3:19 PM, Marko Tiikkaja
> <marko.tiikkaja@cs.helsinki.fi> wrote:
> > Robert Haas wrote:
> >>
> >> Can you at least take a stab at it?  We can fix your grammar, but
> >> guessing what's going on without documentation is hard.
> >
> > With some help from David Fetter, I took another try at it.  I
> > hope someone finds this helpful.  I'm happy to answer any
> > questions.
> 
> Thanks.  I read through this patch some more tonight and I guess I
> am a bit confused about what it accomplishes.  AIUI, the point here
> is to lay the groundwork for a future patch to allow writeable CTEs,
> and I guess I'm not understanding how it's going to do that.

There's another branch in the repository
<http://git.postgresql.org/gitweb?p=writeable_cte.git> called
actually_write which has the beginnings of an implementation based on
this.  If you'd like, I can send along either the patch vs.
writeable_cte, or a patch against the main branch.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

02 October 2009, 07:33:08

Robert Haas wrote:
> Thanks.  I read through this patch some more tonight and I guess I am
> a bit confused about what it accomplishes.  AIUI, the point here is to
> lay the groundwork for a future patch to allow writeable CTEs, and I
> guess I'm not understanding how it's going to do that.

The purpose of this patch is only to move INSERT, UPDATE and DELETE into
plan nodes because for writeable CTEs, we can't use the currently
existing way of handling those operations.  My previous approach was to
only add a special-case node for RETURNING and use the top-level
INSERT/UPDATE/DELETE handling like now, but that would've lead to
copying a lot of code, as Tom pointed out in
http://archives.postgresql.org/pgsql-hackers/2009-07/msg01217.php .  In
that same message, Tom suggested the current approach which to me seems
like the best way to tackle this.  This patch alone doesn't accomplish
anything, but supporting writeable CTEs will be a lot easier as seen in
the git repo David pointed you to.

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

02 October 2009, 08:37:42

Robert Haas wrote:
> Now the point here is that I eventually want to be able to write
> something like this:
> 
> with foo as (insert into project (name) values ('Writeable CTEs')
> returning id) select * from foo;
> 
> ...but how does this get me any closer?  It seems to me that the plan
> for THAT statement has to be a CTE scan over top of BOTH of the
> inserts, but here I have two insert nodes that comprise two separate
> plans.  The DML node, as presently implemented, supports a list of
> plans, but they all have to be of the same type, so it's really only
> useful for handling append, and as previously discussed, it's not
> clear that the proposed handling is any better than what we already
> have.

I don't think you should be able to do this.  I'm not too familiar with
rules, but in your example the rule doesn't modify the output of the
INSERT .. RETURNING so I think it shouldn't do that here either.  How I
see it is that in your example the INSERT INTO shadow would be added to
the top level instead of the CTE and the plan would look something like
this:

------------------------------------------------ CTE Scan on foo  (cost=0.01..0.03 rows=1 width=4)   CTE foo     ->
Insert (cost=0.00..0.01 rows=1 width=0)           ->  Result  (cost=0.00..0.01 rows=1 width=0)

 Insert  (cost=0.00..0.01 rows=1 width=0)   ->  Result  (cost=0.00..0.01 rows=1 width=0)

so you would get the RETURNING output from the CTE and the INSERT to the
shadow table would be executed separately.  I'm not saying that we don't
want to provide the means to do this, but writeable CTEs alone aren't
meant to handle this.

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

04 October 2009, 08:55:40

On Fri, Oct 2, 2009 at 7:37 AM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:
> Robert Haas wrote:
>>
>> Now the point here is that I eventually want to be able to write
>> something like this:
>>
>> with foo as (insert into project (name) values ('Writeable CTEs')
>> returning id) select * from foo;
>>
>> ...but how does this get me any closer?  It seems to me that the plan
>> for THAT statement has to be a CTE scan over top of BOTH of the
>> inserts, but here I have two insert nodes that comprise two separate
>> plans.  The DML node, as presently implemented, supports a list of
>> plans, but they all have to be of the same type, so it's really only
>> useful for handling append, and as previously discussed, it's not
>> clear that the proposed handling is any better than what we already
>> have.
>
> I don't think you should be able to do this.  I'm not too familiar with
> rules, but in your example the rule doesn't modify the output of the
> INSERT .. RETURNING so I think it shouldn't do that here either.  How I
> see it is that in your example the INSERT INTO shadow would be added to
> the top level instead of the CTE and the plan would look something like
> this:
>
> ------------------------------------------------
>  CTE Scan on foo  (cost=0.01..0.03 rows=1 width=4)
>   CTE foo
>     ->  Insert  (cost=0.00..0.01 rows=1 width=0)
>           ->  Result  (cost=0.00..0.01 rows=1 width=0)
>
>  Insert  (cost=0.00..0.01 rows=1 width=0)
>   ->  Result  (cost=0.00..0.01 rows=1 width=0)
>
> so you would get the RETURNING output from the CTE and the INSERT to the
> shadow table would be executed separately.

Yeah, I think you're right.

> I'm not saying that we don't
> want to provide the means to do this, but writeable CTEs alone aren't
> meant to handle this.

Well, I think a patch to implement writeable CTEs is probably going to
have to handle this case - I don't think we can just ignore rewrite
rules when processing a CTE.  But it does seem to be beyond the scope
of the current patch.

I'm going to go ahead and mark this Ready for Committer.  Thanks for
your patience.

...Robert

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

04 October 2009, 09:14:37

Marko -

I noticed something a little odd about the new append-plan handling.

rhaas=# explain update parent set c = 1;                             QUERY PLAN
-----------------------------------------------------------------------Update  (cost=0.00..60.80 rows=4080 width=12)
-> Seq Scan on parent  (cost=0.00..31.40 rows=2140 width=10)  ->  Seq Scan on child parent  (cost=0.00..29.40 rows=1940
width=14)
(3 rows)

That may be OK, actually, but it does look a little weird.

...Robert

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

04 October 2009, 09:15:23

On Sun, Oct 4, 2009 at 8:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Marko -
>
> I noticed something a little odd about the new append-plan handling.
>
> rhaas=# explain update parent set c = 1;
>                              QUERY PLAN
> -----------------------------------------------------------------------
>  Update  (cost=0.00..60.80 rows=4080 width=12)
>   ->  Seq Scan on parent  (cost=0.00..31.40 rows=2140 width=10)
>   ->  Seq Scan on child parent  (cost=0.00..29.40 rows=1940 width=14)
> (3 rows)
>
> That may be OK, actually, but it does look a little weird.

Argh.  Nevermind.  It was like that before.

Sigh.

...Robert

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

04 October 2009, 10:34:16

Robert Haas wrote:
>> I'm not saying that we don't
>> want to provide the means to do this, but writeable CTEs alone aren't
>> meant to handle this.
> 
> Well, I think a patch to implement writeable CTEs is probably going to
> have to handle this case - I don't think we can just ignore rewrite
> rules when processing a CTE.  But it does seem to be beyond the scope
> of the current patch.

My use of "this" was a bit ambiguous here; what I meant was that
writeable CTEs are going to work just like a top-level INSERT ..
RETURNING would have, i.e. return only rows inserted to "project".

> I'm going to go ahead and mark this Ready for Committer.  Thanks for
> your patience.

Thanks for reviewing!

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

04 October 2009, 12:47:37

Robert Haas <robertmhaas@gmail.com> writes:
> Well, I think a patch to implement writeable CTEs is probably going to
> have to handle this case - I don't think we can just ignore rewrite
> rules when processing a CTE.  But it does seem to be beyond the scope
> of the current patch.

I hadn't been paying too much attention to this thread, but ... why
is anyone worrying about that?  Rewrite rules are not the concern
of either the planner or the executor.  A "do also" rule will result
in two entirely separate Query trees, which will each be planned
separately and executed separately.  Any given executor run only
has to think about one type of DML command --- otherwise the executor
would be broken already, since it takes only one command-type argument.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

04 October 2009, 13:25:00

Tom Lane wrote:
> Any given executor run only
> has to think about one type of DML command --- otherwise the executor
> would be broken already, since it takes only one command-type argument.

If I understood you correctly, this would imply that you wouldn't be
able to do for example:

INSERT INTO foo
WITH t AS ( DELETE FROM bar RETURNING * )
SELECT * FROM t;

which is probably the most useful thing you could do with this feature.
Am I misinterpreting what you said?

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

04 October 2009, 14:13:49

On Oct 4, 2009, at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
>> Well, I think a patch to implement writeable CTEs is probably going  
>> to
>> have to handle this case - I don't think we can just ignore rewrite
>> rules when processing a CTE.  But it does seem to be beyond the scope
>> of the current patch.
>
> I hadn't been paying too much attention to this thread, but ... why
> is anyone worrying about that?  Rewrite rules are not the concern
> of either the planner or the executor.  A "do also" rule will result
> in two entirely separate Query trees, which will each be planned
> separately and executed separately.  Any given executor run only
> has to think about one type of DML command --- otherwise the executor
> would be broken already, since it takes only one command-type  
> argument.

If an INSERT/UPDATE/DELETE appears within a CTE, it will still need to  
be rewritten. But you're right that this is irrelevant to the present  
patch; I just didn't see that at once.

...Robert

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

04 October 2009, 14:17:02

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> If I understood you correctly, this would imply that you wouldn't be
> able to do for example:

> INSERT INTO foo
> WITH t AS ( DELETE FROM bar RETURNING * )
> SELECT * FROM t;

Um ... forget what I said --- not enough caffeine yet, apparently.

Yeah, rewrite rules are going to be a *serious* stumbling block to
this whole concept.  Maybe we should just punt the project until we
have a clear idea of how to do that.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

David Fetter

Date:

04 October 2009, 14:24:52

On Sun, Oct 04, 2009 at 01:16:50PM -0400, Tom Lane wrote:
> Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> > If I understood you correctly, this would imply that you wouldn't
> > be able to do for example:
> 
> > INSERT INTO foo
> > WITH t AS ( DELETE FROM bar RETURNING * )
> > SELECT * FROM t;
> 
> Um ... forget what I said --- not enough caffeine yet, apparently.
> 
> Yeah, rewrite rules are going to be a *serious* stumbling block to
> this whole concept.

> Maybe we should just punt the project until we have a clear idea of
> how to do that.

Maybe rewrite rules just don't fit with this feature, and should cause
an error.  We have other things that don't work together, and the
world hasn't ended yet.

This leads me to another Modest Proposal, which I'll detail in another
post.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

04 October 2009, 14:36:04

On Oct 4, 2009, at 1:24 PM, David Fetter <david@fetter.org> wrote:

> On Sun, Oct 04, 2009 at 01:16:50PM -0400, Tom Lane wrote:
>> Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
>>> If I understood you correctly, this would imply that you wouldn't
>>> be able to do for example:
>>
>>> INSERT INTO foo
>>> WITH t AS ( DELETE FROM bar RETURNING * )
>>> SELECT * FROM t;
>>
>> Um ... forget what I said --- not enough caffeine yet, apparently.
>>
>> Yeah, rewrite rules are going to be a *serious* stumbling block to
>> this whole concept.
>
>> Maybe we should just punt the project until we have a clear idea of
>> how to do that.

This was discussed a bit upthread and doesn't seem obviously  
unmanageable to me.

> Maybe rewrite rules just don't fit with this feature, and should cause
> an error.  We have other things that don't work together, and the
> world hasn't ended yet.

That doesn't seem very satisfactory.

...Robert
>

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

04 October 2009, 16:32:02

Tom Lane wrote:
> Yeah, rewrite rules are going to be a *serious* stumbling block to
> this whole concept.  Maybe we should just punt the project until we
> have a clear idea of how to do that.

I've implemented rewrite rules for writeable CTEs, and at least now I
don't see any problems except one.  I can't seem to think of what would
be the correct behaviour in this case:

=> CREATE rule foo_rule AS ON INSERT TO foo DO ALSO SELECT * FROM bar;
CREATE RULE

=> WITH t AS (INSERT INTO foo VALUES(0) RETURNING *) SELECT * FROM t;

If you rewrite the query as it is rewritten in the top-level case, you
get a plan such as this:

-------------------------------------------------------
CTE Scan ON t  (cost=0.01..0.03 rows=1 width=4)  CTE t    ->  INSERT  (cost=0.00..0.01 rows=1 width=0)          ->
Result (cost=0.00..0.01 rows=1 width=0)

Seq Scan ON bar  (cost=0.00..34.00 rows=2400 width=4)

and now you have *two* SELECT statements.  Currently the portal code
gives the output of the "Seq Scan ON bar" here which is IMO very
surprising.  Does ignoring the rule here sound sane or should we error
out?  Or does someone have a better idea?  DO ALSO INSERT/UPDATE/DELETE
works as expected here.

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

08 October 2009, 15:53:24

I wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> BTW what was the conclusion of the idea about having three separate
>> nodes Insert, Delete, Update instead of a single Dml node?

> If we stick with a single node type then I'd suggest calling it
> something like ModifyTable.

I'm starting to work on this patch now.  After looking at it a bit,
it seems to me that keeping it as one node is probably marginally
preferable to making three nodes; but I still do not like the "Dml"
name.  Does anyone have a problem with the ModifyTable suggestion,
or a better idea?
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Alvaro Herrera

Date:

08 October 2009, 15:57:13

Tom Lane escribió:
> I wrote:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
> >> BTW what was the conclusion of the idea about having three separate
> >> nodes Insert, Delete, Update instead of a single Dml node?
> 
> > If we stick with a single node type then I'd suggest calling it
> > something like ModifyTable.
> 
> I'm starting to work on this patch now.  After looking at it a bit,
> it seems to me that keeping it as one node is probably marginally
> preferable to making three nodes; but I still do not like the "Dml"
> name.  Does anyone have a problem with the ModifyTable suggestion,
> or a better idea?

Does it affect how is it going to look in EXPLAIN output?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

08 October 2009, 15:59:27

Tom Lane wrote:
> Does anyone have a problem with the ModifyTable suggestion,
> or a better idea?

Lacking a better idea, +1 for ModifyTable from me.


Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

08 October 2009, 16:31:16

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane escribi�:
> Does it affect how is it going to look in EXPLAIN output?

Hmm, I suppose there's not any law that says EXPLAIN has to print the
same name we use internally.  The explain output could say "Insert",
"Update", or "Delete" independently of what we call the node type.
That would take away about 50% of my objection to the node name,
though on balance I still think "Dml" is a poor choice since plan
node names are typically verbs.

One issue is whether it would be confusing for implementors if the
explain output is completely unrelated to the internal name.  We could
do something like "InsertTable" or "ModifyTable Insert", but both of
these look a bit ugly from a user's standpoint I think.

I notice also that the patch has chosen to represent Dml in XML/JSON
explain output as Node Type = Dml with an entirely new attribute
Operation to indicate Insert/Update/Delete.  Do we really want to
go there?  Adding single-purpose attributes doesn't seem like a great
idea.

While we're discussing explain output ... what about triggers?
An important aspect of this change is that at least the row-level
triggers are now going to be charged as runtime of the
Dml-or-whatever-we-call-it node.  Should we rearrange the explain
output so that the printout for trigger runtimes is associated
with those nodes too?  If so, what about statement-level triggers?
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

08 October 2009, 16:37:32

On Thu, Oct 8, 2009 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Tom Lane escribió:
>> Does it affect how is it going to look in EXPLAIN output?
>
> Hmm, I suppose there's not any law that says EXPLAIN has to print the
> same name we use internally.  The explain output could say "Insert",
> "Update", or "Delete" independently of what we call the node type.

It already does, in text mode.

> That would take away about 50% of my objection to the node name,
> though on balance I still think "Dml" is a poor choice since plan
> node names are typically verbs.

Agreed.

> One issue is whether it would be confusing for implementors if the
> explain output is completely unrelated to the internal name.  We could
> do something like "InsertTable" or "ModifyTable Insert", but both of
> these look a bit ugly from a user's standpoint I think.
>
> I notice also that the patch has chosen to represent Dml in XML/JSON
> explain output as Node Type = Dml with an entirely new attribute
> Operation to indicate Insert/Update/Delete.  Do we really want to
> go there?  Adding single-purpose attributes doesn't seem like a great
> idea.

Well, I was the one who suggested doing it that way, so you can blame
me for that, but it is consistent with how we've handled other things,
like setops and jointypes: the details get moved to another tag so as
to avoid an explosive growth in the number of node types that clients
must be prepared for.

> While we're discussing explain output ... what about triggers?
> An important aspect of this change is that at least the row-level
> triggers are now going to be charged as runtime of the
> Dml-or-whatever-we-call-it node.  Should we rearrange the explain
> output so that the printout for trigger runtimes is associated
> with those nodes too?  If so, what about statement-level triggers?

That's not a bad idea, though it wouldn't bother me much if we left it
for a follow-on patch.

...Robert

Re: Using results from INSERT ... RETURNING

From

Alvaro Herrera

Date:

08 October 2009, 16:40:38

Robert Haas escribió:
> On Thu, Oct 8, 2009 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> > I notice also that the patch has chosen to represent Dml in XML/JSON
> > explain output as Node Type = Dml with an entirely new attribute
> > Operation to indicate Insert/Update/Delete.  Do we really want to
> > go there?  Adding single-purpose attributes doesn't seem like a great
> > idea.
> 
> Well, I was the one who suggested doing it that way, so you can blame
> me for that, but it is consistent with how we've handled other things,
> like setops and jointypes: the details get moved to another tag so as
> to avoid an explosive growth in the number of node types that clients
> must be prepared for.

Perhaps how a join is implemented in a plan can be considered a
"detail", but I don't think the same holds true for insert vs. update.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

08 October 2009, 16:54:10

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Robert Haas escribi�:
>> On Thu, Oct 8, 2009 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I notice also that the patch has chosen to represent Dml in XML/JSON
>>> explain output as Node Type = Dml with an entirely new attribute
>>> Operation to indicate Insert/Update/Delete. �Do we really want to
>>> go there? �Adding single-purpose attributes doesn't seem like a great
>>> idea.
>> 
>> Well, I was the one who suggested doing it that way, so you can blame
>> me for that, but it is consistent with how we've handled other things,
>> like setops and jointypes: the details get moved to another tag so as
>> to avoid an explosive growth in the number of node types that clients
>> must be prepared for.

> Perhaps how a join is implemented in a plan can be considered a
> "detail", but I don't think the same holds true for insert vs. update.

Also, in all the other cases we stuck the detail into a common
attribute called Strategy.  What bothers me about Operation is that
there is only one node type that it is good for.  I would prefer to
keep the text and XML representations of this the same, which at the
moment seems to mean that the node type should be reported as
Insert/Update/Delete.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

08 October 2009, 17:02:06

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Oct 8, 2009 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> While we're discussing explain output ... what about triggers?
>> An important aspect of this change is that at least the row-level
>> triggers are now going to be charged as runtime of the
>> Dml-or-whatever-we-call-it node. �Should we rearrange the explain
>> output so that the printout for trigger runtimes is associated
>> with those nodes too? �If so, what about statement-level triggers?

> That's not a bad idea, though it wouldn't bother me much if we left it
> for a follow-on patch.

After cogitating on this for a few minutes, it seems to me that the
right thing to do is to push the calling of statement-level triggers
into the Dml node too.  That is, ExecDml() would be responsible for
calling BEFORE STATEMENT triggers at the start of its first call,
and for calling AFTER STATEMENT triggers at the end of its last
call (just before it returns NULL).  This would mean that all trigger
activity is now associated with a plan node and should be reported
that way by EXPLAIN.  We could get rid of the rather warty top-level
XML item for triggers and make the trigger instrumentation outputs
just be another plan node property.

Aside from having a cleaner EXPLAIN output design, this would mean that
triggers in the planned WITH RETURNING scenario fire in the order you'd
expect if we are considering the WITH RETURNING queries to be done
sequentially and before the main query.  From a user's standpoint it
would look just like you'd issued the queries sequentially, except that
the RETURNING data is held for use in the main query.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

08 October 2009, 17:14:15

On Thu, Oct 8, 2009 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Robert Haas escribió:
>>> On Thu, Oct 8, 2009 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I notice also that the patch has chosen to represent Dml in XML/JSON
>>>> explain output as Node Type = Dml with an entirely new attribute
>>>> Operation to indicate Insert/Update/Delete.  Do we really want to
>>>> go there?  Adding single-purpose attributes doesn't seem like a great
>>>> idea.
>>>
>>> Well, I was the one who suggested doing it that way, so you can blame
>>> me for that, but it is consistent with how we've handled other things,
>>> like setops and jointypes: the details get moved to another tag so as
>>> to avoid an explosive growth in the number of node types that clients
>>> must be prepared for.
>
>> Perhaps how a join is implemented in a plan can be considered a
>> "detail", but I don't think the same holds true for insert vs. update.
>
> Also, in all the other cases we stuck the detail into a common
> attribute called Strategy.  What bothers me about Operation is that
> there is only one node type that it is good for.  I would prefer to
> keep the text and XML representations of this the same, which at the
> moment seems to mean that the node type should be reported as
> Insert/Update/Delete.

Hmm... well, when you initially committed the code, you complained
that there were two different and unrelated things both using the
strategy tag - really, to mean different things.  Now you're saying
that strategy is OK because it's grand-fathered, but we shouldn't add
any more.

I think there's value in keeping the types that we output in 1-1
correspondence with the internal node tags.  We have other
node-specific properties that aren't handled that way like "Scan
Direction".  The only thing that's different about strategy is that it
happens to be handled in the first "case" block rather than the second
one for reasons that are entirely cosmetic rather than substantive.

...Robert

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

08 October 2009, 17:16:52

On Thu, Oct 8, 2009 at 4:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Oct 8, 2009 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> While we're discussing explain output ... what about triggers?
>>> An important aspect of this change is that at least the row-level
>>> triggers are now going to be charged as runtime of the
>>> Dml-or-whatever-we-call-it node.  Should we rearrange the explain
>>> output so that the printout for trigger runtimes is associated
>>> with those nodes too?  If so, what about statement-level triggers?
>
>> That's not a bad idea, though it wouldn't bother me much if we left it
>> for a follow-on patch.
>
> After cogitating on this for a few minutes, it seems to me that the
> right thing to do is to push the calling of statement-level triggers
> into the Dml node too.  That is, ExecDml() would be responsible for
> calling BEFORE STATEMENT triggers at the start of its first call,
> and for calling AFTER STATEMENT triggers at the end of its last
> call (just before it returns NULL).  This would mean that all trigger
> activity is now associated with a plan node and should be reported
> that way by EXPLAIN.  We could get rid of the rather warty top-level
> XML item for triggers and make the trigger instrumentation outputs
> just be another plan node property.
>
> Aside from having a cleaner EXPLAIN output design, this would mean that
> triggers in the planned WITH RETURNING scenario fire in the order you'd
> expect if we are considering the WITH RETURNING queries to be done
> sequentially and before the main query.  From a user's standpoint it
> would look just like you'd issued the queries sequentially, except that
> the RETURNING data is held for use in the main query.

That seems like a good design.  I hate to have you spend too much time
on any one patch at this stage of the game, though.  Is this a big
enough change that we should send it back for the patch author (or
other interested parties) to do make that change, or are you just
going to knock it out?

...Robert

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

08 October 2009, 17:44:17

Robert Haas <robertmhaas@gmail.com> writes:
> That seems like a good design.  I hate to have you spend too much time
> on any one patch at this stage of the game, though.  Is this a big
> enough change that we should send it back for the patch author (or
> other interested parties) to do make that change, or are you just
> going to knock it out?

I was going to look at it and probably do it myself unless it turns out
to be a bigger change than I'm thinking.  The only other stuff that's
Ready for Committer for me now is privileges stuff, which is not on the
critical path for other development work, and besides I'm a bit tired
of that topic right now.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Robert Haas

Date:

08 October 2009, 18:02:10

On Thu, Oct 8, 2009 at 4:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> That seems like a good design.  I hate to have you spend too much time
>> on any one patch at this stage of the game, though.  Is this a big
>> enough change that we should send it back for the patch author (or
>> other interested parties) to do make that change, or are you just
>> going to knock it out?
>
> I was going to look at it and probably do it myself unless it turns out
> to be a bigger change than I'm thinking.  The only other stuff that's
> Ready for Committer for me now is privileges stuff, which is not on the
> critical path for other development work, and besides I'm a bit tired
> of that topic right now.

"Buffer usage in EXPLAIN and pg_stat_statements" might be another one
for you, unless someone else is already handling that.

"Reworks for Access Controls" is important for future work, so it
would be good to try to fit that one in if possible.  It's not marked
as Ready for Committer yet because Stephen hasn't gotten around to
looking at the latest version, but I think he is fairly comfortable
with where it is at, so it is probably good for you to try to
determine whether there's a hole in the bottom of the ship before too
much more time is spend rearranging the deck furniture.

...Robert

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

08 October 2009, 18:07:13

Tom Lane wrote:
> I was going to look at it and probably do it myself unless it turns out
> to be a bigger change than I'm thinking.  The only other stuff that's
> Ready for Committer for me now is privileges stuff, which is not on the
> critical path for other development work, and besides I'm a bit tired
> of that topic right now.

I'm sorry, it didn't occur to me that this is part of this patch, but I
made these changes in my writeable CTE repo, see
http://git.postgresql.org/gitweb?p=writeable_cte.git;a=commitdiff;h=41ad3d24af845c67a3866e0946129cf9809fe7e9


Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

08 October 2009, 18:22:17

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> I'm sorry, it didn't occur to me that this is part of this patch, but I
> made these changes in my writeable CTE repo, see
> http://git.postgresql.org/gitweb?p=writeable_cte.git;a=commitdiff;h=41ad3d24af845c67a3866e0946129cf9809fe7e9

Could you pull out a patch that includes those changes, please?
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

08 October 2009, 18:53:35

Tom Lane wrote:
> Could you pull out a patch that includes those changes, please?

Sorry for the delay, my master was a bit behind :-( .  I moved the
trigger code to nodeDml.c with minor changes and removed unused
resultRelation stuff from DML nodes completely.  This also has the
README stuff in it.


Regards,
MArko Tiikkaja

*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***************
*** 581,586 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 581,587 ----
      const char *pname;            /* node type name for text output */
      const char *sname;            /* node type name for non-text output */
      const char *strategy = NULL;
+     const char *operation = NULL; /* DML operation */
      int            save_indent = es->indent;
      bool        haschildren;

***************
*** 705,710 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 706,729 ----
          case T_Hash:
              pname = sname = "Hash";
              break;
+         case T_Dml:
+             sname = "Dml";
+             switch( ((Dml *) plan)->operation)
+             {
+                 case CMD_INSERT:
+                     pname = operation = "Insert";
+                     break;
+                 case CMD_UPDATE:
+                     pname = operation = "Update";
+                     break;
+                 case CMD_DELETE:
+                     pname = operation = "Delete";
+                     break;
+                 default:
+                     pname = "???";
+                     break;
+             }
+             break;
          default:
              pname = sname = "???";
              break;
***************
*** 740,745 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 759,766 ----
              ExplainPropertyText("Parent Relationship", relationship, es);
          if (plan_name)
              ExplainPropertyText("Subplan Name", plan_name, es);
+         if (operation)
+             ExplainPropertyText("Operation", operation, es);
      }

      switch (nodeTag(plan))
***************
*** 1064,1069 **** ExplainNode(Plan *plan, PlanState *planstate,
--- 1085,1095 ----
                                 ((AppendState *) planstate)->appendplans,
                                 outer_plan, es);
              break;
+         case T_Dml:
+             ExplainMemberNodes(((Dml *) plan)->plans,
+                                ((DmlState *) planstate)->dmlplans,
+                                outer_plan, es);
+             break;
          case T_BitmapAnd:
              ExplainMemberNodes(((BitmapAnd *) plan)->bitmapplans,
                                 ((BitmapAndState *) planstate)->bitmapplans,
*** a/src/backend/executor/Makefile
--- b/src/backend/executor/Makefile
***************
*** 15,21 **** include $(top_builddir)/src/Makefile.global
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
--- 15,21 ----
  OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
         execProcnode.o execQual.o execScan.o execTuples.o \
         execUtils.o functions.o instrument.o nodeAppend.o nodeAgg.o \
!        nodeBitmapAnd.o nodeBitmapOr.o nodeDml.o \
         nodeBitmapHeapscan.o nodeBitmapIndexscan.o nodeHash.o \
         nodeHashjoin.o nodeIndexscan.o nodeMaterial.o nodeMergejoin.o \
         nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
*** a/src/backend/executor/README
--- b/src/backend/executor/README
***************
*** 25,38 **** There is a moderately intelligent scheme to avoid rescanning nodes
  unnecessarily (for example, Sort does not rescan its input if no parameters
  of the input have changed, since it can just reread its stored sorted data).

! The plan tree concept implements SELECT directly: it is only necessary to
! deliver the top-level result tuples to the client, or insert them into
! another table in the case of INSERT ... SELECT.  (INSERT ... VALUES is
! handled similarly, but the plan tree is just a Result node with no source
! tables.)  For UPDATE, the plan tree selects the tuples that need to be
! updated (WHERE condition) and delivers a new calculated tuple value for each
! such tuple, plus a "junk" (hidden) tuple CTID identifying the target tuple.
! The executor's top level then uses this information to update the correct
  tuple.  DELETE is similar to UPDATE except that only a CTID need be
  delivered by the plan tree.

--- 25,42 ----
  unnecessarily (for example, Sort does not rescan its input if no parameters
  of the input have changed, since it can just reread its stored sorted data).

! It is only necessary to deliver the top-level result tuples to the client.
! If the query is a SELECT, the topmost plan node is the output of the SELECT.
! If the query is a DML operation, a DML node is added to the top, which calls
! its child nodes to fetch the tuples.  If the DML operation has a RETURNING
! clause the node will output the projected tuples, otherwise it gives out
! dummy tuples until it has processed all tuples from its child nodes.  After
! that, it gives NULL.
!
! Handling INSERT is pretty straightforward: the tuples returned from the
! subtree are inserted into the correct result relation.  In addition to the
! tuple value, UPDATE needs a "junk" (hidden) tuple CTID identifying the
! target tuple.  The DML node needs this information to update the correct
  tuple.  DELETE is similar to UPDATE except that only a CTID need be
  delivered by the plan tree.

*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 77,83 **** typedef struct evalPlanQual

  /* decls for local routines only used within this module */
  static void InitPlan(QueryDesc *queryDesc, int eflags);
- static void ExecCheckPlanOutput(Relation resultRel, List *targetList);
  static void ExecEndPlan(PlanState *planstate, EState *estate);
  static void ExecutePlan(EState *estate, PlanState *planstate,
              CmdType operation,
--- 77,82 ----
***************
*** 86,104 **** static void ExecutePlan(EState *estate, PlanState *planstate,
              DestReceiver *dest);
  static void ExecSelect(TupleTableSlot *slot,
             DestReceiver *dest, EState *estate);
- static void ExecInsert(TupleTableSlot *slot, ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecDelete(ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecUpdate(TupleTableSlot *slot, ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest, EState *estate);
- static void ExecProcessReturning(ProjectionInfo *projectReturning,
-                      TupleTableSlot *tupleSlot,
-                      TupleTableSlot *planSlot,
-                      DestReceiver *dest);
  static TupleTableSlot *EvalPlanQualNext(EState *estate);
  static void EndEvalPlanQual(EState *estate);
  static void ExecCheckRTPerms(List *rangeTable);
--- 85,90 ----
***************
*** 814,909 **** InitPlan(QueryDesc *queryDesc, int eflags)
      tupType = ExecGetResultType(planstate);

      /*
!      * Initialize the junk filter if needed.  SELECT and INSERT queries need a
!      * filter if there are any junk attrs in the tlist.  UPDATE and DELETE
!      * always need a filter, since there's always a junk 'ctid' attribute
!      * present --- no need to look first.
!      *
!      * This section of code is also a convenient place to verify that the
!      * output of an INSERT or UPDATE matches the target table(s).
       */
      {
          bool        junk_filter_needed = false;
          ListCell   *tlist;

!         switch (operation)
          {
!             case CMD_SELECT:
!             case CMD_INSERT:
!                 foreach(tlist, plan->targetlist)
!                 {
!                     TargetEntry *tle = (TargetEntry *) lfirst(tlist);

!                     if (tle->resjunk)
!                     {
!                         junk_filter_needed = true;
!                         break;
!                     }
!                 }
!                 break;
!             case CMD_UPDATE:
!             case CMD_DELETE:
                  junk_filter_needed = true;
                  break;
!             default:
!                 break;
          }

          if (junk_filter_needed)
          {
-             /*
-              * If there are multiple result relations, each one needs its own
-              * junk filter.  Note this is only possible for UPDATE/DELETE, so
-              * we can't be fooled by some needing a filter and some not.
-              */
              if (list_length(plannedstmt->resultRelations) > 1)
              {
-                 PlanState **appendplans;
-                 int            as_nplans;
-                 ResultRelInfo *resultRelInfo;
-
-                 /* Top plan had better be an Append here. */
-                 Assert(IsA(plan, Append));
-                 Assert(((Append *) plan)->isTarget);
-                 Assert(IsA(planstate, AppendState));
-                 appendplans = ((AppendState *) planstate)->appendplans;
-                 as_nplans = ((AppendState *) planstate)->as_nplans;
-                 Assert(as_nplans == estate->es_num_result_relations);
-                 resultRelInfo = estate->es_result_relations;
-                 for (i = 0; i < as_nplans; i++)
-                 {
-                     PlanState  *subplan = appendplans[i];
-                     JunkFilter *j;
-
-                     if (operation == CMD_UPDATE)
-                         ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
-                                             subplan->plan->targetlist);
-
-                     j = ExecInitJunkFilter(subplan->plan->targetlist,
-                             resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
-                                            ExecInitExtraTupleSlot(estate));
-
-                     /*
-                      * Since it must be UPDATE/DELETE, there had better be a
-                      * "ctid" junk attribute in the tlist ... but ctid could
-                      * be at a different resno for each result relation. We
-                      * look up the ctid resnos now and save them in the
-                      * junkfilters.
-                      */
-                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
-                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
-                         elog(ERROR, "could not find junk ctid column");
-                     resultRelInfo->ri_junkFilter = j;
-                     resultRelInfo++;
-                 }
-
-                 /*
-                  * Set active junkfilter too; at this point ExecInitAppend has
-                  * already selected an active result relation...
-                  */
-                 estate->es_junkFilter =
-                     estate->es_result_relation_info->ri_junkFilter;
-
                  /*
                   * We currently can't support rowmarks in this case, because
                   * the associated junk CTIDs might have different resnos in
--- 800,828 ----
      tupType = ExecGetResultType(planstate);

      /*
!      * Initialize the junk filter if needed.  SELECT queries need a
!      * filter if there are any junk attrs in the tlist.
       */
+     if (operation == CMD_SELECT)
      {
          bool        junk_filter_needed = false;
          ListCell   *tlist;

!         foreach(tlist, plan->targetlist)
          {
!             TargetEntry *tle = (TargetEntry *) lfirst(tlist);

!             if (tle->resjunk)
!             {
                  junk_filter_needed = true;
                  break;
!             }
          }

          if (junk_filter_needed)
          {
              if (list_length(plannedstmt->resultRelations) > 1)
              {
                  /*
                   * We currently can't support rowmarks in this case, because
                   * the associated junk CTIDs might have different resnos in
***************
*** 916,947 **** InitPlan(QueryDesc *queryDesc, int eflags)
              }
              else
              {
-                 /* Normal case with just one JunkFilter */
                  JunkFilter *j;

-                 if (operation == CMD_INSERT || operation == CMD_UPDATE)
-                     ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
-                                         planstate->plan->targetlist);
-
                  j = ExecInitJunkFilter(planstate->plan->targetlist,
                                         tupType->tdhasoid,
                                         ExecInitExtraTupleSlot(estate));
                  estate->es_junkFilter = j;
-                 if (estate->es_result_relation_info)
-                     estate->es_result_relation_info->ri_junkFilter = j;

!                 if (operation == CMD_SELECT)
!                 {
!                     /* For SELECT, want to return the cleaned tuple type */
!                     tupType = j->jf_cleanTupType;
!                 }
!                 else if (operation == CMD_UPDATE || operation == CMD_DELETE)
!                 {
!                     /* For UPDATE/DELETE, find the ctid junk attr now */
!                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
!                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
!                         elog(ERROR, "could not find junk ctid column");
!                 }

                  /* For SELECT FOR UPDATE/SHARE, find the junk attrs now */
                  foreach(l, estate->es_rowMarks)
--- 835,849 ----
              }
              else
              {
                  JunkFilter *j;

                  j = ExecInitJunkFilter(planstate->plan->targetlist,
                                         tupType->tdhasoid,
                                         ExecInitExtraTupleSlot(estate));
                  estate->es_junkFilter = j;

!                 /* For SELECT, want to return the cleaned tuple type */
!                 tupType = j->jf_cleanTupType;

                  /* For SELECT FOR UPDATE/SHARE, find the junk attrs now */
                  foreach(l, estate->es_rowMarks)
***************
*** 971,1027 **** InitPlan(QueryDesc *queryDesc, int eflags)
          }
          else
          {
-             if (operation == CMD_INSERT)
-                 ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
-                                     planstate->plan->targetlist);
-
              estate->es_junkFilter = NULL;
              if (estate->es_rowMarks)
                  elog(ERROR, "SELECT FOR UPDATE/SHARE, but no junk columns");
          }
      }

-     /*
-      * Initialize RETURNING projections if needed.
-      */
-     if (plannedstmt->returningLists)
-     {
-         TupleTableSlot *slot;
-         ExprContext *econtext;
-         ResultRelInfo *resultRelInfo;
-
-         /*
-          * We set QueryDesc.tupDesc to be the RETURNING rowtype in this case.
-          * We assume all the sublists will generate the same output tupdesc.
-          */
-         tupType = ExecTypeFromTL((List *) linitial(plannedstmt->returningLists),
-                                  false);
-
-         /* Set up a slot for the output of the RETURNING projection(s) */
-         slot = ExecInitExtraTupleSlot(estate);
-         ExecSetSlotDescriptor(slot, tupType);
-         /* Need an econtext too */
-         econtext = CreateExprContext(estate);
-
-         /*
-          * Build a projection for each result rel.    Note that any SubPlans in
-          * the RETURNING lists get attached to the topmost plan node.
-          */
-         Assert(list_length(plannedstmt->returningLists) == estate->es_num_result_relations);
-         resultRelInfo = estate->es_result_relations;
-         foreach(l, plannedstmt->returningLists)
-         {
-             List       *rlist = (List *) lfirst(l);
-             List       *rliststate;
-
-             rliststate = (List *) ExecInitExpr((Expr *) rlist, planstate);
-             resultRelInfo->ri_projectReturning =
-                 ExecBuildProjectionInfo(rliststate, econtext, slot,
-                                      resultRelInfo->ri_RelationDesc->rd_att);
-             resultRelInfo++;
-         }
-     }
-
      queryDesc->tupDesc = tupType;
      queryDesc->planstate = planstate;

--- 873,884 ----
***************
*** 1123,1197 **** InitResultRelInfo(ResultRelInfo *resultRelInfo,
  }

  /*
-  * Verify that the tuples to be produced by INSERT or UPDATE match the
-  * target relation's rowtype
-  *
-  * We do this to guard against stale plans.  If plan invalidation is
-  * functioning properly then we should never get a failure here, but better
-  * safe than sorry.  Note that this is called after we have obtained lock
-  * on the target rel, so the rowtype can't change underneath us.
-  *
-  * The plan output is represented by its targetlist, because that makes
-  * handling the dropped-column case easier.
-  */
- static void
- ExecCheckPlanOutput(Relation resultRel, List *targetList)
- {
-     TupleDesc    resultDesc = RelationGetDescr(resultRel);
-     int            attno = 0;
-     ListCell   *lc;
-
-     foreach(lc, targetList)
-     {
-         TargetEntry *tle = (TargetEntry *) lfirst(lc);
-         Form_pg_attribute attr;
-
-         if (tle->resjunk)
-             continue;            /* ignore junk tlist items */
-
-         if (attno >= resultDesc->natts)
-             ereport(ERROR,
-                     (errcode(ERRCODE_DATATYPE_MISMATCH),
-                      errmsg("table row type and query-specified row type do not match"),
-                      errdetail("Query has too many columns.")));
-         attr = resultDesc->attrs[attno++];
-
-         if (!attr->attisdropped)
-         {
-             /* Normal case: demand type match */
-             if (exprType((Node *) tle->expr) != attr->atttypid)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_DATATYPE_MISMATCH),
-                          errmsg("table row type and query-specified row type do not match"),
-                          errdetail("Table has type %s at ordinal position %d, but query expects %s.",
-                                    format_type_be(attr->atttypid),
-                                    attno,
-                              format_type_be(exprType((Node *) tle->expr)))));
-         }
-         else
-         {
-             /*
-              * For a dropped column, we can't check atttypid (it's likely 0).
-              * In any case the planner has most likely inserted an INT4 null.
-              * What we insist on is just *some* NULL constant.
-              */
-             if (!IsA(tle->expr, Const) ||
-                 !((Const *) tle->expr)->constisnull)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_DATATYPE_MISMATCH),
-                          errmsg("table row type and query-specified row type do not match"),
-                          errdetail("Query provides a value for a dropped column at ordinal position %d.",
-                                    attno)));
-         }
-     }
-     if (attno != resultDesc->natts)
-         ereport(ERROR,
-                 (errcode(ERRCODE_DATATYPE_MISMATCH),
-           errmsg("table row type and query-specified row type do not match"),
-                  errdetail("Query has too few columns.")));
- }
-
- /*
   *        ExecGetTriggerResultRel
   *
   * Get a ResultRelInfo for a trigger target relation.  Most of the time,
--- 980,985 ----
***************
*** 1423,1430 **** ExecutePlan(EState *estate,
      JunkFilter *junkfilter;
      TupleTableSlot *planSlot;
      TupleTableSlot *slot;
-     ItemPointer tupleid = NULL;
-     ItemPointerData tuple_ctid;
      long        current_tuple_count;

      /*
--- 1211,1216 ----
***************
*** 1438,1462 **** ExecutePlan(EState *estate,
      estate->es_direction = direction;

      /*
-      * Process BEFORE EACH STATEMENT triggers
-      */
-     switch (operation)
-     {
-         case CMD_UPDATE:
-             ExecBSUpdateTriggers(estate, estate->es_result_relation_info);
-             break;
-         case CMD_DELETE:
-             ExecBSDeleteTriggers(estate, estate->es_result_relation_info);
-             break;
-         case CMD_INSERT:
-             ExecBSInsertTriggers(estate, estate->es_result_relation_info);
-             break;
-         default:
-             /* do nothing */
-             break;
-     }
-
-     /*
       * Loop until we've processed the proper number of tuples from the plan.
       */
      for (;;)
--- 1224,1229 ----
***************
*** 1495,1501 **** lnext:    ;
           *
           * But first, extract all the junk information we need.
           */
!         if ((junkfilter = estate->es_junkFilter) != NULL)
          {
              /*
               * Process any FOR UPDATE or FOR SHARE locking requested.
--- 1262,1268 ----
           *
           * But first, extract all the junk information we need.
           */
!         if (operation == CMD_SELECT && (junkfilter = estate->es_junkFilter) != NULL)
          {
              /*
               * Process any FOR UPDATE or FOR SHARE locking requested.
***************
*** 1604,1635 **** lnext:    ;
                  }
              }

!             /*
!              * extract the 'ctid' junk attribute.
!              */
!             if (operation == CMD_UPDATE || operation == CMD_DELETE)
!             {
!                 Datum        datum;
!                 bool        isNull;
!
!                 datum = ExecGetJunkAttribute(slot, junkfilter->jf_junkAttNo,
!                                              &isNull);
!                 /* shouldn't ever get a null result... */
!                 if (isNull)
!                     elog(ERROR, "ctid is NULL");
!
!                 tupleid = (ItemPointer) DatumGetPointer(datum);
!                 tuple_ctid = *tupleid;    /* make sure we don't free the ctid!! */
!                 tupleid = &tuple_ctid;
!             }
!
!             /*
!              * Create a new "clean" tuple with all junk attributes removed. We
!              * don't need to do this for DELETE, however (there will in fact
!              * be no non-junk attributes in a DELETE!)
!              */
!             if (operation != CMD_DELETE)
!                 slot = ExecFilterJunk(junkfilter, slot);
          }

          /*
--- 1371,1377 ----
                  }
              }

!             slot = ExecFilterJunk(junkfilter, slot);
          }

          /*
***************
*** 1644,1658 **** lnext:    ;
                  break;

              case CMD_INSERT:
-                 ExecInsert(slot, tupleid, planSlot, dest, estate);
-                 break;
-
              case CMD_DELETE:
-                 ExecDelete(tupleid, planSlot, dest, estate);
-                 break;
-
              case CMD_UPDATE:
!                 ExecUpdate(slot, tupleid, planSlot, dest, estate);
                  break;

              default:
--- 1386,1395 ----
                  break;

              case CMD_INSERT:
              case CMD_DELETE:
              case CMD_UPDATE:
!                 if (estate->es_plannedstmt->returningLists)
!                     (*dest->receiveSlot) (slot, dest);
                  break;

              default:
***************
*** 1670,1694 **** lnext:    ;
          if (numberTuples && numberTuples == current_tuple_count)
              break;
      }
-
-     /*
-      * Process AFTER EACH STATEMENT triggers
-      */
-     switch (operation)
-     {
-         case CMD_UPDATE:
-             ExecASUpdateTriggers(estate, estate->es_result_relation_info);
-             break;
-         case CMD_DELETE:
-             ExecASDeleteTriggers(estate, estate->es_result_relation_info);
-             break;
-         case CMD_INSERT:
-             ExecASInsertTriggers(estate, estate->es_result_relation_info);
-             break;
-         default:
-             /* do nothing */
-             break;
-     }
  }

  /* ----------------------------------------------------------------
--- 1407,1412 ----
***************
*** 1708,2127 **** ExecSelect(TupleTableSlot *slot,
      (estate->es_processed)++;
  }

- /* ----------------------------------------------------------------
-  *        ExecInsert
-  *
-  *        INSERTs are trickier.. we have to insert the tuple into
-  *        the base relation and insert appropriate tuples into the
-  *        index relations.
-  * ----------------------------------------------------------------
-  */
- static void
- ExecInsert(TupleTableSlot *slot,
-            ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     Oid            newId;
-     List       *recheckIndexes = NIL;
-
-     /*
-      * get the heap tuple out of the tuple table slot, making sure we have a
-      * writable copy
-      */
-     tuple = ExecMaterializeSlot(slot);
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /*
-      * If the result relation has OIDs, force the tuple's OID to zero so that
-      * heap_insert will assign a fresh OID.  Usually the OID already will be
-      * zero at this point, but there are corner cases where the plan tree can
-      * return a tuple extracted literally from some table with the same
-      * rowtype.
-      *
-      * XXX if we ever wanted to allow users to assign their own OIDs to new
-      * rows, this'd be the place to do it.  For the moment, we make a point of
-      * doing this before calling triggers, so that a user-supplied trigger
-      * could hack the OID if desired.
-      */
-     if (resultRelationDesc->rd_rel->relhasoids)
-         HeapTupleSetOid(tuple, InvalidOid);
-
-     /* BEFORE ROW INSERT Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
-     {
-         HeapTuple    newtuple;
-
-         newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
-
-         if (newtuple == NULL)    /* "do nothing" */
-             return;
-
-         if (newtuple != tuple)    /* modified by Trigger(s) */
-         {
-             /*
-              * Put the modified tuple into a slot for convenience of routines
-              * below.  We assume the tuple was allocated in per-tuple memory
-              * context, and therefore will go away by itself. The tuple table
-              * slot should not try to clear it.
-              */
-             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
-
-             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
-                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
-             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
-             slot = newslot;
-             tuple = newtuple;
-         }
-     }
-
-     /*
-      * Check the constraints of the tuple
-      */
-     if (resultRelationDesc->rd_att->constr)
-         ExecConstraints(resultRelInfo, slot, estate);
-
-     /*
-      * insert the tuple
-      *
-      * Note: heap_insert returns the tid (location) of the new tuple in the
-      * t_self field.
-      */
-     newId = heap_insert(resultRelationDesc, tuple,
-                         estate->es_output_cid, 0, NULL);
-
-     IncrAppended();
-     (estate->es_processed)++;
-     estate->es_lastoid = newId;
-     setLastTid(&(tuple->t_self));
-
-     /*
-      * insert index entries for tuple
-      */
-     if (resultRelInfo->ri_NumIndices > 0)
-         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
-                                                estate, false);
-
-     /* AFTER ROW INSERT Triggers */
-     ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
- }
-
- /* ----------------------------------------------------------------
-  *        ExecDelete
-  *
-  *        DELETE is like UPDATE, except that we delete the tuple and no
-  *        index modifications are needed
-  * ----------------------------------------------------------------
-  */
- static void
- ExecDelete(ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     HTSU_Result result;
-     ItemPointerData update_ctid;
-     TransactionId update_xmax;
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /* BEFORE ROW DELETE Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_DELETE] > 0)
-     {
-         bool        dodelete;
-
-         dodelete = ExecBRDeleteTriggers(estate, resultRelInfo, tupleid);
-
-         if (!dodelete)            /* "do nothing" */
-             return;
-     }
-
-     /*
-      * delete the tuple
-      *
-      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
-      * the row to be deleted is visible to that snapshot, and throw a can't-
-      * serialize error if not.    This is a special-case behavior needed for
-      * referential integrity updates in serializable transactions.
-      */
- ldelete:;
-     result = heap_delete(resultRelationDesc, tupleid,
-                          &update_ctid, &update_xmax,
-                          estate->es_output_cid,
-                          estate->es_crosscheck_snapshot,
-                          true /* wait for commit */ );
-     switch (result)
-     {
-         case HeapTupleSelfUpdated:
-             /* already deleted by self; nothing to do */
-             return;
-
-         case HeapTupleMayBeUpdated:
-             break;
-
-         case HeapTupleUpdated:
-             if (IsXactIsoLevelSerializable)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                          errmsg("could not serialize access due to concurrent update")));
-             else if (!ItemPointerEquals(tupleid, &update_ctid))
-             {
-                 TupleTableSlot *epqslot;
-
-                 epqslot = EvalPlanQual(estate,
-                                        resultRelInfo->ri_RangeTableIndex,
-                                        &update_ctid,
-                                        update_xmax);
-                 if (!TupIsNull(epqslot))
-                 {
-                     *tupleid = update_ctid;
-                     goto ldelete;
-                 }
-             }
-             /* tuple already deleted; nothing to do */
-             return;
-
-         default:
-             elog(ERROR, "unrecognized heap_delete status: %u", result);
-             return;
-     }
-
-     IncrDeleted();
-     (estate->es_processed)++;
-
-     /*
-      * Note: Normally one would think that we have to delete index tuples
-      * associated with the heap tuple now...
-      *
-      * ... but in POSTGRES, we have no need to do this because VACUUM will
-      * take care of it later.  We can't delete index tuples immediately
-      * anyway, since the tuple is still visible to other transactions.
-      */
-
-     /* AFTER ROW DELETE Triggers */
-     ExecARDeleteTriggers(estate, resultRelInfo, tupleid);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-     {
-         /*
-          * We have to put the target tuple into a slot, which means first we
-          * gotta fetch it.    We can use the trigger tuple slot.
-          */
-         TupleTableSlot *slot = estate->es_trig_tuple_slot;
-         HeapTupleData deltuple;
-         Buffer        delbuffer;
-
-         deltuple.t_self = *tupleid;
-         if (!heap_fetch(resultRelationDesc, SnapshotAny,
-                         &deltuple, &delbuffer, false, NULL))
-             elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
-
-         if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
-             ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
-         ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
-
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
-
-         ExecClearTuple(slot);
-         ReleaseBuffer(delbuffer);
-     }
- }
-
- /* ----------------------------------------------------------------
-  *        ExecUpdate
-  *
-  *        note: we can't run UPDATE queries with transactions
-  *        off because UPDATEs are actually INSERTs and our
-  *        scan will mistakenly loop forever, updating the tuple
-  *        it just inserted..    This should be fixed but until it
-  *        is, we don't want to get stuck in an infinite loop
-  *        which corrupts your database..
-  * ----------------------------------------------------------------
-  */
- static void
- ExecUpdate(TupleTableSlot *slot,
-            ItemPointer tupleid,
-            TupleTableSlot *planSlot,
-            DestReceiver *dest,
-            EState *estate)
- {
-     HeapTuple    tuple;
-     ResultRelInfo *resultRelInfo;
-     Relation    resultRelationDesc;
-     HTSU_Result result;
-     ItemPointerData update_ctid;
-     TransactionId update_xmax;
-     List *recheckIndexes = NIL;
-
-     /*
-      * abort the operation if not running transactions
-      */
-     if (IsBootstrapProcessingMode())
-         elog(ERROR, "cannot UPDATE during bootstrap");
-
-     /*
-      * get the heap tuple out of the tuple table slot, making sure we have a
-      * writable copy
-      */
-     tuple = ExecMaterializeSlot(slot);
-
-     /*
-      * get information on the (current) result relation
-      */
-     resultRelInfo = estate->es_result_relation_info;
-     resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
-     /* BEFORE ROW UPDATE Triggers */
-     if (resultRelInfo->ri_TrigDesc &&
-         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_UPDATE] > 0)
-     {
-         HeapTuple    newtuple;
-
-         newtuple = ExecBRUpdateTriggers(estate, resultRelInfo,
-                                         tupleid, tuple);
-
-         if (newtuple == NULL)    /* "do nothing" */
-             return;
-
-         if (newtuple != tuple)    /* modified by Trigger(s) */
-         {
-             /*
-              * Put the modified tuple into a slot for convenience of routines
-              * below.  We assume the tuple was allocated in per-tuple memory
-              * context, and therefore will go away by itself. The tuple table
-              * slot should not try to clear it.
-              */
-             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
-
-             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
-                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
-             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
-             slot = newslot;
-             tuple = newtuple;
-         }
-     }
-
-     /*
-      * Check the constraints of the tuple
-      *
-      * If we generate a new candidate tuple after EvalPlanQual testing, we
-      * must loop back here and recheck constraints.  (We don't need to redo
-      * triggers, however.  If there are any BEFORE triggers then trigger.c
-      * will have done heap_lock_tuple to lock the correct tuple, so there's no
-      * need to do them again.)
-      */
- lreplace:;
-     if (resultRelationDesc->rd_att->constr)
-         ExecConstraints(resultRelInfo, slot, estate);
-
-     /*
-      * replace the heap tuple
-      *
-      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
-      * the row to be updated is visible to that snapshot, and throw a can't-
-      * serialize error if not.    This is a special-case behavior needed for
-      * referential integrity updates in serializable transactions.
-      */
-     result = heap_update(resultRelationDesc, tupleid, tuple,
-                          &update_ctid, &update_xmax,
-                          estate->es_output_cid,
-                          estate->es_crosscheck_snapshot,
-                          true /* wait for commit */ );
-     switch (result)
-     {
-         case HeapTupleSelfUpdated:
-             /* already deleted by self; nothing to do */
-             return;
-
-         case HeapTupleMayBeUpdated:
-             break;
-
-         case HeapTupleUpdated:
-             if (IsXactIsoLevelSerializable)
-                 ereport(ERROR,
-                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                          errmsg("could not serialize access due to concurrent update")));
-             else if (!ItemPointerEquals(tupleid, &update_ctid))
-             {
-                 TupleTableSlot *epqslot;
-
-                 epqslot = EvalPlanQual(estate,
-                                        resultRelInfo->ri_RangeTableIndex,
-                                        &update_ctid,
-                                        update_xmax);
-                 if (!TupIsNull(epqslot))
-                 {
-                     *tupleid = update_ctid;
-                     slot = ExecFilterJunk(estate->es_junkFilter, epqslot);
-                     tuple = ExecMaterializeSlot(slot);
-                     goto lreplace;
-                 }
-             }
-             /* tuple already deleted; nothing to do */
-             return;
-
-         default:
-             elog(ERROR, "unrecognized heap_update status: %u", result);
-             return;
-     }
-
-     IncrReplaced();
-     (estate->es_processed)++;
-
-     /*
-      * Note: instead of having to update the old index tuples associated with
-      * the heap tuple, all we do is form and insert new index tuples. This is
-      * because UPDATEs are actually DELETEs and INSERTs, and index tuple
-      * deletion is done later by VACUUM (see notes in ExecDelete).    All we do
-      * here is insert new index tuples.  -cim 9/27/89
-      */
-
-     /*
-      * insert index entries for tuple
-      *
-      * Note: heap_update returns the tid (location) of the new tuple in the
-      * t_self field.
-      *
-      * If it's a HOT update, we mustn't insert new index entries.
-      */
-     if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
-         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
-                                                estate, false);
-
-     /* AFTER ROW UPDATE Triggers */
-     ExecARUpdateTriggers(estate, resultRelInfo, tupleid, tuple,
-                          recheckIndexes);
-
-     /* Process RETURNING if present */
-     if (resultRelInfo->ri_projectReturning)
-         ExecProcessReturning(resultRelInfo->ri_projectReturning,
-                              slot, planSlot, dest);
- }
-
  /*
   * ExecRelCheck --- check that tuple meets constraints for result relation
   */
--- 1426,1431 ----
***************
*** 2222,2263 **** ExecConstraints(ResultRelInfo *resultRelInfo,
  }

  /*
-  * ExecProcessReturning --- evaluate a RETURNING list and send to dest
-  *
-  * projectReturning: RETURNING projection info for current result rel
-  * tupleSlot: slot holding tuple actually inserted/updated/deleted
-  * planSlot: slot holding tuple returned by top plan node
-  * dest: where to send the output
-  */
- static void
- ExecProcessReturning(ProjectionInfo *projectReturning,
-                      TupleTableSlot *tupleSlot,
-                      TupleTableSlot *planSlot,
-                      DestReceiver *dest)
- {
-     ExprContext *econtext = projectReturning->pi_exprContext;
-     TupleTableSlot *retSlot;
-
-     /*
-      * Reset per-tuple memory context to free any expression evaluation
-      * storage allocated in the previous cycle.
-      */
-     ResetExprContext(econtext);
-
-     /* Make tuple and any needed join variables available to ExecProject */
-     econtext->ecxt_scantuple = tupleSlot;
-     econtext->ecxt_outertuple = planSlot;
-
-     /* Compute the RETURNING expressions */
-     retSlot = ExecProject(projectReturning, NULL);
-
-     /* Send to dest */
-     (*dest->receiveSlot) (retSlot, dest);
-
-     ExecClearTuple(retSlot);
- }
-
- /*
   * Check a modified tuple to see if we want to process its updated version
   * under READ COMMITTED rules.
   *
--- 1526,1531 ----
*** a/src/backend/executor/execProcnode.c
--- b/src/backend/executor/execProcnode.c
***************
*** 90,95 ****
--- 90,96 ----
  #include "executor/nodeHash.h"
  #include "executor/nodeHashjoin.h"
  #include "executor/nodeIndexscan.h"
+ #include "executor/nodeDml.h"
  #include "executor/nodeLimit.h"
  #include "executor/nodeMaterial.h"
  #include "executor/nodeMergejoin.h"
***************
*** 285,290 **** ExecInitNode(Plan *node, EState *estate, int eflags)
--- 286,296 ----
                                                   estate, eflags);
              break;

+         case T_Dml:
+             result = (PlanState *) ExecInitDml((Dml *) node,
+                                                  estate, eflags);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;        /* keep compiler quiet */
***************
*** 450,455 **** ExecProcNode(PlanState *node)
--- 456,465 ----
              result = ExecLimit((LimitState *) node);
              break;

+         case T_DmlState:
+             result = ExecDml((DmlState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              result = NULL;
***************
*** 666,671 **** ExecEndNode(PlanState *node)
--- 676,685 ----
              ExecEndLimit((LimitState *) node);
              break;

+         case T_DmlState:
+             ExecEndDml((DmlState *) node);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
              break;
*** a/src/backend/executor/nodeAppend.c
--- b/src/backend/executor/nodeAppend.c
***************
*** 103,123 **** exec_append_initialize_next(AppendState *appendstate)
      }
      else
      {
-         /*
-          * initialize the scan
-          *
-          * If we are controlling the target relation, select the proper active
-          * ResultRelInfo and junk filter for this target.
-          */
-         if (((Append *) appendstate->ps.plan)->isTarget)
-         {
-             Assert(whichplan < estate->es_num_result_relations);
-             estate->es_result_relation_info =
-                 estate->es_result_relations + whichplan;
-             estate->es_junkFilter =
-                 estate->es_result_relation_info->ri_junkFilter;
-         }
-
          return TRUE;
      }
  }
--- 103,108 ----
***************
*** 164,189 **** ExecInitAppend(Append *node, EState *estate, int eflags)
      appendstate->appendplans = appendplanstates;
      appendstate->as_nplans = nplans;

!     /*
!      * Do we want to scan just one subplan?  (Special case for EvalPlanQual)
!      * XXX pretty dirty way of determining that this case applies ...
!      */
!     if (node->isTarget && estate->es_evTuple != NULL)
!     {
!         int            tplan;
!
!         tplan = estate->es_result_relation_info - estate->es_result_relations;
!         Assert(tplan >= 0 && tplan < nplans);
!
!         appendstate->as_firstplan = tplan;
!         appendstate->as_lastplan = tplan;
!     }
!     else
!     {
!         /* normal case, scan all subplans */
!         appendstate->as_firstplan = 0;
!         appendstate->as_lastplan = nplans - 1;
!     }

      /*
       * Miscellaneous initialization
--- 149,157 ----
      appendstate->appendplans = appendplanstates;
      appendstate->as_nplans = nplans;

!
!     appendstate->as_firstplan = 0;
!     appendstate->as_lastplan = nplans - 1;

      /*
       * Miscellaneous initialization
*** /dev/null
--- b/src/backend/executor/nodeDml.c
***************
*** 0 ****
--- 1,891 ----
+ #include "postgres.h"
+
+ #include "access/xact.h"
+ #include "parser/parsetree.h"
+ #include "executor/executor.h"
+ #include "executor/execdebug.h"
+ #include "executor/nodeDml.h"
+ #include "commands/trigger.h"
+ #include "nodes/nodeFuncs.h"
+ #include "utils/memutils.h"
+ #include "utils/builtins.h"
+ #include "utils/tqual.h"
+ #include "storage/bufmgr.h"
+ #include "miscadmin.h"
+
+ /*
+  * Verify that the tuples to be produced by INSERT or UPDATE match the
+  * target relation's rowtype
+  *
+  * We do this to guard against stale plans.  If plan invalidation is
+  * functioning properly then we should never get a failure here, but better
+  * safe than sorry.  Note that this is called after we have obtained lock
+  * on the target rel, so the rowtype can't change underneath us.
+  *
+  * The plan output is represented by its targetlist, because that makes
+  * handling the dropped-column case easier.
+  */
+ static void
+ ExecCheckPlanOutput(Relation resultRel, List *targetList)
+ {
+     TupleDesc    resultDesc = RelationGetDescr(resultRel);
+     int            attno = 0;
+     ListCell   *lc;
+
+     foreach(lc, targetList)
+     {
+         TargetEntry *tle = (TargetEntry *) lfirst(lc);
+         Form_pg_attribute attr;
+
+         if (tle->resjunk)
+             continue;            /* ignore junk tlist items */
+
+         if (attno >= resultDesc->natts)
+             ereport(ERROR,
+                     (errcode(ERRCODE_DATATYPE_MISMATCH),
+                      errmsg("table row type and query-specified row type do not match"),
+                      errdetail("Query has too many columns.")));
+         attr = resultDesc->attrs[attno++];
+
+         if (!attr->attisdropped)
+         {
+             /* Normal case: demand type match */
+             if (exprType((Node *) tle->expr) != attr->atttypid)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_DATATYPE_MISMATCH),
+                          errmsg("table row type and query-specified row type do not match"),
+                          errdetail("Table has type %s at ordinal position %d, but query expects %s.",
+                                    format_type_be(attr->atttypid),
+                                    attno,
+                              format_type_be(exprType((Node *) tle->expr)))));
+         }
+         else
+         {
+             /*
+              * For a dropped column, we can't check atttypid (it's likely 0).
+              * In any case the planner has most likely inserted an INT4 null.
+              * What we insist on is just *some* NULL constant.
+              */
+             if (!IsA(tle->expr, Const) ||
+                 !((Const *) tle->expr)->constisnull)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_DATATYPE_MISMATCH),
+                          errmsg("table row type and query-specified row type do not match"),
+                          errdetail("Query provides a value for a dropped column at ordinal position %d.",
+                                    attno)));
+         }
+     }
+     if (attno != resultDesc->natts)
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+           errmsg("table row type and query-specified row type do not match"),
+                  errdetail("Query has too few columns.")));
+ }
+
+ static TupleTableSlot*
+ ExecProcessReturning(ProjectionInfo *projectReturning,
+                      TupleTableSlot *tupleSlot,
+                      TupleTableSlot *planSlot)
+ {
+     ExprContext *econtext = projectReturning->pi_exprContext;
+     TupleTableSlot *retSlot;
+
+     /*
+      * Reset per-tuple memory context to free any expression evaluation
+      * storage allocated in the previous cycle.
+      */
+     ResetExprContext(econtext);
+
+     /* Make tuple and any needed join variables available to ExecProject */
+     econtext->ecxt_scantuple = tupleSlot;
+     econtext->ecxt_outertuple = planSlot;
+
+     /* Compute the RETURNING expressions */
+     retSlot = ExecProject(projectReturning, NULL);
+
+     return retSlot;
+ }
+
+ static TupleTableSlot *
+ ExecInsert(TupleTableSlot *slot,
+             ItemPointer tupleid,
+             TupleTableSlot *planSlot,
+             EState *estate)
+ {
+     HeapTuple    tuple;
+     ResultRelInfo *resultRelInfo;
+     Relation    resultRelationDesc;
+     Oid            newId;
+     List        *recheckIndexes = NIL;
+
+     /*
+      * get the heap tuple out of the tuple table slot, making sure we have a
+      * writable copy
+      */
+     tuple = ExecMaterializeSlot(slot);
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relations;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /*
+      * If the result relation has OIDs, force the tuple's OID to zero so that
+      * heap_insert will assign a fresh OID.  Usually the OID already will be
+      * zero at this point, but there are corner cases where the plan tree can
+      * return a tuple extracted literally from some table with the same
+      * rowtype.
+      *
+      * XXX if we ever wanted to allow users to assign their own OIDs to new
+      * rows, this'd be the place to do it.  For the moment, we make a point of
+      * doing this before calling triggers, so that a user-supplied trigger
+      * could hack the OID if desired.
+      */
+     if (resultRelationDesc->rd_rel->relhasoids)
+         HeapTupleSetOid(tuple, InvalidOid);
+
+     /* BEFORE ROW INSERT Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
+     {
+         HeapTuple    newtuple;
+
+         newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
+
+         if (newtuple == NULL)    /* "do nothing" */
+             return NULL;
+
+         if (newtuple != tuple)    /* modified by Trigger(s) */
+         {
+             /*
+              * Put the modified tuple into a slot for convenience of routines
+              * below.  We assume the tuple was allocated in per-tuple memory
+              * context, and therefore will go away by itself. The tuple table
+              * slot should not try to clear it.
+              */
+             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
+
+             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
+                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
+             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
+             slot = newslot;
+             tuple = newtuple;
+         }
+     }
+
+     /*
+      * Check the constraints of the tuple
+      */
+     if (resultRelationDesc->rd_att->constr)
+         ExecConstraints(resultRelInfo, slot, estate);
+
+     /*
+      * insert the tuple
+      *
+      * Note: heap_insert returns the tid (location) of the new tuple in the
+      * t_self field.
+      */
+     newId = heap_insert(resultRelationDesc, tuple,
+                         estate->es_output_cid, 0, NULL);
+
+     IncrAppended();
+     (estate->es_processed)++;
+     estate->es_lastoid = newId;
+     setLastTid(&(tuple->t_self));
+
+     /*
+      * insert index entries for tuple
+      */
+     if (resultRelInfo->ri_NumIndices > 0)
+         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false);
+
+     /* AFTER ROW INSERT Triggers */
+     ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+         slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                              slot, planSlot);
+
+     return slot;
+ }
+
+ /* ----------------------------------------------------------------
+  *        ExecDelete
+  *
+  *        DELETE is like UPDATE, except that we delete the tuple and no
+  *        index modifications are needed
+  * ----------------------------------------------------------------
+  */
+ static TupleTableSlot *
+ ExecDelete(ItemPointer tupleid,
+            TupleTableSlot *planSlot,
+            EState *estate)
+ {
+     ResultRelInfo* resultRelInfo;
+     Relation    resultRelationDesc;
+     HTSU_Result result;
+     ItemPointerData update_ctid;
+     TransactionId update_xmax;
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relation_info;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /* BEFORE ROW DELETE Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_DELETE] > 0)
+     {
+         bool        dodelete;
+
+         dodelete = ExecBRDeleteTriggers(estate, resultRelInfo, tupleid);
+
+         if (!dodelete)            /* "do nothing" */
+             return planSlot;
+     }
+
+     /*
+      * delete the tuple
+      *
+      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
+      * the row to be deleted is visible to that snapshot, and throw a can't-
+      * serialize error if not.    This is a special-case behavior needed for
+      * referential integrity updates in serializable transactions.
+      */
+ ldelete:;
+     result = heap_delete(resultRelationDesc, tupleid,
+                          &update_ctid, &update_xmax,
+                          estate->es_output_cid,
+                          estate->es_crosscheck_snapshot,
+                          true /* wait for commit */ );
+     switch (result)
+     {
+         case HeapTupleSelfUpdated:
+             /* already deleted by self; nothing to do */
+             return planSlot;
+
+         case HeapTupleMayBeUpdated:
+             break;
+
+         case HeapTupleUpdated:
+             if (IsXactIsoLevelSerializable)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                          errmsg("could not serialize access due to concurrent update")));
+             else if (!ItemPointerEquals(tupleid, &update_ctid))
+             {
+                 TupleTableSlot *epqslot;
+
+                 epqslot = EvalPlanQual(estate,
+                                        resultRelInfo->ri_RangeTableIndex,
+                                        &update_ctid,
+                                        update_xmax);
+                 if (!TupIsNull(epqslot))
+                 {
+                     *tupleid = update_ctid;
+                     goto ldelete;
+                 }
+             }
+             /* tuple already deleted; nothing to do */
+             return planSlot;
+
+         default:
+             elog(ERROR, "unrecognized heap_delete status: %u", result);
+             return NULL;
+     }
+
+     IncrDeleted();
+     (estate->es_processed)++;
+
+     /*
+      * Note: Normally one would think that we have to delete index tuples
+      * associated with the heap tuple now...
+      *
+      * ... but in POSTGRES, we have no need to do this because VACUUM will
+      * take care of it later.  We can't delete index tuples immediately
+      * anyway, since the tuple is still visible to other transactions.
+      */
+
+     /* AFTER ROW DELETE Triggers */
+     ExecARDeleteTriggers(estate, resultRelInfo, tupleid);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+     {
+         /*
+          * We have to put the target tuple into a slot, which means first we
+          * gotta fetch it.    We can use the trigger tuple slot.
+          */
+         TupleTableSlot *slot = estate->es_trig_tuple_slot;
+         HeapTupleData deltuple;
+         Buffer        delbuffer;
+
+         deltuple.t_self = *tupleid;
+         if (!heap_fetch(resultRelationDesc, SnapshotAny,
+                         &deltuple, &delbuffer, false, NULL))
+             elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
+
+         if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
+             ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
+         ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
+
+         planSlot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                              slot, planSlot);
+
+         ExecClearTuple(slot);
+         ReleaseBuffer(delbuffer);
+     }
+
+     return planSlot;
+ }
+
+ /* ----------------------------------------------------------------
+  *        ExecUpdate
+  *
+  *        note: we can't run UPDATE queries with transactions
+  *        off because UPDATEs are actually INSERTs and our
+  *        scan will mistakenly loop forever, updating the tuple
+  *        it just inserted..    This should be fixed but until it
+  *        is, we don't want to get stuck in an infinite loop
+  *        which corrupts your database..
+  * ----------------------------------------------------------------
+  */
+ static TupleTableSlot *
+ ExecUpdate(TupleTableSlot *slot,
+            ItemPointer tupleid,
+            TupleTableSlot *planSlot,
+            EState *estate)
+ {
+     HeapTuple    tuple;
+     ResultRelInfo *resultRelInfo;
+     Relation    resultRelationDesc;
+     HTSU_Result result;
+     ItemPointerData update_ctid;
+     TransactionId update_xmax;
+     List *recheckIndexes = NIL;
+
+     /*
+      * abort the operation if not running transactions
+      */
+     if (IsBootstrapProcessingMode())
+         elog(ERROR, "cannot UPDATE during bootstrap");
+
+     /*
+      * get the heap tuple out of the tuple table slot, making sure we have a
+      * writable copy
+      */
+     tuple = ExecMaterializeSlot(slot);
+
+     /*
+      * get information on the (current) result relation
+      */
+     resultRelInfo = estate->es_result_relation_info;
+     resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+     /* BEFORE ROW UPDATE Triggers */
+     if (resultRelInfo->ri_TrigDesc &&
+         resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_UPDATE] > 0)
+     {
+         HeapTuple    newtuple;
+
+         newtuple = ExecBRUpdateTriggers(estate, resultRelInfo,
+                                         tupleid, tuple);
+
+         if (newtuple == NULL)    /* "do nothing" */
+             return planSlot;
+
+         if (newtuple != tuple)    /* modified by Trigger(s) */
+         {
+             /*
+              * Put the modified tuple into a slot for convenience of routines
+              * below.  We assume the tuple was allocated in per-tuple memory
+              * context, and therefore will go away by itself. The tuple table
+              * slot should not try to clear it.
+              */
+             TupleTableSlot *newslot = estate->es_trig_tuple_slot;
+
+             if (newslot->tts_tupleDescriptor != slot->tts_tupleDescriptor)
+                 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor);
+             ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
+             slot = newslot;
+             tuple = newtuple;
+         }
+     }
+
+     /*
+      * Check the constraints of the tuple
+      *
+      * If we generate a new candidate tuple after EvalPlanQual testing, we
+      * must loop back here and recheck constraints.  (We don't need to redo
+      * triggers, however.  If there are any BEFORE triggers then trigger.c
+      * will have done heap_lock_tuple to lock the correct tuple, so there's no
+      * need to do them again.)
+      */
+ lreplace:;
+     if (resultRelationDesc->rd_att->constr)
+         ExecConstraints(resultRelInfo, slot, estate);
+
+     /*
+      * replace the heap tuple
+      *
+      * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that
+      * the row to be updated is visible to that snapshot, and throw a can't-
+      * serialize error if not.    This is a special-case behavior needed for
+      * referential integrity updates in serializable transactions.
+      */
+     result = heap_update(resultRelationDesc, tupleid, tuple,
+                          &update_ctid, &update_xmax,
+                          estate->es_output_cid,
+                          estate->es_crosscheck_snapshot,
+                          true /* wait for commit */ );
+     switch (result)
+     {
+         case HeapTupleSelfUpdated:
+             /* already deleted by self; nothing to do */
+             return planSlot;
+
+         case HeapTupleMayBeUpdated:
+             break;
+
+         case HeapTupleUpdated:
+             if (IsXactIsoLevelSerializable)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                          errmsg("could not serialize access due to concurrent update")));
+             else if (!ItemPointerEquals(tupleid, &update_ctid))
+             {
+                 TupleTableSlot *epqslot;
+
+                 epqslot = EvalPlanQual(estate,
+                                        resultRelInfo->ri_RangeTableIndex,
+                                        &update_ctid,
+                                        update_xmax);
+                 if (!TupIsNull(epqslot))
+                 {
+                     *tupleid = update_ctid;
+                     slot = ExecFilterJunk(estate->es_result_relation_info->ri_junkFilter, epqslot);
+                     tuple = ExecMaterializeSlot(slot);
+                     goto lreplace;
+                 }
+             }
+             /* tuple already deleted; nothing to do */
+             return planSlot;
+
+         default:
+             elog(ERROR, "unrecognized heap_update status: %u", result);
+             return NULL;
+     }
+
+     IncrReplaced();
+     (estate->es_processed)++;
+
+     /*
+      * Note: instead of having to update the old index tuples associated with
+      * the heap tuple, all we do is form and insert new index tuples. This is
+      * because UPDATEs are actually DELETEs and INSERTs, and index tuple
+      * deletion is done later by VACUUM (see notes in ExecDelete).    All we do
+      * here is insert new index tuples.  -cim 9/27/89
+      */
+
+     /*
+      * insert index entries for tuple
+      *
+      * Note: heap_update returns the tid (location) of the new tuple in the
+      * t_self field.
+      *
+      * If it's a HOT update, we mustn't insert new index entries.
+      */
+     if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
+         recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
+                                                estate, false);
+
+     /* AFTER ROW UPDATE Triggers */
+     ExecARUpdateTriggers(estate, resultRelInfo, tupleid, tuple,
+                          recheckIndexes);
+
+     /* Process RETURNING if present */
+     if (resultRelInfo->ri_projectReturning)
+         slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
+                                     slot, planSlot);
+
+     return slot;
+ }
+
+
+ static void fireBSTriggers(DmlState *node)
+ {
+     /*
+      * Process BEFORE EACH STATEMENT triggers
+      */
+     switch (node->operation)
+     {
+         case CMD_UPDATE:
+             ExecBSUpdateTriggers(node->ps.state, node->ps.state->es_result_relation_info);
+             break;
+         case CMD_DELETE:
+             ExecBSDeleteTriggers(node->ps.state, node->ps.state->es_result_relation_info);
+             break;
+         case CMD_INSERT:
+             ExecBSInsertTriggers(node->ps.state, node->ps.state->es_result_relation_info);
+             break;
+         default:
+             elog(ERROR, "unknown operation");
+             break;
+     }
+ }
+
+
+ static void fireASTriggers(DmlState *node)
+ {
+     /*
+      * Process AFTER EACH STATEMENT triggers
+      */
+     switch (node->operation)
+     {
+         case CMD_UPDATE:
+             ExecASUpdateTriggers(node->ps.state, node->ps.state->es_result_relation_info);
+             break;
+         case CMD_DELETE:
+             ExecASDeleteTriggers(node->ps.state, node->ps.state->es_result_relation_info);
+             break;
+         case CMD_INSERT:
+             ExecASInsertTriggers(node->ps.state, node->ps.state->es_result_relation_info);
+             break;
+         default:
+             elog(ERROR, "unknown operation");
+             break;
+     }
+ }
+
+ TupleTableSlot *
+ ExecDml(DmlState *node)
+ {
+     CmdType operation = node->operation;
+     EState *estate = node->ps.state;
+     JunkFilter *junkfilter;
+     TupleTableSlot *slot;
+     TupleTableSlot *planSlot;
+     ItemPointer tupleid = NULL;
+     ItemPointerData tuple_ctid;
+
+     /* Do we need to fire BEFORE STATEMENT triggers? */
+     if (node->fireBSTriggers)
+     {
+         fireBSTriggers(node);
+         node->fireBSTriggers = false;
+     }
+
+     for (;;)
+     {
+         planSlot = ExecProcNode(node->dmlplans[node->ds_whichplan]);
+         if (TupIsNull(planSlot))
+         {
+             node->ds_whichplan++;
+             if (node->ds_whichplan < node->ds_nplans)
+             {
+                 estate->es_result_relation_info++;
+                 continue;
+             }
+             else
+             {
+                 fireASTriggers(node);
+                 return NULL;
+             }
+         }
+         else
+             break;
+     }
+
+     slot = planSlot;
+
+     if ((junkfilter = estate->es_result_relation_info->ri_junkFilter) != NULL)
+     {
+         /*
+          * extract the 'ctid' junk attribute.
+          */
+         if (operation == CMD_UPDATE || operation == CMD_DELETE)
+         {
+             Datum        datum;
+             bool        isNull;
+
+             datum = ExecGetJunkAttribute(slot, junkfilter->jf_junkAttNo,
+                                              &isNull);
+             /* shouldn't ever get a null result... */
+             if (isNull)
+                 elog(ERROR, "ctid is NULL");
+
+             tupleid = (ItemPointer) DatumGetPointer(datum);
+             tuple_ctid = *tupleid;    /* make sure we don't free the ctid!! */
+             tupleid = &tuple_ctid;
+         }
+
+         if (operation != CMD_DELETE)
+             slot = ExecFilterJunk(junkfilter, slot);
+     }
+
+     switch (operation)
+     {
+         case CMD_INSERT:
+             return ExecInsert(slot, tupleid, planSlot, estate);
+             break;
+         case CMD_UPDATE:
+             return ExecUpdate(slot, tupleid, planSlot, estate);
+             break;
+         case CMD_DELETE:
+             return ExecDelete(tupleid, slot, estate);
+         default:
+             elog(ERROR, "unknown operation");
+             break;
+     }
+
+     return NULL;
+ }
+
+ DmlState *
+ ExecInitDml(Dml *node, EState *estate, int eflags)
+ {
+     DmlState *dmlstate;
+     ResultRelInfo *resultRelInfo;
+     Plan *subplan;
+     ListCell *l;
+     CmdType operation = node->operation;
+     int nplans;
+     int i;
+
+     TupleDesc tupDesc;
+
+     nplans = list_length(node->plans);
+
+     /*
+      * Do we want to scan just one subplan?  (Special case for EvalPlanQual)
+      * XXX pretty dirty way of determining that this case applies ...
+      */
+     if (estate->es_evTuple != NULL)
+     {
+         int tplan;
+
+         tplan = estate->es_result_relation_info - estate->es_result_relations;
+         Assert(tplan >= 0 && tplan < nplans);
+
+         /*
+          * We don't want another DmlNode on top, so just
+          * return a PlanState for the subplan wanted.
+          */
+         return (DmlState *) ExecInitNode(list_nth(node->plans, tplan), estate, eflags);
+     }
+
+     /*
+      * create state structure
+      */
+     dmlstate = makeNode(DmlState);
+     dmlstate->ps.plan = (Plan *) node;
+     dmlstate->ps.state = estate;
+     dmlstate->ps.targetlist = node->plan.targetlist;
+
+     dmlstate->ds_nplans = nplans;
+     dmlstate->dmlplans = (PlanState **) palloc0(sizeof(PlanState *) * nplans);
+     dmlstate->operation = node->operation;
+     dmlstate->fireBSTriggers = true;
+
+     estate->es_result_relation_info = estate->es_result_relations;
+     i = 0;
+     foreach(l, node->plans)
+     {
+         subplan = lfirst(l);
+
+         dmlstate->dmlplans[i] = ExecInitNode(subplan, estate, eflags);
+
+         i++;
+         estate->es_result_relation_info++;
+     }
+
+     estate->es_result_relation_info = estate->es_result_relations;
+
+     dmlstate->ds_whichplan = 0;
+
+     subplan = (Plan *) linitial(node->plans);
+
+     if (node->returningLists)
+     {
+         TupleTableSlot *slot;
+         ExprContext *econtext;
+
+         /*
+          * Initialize result tuple slot and assign
+          * type from the RETURNING list.
+          */
+         tupDesc = ExecTypeFromTL((List *) linitial(node->returningLists),
+                                  false);
+
+         /*
+          * Set up a slot for the output of the RETURNING projection(s).
+          */
+         slot = ExecAllocTableSlot(&estate->es_tupleTable);
+         ExecSetSlotDescriptor(slot, tupDesc);
+
+         econtext = CreateExprContext(estate);
+
+         Assert(list_length(node->returningLists) == estate->es_num_result_relations);
+         resultRelInfo = estate->es_result_relations;
+         foreach(l, node->returningLists)
+         {
+             List       *rlist = (List *) lfirst(l);
+             List       *rliststate;
+
+             rliststate = (List *) ExecInitExpr((Expr *) rlist, &dmlstate->ps);
+             resultRelInfo->ri_projectReturning =
+                 ExecBuildProjectionInfo(rliststate, econtext, slot,
+                                         resultRelInfo->ri_RelationDesc->rd_att);
+             resultRelInfo++;
+         }
+
+         dmlstate->ps.ps_ResultTupleSlot = slot;
+         dmlstate->ps.ps_ExprContext = econtext;
+     }
+     else
+     {
+         ExecInitResultTupleSlot(estate, &dmlstate->ps);
+         tupDesc = ExecTypeFromTL(subplan->targetlist, false);
+         ExecAssignResultType(&dmlstate->ps, tupDesc);
+
+         dmlstate->ps.ps_ExprContext = NULL;
+     }
+
+     /*
+      * Initialize the junk filter if needed. INSERT queries need a filter
+      * if there are any junk attrs in the tlist.  UPDATE and DELETE
+      * always need a filter, since there's always a junk 'ctid' attribute
+      * present --- no need to look first.
+      *
+      * This section of code is also a convenient place to verify that the
+      * output of an INSERT or UPDATE matches the target table(s).
+      */
+     {
+         bool        junk_filter_needed = false;
+         ListCell   *tlist;
+
+         switch (operation)
+         {
+             case CMD_INSERT:
+                 foreach(tlist, subplan->targetlist)
+                 {
+                     TargetEntry *tle = (TargetEntry *) lfirst(tlist);
+
+                     if (tle->resjunk)
+                     {
+                         junk_filter_needed = true;
+                         break;
+                     }
+                 }
+                 break;
+             case CMD_UPDATE:
+             case CMD_DELETE:
+                 junk_filter_needed = true;
+                 break;
+             default:
+                 break;
+         }
+
+         resultRelInfo = estate->es_result_relations;
+
+         if (junk_filter_needed)
+         {
+             /*
+              * If there are multiple result relations, each one needs its own
+              * junk filter.  Note this is only possible for UPDATE/DELETE, so
+              * we can't be fooled by some needing a filter and some not.
+
+              */
+             if (nplans > 1)
+             {
+                 for (i = 0; i < nplans; i++)
+                 {
+                     PlanState *ps = dmlstate->dmlplans[i];
+                     JunkFilter *j;
+
+                     if (operation == CMD_UPDATE)
+                         ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
+                                             ps->plan->targetlist);
+
+                     j = ExecInitJunkFilter(ps->plan->targetlist,
+                             resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
+                                 ExecInitExtraTupleSlot(estate));
+
+
+                     /*
+                      * Since it must be UPDATE/DELETE, there had better be a
+                      * "ctid" junk attribute in the tlist ... but ctid could
+                      * be at a different resno for each result relation. We
+                      * look up the ctid resnos now and save them in the
+                      * junkfilters.
+                      */
+                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
+                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
+                         elog(ERROR, "could not find junk ctid column");
+                     resultRelInfo->ri_junkFilter = j;
+                     resultRelInfo++;
+                 }
+             }
+             else
+             {
+                 JunkFilter *j;
+                 subplan = dmlstate->dmlplans[0]->plan;
+
+                 if (operation == CMD_INSERT || operation == CMD_UPDATE)
+                     ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
+                                         subplan->targetlist);
+
+                 j = ExecInitJunkFilter(subplan->targetlist,
+                                        resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
+                                        ExecInitExtraTupleSlot(estate));
+
+                 if (operation == CMD_UPDATE || operation == CMD_DELETE)
+                 {
+                     /* FOR UPDATE/DELETE, find the ctid junk attr now */
+                     j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
+                     if (!AttributeNumberIsValid(j->jf_junkAttNo))
+                         elog(ERROR, "could not find junk ctid column");
+                 }
+
+                 resultRelInfo->ri_junkFilter = j;
+             }
+         }
+         else
+         {
+             if (operation == CMD_INSERT)
+                 ExecCheckPlanOutput(estate->es_result_relation_info->ri_RelationDesc,
+                                     subplan->targetlist);
+         }
+     }
+
+     return dmlstate;
+ }
+
+ void
+ ExecEndDml(DmlState *node)
+ {
+     int i;
+
+     /*
+      * Free the exprcontext
+      */
+     ExecFreeExprContext(&node->ps);
+
+     /*
+      * clean out the tuple table
+      */
+     ExecClearTuple(node->ps.ps_ResultTupleSlot);
+
+     /*
+      * shut down subplans
+      */
+     for (i=0;i<node->ds_nplans;++i)
+     {
+         ExecEndNode(node->dmlplans[i]);
+     }
+
+     pfree(node->dmlplans);
+ }
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 171,177 **** _copyAppend(Append *from)
       * copy remainder of node
       */
      COPY_NODE_FIELD(appendplans);
-     COPY_SCALAR_FIELD(isTarget);

      return newnode;
  }
--- 171,176 ----
***************
*** 1407,1412 **** _copyXmlExpr(XmlExpr *from)
--- 1406,1426 ----

      return newnode;
  }
+
+ static Dml *
+ _copyDml(Dml *from)
+ {
+     Dml    *newnode = makeNode(Dml);
+
+     CopyPlanFields((Plan *) from, (Plan *) newnode);
+
+     COPY_NODE_FIELD(plans);
+     COPY_SCALAR_FIELD(operation);
+     COPY_NODE_FIELD(returningLists);
+
+     return newnode;
+ }
+

  /*
   * _copyNullIfExpr (same as OpExpr)
***************
*** 4131,4136 **** copyObject(void *from)
--- 4145,4153 ----
          case T_XmlSerialize:
              retval = _copyXmlSerialize(from);
              break;
+         case T_Dml:
+             retval = _copyDml(from);
+             break;

          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 326,332 **** _outAppend(StringInfo str, Append *node)
      _outPlanInfo(str, (Plan *) node);

      WRITE_NODE_FIELD(appendplans);
-     WRITE_BOOL_FIELD(isTarget);
  }

  static void
--- 326,331 ----
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 579,585 **** create_append_plan(PlannerInfo *root, AppendPath *best_path)
          subplans = lappend(subplans, create_plan(root, subpath));
      }

!     plan = make_append(subplans, false, tlist);

      return (Plan *) plan;
  }
--- 579,585 ----
          subplans = lappend(subplans, create_plan(root, subpath));
      }

!     plan = make_append(subplans, tlist);

      return (Plan *) plan;
  }
***************
*** 2621,2627 **** make_worktablescan(List *qptlist,
  }

  Append *
! make_append(List *appendplans, bool isTarget, List *tlist)
  {
      Append       *node = makeNode(Append);
      Plan       *plan = &node->plan;
--- 2621,2627 ----
  }

  Append *
! make_append(List *appendplans, List *tlist)
  {
      Append       *node = makeNode(Append);
      Plan       *plan = &node->plan;
***************
*** 2657,2663 **** make_append(List *appendplans, bool isTarget, List *tlist)
      plan->lefttree = NULL;
      plan->righttree = NULL;
      node->appendplans = appendplans;
-     node->isTarget = isTarget;

      return node;
  }
--- 2657,2662 ----
***************
*** 3665,3670 **** make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
--- 3664,3716 ----
      return node;
  }

+ Dml *
+ make_dml(List *subplans, List *returningLists, CmdType operation)
+ {
+     Dml *node = makeNode(Dml);
+     Plan       *plan = &node->plan;
+     double        total_size;
+     ListCell   *subnode;
+
+     /*
+      * Compute cost as sum of subplan costs.
+      */
+     plan->startup_cost = 0;
+     plan->total_cost = 0;
+     plan->plan_rows = 0;
+     total_size = 0;
+     foreach(subnode, subplans)
+     {
+         Plan       *subplan = (Plan *) lfirst(subnode);
+
+         if (subnode == list_head(subplans))    /* first node? */
+             plan->startup_cost = subplan->startup_cost;
+         plan->total_cost += subplan->total_cost;
+         plan->plan_rows += subplan->plan_rows;
+         total_size += subplan->plan_width * subplan->plan_rows;
+     }
+     if (plan->plan_rows > 0)
+         plan->plan_width = rint(total_size / plan->plan_rows);
+     else
+         plan->plan_width = 0;
+
+     node->plan.lefttree = NULL;
+     node->plan.righttree = NULL;
+     node->plan.qual = NIL;
+
+     if (returningLists)
+         node->plan.targetlist = linitial(returningLists);
+     else
+         node->plan.targetlist = NIL;
+
+     node->plans = subplans;
+     node->returningLists = returningLists;
+
+     node->operation = operation;
+
+     return node;
+ }
+
  /*
   * make_result
   *      Build a Result plan node
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***************
*** 478,485 **** subquery_planner(PlannerGlobal *glob, Query *parse,
--- 478,493 ----
          rt_fetch(parse->resultRelation, parse->rtable)->inh)
          plan = inheritance_planner(root);
      else
+     {
          plan = grouping_planner(root, tuple_fraction);

+         if (parse->commandType != CMD_SELECT)
+             plan = (Plan *) make_dml(list_make1(plan),
+                                      root->returningLists,
+                                      parse->commandType);
+     }
+
+
      /*
       * If any subplans were generated, or if we're inside a subplan, build
       * initPlan list and extParam/allParam sets for plan nodes, and attach the
***************
*** 625,633 **** preprocess_qual_conditions(PlannerInfo *root, Node *jtnode)
   * is an inheritance set. Source inheritance is expanded at the bottom of the
   * plan tree (see allpaths.c), but target inheritance has to be expanded at
   * the top.  The reason is that for UPDATE, each target relation needs a
!  * different targetlist matching its own column set.  Also, for both UPDATE
!  * and DELETE, the executor needs the Append plan node at the top, else it
!  * can't keep track of which table is the current target table.  Fortunately,
   * the UPDATE/DELETE target can never be the nullable side of an outer join,
   * so it's OK to generate the plan this way.
   *
--- 633,639 ----
   * is an inheritance set. Source inheritance is expanded at the bottom of the
   * plan tree (see allpaths.c), but target inheritance has to be expanded at
   * the top.  The reason is that for UPDATE, each target relation needs a
!  * different targetlist matching its own column set.  Fortunately,
   * the UPDATE/DELETE target can never be the nullable side of an outer join,
   * so it's OK to generate the plan this way.
   *
***************
*** 738,748 **** inheritance_planner(PlannerInfo *root)
       */
      parse->rtable = rtable;

!     /* Suppress Append if there's only one surviving child rel */
!     if (list_length(subplans) == 1)
!         return (Plan *) linitial(subplans);
!
!     return (Plan *) make_append(subplans, true, tlist);
  }

  /*--------------------
--- 744,752 ----
       */
      parse->rtable = rtable;

!     return (Plan *) make_dml(subplans,
!                              root->returningLists,
!                              parse->commandType);
  }

  /*--------------------
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***************
*** 375,380 **** set_plan_refs(PlannerGlobal *glob, Plan *plan, int rtoffset)
--- 375,398 ----
              set_join_references(glob, (Join *) plan, rtoffset);
              break;

+         case T_Dml:
+             {
+                 /*
+                  * grouping_planner() already called set_returning_clause_references
+                  * so the targetList's references are already set.
+                  */
+                 Dml *splan = (Dml *) plan;
+
+                 Assert(splan->plan.qual == NIL);
+                 foreach(l, splan->plans)
+                 {
+                     lfirst(l) = set_plan_refs(glob,
+                                               (Plan *) lfirst(l),
+                                               rtoffset);
+                 }
+             }
+             break;
+
          case T_Hash:
          case T_Material:
          case T_Sort:
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***************
*** 2018,2023 **** finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params)
--- 2018,2024 ----
          case T_Unique:
          case T_SetOp:
          case T_Group:
+         case T_Dml:
              break;

          default:
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
***************
*** 448,454 **** generate_union_plan(SetOperationStmt *op, PlannerInfo *root,
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, false, tlist);

      /*
       * For UNION ALL, we just need the Append plan.  For UNION, need to add
--- 448,454 ----
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, tlist);

      /*
       * For UNION ALL, we just need the Append plan.  For UNION, need to add
***************
*** 539,545 **** generate_nonunion_plan(SetOperationStmt *op, PlannerInfo *root,
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, false, tlist);

      /* Identify the grouping semantics */
      groupList = generate_setop_grouplist(op, tlist);
--- 539,545 ----
      /*
       * Append the child results together.
       */
!     plan = (Plan *) make_append(planlist, tlist);

      /* Identify the grouping semantics */
      groupList = generate_setop_grouplist(op, tlist);
*** /dev/null
--- b/src/include/executor/nodeDml.h
***************
*** 0 ****
--- 1,10 ----
+ #ifndef NODEDML_H
+ #define NODEDML_H
+
+ #include "nodes/execnodes.h"
+
+ extern DmlState *ExecInitDml(Dml *node, EState *estate, int eflags);
+ extern TupleTableSlot *ExecDml(DmlState *node);
+ extern void ExecEndDml(DmlState *node);
+
+ #endif
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 976,981 **** typedef struct ResultState
--- 976,997 ----
  } ResultState;

  /* ----------------
+  *     DmlState information
+  * ----------------
+  */
+ typedef struct DmlState
+ {
+     PlanState        ps;                /* its first field is NodeTag */
+     PlanState      **dmlplans;
+     int                ds_nplans;
+     int                ds_whichplan;
+     bool            fireBSTriggers;
+
+     CmdType            operation;
+ } DmlState;
+
+
+ /* ----------------
   *     AppendState information
   *
   *        nplans            how many plans are in the list
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 71,76 **** typedef enum NodeTag
--- 71,77 ----
      T_Hash,
      T_SetOp,
      T_Limit,
+     T_Dml,
      /* this one isn't a subclass of Plan: */
      T_PlanInvalItem,

***************
*** 191,196 **** typedef enum NodeTag
--- 192,198 ----
      T_NullTestState,
      T_CoerceToDomainState,
      T_DomainConstraintState,
+     T_DmlState,

      /*
       * TAGS FOR PLANNER NODES (relation.h)
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
***************
*** 164,185 **** typedef struct Result
      Node       *resconstantqual;
  } Result;

  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
-  *
-  * Append nodes are sometimes used to switch between several result relations
-  * (when the target of an UPDATE or DELETE is an inheritance set).    Such a
-  * node will have isTarget true.  The Append executor is then responsible
-  * for updating the executor state to point at the correct target relation
-  * whenever it switches subplans.
   * ----------------
   */
  typedef struct Append
  {
      Plan        plan;
      List       *appendplans;
-     bool        isTarget;
  } Append;

  /* ----------------
--- 164,188 ----
      Node       *resconstantqual;
  } Result;

+ typedef struct Dml
+ {
+     Plan        plan;
+
+     CmdType        operation;
+     List       *plans;
+     List       *returningLists;
+ } Dml;
+
+
  /* ----------------
   *     Append node -
   *        Generate the concatenation of the results of sub-plans.
   * ----------------
   */
  typedef struct Append
  {
      Plan        plan;
      List       *appendplans;
  } Append;

  /* ----------------
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 41,47 **** extern Plan *optimize_minmax_aggregates(PlannerInfo *root, List *tlist,
  extern Plan *create_plan(PlannerInfo *root, Path *best_path);
  extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
                    Index scanrelid, Plan *subplan, List *subrtable);
! extern Append *make_append(List *appendplans, bool isTarget, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
                       Plan *lefttree, Plan *righttree, int wtParam,
                       List *distinctList, long numGroups);
--- 41,47 ----
  extern Plan *create_plan(PlannerInfo *root, Path *best_path);
  extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
                    Index scanrelid, Plan *subplan, List *subrtable);
! extern Append *make_append(List *appendplans, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
                       Plan *lefttree, Plan *righttree, int wtParam,
                       List *distinctList, long numGroups);
***************
*** 69,74 **** extern Plan *materialize_finished_plan(Plan *subplan);
--- 69,75 ----
  extern Unique *make_unique(Plan *lefttree, List *distinctList);
  extern Limit *make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
             int64 offset_est, int64 count_est);
+ extern Dml *make_dml(List *subplans, List *returningLists, CmdType operation);
  extern SetOp *make_setop(SetOpCmd cmd, SetOpStrategy strategy, Plan *lefttree,
             List *distinctList, AttrNumber flagColIdx, int firstFlag,
             long numGroups, double outputRows);

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

08 October 2009, 18:59:33

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> Sorry for the delay, my master was a bit behind :-( .  I moved the
> trigger code to nodeDml.c with minor changes and removed unused
> resultRelation stuff from DML nodes completely.  This also has the
> README stuff in it.

Hm, I've not compared the two versions of the patch, but what I was
thinking was that I'd like to get the resultRelation stuff out of EState
and have it *only* in the DML nodes.  It sounds like you went in the
other direction --- what was the reason for that?
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

08 October 2009, 19:05:03

Tom Lane wrote:
> Hm, I've not compared the two versions of the patch, but what I was
> thinking was that I'd like to get the resultRelation stuff out of EState
> and have it *only* in the DML nodes.  It sounds like you went in the
> other direction --- what was the reason for that?

I've taken a compromise of that in the writeable CTE code; the DML nodes
have the index to the start of their result relation array which is part
of estate->es_result_relations.  This way the code that currently
depends on estate->es_result_relations works normally.  Also, we set
estate->es_result_relation_info only during ExecInitDml().  I didn't
want to break that either.  The "index to es_result_relations" is a bit
kludgy, but I didn't see any better ways to do this and wanted to move
on with the code and not be stuck in a single place for too long.  But
I'm thinking that this should be part of the writeable CTE patch.

Regards,
Marko Tiikkaja

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

08 October 2009, 19:25:03

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> Tom Lane wrote:
>> Hm, I've not compared the two versions of the patch, but what I was
>> thinking was that I'd like to get the resultRelation stuff out of EState
>> and have it *only* in the DML nodes.  It sounds like you went in the
>> other direction --- what was the reason for that?

> I've taken a compromise of that in the writeable CTE code; the DML nodes
> have the index to the start of their result relation array which is part
> of estate->es_result_relations.  This way the code that currently
> depends on estate->es_result_relations works normally.

OK, that will work I guess, though you'll need to consider how to combine
result_relations arrays from multiple DML nodes.  A quick look shows me
that there are a couple of places that do want to look at all the result
relations.  We could possibly get rid of that but it's not clear it'd
be much less ugly than what you suggest here.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Tom Lane

Date:

09 October 2009, 23:01:44

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> Tom Lane wrote:
>> Could you pull out a patch that includes those changes, please?

> Sorry for the delay, my master was a bit behind :-( .  I moved the
> trigger code to nodeDml.c with minor changes and removed unused
> resultRelation stuff from DML nodes completely.  This also has the
> README stuff in it.

Applied with a moderate amount of editorialization.  Notably, I didn't
like what you'd done with the EvalPlanQual stuff, and after a bit of
reflection decided that the right thing was to teach EvalPlanQual to
execute just the desired subplan.  Also, I put back the marking of
the ModifyTuple node with its target relations, which you'd removed
in the v4 patch --- I'm convinced that that will be necessary in some
form or other later, so taking it out now seemed like moving backward.

I did not do anything about changing EXPLAIN's output of trigger
information.  The stumbling block there is that EXPLAIN executes
queued AFTER trigger events only after finishing the main plan tree
execution.  The part of that that is driven by ModifyTable is just
the *queuing* of the triggers, not their *execution*.  So my previous
claim that all the trigger execution would now be part of ModifyTable
was wrong.  There are several things we could do here:

1. Nothing.  I don't care for this, though, because it will lead to
the inconsistent behavior that BEFORE triggers count as part of the
displayed runtime for ModifyTuple and AFTER triggers don't.

2. Move actual execution of (non-deferred) AFTER triggers inside
ModifyTuple.  This might be a good idea in order to have the most
consistent results for a series of WITH queries, but I'm not sure.

3. Have EXPLAIN show BEFORE triggers as associated with ModifyTuple
while still showing AFTER triggers as "free standing".  Seems a bit
inconsistent.

Comments?

Also, working on this patch made me really want to pull SELECT FOR
UPDATE/SHARE locking out as a separate node too.  We've talked about
that before but never got round to it.  It's really necessary to do
that in order to have something likeINSERT INTO foo SELECT * FROM bar FOR UPDATE;
behave sanely.  I don't recall at the moment whether that worked
sanely before, but it is definitely broken as of CVS tip.  Perhaps
I'll work on that this weekend.
        regards, tom lane

Re: Using results from INSERT ... RETURNING

From

Marko Tiikkaja

Date:

10 October 2009, 05:43:06

Tom Lane wrote:
> Applied with a moderate amount of editorialization.

Thank you!

> Notably, I didn't
> like what you'd done with the EvalPlanQual stuff, and after a bit of
> reflection decided that the right thing was to teach EvalPlanQual to
> execute just the desired subplan.

I didn't really like that either but I didn't have any good ideas.  This
is a lot better.

> Also, I put back the marking of
> the ModifyTuple node with its target relations, which you'd removed
> in the v4 patch --- I'm convinced that that will be necessary in some
> form or other later, so taking it out now seemed like moving backward.

Ok.

> 2. Move actual execution of (non-deferred) AFTER triggers inside
> ModifyTuple.  This might be a good idea in order to have the most
> consistent results for a series of WITH queries, but I'm not sure.

This definitely seems like the best option to me.


Regards,
Marko Tiikkaja