Re: Remembering bug #6123 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Remembering bug #6123
Date
Msg-id 26568.1326475948@sss.pgh.pa.us
Whole thread Raw
In response to Re: Remembering bug #6123  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: Remembering bug #6123  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: Remembering bug #6123  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> You were right.  One of the isolation tests failed an assertion;
> things pass with the attached change, setting the cmax
> conditionally.  Some comments updated.  I hope this is helpful.

I worked over this patch a bit, mostly cosmetically; updated version
attached.  However, we're not there yet.  I have a couple of remaining
concerns:

1. I think the error message needs more thought.  I believe it's
possible to trigger this error not only with BEFORE triggers, but also
with a volatile function used in the query that reaches around and
updates row(s) of the target table.  Now, I don't have the slightest
qualms about breaking any apps that do anything so dirty, but should
we try to generalize the message text to cope with that?

2. The HINT needs work too, as it seems pretty useless as it stands.
I'd have expected some short reference to the technique of re-executing
the update in the trigger and then returning NULL.  (BTW, does this
patch require any documentation changes, and if so where?)

3. I modified heap_lock_tuple to also return a HeapUpdateFailureData,
principally on the grounds that its API should be largely parallel to
heap_update, but having done that I can't help wondering whether its
callers are okay with skipping self-updated tuples.  I see that you
changed EvalPlanQualFetch, but I'm not entirely sure that that is right;
shouldn't it continue to ignore rows modified by the current command
itself?  And you did not touch the other two callers, which definitely
have got issues.  Here is an example, which is basically the reference
count case refactored into a single self-referencing table, so that we
can hit both parent and child rows in one operation:

create table test (
    id int primary key,
    parent int references test,
    data text,
    nchildren int not null default 0
);

create function test_ins_func()
  returns trigger language plpgsql as
$$
begin
  if new.parent is not null then
    update test set nchildren = nchildren + 1 where id = new.parent;
  end if;
  return new;
end;
$$;
create trigger test_ins_trig before insert on test
  for each row execute procedure test_ins_func();

create function test_del_func()
  returns trigger language plpgsql as
$$
begin
  if old.parent is not null then
    update test set nchildren = nchildren - 1 where id = old.parent;
  end if;
  return old;
end;
$$;
create trigger test_del_trig before delete on test
  for each row execute procedure test_del_func();

insert into test values (1, null, 'root');
insert into test values (2, 1, 'root child A');
insert into test values (3, 1, 'root child B');
insert into test values (4, 2, 'grandchild 1');
insert into test values (5, 3, 'grandchild 2');

update test set data = 'root!' where id = 1;

select * from test;

delete from test;

select * from test;

drop table test;
drop function test_ins_func();
drop function test_del_func();

When you run this, with or without the current patch, you get:

 id | parent |     data     | nchildren
----+--------+--------------+-----------
  2 |      1 | root child A |         1
  4 |      2 | grandchild 1 |         0
  3 |      1 | root child B |         1
  5 |      3 | grandchild 2 |         0
  1 |        | root!        |         2
(5 rows)

DELETE 4
 id | parent | data  | nchildren
----+--------+-------+-----------
  1 |        | root! |         0
(1 row)

the reason being that when the outer delete arrives at the last (root!)
row, GetTupleForTrigger fails because the tuple is already self-updated,
and we treat that as canceling the outer delete action.

I'm not sure what to do about this.  If we throw an error there, there
will be no way that the trigger can override the error because it will
never get to run.  Possibly we could plow ahead with the expectation
of throwing an error later if the trigger doesn't cancel the
update/delete, but is it safe to do so if we don't hold lock on the
tuple?  In any case that idea doesn't help with the remaining caller,
ExecLockRows.

            regards, tom lane

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 5f6ac2ec1fdaf7598084150d017d8c85bfeeebe5..eb94060371a41913c63b2187fb4b66012ebfaed3 100644
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
*************** simple_heap_insert(Relation relation, He
*** 2317,2343 ****
   *
   *    relation - table to be modified (caller must hold suitable lock)
   *    tid - TID of tuple to be deleted
-  *    ctid - output parameter, used only for failure case (see below)
-  *    update_xmax - output parameter, used only for failure case (see below)
   *    cid - delete command ID (used for visibility test, and stored into
   *        cmax if successful)
   *    crosscheck - if not InvalidSnapshot, also check tuple against this
   *    wait - true if should wait for any conflicting update to commit/abort
   *
   * Normal, successful return value is HeapTupleMayBeUpdated, which
   * actually means we did delete it.  Failure return codes are
   * HeapTupleSelfUpdated, HeapTupleUpdated, or HeapTupleBeingUpdated
   * (the last only possible if wait == false).
   *
!  * In the failure cases, the routine returns the tuple's t_ctid and t_xmax.
!  * If t_ctid is the same as tid, the tuple was deleted; if different, the
!  * tuple was updated, and t_ctid is the location of the replacement tuple.
!  * (t_xmax is needed to verify that the replacement tuple matches.)
   */
  HTSU_Result
  heap_delete(Relation relation, ItemPointer tid,
!             ItemPointer ctid, TransactionId *update_xmax,
!             CommandId cid, Snapshot crosscheck, bool wait)
  {
      HTSU_Result result;
      TransactionId xid = GetCurrentTransactionId();
--- 2317,2342 ----
   *
   *    relation - table to be modified (caller must hold suitable lock)
   *    tid - TID of tuple to be deleted
   *    cid - delete command ID (used for visibility test, and stored into
   *        cmax if successful)
   *    crosscheck - if not InvalidSnapshot, also check tuple against this
   *    wait - true if should wait for any conflicting update to commit/abort
+  *    hufd - output parameter, filled in failure cases (see below)
   *
   * Normal, successful return value is HeapTupleMayBeUpdated, which
   * actually means we did delete it.  Failure return codes are
   * HeapTupleSelfUpdated, HeapTupleUpdated, or HeapTupleBeingUpdated
   * (the last only possible if wait == false).
   *
!  * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
!  * t_xmax, and t_cmax (the last only for HeapTupleSelfUpdated, since we
!  * cannot obtain cmax from a combocid generated by another transaction).
!  * See comments for struct HeapUpdateFailureData for additional info.
   */
  HTSU_Result
  heap_delete(Relation relation, ItemPointer tid,
!             CommandId cid, Snapshot crosscheck, bool wait,
!             HeapUpdateFailureData *hufd)
  {
      HTSU_Result result;
      TransactionId xid = GetCurrentTransactionId();
*************** l1:
*** 2498,2505 ****
                 result == HeapTupleUpdated ||
                 result == HeapTupleBeingUpdated);
          Assert(!(tp.t_data->t_infomask & HEAP_XMAX_INVALID));
!         *ctid = tp.t_data->t_ctid;
!         *update_xmax = HeapTupleHeaderGetXmax(tp.t_data);
          UnlockReleaseBuffer(buffer);
          if (have_tuple_lock)
              UnlockTuple(relation, &(tp.t_self), ExclusiveLock);
--- 2497,2508 ----
                 result == HeapTupleUpdated ||
                 result == HeapTupleBeingUpdated);
          Assert(!(tp.t_data->t_infomask & HEAP_XMAX_INVALID));
!         hufd->ctid = tp.t_data->t_ctid;
!         hufd->xmax = HeapTupleHeaderGetXmax(tp.t_data);
!         if (result == HeapTupleSelfUpdated)
!             hufd->cmax = HeapTupleHeaderGetCmax(tp.t_data);
!         else
!             hufd->cmax = 0;        /* for lack of an InvalidCommandId value */
          UnlockReleaseBuffer(buffer);
          if (have_tuple_lock)
              UnlockTuple(relation, &(tp.t_self), ExclusiveLock);
*************** void
*** 2631,2643 ****
  simple_heap_delete(Relation relation, ItemPointer tid)
  {
      HTSU_Result result;
!     ItemPointerData update_ctid;
!     TransactionId update_xmax;

      result = heap_delete(relation, tid,
-                          &update_ctid, &update_xmax,
                           GetCurrentCommandId(true), InvalidSnapshot,
!                          true /* wait for commit */ );
      switch (result)
      {
          case HeapTupleSelfUpdated:
--- 2634,2645 ----
  simple_heap_delete(Relation relation, ItemPointer tid)
  {
      HTSU_Result result;
!     HeapUpdateFailureData hufd;

      result = heap_delete(relation, tid,
                           GetCurrentCommandId(true), InvalidSnapshot,
!                          true /* wait for commit */,
!                          &hufd);
      switch (result)
      {
          case HeapTupleSelfUpdated:
*************** simple_heap_delete(Relation relation, It
*** 2668,2679 ****
   *    relation - table to be modified (caller must hold suitable lock)
   *    otid - TID of old tuple to be replaced
   *    newtup - newly constructed tuple data to store
-  *    ctid - output parameter, used only for failure case (see below)
-  *    update_xmax - output parameter, used only for failure case (see below)
   *    cid - update command ID (used for visibility test, and stored into
   *        cmax/cmin if successful)
   *    crosscheck - if not InvalidSnapshot, also check old tuple against this
   *    wait - true if should wait for any conflicting update to commit/abort
   *
   * Normal, successful return value is HeapTupleMayBeUpdated, which
   * actually means we *did* update it.  Failure return codes are
--- 2670,2680 ----
   *    relation - table to be modified (caller must hold suitable lock)
   *    otid - TID of old tuple to be replaced
   *    newtup - newly constructed tuple data to store
   *    cid - update command ID (used for visibility test, and stored into
   *        cmax/cmin if successful)
   *    crosscheck - if not InvalidSnapshot, also check old tuple against this
   *    wait - true if should wait for any conflicting update to commit/abort
+  *    hufd - output parameter, filled in failure cases (see below)
   *
   * Normal, successful return value is HeapTupleMayBeUpdated, which
   * actually means we *did* update it.  Failure return codes are
*************** simple_heap_delete(Relation relation, It
*** 2686,2700 ****
   * update was done.  However, any TOAST changes in the new tuple's
   * data are not reflected into *newtup.
   *
!  * In the failure cases, the routine returns the tuple's t_ctid and t_xmax.
!  * If t_ctid is the same as otid, the tuple was deleted; if different, the
!  * tuple was updated, and t_ctid is the location of the replacement tuple.
!  * (t_xmax is needed to verify that the replacement tuple matches.)
   */
  HTSU_Result
  heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
!             ItemPointer ctid, TransactionId *update_xmax,
!             CommandId cid, Snapshot crosscheck, bool wait)
  {
      HTSU_Result result;
      TransactionId xid = GetCurrentTransactionId();
--- 2687,2701 ----
   * update was done.  However, any TOAST changes in the new tuple's
   * data are not reflected into *newtup.
   *
!  * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
!  * t_xmax, and t_cmax (the last only for HeapTupleSelfUpdated, since we
!  * cannot obtain cmax from a combocid generated by another transaction).
!  * See comments for struct HeapUpdateFailureData for additional info.
   */
  HTSU_Result
  heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
!             CommandId cid, Snapshot crosscheck, bool wait,
!             HeapUpdateFailureData *hufd)
  {
      HTSU_Result result;
      TransactionId xid = GetCurrentTransactionId();
*************** l2:
*** 2873,2880 ****
                 result == HeapTupleUpdated ||
                 result == HeapTupleBeingUpdated);
          Assert(!(oldtup.t_data->t_infomask & HEAP_XMAX_INVALID));
!         *ctid = oldtup.t_data->t_ctid;
!         *update_xmax = HeapTupleHeaderGetXmax(oldtup.t_data);
          UnlockReleaseBuffer(buffer);
          if (have_tuple_lock)
              UnlockTuple(relation, &(oldtup.t_self), ExclusiveLock);
--- 2874,2885 ----
                 result == HeapTupleUpdated ||
                 result == HeapTupleBeingUpdated);
          Assert(!(oldtup.t_data->t_infomask & HEAP_XMAX_INVALID));
!         hufd->ctid = oldtup.t_data->t_ctid;
!         hufd->xmax = HeapTupleHeaderGetXmax(oldtup.t_data);
!         if (result == HeapTupleSelfUpdated)
!             hufd->cmax = HeapTupleHeaderGetCmax(oldtup.t_data);
!         else
!             hufd->cmax = 0;        /* for lack of an InvalidCommandId value */
          UnlockReleaseBuffer(buffer);
          if (have_tuple_lock)
              UnlockTuple(relation, &(oldtup.t_self), ExclusiveLock);
*************** void
*** 3344,3356 ****
  simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
  {
      HTSU_Result result;
!     ItemPointerData update_ctid;
!     TransactionId update_xmax;

      result = heap_update(relation, otid, tup,
-                          &update_ctid, &update_xmax,
                           GetCurrentCommandId(true), InvalidSnapshot,
!                          true /* wait for commit */ );
      switch (result)
      {
          case HeapTupleSelfUpdated:
--- 3349,3360 ----
  simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
  {
      HTSU_Result result;
!     HeapUpdateFailureData hufd;

      result = heap_update(relation, otid, tup,
                           GetCurrentCommandId(true), InvalidSnapshot,
!                          true /* wait for commit */,
!                          &hufd);
      switch (result)
      {
          case HeapTupleSelfUpdated:
*************** simple_heap_update(Relation relation, It
*** 3388,3405 ****
   * Output parameters:
   *    *tuple: all fields filled in
   *    *buffer: set to buffer holding tuple (pinned but not locked at exit)
!  *    *ctid: set to tuple's t_ctid, but only in failure cases
!  *    *update_xmax: set to tuple's xmax, but only in failure cases
   *
   * Function result may be:
   *    HeapTupleMayBeUpdated: lock was successfully acquired
   *    HeapTupleSelfUpdated: lock failed because tuple updated by self
   *    HeapTupleUpdated: lock failed because tuple updated by other xact
   *
!  * In the failure cases, the routine returns the tuple's t_ctid and t_xmax.
!  * If t_ctid is the same as t_self, the tuple was deleted; if different, the
!  * tuple was updated, and t_ctid is the location of the replacement tuple.
!  * (t_xmax is needed to verify that the replacement tuple matches.)
   *
   *
   * NOTES: because the shared-memory lock table is of finite size, but users
--- 3392,3408 ----
   * Output parameters:
   *    *tuple: all fields filled in
   *    *buffer: set to buffer holding tuple (pinned but not locked at exit)
!  *    *hufd: filled in failure cases (see below)
   *
   * Function result may be:
   *    HeapTupleMayBeUpdated: lock was successfully acquired
   *    HeapTupleSelfUpdated: lock failed because tuple updated by self
   *    HeapTupleUpdated: lock failed because tuple updated by other xact
   *
!  * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
!  * t_xmax, and t_cmax (the last only for HeapTupleSelfUpdated, since we
!  * cannot obtain cmax from a combocid generated by another transaction).
!  * See comments for struct HeapUpdateFailureData for additional info.
   *
   *
   * NOTES: because the shared-memory lock table is of finite size, but users
*************** simple_heap_update(Relation relation, It
*** 3435,3443 ****
   * conflict for a tuple, we don't incur any extra overhead.
   */
  HTSU_Result
! heap_lock_tuple(Relation relation, HeapTuple tuple, Buffer *buffer,
!                 ItemPointer ctid, TransactionId *update_xmax,
!                 CommandId cid, LockTupleMode mode, bool nowait)
  {
      HTSU_Result result;
      ItemPointer tid = &(tuple->t_self);
--- 3438,3446 ----
   * conflict for a tuple, we don't incur any extra overhead.
   */
  HTSU_Result
! heap_lock_tuple(Relation relation, HeapTuple tuple,
!                 CommandId cid, LockTupleMode mode, bool nowait,
!                 Buffer *buffer, HeapUpdateFailureData *hufd)
  {
      HTSU_Result result;
      ItemPointer tid = &(tuple->t_self);
*************** l3:
*** 3622,3629 ****
      {
          Assert(result == HeapTupleSelfUpdated || result == HeapTupleUpdated);
          Assert(!(tuple->t_data->t_infomask & HEAP_XMAX_INVALID));
!         *ctid = tuple->t_data->t_ctid;
!         *update_xmax = HeapTupleHeaderGetXmax(tuple->t_data);
          LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
          if (have_tuple_lock)
              UnlockTuple(relation, tid, tuple_lock_type);
--- 3625,3636 ----
      {
          Assert(result == HeapTupleSelfUpdated || result == HeapTupleUpdated);
          Assert(!(tuple->t_data->t_infomask & HEAP_XMAX_INVALID));
!         hufd->ctid = tuple->t_data->t_ctid;
!         hufd->xmax = HeapTupleHeaderGetXmax(tuple->t_data);
!         if (result == HeapTupleSelfUpdated)
!             hufd->cmax = HeapTupleHeaderGetCmax(tuple->t_data);
!         else
!             hufd->cmax = 0;        /* for lack of an InvalidCommandId value */
          LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
          if (have_tuple_lock)
              UnlockTuple(relation, tid, tuple_lock_type);
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index f5e12e5681ddfbcc041bf636140d5179d0e45152..083095db0a93365cd4a253990cafbae77fa7aaac 100644
*** a/src/backend/commands/trigger.c
--- b/src/backend/commands/trigger.c
*************** GetTupleForTrigger(EState *estate,
*** 2565,2572 ****
      if (newSlot != NULL)
      {
          HTSU_Result test;
!         ItemPointerData update_ctid;
!         TransactionId update_xmax;

          *newSlot = NULL;

--- 2565,2571 ----
      if (newSlot != NULL)
      {
          HTSU_Result test;
!         HeapUpdateFailureData hufd;

          *newSlot = NULL;

*************** GetTupleForTrigger(EState *estate,
*** 2578,2587 ****
           */
  ltrmark:;
          tuple.t_self = *tid;
!         test = heap_lock_tuple(relation, &tuple, &buffer,
!                                &update_ctid, &update_xmax,
                                 estate->es_output_cid,
!                                LockTupleExclusive, false);
          switch (test)
          {
              case HeapTupleSelfUpdated:
--- 2577,2586 ----
           */
  ltrmark:;
          tuple.t_self = *tid;
!         test = heap_lock_tuple(relation, &tuple,
                                 estate->es_output_cid,
!                                LockTupleExclusive, false /* wait */,
!                                &buffer, &hufd);
          switch (test)
          {
              case HeapTupleSelfUpdated:
*************** ltrmark:;
*** 2598,2604 ****
                      ereport(ERROR,
                              (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                               errmsg("could not serialize access due to concurrent update")));
!                 if (!ItemPointerEquals(&update_ctid, &tuple.t_self))
                  {
                      /* it was updated, so look at the updated version */
                      TupleTableSlot *epqslot;
--- 2597,2603 ----
                      ereport(ERROR,
                              (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                               errmsg("could not serialize access due to concurrent update")));
!                 if (!ItemPointerEquals(&hufd.ctid, &tuple.t_self))
                  {
                      /* it was updated, so look at the updated version */
                      TupleTableSlot *epqslot;
*************** ltrmark:;
*** 2607,2617 ****
                                             epqstate,
                                             relation,
                                             relinfo->ri_RangeTableIndex,
!                                            &update_ctid,
!                                            update_xmax);
                      if (!TupIsNull(epqslot))
                      {
!                         *tid = update_ctid;
                          *newSlot = epqslot;

                          /*
--- 2606,2616 ----
                                             epqstate,
                                             relation,
                                             relinfo->ri_RangeTableIndex,
!                                            &hufd.ctid,
!                                            hufd.xmax);
                      if (!TupIsNull(epqslot))
                      {
!                         *tid = hufd.ctid;
                          *newSlot = epqslot;

                          /*
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 569d0ba7ade1e0cd64c5566f2ecbbd38773f2b25..0b9e1e48901424e942c6d971952de3bd8d9d0374 100644
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
*************** EvalPlanQualFetch(EState *estate, Relati
*** 1857,1864 ****
          if (heap_fetch(relation, &SnapshotDirty, &tuple, &buffer, true, NULL))
          {
              HTSU_Result test;
!             ItemPointerData update_ctid;
!             TransactionId update_xmax;

              /*
               * If xmin isn't what we're expecting, the slot must have been
--- 1857,1863 ----
          if (heap_fetch(relation, &SnapshotDirty, &tuple, &buffer, true, NULL))
          {
              HTSU_Result test;
!             HeapUpdateFailureData hufd;

              /*
               * If xmin isn't what we're expecting, the slot must have been
*************** EvalPlanQualFetch(EState *estate, Relati
*** 1911,1928 ****
              /*
               * This is a live tuple, so now try to lock it.
               */
!             test = heap_lock_tuple(relation, &tuple, &buffer,
!                                    &update_ctid, &update_xmax,
                                     estate->es_output_cid,
!                                    lockmode, false);
              /* We now have two pins on the buffer, get rid of one */
              ReleaseBuffer(buffer);

              switch (test)
              {
                  case HeapTupleSelfUpdated:
-                     /* treat it as deleted; do not process */
                      ReleaseBuffer(buffer);
                      return NULL;

                  case HeapTupleMayBeUpdated:
--- 1910,1935 ----
              /*
               * This is a live tuple, so now try to lock it.
               */
!             test = heap_lock_tuple(relation, &tuple,
                                     estate->es_output_cid,
!                                    lockmode, false /* wait */,
!                                    &buffer, &hufd);
              /* We now have two pins on the buffer, get rid of one */
              ReleaseBuffer(buffer);

              switch (test)
              {
                  case HeapTupleSelfUpdated:
                      ReleaseBuffer(buffer);
+                     if (!ItemPointerEquals(&hufd.ctid, &tuple.t_self))
+                     {
+                         /* it was updated, so look at the updated version */
+                         tuple.t_self = hufd.ctid;
+                         /* updated row should have xmin matching this xmax */
+                         priorXmax = hufd.xmax;
+                         continue;
+                     }
+                     /* tuple was deleted, so give up */
                      return NULL;

                  case HeapTupleMayBeUpdated:
*************** EvalPlanQualFetch(EState *estate, Relati
*** 1935,1946 ****
                          ereport(ERROR,
                                  (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                                   errmsg("could not serialize access due to concurrent update")));
!                     if (!ItemPointerEquals(&update_ctid, &tuple.t_self))
                      {
                          /* it was updated, so look at the updated version */
!                         tuple.t_self = update_ctid;
                          /* updated row should have xmin matching this xmax */
!                         priorXmax = update_xmax;
                          continue;
                      }
                      /* tuple was deleted, so give up */
--- 1942,1953 ----
                          ereport(ERROR,
                                  (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                                   errmsg("could not serialize access due to concurrent update")));
!                     if (!ItemPointerEquals(&hufd.ctid, &tuple.t_self))
                      {
                          /* it was updated, so look at the updated version */
!                         tuple.t_self = hufd.ctid;
                          /* updated row should have xmin matching this xmax */
!                         priorXmax = hufd.xmax;
                          continue;
                      }
                      /* tuple was deleted, so give up */
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index d2a33cb35682e23168d509cffe5f5f179f25249b..9893d767a12d584486adf91a6a0572e63cabc288 100644
*** a/src/backend/executor/nodeLockRows.c
--- b/src/backend/executor/nodeLockRows.c
*************** lnext:
*** 70,77 ****
          bool        isNull;
          HeapTupleData tuple;
          Buffer        buffer;
!         ItemPointerData update_ctid;
!         TransactionId update_xmax;
          LockTupleMode lockmode;
          HTSU_Result test;
          HeapTuple    copyTuple;
--- 70,76 ----
          bool        isNull;
          HeapTupleData tuple;
          Buffer        buffer;
!         HeapUpdateFailureData hufd;
          LockTupleMode lockmode;
          HTSU_Result test;
          HeapTuple    copyTuple;
*************** lnext:
*** 116,125 ****
          else
              lockmode = LockTupleShared;

!         test = heap_lock_tuple(erm->relation, &tuple, &buffer,
!                                &update_ctid, &update_xmax,
                                 estate->es_output_cid,
!                                lockmode, erm->noWait);
          ReleaseBuffer(buffer);
          switch (test)
          {
--- 115,124 ----
          else
              lockmode = LockTupleShared;

!         test = heap_lock_tuple(erm->relation, &tuple,
                                 estate->es_output_cid,
!                                lockmode, erm->noWait,
!                                &buffer, &hufd);
          ReleaseBuffer(buffer);
          switch (test)
          {
*************** lnext:
*** 136,142 ****
                      ereport(ERROR,
                              (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                               errmsg("could not serialize access due to concurrent update")));
!                 if (ItemPointerEquals(&update_ctid,
                                        &tuple.t_self))
                  {
                      /* Tuple was deleted, so don't return it */
--- 135,141 ----
                      ereport(ERROR,
                              (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                               errmsg("could not serialize access due to concurrent update")));
!                 if (ItemPointerEquals(&hufd.ctid,
                                        &tuple.t_self))
                  {
                      /* Tuple was deleted, so don't return it */
*************** lnext:
*** 145,151 ****

                  /* updated, so fetch and lock the updated version */
                  copyTuple = EvalPlanQualFetch(estate, erm->relation, lockmode,
!                                               &update_ctid, update_xmax);

                  if (copyTuple == NULL)
                  {
--- 144,150 ----

                  /* updated, so fetch and lock the updated version */
                  copyTuple = EvalPlanQualFetch(estate, erm->relation, lockmode,
!                                               &hufd.ctid, hufd.xmax);

                  if (copyTuple == NULL)
                  {
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 37b70b88d53f6f8589c2fbb3036e5f1dc27f95ad..30ff02a45697320aeffc75155bcdbfeb733747db 100644
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
*************** ExecDelete(ItemPointer tupleid,
*** 294,301 ****
      ResultRelInfo *resultRelInfo;
      Relation    resultRelationDesc;
      HTSU_Result result;
!     ItemPointerData update_ctid;
!     TransactionId update_xmax;

      /*
       * get information on the (current) result relation
--- 294,300 ----
      ResultRelInfo *resultRelInfo;
      Relation    resultRelationDesc;
      HTSU_Result result;
!     HeapUpdateFailureData hufd;

      /*
       * get information on the (current) result relation
*************** ExecDelete(ItemPointer tupleid,
*** 347,360 ****
           */
  ldelete:;
          result = heap_delete(resultRelationDesc, tupleid,
-                              &update_ctid, &update_xmax,
                               estate->es_output_cid,
                               estate->es_crosscheck_snapshot,
!                              true /* wait for commit */ );
          switch (result)
          {
              case HeapTupleSelfUpdated:
!                 /* already deleted by self; nothing to do */
                  return NULL;

              case HeapTupleMayBeUpdated:
--- 346,389 ----
           */
  ldelete:;
          result = heap_delete(resultRelationDesc, tupleid,
                               estate->es_output_cid,
                               estate->es_crosscheck_snapshot,
!                              true /* wait for commit */,
!                              &hufd);
          switch (result)
          {
              case HeapTupleSelfUpdated:
!                 /*
!                  * The target tuple was already updated or deleted by the
!                  * current command, or by a later command in the current
!                  * transaction.  The former case is possible in a join DELETE
!                  * where multiple tuples join to the same target tuple.
!                  * This is somewhat questionable, but Postgres has always
!                  * allowed it: we just ignore additional deletion attempts.
!                  *
!                  * The latter case arises if the tuple is modified by a
!                  * command in a BEFORE trigger, or perhaps by a command in a
!                  * volatile function used in the query.  In such situations we
!                  * should not ignore the deletion, but it is equally unsafe to
!                  * proceed.  We don't want to discard the original DELETE
!                  * while keeping the triggered actions based on its deletion;
!                  * and it would be no better to allow the original DELETE
!                  * while discarding updates that it triggered.  The row update
!                  * carries some information that might be important according
!                  * to business rules; so throwing an error is the only safe
!                  * course.
!                  *
!                  * If a trigger actually intends this type of interaction,
!                  * it can re-execute the DELETE and then return NULL to
!                  * cancel the outer delete.
!                  */
!                 if (hufd.cmax != estate->es_output_cid)
!                     ereport(ERROR,
!                             (errcode(ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION),
!                              errmsg("cannot update or delete a row from its BEFORE DELETE trigger"),
!                              errhint("Consider moving code to an AFTER DELETE trigger.")));
!
!                 /* Else, already deleted by self; nothing to do */
                  return NULL;

              case HeapTupleMayBeUpdated:
*************** ldelete:;
*** 365,371 ****
                      ereport(ERROR,
                              (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                               errmsg("could not serialize access due to concurrent update")));
!                 if (!ItemPointerEquals(tupleid, &update_ctid))
                  {
                      TupleTableSlot *epqslot;

--- 394,400 ----
                      ereport(ERROR,
                              (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                               errmsg("could not serialize access due to concurrent update")));
!                 if (!ItemPointerEquals(tupleid, &hufd.ctid))
                  {
                      TupleTableSlot *epqslot;

*************** ldelete:;
*** 373,383 ****
                                             epqstate,
                                             resultRelationDesc,
                                             resultRelInfo->ri_RangeTableIndex,
!                                            &update_ctid,
!                                            update_xmax);
                      if (!TupIsNull(epqslot))
                      {
!                         *tupleid = update_ctid;
                          goto ldelete;
                      }
                  }
--- 402,412 ----
                                             epqstate,
                                             resultRelationDesc,
                                             resultRelInfo->ri_RangeTableIndex,
!                                            &hufd.ctid,
!                                            hufd.xmax);
                      if (!TupIsNull(epqslot))
                      {
!                         *tupleid = hufd.ctid;
                          goto ldelete;
                      }
                  }
*************** ExecUpdate(ItemPointer tupleid,
*** 481,488 ****
      ResultRelInfo *resultRelInfo;
      Relation    resultRelationDesc;
      HTSU_Result result;
!     ItemPointerData update_ctid;
!     TransactionId update_xmax;
      List       *recheckIndexes = NIL;

      /*
--- 510,516 ----
      ResultRelInfo *resultRelInfo;
      Relation    resultRelationDesc;
      HTSU_Result result;
!     HeapUpdateFailureData hufd;
      List       *recheckIndexes = NIL;

      /*
*************** lreplace:;
*** 563,576 ****
           * mode transactions.
           */
          result = heap_update(resultRelationDesc, tupleid, tuple,
-                              &update_ctid, &update_xmax,
                               estate->es_output_cid,
                               estate->es_crosscheck_snapshot,
!                              true /* wait for commit */ );
          switch (result)
          {
              case HeapTupleSelfUpdated:
!                 /* already deleted by self; nothing to do */
                  return NULL;

              case HeapTupleMayBeUpdated:
--- 591,633 ----
           * mode transactions.
           */
          result = heap_update(resultRelationDesc, tupleid, tuple,
                               estate->es_output_cid,
                               estate->es_crosscheck_snapshot,
!                              true /* wait for commit */,
!                              &hufd);
          switch (result)
          {
              case HeapTupleSelfUpdated:
!                 /*
!                  * The target tuple was already updated or deleted by the
!                  * current command, or by a later command in the current
!                  * transaction.  The former case is possible in a join UPDATE
!                  * where multiple tuples join to the same target tuple.
!                  * This is pretty questionable, but Postgres has always
!                  * allowed it: we just execute the first update action and
!                  * ignore additional update attempts.
!                  *
!                  * The latter case arises if the tuple is modified by a
!                  * command in a BEFORE trigger, or perhaps by a command in a
!                  * volatile function used in the query.  In such situations we
!                  * should not ignore the update, but it is equally unsafe to
!                  * proceed.  We don't want to discard the original UPDATE
!                  * while keeping the triggered actions based on it; and we
!                  * have no principled way to merge this update with the
!                  * previous ones.  So throwing an error is the only safe
!                  * course.
!                  *
!                  * If a trigger actually intends this type of interaction,
!                  * it can re-execute the UPDATE (assuming it can figure out
!                  * how) and then return NULL to cancel the outer update.
!                  */
!                 if (hufd.cmax != estate->es_output_cid)
!                     ereport(ERROR,
!                             (errcode(ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION),
!                              errmsg("cannot update or delete a row from its BEFORE UPDATE trigger"),
!                              errhint("Consider moving code to an AFTER UPDATE trigger.")));
!
!                 /* Else, already updated by self; nothing to do */
                  return NULL;

              case HeapTupleMayBeUpdated:
*************** lreplace:;
*** 581,587 ****
                      ereport(ERROR,
                              (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                               errmsg("could not serialize access due to concurrent update")));
!                 if (!ItemPointerEquals(tupleid, &update_ctid))
                  {
                      TupleTableSlot *epqslot;

--- 638,644 ----
                      ereport(ERROR,
                              (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                               errmsg("could not serialize access due to concurrent update")));
!                 if (!ItemPointerEquals(tupleid, &hufd.ctid))
                  {
                      TupleTableSlot *epqslot;

*************** lreplace:;
*** 589,599 ****
                                             epqstate,
                                             resultRelationDesc,
                                             resultRelInfo->ri_RangeTableIndex,
!                                            &update_ctid,
!                                            update_xmax);
                      if (!TupIsNull(epqslot))
                      {
!                         *tupleid = update_ctid;
                          slot = ExecFilterJunk(resultRelInfo->ri_junkFilter, epqslot);
                          tuple = ExecMaterializeSlot(slot);
                          goto lreplace;
--- 646,656 ----
                                             epqstate,
                                             resultRelationDesc,
                                             resultRelInfo->ri_RangeTableIndex,
!                                            &hufd.ctid,
!                                            hufd.xmax);
                      if (!TupIsNull(epqslot))
                      {
!                         *tupleid = hufd.ctid;
                          slot = ExecFilterJunk(resultRelInfo->ri_junkFilter, epqslot);
                          tuple = ExecMaterializeSlot(slot);
                          goto lreplace;
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index fa38803b249ffd2407d77f6efd4e0203975f2349..25f3b988e0e3de7d1d24d6402d22bae8f67f9945 100644
*** a/src/include/access/heapam.h
--- b/src/include/access/heapam.h
*************** typedef enum
*** 35,40 ****
--- 35,63 ----
      LockTupleExclusive
  } LockTupleMode;

+ /*
+  * When heap_update, heap_delete, or heap_lock_tuple fails because the target
+  * tuple is already outdated, it fills in this struct to provide information
+  * to the caller about what happened.
+  * ctid is the target's ctid link: it is the same as the target's TID if the
+  * target was deleted, or the location of the replacement tuple if the target
+  * was updated.
+  * xmax is the outdating transaction's XID.  If the caller wants to visit the
+  * replacement tuple, it must check that this matches before believing the
+  * replacement is really a match.
+  * cmax is the outdating command's CID, but only when the failure code is
+  * HeapTupleSelfUpdated (i.e., something in the current transaction outdated
+  * the tuple); otherwise cmax is zero.  (We make this restriction because
+  * HeapTupleHeaderGetCmax doesn't work for tuples outdated in other
+  * transactions.)
+  */
+ typedef struct HeapUpdateFailureData
+ {
+     ItemPointerData        ctid;
+     TransactionId        xmax;
+     CommandId            cmax;
+ } HeapUpdateFailureData;
+

  /* ----------------
   *        function prototypes for heap access method
*************** extern Oid heap_insert(Relation relation
*** 100,115 ****
  extern void heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
                    CommandId cid, int options, BulkInsertState bistate);
  extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
!             ItemPointer ctid, TransactionId *update_xmax,
!             CommandId cid, Snapshot crosscheck, bool wait);
  extern HTSU_Result heap_update(Relation relation, ItemPointer otid,
              HeapTuple newtup,
!             ItemPointer ctid, TransactionId *update_xmax,
!             CommandId cid, Snapshot crosscheck, bool wait);
  extern HTSU_Result heap_lock_tuple(Relation relation, HeapTuple tuple,
!                 Buffer *buffer, ItemPointer ctid,
!                 TransactionId *update_xmax, CommandId cid,
!                 LockTupleMode mode, bool nowait);
  extern void heap_inplace_update(Relation relation, HeapTuple tuple);
  extern bool heap_freeze_tuple(HeapTupleHeader tuple, TransactionId cutoff_xid,
                    Buffer buf);
--- 123,137 ----
  extern void heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
                    CommandId cid, int options, BulkInsertState bistate);
  extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
!             CommandId cid, Snapshot crosscheck, bool wait,
!             HeapUpdateFailureData *hufd);
  extern HTSU_Result heap_update(Relation relation, ItemPointer otid,
              HeapTuple newtup,
!             CommandId cid, Snapshot crosscheck, bool wait,
!             HeapUpdateFailureData *hufd);
  extern HTSU_Result heap_lock_tuple(Relation relation, HeapTuple tuple,
!                 CommandId cid, LockTupleMode mode, bool nowait,
!                 Buffer *buffer, HeapUpdateFailureData *hufd);
  extern void heap_inplace_update(Relation relation, HeapTuple tuple);
  extern bool heap_freeze_tuple(HeapTupleHeader tuple, TransactionId cutoff_xid,
                    Buffer buf);
diff --git a/src/test/regress/expected/triggers.out b/src/test/regress/expected/triggers.out
index b4d391974d21e9e94e75124c1cd0e022527f68e9..2392598061ff648c36fc863d908e7ff38ff2b727 100644
*** a/src/test/regress/expected/triggers.out
--- b/src/test/regress/expected/triggers.out
*************** NOTICE:  drop cascades to 2 other object
*** 1443,1445 ****
--- 1443,1575 ----
  DETAIL:  drop cascades to view city_view
  drop cascades to view european_city_view
  DROP TABLE country_table;
+ --
+ -- Test updates to rows during firing of BEFORE ROW triggers.
+ -- As of 9.2, such cases should be rejected (see bug #6123).
+ --
+ create temp table parent (
+     aid int not null primary key,
+     val1 text,
+     val2 text,
+     val3 text,
+     val4 text,
+     bcnt int not null default 0);
+ NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "parent_pkey" for table "parent"
+ create temp table child (
+     bid int not null primary key,
+     aid int not null,
+     val1 text);
+ NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "child_pkey" for table "child"
+ create function parent_upd_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   if old.val1 <> new.val1 then
+     new.val2 = new.val1;
+     delete from child where child.aid = new.aid and child.val1 = new.val1;
+   end if;
+   return new;
+ end;
+ $$;
+ create trigger parent_upd_trig before update on parent
+   for each row execute procedure parent_upd_func();
+ create function parent_del_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   delete from child where aid = old.aid;
+   return old;
+ end;
+ $$;
+ create trigger parent_del_trig before delete on parent
+   for each row execute procedure parent_del_func();
+ create function child_ins_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   update parent set bcnt = bcnt + 1 where aid = new.aid;
+   return new;
+ end;
+ $$;
+ create trigger child_ins_trig after insert on child
+   for each row execute procedure child_ins_func();
+ create function child_del_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   update parent set bcnt = bcnt - 1 where aid = old.aid;
+   return old;
+ end;
+ $$;
+ create trigger child_del_trig after delete on child
+   for each row execute procedure child_del_func();
+ insert into parent values (1, 'a', 'a', 'a', 'a', 0);
+ insert into child values (10, 1, 'b');
+ select * from parent; select * from child;
+  aid | val1 | val2 | val3 | val4 | bcnt
+ -----+------+------+------+------+------
+    1 | a    | a    | a    | a    |    1
+ (1 row)
+
+  bid | aid | val1
+ -----+-----+------
+   10 |   1 | b
+ (1 row)
+
+ update parent set val1 = 'b' where aid = 1; -- should fail
+ ERROR:  cannot update or delete a row from its BEFORE UPDATE trigger
+ HINT:  Consider moving code to an AFTER UPDATE trigger.
+ select * from parent; select * from child;
+  aid | val1 | val2 | val3 | val4 | bcnt
+ -----+------+------+------+------+------
+    1 | a    | a    | a    | a    |    1
+ (1 row)
+
+  bid | aid | val1
+ -----+-----+------
+   10 |   1 | b
+ (1 row)
+
+ delete from parent where aid = 1; -- should fail
+ ERROR:  cannot update or delete a row from its BEFORE DELETE trigger
+ HINT:  Consider moving code to an AFTER DELETE trigger.
+ select * from parent; select * from child;
+  aid | val1 | val2 | val3 | val4 | bcnt
+ -----+------+------+------+------+------
+    1 | a    | a    | a    | a    |    1
+ (1 row)
+
+  bid | aid | val1
+ -----+-----+------
+   10 |   1 | b
+ (1 row)
+
+ -- replace the trigger function with one that restarts the deletion after
+ -- having modified a child
+ create or replace function parent_del_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   delete from child where aid = old.aid;
+   if found then
+     delete from parent where aid = old.aid;
+     return null; -- cancel outer deletion
+   end if;
+   return old;
+ end;
+ $$;
+ delete from parent where aid = 1;
+ select * from parent; select * from child;
+  aid | val1 | val2 | val3 | val4 | bcnt
+ -----+------+------+------+------+------
+ (0 rows)
+
+  bid | aid | val1
+ -----+-----+------
+ (0 rows)
+
+ drop table parent, child;
+ drop function parent_upd_func();
+ drop function parent_del_func();
+ drop function child_ins_func();
+ drop function child_del_func();
diff --git a/src/test/regress/sql/triggers.sql b/src/test/regress/sql/triggers.sql
index e52131dbba2d7a45c5f9365f24721de8a07e3253..1848868eca8e4d852ca614ad0336e1168a028539 100644
*** a/src/test/regress/sql/triggers.sql
--- b/src/test/regress/sql/triggers.sql
*************** SELECT * FROM city_view;
*** 961,963 ****
--- 961,1062 ----

  DROP TABLE city_table CASCADE;
  DROP TABLE country_table;
+
+ --
+ -- Test updates to rows during firing of BEFORE ROW triggers.
+ -- As of 9.2, such cases should be rejected (see bug #6123).
+ --
+
+ create temp table parent (
+     aid int not null primary key,
+     val1 text,
+     val2 text,
+     val3 text,
+     val4 text,
+     bcnt int not null default 0);
+ create temp table child (
+     bid int not null primary key,
+     aid int not null,
+     val1 text);
+
+ create function parent_upd_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   if old.val1 <> new.val1 then
+     new.val2 = new.val1;
+     delete from child where child.aid = new.aid and child.val1 = new.val1;
+   end if;
+   return new;
+ end;
+ $$;
+ create trigger parent_upd_trig before update on parent
+   for each row execute procedure parent_upd_func();
+
+ create function parent_del_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   delete from child where aid = old.aid;
+   return old;
+ end;
+ $$;
+ create trigger parent_del_trig before delete on parent
+   for each row execute procedure parent_del_func();
+
+ create function child_ins_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   update parent set bcnt = bcnt + 1 where aid = new.aid;
+   return new;
+ end;
+ $$;
+ create trigger child_ins_trig after insert on child
+   for each row execute procedure child_ins_func();
+
+ create function child_del_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   update parent set bcnt = bcnt - 1 where aid = old.aid;
+   return old;
+ end;
+ $$;
+ create trigger child_del_trig after delete on child
+   for each row execute procedure child_del_func();
+
+ insert into parent values (1, 'a', 'a', 'a', 'a', 0);
+ insert into child values (10, 1, 'b');
+ select * from parent; select * from child;
+
+ update parent set val1 = 'b' where aid = 1; -- should fail
+ select * from parent; select * from child;
+
+ delete from parent where aid = 1; -- should fail
+ select * from parent; select * from child;
+
+ -- replace the trigger function with one that restarts the deletion after
+ -- having modified a child
+ create or replace function parent_del_func()
+   returns trigger language plpgsql as
+ $$
+ begin
+   delete from child where aid = old.aid;
+   if found then
+     delete from parent where aid = old.aid;
+     return null; -- cancel outer deletion
+   end if;
+   return old;
+ end;
+ $$;
+
+ delete from parent where aid = 1;
+ select * from parent; select * from child;
+
+ drop table parent, child;
+
+ drop function parent_upd_func();
+ drop function parent_del_func();
+ drop function child_ins_func();
+ drop function child_del_func();

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Disabled features on Hot Standby
Next
From: Jeff Janes
Date:
Subject: Re: Standalone synchronous master