Thread: Phantom Command IDs, updated patch

Phantom Command IDs, updated patch

From
Heikki Linnakangas
Date:
Here's an updated version of the phantom command ids patch.

I found one more subtle safety issue. The array and hash table for
phantom command ids is dynamically grown when HeapTupleHeaderSetCmax is
called. Unfortunately, since HeapTupleHeaderSetCmax is used inside a
critical sections, running out of memory while trying to grow them would
cause a PANIC. That's why I moved the SetXmax/SetCmax calls outside
critical sections in heapam.c. I believe that's safe; if a backend
aborts after setting the xmax/cmax, no-one is going to care about the
xid of an aborted transaction in there.

Per Tom's suggestion, I replaced the function cache code in fmgr.c and
similar code in plperl.c, pltcl.c, plpgsql/pl_comp.c and plpython.c to
use xmin+tid instead of xmin+cmin for the up-to-dateness check. I don't
have any tcl, perl or python test cases handy to test them, but the
change is small and essentially same for all of the above. Is there any
regression tests for the PL languages?

I made cmin and cmax system attributes aliases for the same physical
commandid field. I support the idea of a complete overhaul of those
system attributes, but let's do that in a separate patch.

To measure the overhead, I ran a plpgsql test case that updates a single
row 10000 times in a loop, generating a new phantom command id in each
iteration. The test took ~5% longer with the patch, so I think that's
acceptable. I couldn't measure a difference with pgbench (as expected).

I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
and ifdef blocks before applying.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/common/heaptuple.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/common/heaptuple.c,v
retrieving revision 1.114
diff -c -r1.114 heaptuple.c
*** src/backend/access/common/heaptuple.c    9 Jan 2007 22:00:59 -0000    1.114
--- src/backend/access/common/heaptuple.c    29 Jan 2007 18:01:52 -0000
***************
*** 563,568 ****
--- 563,569 ----
  heap_getsysattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  {
      Datum        result;
+     CommandId    commandId;

      Assert(tup);

***************
*** 582,596 ****
          case MinTransactionIdAttributeNumber:
              result = TransactionIdGetDatum(HeapTupleHeaderGetXmin(tup->t_data));
              break;
          case MinCommandIdAttributeNumber:
!             result = CommandIdGetDatum(HeapTupleHeaderGetCmin(tup->t_data));
              break;
          case MaxTransactionIdAttributeNumber:
              result = TransactionIdGetDatum(HeapTupleHeaderGetXmax(tup->t_data));
              break;
-         case MaxCommandIdAttributeNumber:
-             result = CommandIdGetDatum(HeapTupleHeaderGetCmax(tup->t_data));
-             break;
          case TableOidAttributeNumber:
              result = ObjectIdGetDatum(tup->t_tableOid);
              break;
--- 583,598 ----
          case MinTransactionIdAttributeNumber:
              result = TransactionIdGetDatum(HeapTupleHeaderGetXmin(tup->t_data));
              break;
+         /* cmin and cmax are now both aliases for the same commandid field,
+          * which can in fact also be a phantom command id */
          case MinCommandIdAttributeNumber:
!         case MaxCommandIdAttributeNumber:
!             commandId = tup->t_data->t_choice.t_heap.t_field4.t_commandid;
!             result = CommandIdGetDatum(commandId);
              break;
          case MaxTransactionIdAttributeNumber:
              result = TransactionIdGetDatum(HeapTupleHeaderGetXmax(tup->t_data));
              break;
          case TableOidAttributeNumber:
              result = ObjectIdGetDatum(tup->t_tableOid);
              break;
Index: src/backend/access/heap/heapam.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v
retrieving revision 1.225
diff -c -r1.225 heapam.c
*** src/backend/access/heap/heapam.c    25 Jan 2007 02:17:25 -0000    1.225
--- src/backend/access/heap/heapam.c    30 Jan 2007 10:18:20 -0000
***************
*** 1407,1414 ****
      tup->t_data->t_infomask |= HEAP_XMAX_INVALID;
      HeapTupleHeaderSetXmin(tup->t_data, xid);
      HeapTupleHeaderSetCmin(tup->t_data, cid);
!     HeapTupleHeaderSetXmax(tup->t_data, 0);        /* zero out Datum fields */
!     HeapTupleHeaderSetCmax(tup->t_data, 0);        /* for cleanliness */
      tup->t_tableOid = RelationGetRelid(relation);

      /*
--- 1407,1413 ----
      tup->t_data->t_infomask |= HEAP_XMAX_INVALID;
      HeapTupleHeaderSetXmin(tup->t_data, xid);
      HeapTupleHeaderSetCmin(tup->t_data, cid);
!     HeapTupleHeaderSetXmax(tup->t_data, 0);
      tup->t_tableOid = RelationGetRelid(relation);

      /*
***************
*** 1725,1732 ****
          return result;
      }

-     START_CRIT_SECTION();
-
      /* store transaction information of xact deleting the tuple */
      tp.t_data->t_infomask &= ~(HEAP_XMAX_COMMITTED |
                                 HEAP_XMAX_INVALID |
--- 1724,1729 ----
***************
*** 1734,1743 ****
                                 HEAP_IS_LOCKED |
                                 HEAP_MOVED);
      HeapTupleHeaderSetXmax(tp.t_data, xid);
!     HeapTupleHeaderSetCmax(tp.t_data, cid);
      /* Make sure there is no forward chain link in t_ctid */
      tp.t_data->t_ctid = tp.t_self;

      MarkBufferDirty(buffer);

      /* XLOG stuff */
--- 1731,1742 ----
                                 HEAP_IS_LOCKED |
                                 HEAP_MOVED);
      HeapTupleHeaderSetXmax(tp.t_data, xid);
!     HeapTupleHeaderSetCmax(tp.t_data, cid);    /* HeapTupleHeaderSetCmax can ereport(ERROR) */
      /* Make sure there is no forward chain link in t_ctid */
      tp.t_data->t_ctid = tp.t_self;

+     START_CRIT_SECTION();
+
      MarkBufferDirty(buffer);

      /* XLOG stuff */
***************
*** 2059,2066 ****
      newtup->t_data->t_infomask |= (HEAP_XMAX_INVALID | HEAP_UPDATED);
      HeapTupleHeaderSetXmin(newtup->t_data, xid);
      HeapTupleHeaderSetCmin(newtup->t_data, cid);
!     HeapTupleHeaderSetXmax(newtup->t_data, 0);    /* zero out Datum fields */
!     HeapTupleHeaderSetCmax(newtup->t_data, 0);    /* for cleanliness */

      /*
       * If the toaster needs to be activated, OR if the new tuple will not fit
--- 2058,2064 ----
      newtup->t_data->t_infomask |= (HEAP_XMAX_INVALID | HEAP_UPDATED);
      HeapTupleHeaderSetXmin(newtup->t_data, xid);
      HeapTupleHeaderSetCmin(newtup->t_data, cid);
!     HeapTupleHeaderSetXmax(newtup->t_data, 0);

      /*
       * If the toaster needs to be activated, OR if the new tuple will not fit
***************
*** 2171,2181 ****
       * one pin is held.
       */

-     /* NO EREPORT(ERROR) from here till changes are logged */
-     START_CRIT_SECTION();
-
-     RelationPutHeapTuple(relation, newbuf, heaptup);    /* insert new tuple */
-
      if (!already_marked)
      {
          oldtup.t_data->t_infomask &= ~(HEAP_XMAX_COMMITTED |
--- 2169,2174 ----
***************
*** 2184,2192 ****
--- 2177,2191 ----
                                         HEAP_IS_LOCKED |
                                         HEAP_MOVED);
          HeapTupleHeaderSetXmax(oldtup.t_data, xid);
+         /* HeapTupleHeaderSetCmax can ereport(ERROR) */
          HeapTupleHeaderSetCmax(oldtup.t_data, cid);
      }

+     /* NO EREPORT(ERROR) from here till changes are logged */
+     START_CRIT_SECTION();
+
+     RelationPutHeapTuple(relation, newbuf, heaptup);    /* insert new tuple */
+
      /* record address of new tuple in t_ctid of old one */
      oldtup.t_data->t_ctid = heaptup->t_self;

***************
*** 2683,2690 ****
          new_infomask |= HEAP_XMAX_EXCL_LOCK;
      }

-     START_CRIT_SECTION();
-
      /*
       * Store transaction information of xact locking the tuple.
       *
--- 2682,2687 ----
***************
*** 2697,2702 ****
--- 2694,2701 ----
      /* Make sure there is no forward chain link in t_ctid */
      tuple->t_data->t_ctid = *tid;

+     START_CRIT_SECTION();
+
      MarkBufferDirty(*buffer);

      /*
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.231
diff -c -r1.231 xact.c
*** src/backend/access/transam/xact.c    5 Jan 2007 22:19:23 -0000    1.231
--- src/backend/access/transam/xact.c    29 Jan 2007 12:27:03 -0000
***************
*** 43,48 ****
--- 43,49 ----
  #include "utils/memutils.h"
  #include "utils/relcache.h"
  #include "utils/guc.h"
+ #include "utils/phantomcid.h"


  /*
***************
*** 1628,1633 ****
--- 1629,1635 ----
      AtEOXact_Namespace(true);
      /* smgrcommit already done */
      AtEOXact_Files();
+     AtEOXact_PhantomCid();
      pgstat_count_xact_commit();
      pgstat_report_txn_timestamp(0);

***************
*** 1844,1849 ****
--- 1846,1852 ----
      AtEOXact_Namespace(true);
      /* smgrcommit already done */
      AtEOXact_Files();
+     AtEOXact_PhantomCid();

      CurrentResourceOwner = NULL;
      ResourceOwnerDelete(TopTransactionResourceOwner);
***************
*** 1995,2000 ****
--- 1998,2004 ----
      AtEOXact_Namespace(false);
      smgrabort();
      AtEOXact_Files();
+     AtEOXact_PhantomCid();
      pgstat_count_xact_rollback();
      pgstat_report_txn_timestamp(0);

Index: src/backend/utils/fmgr/fmgr.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/utils/fmgr/fmgr.c,v
retrieving revision 1.103
diff -c -r1.103 fmgr.c
*** src/backend/utils/fmgr/fmgr.c    5 Jan 2007 22:19:43 -0000    1.103
--- src/backend/utils/fmgr/fmgr.c    29 Jan 2007 15:58:26 -0000
***************
*** 64,70 ****
      /* fn_oid is the hash key and so must be first! */
      Oid            fn_oid;            /* OID of an external C function */
      TransactionId fn_xmin;        /* for checking up-to-dateness */
!     CommandId    fn_cmin;
      PGFunction    user_fn;        /* the function's address */
      const Pg_finfo_record *inforec;        /* address of its info record */
  } CFuncHashTabEntry;
--- 64,70 ----
      /* fn_oid is the hash key and so must be first! */
      Oid            fn_oid;            /* OID of an external C function */
      TransactionId fn_xmin;        /* for checking up-to-dateness */
!     ItemPointerData    fn_tid;
      PGFunction    user_fn;        /* the function's address */
      const Pg_finfo_record *inforec;        /* address of its info record */
  } CFuncHashTabEntry;
***************
*** 483,489 ****
      if (entry == NULL)
          return NULL;            /* no such entry */
      if (entry->fn_xmin == HeapTupleHeaderGetXmin(procedureTuple->t_data) &&
!         entry->fn_cmin == HeapTupleHeaderGetCmin(procedureTuple->t_data))
          return entry;            /* OK */
      return NULL;                /* entry is out of date */
  }
--- 483,489 ----
      if (entry == NULL)
          return NULL;            /* no such entry */
      if (entry->fn_xmin == HeapTupleHeaderGetXmin(procedureTuple->t_data) &&
!         ItemPointerEquals(&entry->fn_tid, &procedureTuple->t_self))
          return entry;            /* OK */
      return NULL;                /* entry is out of date */
  }
***************
*** 521,527 ****
                      &found);
      /* OID is already filled in */
      entry->fn_xmin = HeapTupleHeaderGetXmin(procedureTuple->t_data);
!     entry->fn_cmin = HeapTupleHeaderGetCmin(procedureTuple->t_data);
      entry->user_fn = user_fn;
      entry->inforec = inforec;
  }
--- 521,527 ----
                      &found);
      /* OID is already filled in */
      entry->fn_xmin = HeapTupleHeaderGetXmin(procedureTuple->t_data);
!     entry->fn_tid = procedureTuple->t_self;
      entry->user_fn = user_fn;
      entry->inforec = inforec;
  }
Index: src/backend/utils/time/Makefile
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/utils/time/Makefile,v
retrieving revision 1.11
diff -c -r1.11 Makefile
*** src/backend/utils/time/Makefile    20 Jan 2007 17:16:15 -0000    1.11
--- src/backend/utils/time/Makefile    29 Jan 2007 12:27:07 -0000
***************
*** 12,18 ****
  top_builddir = ../../../..
  include $(top_builddir)/src/Makefile.global

! OBJS = tqual.o

  all: SUBSYS.o

--- 12,18 ----
  top_builddir = ../../../..
  include $(top_builddir)/src/Makefile.global

! OBJS = tqual.o phantomcid.o

  all: SUBSYS.o

Index: src/backend/utils/time/phantomcid.c
===================================================================
RCS file: src/backend/utils/time/phantomcid.c
diff -N src/backend/utils/time/phantomcid.c
*** /dev/null    1 Jan 1970 00:00:00 -0000
--- src/backend/utils/time/phantomcid.c    30 Jan 2007 10:59:39 -0000
***************
*** 0 ****
--- 1,306 ----
+ /*-------------------------------------------------------------------------
+  *
+  * phantomcid.c
+  *      Phantom command id support routines
+  *
+  * Before version 8.3, HeapTupleHeaderData had separate fields for cmin
+  * and cmax. To reduce the header size, cmin and cmax are now overlayed
+  * in the same field in the header. That usually works because you rarely
+  * insert and delete a tuple in the transaction. To make it work when you
+  * do, we create a phantom command id and store that in the tuple header
+  * instead of cmin and cmax. The phantom command id maps to the real cmin
+  * and cmax in a backend-private array. Other backends don't need them,
+  * because cmin and cmax are only interesting to the inserting/deleting
+  * transaction.
+  *
+  * To allow reusing existing phantom cids, we also keep a hash table that
+  * maps cmin,cmax pairs to phantom cids.
+  *
+  * The array and hash table are kept in TopTransactionContext, and are
+  * destroyed at the end of transaction.
+  *
+  * A 32-bit phantom command id allows us to represent 2^32 distinct
+  * cmin, cmax combinations. In the most perverse case where each command
+  * deletes a tuple generated by each previous command, the number of phantom
+  * command ids required for n commands is n(1+n)/2. That means that in the
+  * worst case, that's enough for 92682 commands. In practice, you'll run out
+  * of memory and/or disk space way before you reach that limit.
+  *
+  * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * IDENTIFICATION
+  *      $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+
+ #include "postgres.h"
+
+ #include "access/htup.h"
+ #include "access/xact.h"
+ #include "utils/memutils.h"
+ #include "utils/hsearch.h"
+ #include "utils/phantomcid.h"
+
+ /* #define PHANTOMCID_DEBUG */
+
+
+ /* Hash table to lookup phantom cids by cmin and cmax */
+ static HTAB *phantomHash = NULL;
+
+ /* Key and entry structures for the hash table */
+ typedef struct {
+     CommandId cmin;
+     CommandId cmax;
+ } PhantomCidKeyData;
+
+ typedef struct {
+     PhantomCidKeyData key;
+     CommandId phantomcid;
+ } PhantomCidEntryData;
+
+ typedef PhantomCidKeyData *PhantomCidKey;
+ typedef PhantomCidEntryData *PhantomCidEntry;
+
+ #define PCID_HASH_SIZE                100
+
+
+ /* An array of cmin,cmax pairs, indexed by phantom command id.
+  * To convert a phantom cid to cmin and cmax, you do a simple array
+  * lookup. */
+ static PhantomCidKey phantomCids = NULL;
+ static int usedPhantomCids = 0; /* number of elements in phantomCids */
+ static int sizePhantomCids = 0; /* size of phantomCids array */
+
+ /* Initial size of the array. It will be grown if it fills up */
+ #define PCID_ARRAY_INITIAL_SIZE        100
+
+
+ /* prototypes for internal functions */
+ static CommandId GetPhantomCommandId(CommandId cmin, CommandId cmax);
+ static CommandId GetRealCmin(CommandId phantomcid);
+ static CommandId GetRealCmax(CommandId phantomcid);
+
+
+ /**** External API ****/
+
+ /* All these functions rely on the caller to not call functions that don't make
+  * sense. You should only call cmin related functions in the inserting
+  * transaction and cmax related functions in the deleting transaction.
+  */
+
+ CommandId
+ HeapTupleHeaderGetCmin(HeapTupleHeader tup)
+ {
+     CommandId cid = tup->t_choice.t_heap.t_field4.t_commandid;
+
+     Assert(!(tup->t_infomask & HEAP_MOVED));
+     Assert(TransactionIdIsCurrentTransactionId(tup->t_choice.t_heap.t_xmin));
+
+     if (tup->t_infomask & HEAP_PHANTOMCID)
+         return GetRealCmin(cid);
+     else
+         return cid;
+ }
+
+ void
+ HeapTupleHeaderSetCmin(HeapTupleHeader tup, CommandId cmin)
+ {
+     /* We never need to create a phantom cid for a new tuple. */
+     tup->t_choice.t_heap.t_field4.t_commandid = cmin;
+     tup->t_infomask &= ~HEAP_PHANTOMCID;
+ }
+
+ CommandId
+ HeapTupleHeaderGetCmax(HeapTupleHeader tup)
+ {
+     CommandId cid = tup->t_choice.t_heap.t_field4.t_commandid;
+
+     Assert(!(tup->t_infomask & HEAP_MOVED));
+     Assert(TransactionIdIsCurrentTransactionId(tup->t_choice.t_heap.t_xmax));
+
+     if (tup->t_infomask & HEAP_PHANTOMCID)
+         return GetRealCmax(cid);
+     else
+         return cid;
+ }
+
+ void
+ HeapTupleHeaderSetCmax(HeapTupleHeader tup, CommandId cmax)
+ {
+     HeapTupleFields *t_heap = &tup->t_choice.t_heap;
+
+     /* If we're marking a tuple deleted that was inserted by our transaction,
+      * we need to use a phantom command id. Test for HEAP_XMIN_COMMITTED
+      * first, because it's cheaper than a TransactionIdIsCurrentTransactionId
+      * call.
+      */
+     if (!(tup->t_infomask & HEAP_XMIN_COMMITTED)
+         && TransactionIdIsCurrentTransactionId(t_heap->t_xmin))
+     {
+         CommandId cmin = t_heap->t_field4.t_commandid;
+
+         t_heap->t_field4.t_commandid = GetPhantomCommandId(cmin, cmax);
+         tup->t_infomask |= HEAP_PHANTOMCID;
+     }
+     else
+     {
+         t_heap->t_field4.t_commandid = cmax;
+         tup->t_infomask &= ~HEAP_PHANTOMCID;
+     }
+ }
+
+ /*
+  * Phantom command ids are only interesting to the inserting and deleting
+  * transaction, so we can forget about them at the end of transaction.
+  */
+ void
+ AtEOXact_PhantomCid()
+ {
+     /* Don't bother to pfree. These are allocated in TopTransactionContext,
+      * so they're going to be freed at the end of transaction anyway.
+      */
+     phantomCids = NULL;
+     phantomHash = NULL;
+
+     usedPhantomCids = 0;
+     sizePhantomCids = 0;
+ }
+
+
+ /**** Internal routines ****/
+
+ /*
+  * Get a phantom command id that maps to cmin and cmax.
+  *
+  * We try to reuse old phantom command ids when possible.
+  */
+ static
+ CommandId GetPhantomCommandId(CommandId cmin, CommandId cmax)
+ {
+     CommandId phantomcid;
+     PhantomCidKeyData key;
+     PhantomCidEntry entry = NULL;
+     bool found;
+
+ #ifdef PHANTOMCID_DEBUG
+     elog(LOG, "GetPhantomCommandId cmin: %d cmax: %d", cmin, cmax);
+ #endif
+
+     /* Create the array and hash table the first time we need to use
+      * phantom cids in the transaction.
+      */
+     if (phantomCids == NULL)
+     {
+         HASHCTL hash_ctl;
+         MemoryContext oldcontext;
+
+         oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+         phantomCids =
+             palloc(sizeof(PhantomCidKeyData) * PCID_ARRAY_INITIAL_SIZE);
+         sizePhantomCids = PCID_ARRAY_INITIAL_SIZE;
+
+         memset(&hash_ctl, 0, sizeof(hash_ctl));
+         hash_ctl.keysize = sizeof(PhantomCidKeyData);
+         hash_ctl.entrysize = sizeof(PhantomCidEntryData);
+         hash_ctl.hash = tag_hash;
+         hash_ctl.hcxt = TopTransactionContext;
+
+         phantomHash =
+             hash_create("Phantom command id hash table",
+                         PCID_HASH_SIZE, &hash_ctl,
+                         HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
+
+         MemoryContextSwitchTo(TopTransactionContext);
+     }
+
+     /* Try to find an old phantom cid with the same cmin and cmax for reuse */
+
+     key.cmin = cmin;
+     key.cmax = cmax;
+     entry = (PhantomCidEntry)
+         hash_search(phantomHash, (void *) &key, HASH_ENTER, &found);
+
+     if (found)
+     {
+ #ifdef PHANTOMCID_DEBUG
+         elog(LOG, "result: %d (reused)", entry->phantomcid);
+ #endif
+         return entry->phantomcid;
+     }
+     else
+     {
+         /* We have to create a new phantom cid. Check that there's room
+          * for it in the array, and grow it if there isn't */
+         if (usedPhantomCids >= sizePhantomCids)
+         {
+             uint64 newsz;
+             MemoryContext oldcontext;
+
+             /* We need to grow the array */
+
+             newsz = ((uint64)sizePhantomCids) * 2;
+
+             if (newsz > UINT_MAX)
+             {
+                 if (sizePhantomCids == UINT_MAX)
+                     ereport(ERROR,
+                             (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+                              errmsg("maximum number of phantom command ids exceeded")));
+
+                 newsz = UINT_MAX;
+             }
+
+             sizePhantomCids = (uint32) newsz;
+
+             oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+             phantomCids =
+                 repalloc(phantomCids,
+                          sizeof(PhantomCidKeyData) * sizePhantomCids);
+
+             MemoryContextSwitchTo(oldcontext);
+         }
+
+         phantomcid = usedPhantomCids;
+         entry->phantomcid = phantomcid;
+
+         phantomCids[phantomcid].cmin = cmin;
+         phantomCids[phantomcid].cmax = cmax;
+
+         usedPhantomCids++;
+
+ #ifdef PHANTOMCID_DEBUG
+         elog(LOG, "result: %d", phantomcid);
+ #endif
+     }
+
+     return phantomcid;
+ }
+
+ static CommandId
+ GetRealCmin(CommandId phantomcid)
+ {
+     Assert(phantomcid < usedPhantomCids);
+
+ #ifdef PHANTOMCID_DEBUG
+     elog(LOG, "GetRealCmin phantomcid: %d -> %d", phantomcid, phantomCids[phantomcid].cmin);
+ #endif
+
+
+     return phantomCids[phantomcid].cmin;
+ }
+
+ static CommandId
+ GetRealCmax(CommandId phantomcid)
+ {
+     Assert(phantomcid < usedPhantomCids);
+
+ #ifdef PHANTOMCID_DEBUG
+     elog(LOG, "GetRealCmax phantomcid: %d -> %d", phantomcid, phantomCids[phantomcid].cmax);
+ #endif
+
+     return phantomCids[phantomcid].cmax;
+ }
+
Index: src/include/access/htup.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/htup.h,v
retrieving revision 1.89
diff -c -r1.89 htup.h
*** src/include/access/htup.h    9 Jan 2007 22:01:00 -0000    1.89
--- src/include/access/htup.h    30 Jan 2007 10:29:20 -0000
***************
*** 65,77 ****
   *            object ID (if HEAP_HASOID is set in t_infomask)
   *            user data fields
   *
!  * We store five "virtual" fields Xmin, Cmin, Xmax, Cmax, and Xvac in four
!  * physical fields.  Xmin, Cmin and Xmax are always really stored, but
!  * Cmax and Xvac share a field.  This works because we know that there are
!  * only a limited number of states that a tuple can be in, and that Cmax
!  * is only interesting for the lifetime of the deleting transaction.
!  * This assumes that VACUUM FULL never tries to move a tuple whose Cmax
!  * is still interesting (ie, delete-in-progress).
   *
   * Note that in 7.3 and 7.4 a similar idea was applied to Xmax and Cmin.
   * However, with the advent of subtransactions, a tuple may need both Xmax
--- 65,81 ----
   *            object ID (if HEAP_HASOID is set in t_infomask)
   *            user data fields
   *
!  * We store five "virtual" fields Xmin, Cmin, Xmax, Cmax, and Xvac in three
!  * physical fields.  Xmin and Xmax are always really stored, but Cmin, Cmax
!  * and Xvac share a field.  This works because we know that there are only
!  * a limited number of states that a tuple can be in, and that Cmax and Cmin
!  * are only interesting for the lifetime of the deleting or inserting
!  * transaction. If a tuple is inserted and deleted in the same transaction,
!  * we use a phantom command id that maps to the real cmin and cmax. The
!  * mapping is local to the backend. See phantomcid.c for more details.
!  *
!  * This assumes that VACUUM FULL never tries to move a tuple whose Cmax or
!  * Cmin is still interesting (ie, delete- or insert-in-progress).
   *
   * Note that in 7.3 and 7.4 a similar idea was applied to Xmax and Cmin.
   * However, with the advent of subtransactions, a tuple may need both Xmax
***************
*** 103,114 ****
  typedef struct HeapTupleFields
  {
      TransactionId t_xmin;        /* inserting xact ID */
-     CommandId    t_cmin;            /* inserting command ID */
      TransactionId t_xmax;        /* deleting or locking xact ID */

      union
      {
!         CommandId    t_cmax;        /* deleting or locking command ID */
          TransactionId t_xvac;    /* VACUUM FULL xact ID */
      }            t_field4;
  } HeapTupleFields;
--- 107,121 ----
  typedef struct HeapTupleFields
  {
      TransactionId t_xmin;        /* inserting xact ID */
      TransactionId t_xmax;        /* deleting or locking xact ID */

      union
      {
!         /* t_commandid is the inserting command ID (cmin), the deleting
!          * command ID (cmax), or a phantom cid that maps to cmin and cmax if
!          * the tuple was inserted and deleted in the same transaction
!          */
!         CommandId    t_commandid;
          TransactionId t_xvac;    /* VACUUM FULL xact ID */
      }            t_field4;
  } HeapTupleFields;
***************
*** 163,169 ****
  #define HEAP_HASCOMPRESSED        0x0008    /* has compressed stored attribute(s) */
  #define HEAP_HASEXTENDED        0x000C    /* the two above combined */
  #define HEAP_HASOID                0x0010    /* has an object-id field */
! /* 0x0020 is presently unused */
  #define HEAP_XMAX_EXCL_LOCK        0x0040    /* xmax is exclusive locker */
  #define HEAP_XMAX_SHARED_LOCK    0x0080    /* xmax is shared locker */
  /* if either LOCK bit is set, xmax hasn't deleted the tuple, only locked it */
--- 170,176 ----
  #define HEAP_HASCOMPRESSED        0x0008    /* has compressed stored attribute(s) */
  #define HEAP_HASEXTENDED        0x000C    /* the two above combined */
  #define HEAP_HASOID                0x0010    /* has an object-id field */
! #define HEAP_PHANTOMCID            0x0020    /* t_commandid is a phantom cid */
  #define HEAP_XMAX_EXCL_LOCK        0x0040    /* xmax is exclusive locker */
  #define HEAP_XMAX_SHARED_LOCK    0x0080    /* xmax is shared locker */
  /* if either LOCK bit is set, xmax hasn't deleted the tuple, only locked it */
***************
*** 219,252 ****
      TransactionIdStore((xid), &(tup)->t_choice.t_heap.t_xmax) \
  )

- #define HeapTupleHeaderGetCmin(tup) \
- ( \
-     (tup)->t_choice.t_heap.t_cmin \
- )

! #define HeapTupleHeaderSetCmin(tup, cid) \
! ( \
!     (tup)->t_choice.t_heap.t_cmin = (cid) \
! )
!
! /*
!  * Note: GetCmax will produce wrong answers after SetXvac has been executed
!  * by a transaction other than the inserting one.  We could check
!  * HEAP_XMAX_INVALID and return FirstCommandId if it's clear, but since that
!  * bit will be set again if the deleting transaction aborts, there'd be no
!  * real gain in safety from the extra test.  So, just rely on the caller not
!  * to trust the value unless it's meaningful.
   */
- #define HeapTupleHeaderGetCmax(tup) \
- ( \
-     (tup)->t_choice.t_heap.t_field4.t_cmax \
- )
-
- #define HeapTupleHeaderSetCmax(tup, cid) \
- do { \
-     Assert(!((tup)->t_infomask & HEAP_MOVED)); \
-     (tup)->t_choice.t_heap.t_field4.t_cmax = (cid); \
- } while (0)

  #define HeapTupleHeaderGetXvac(tup) \
  ( \
--- 226,235 ----
      TransactionIdStore((xid), &(tup)->t_choice.t_heap.t_xmax) \
  )


! /* HeapTupleHeaderGetCmin, HeapTupleHeaderGetCmax, and ..SetCmin and ..SetCmax
!  * are defined in phantomcid.h
   */

  #define HeapTupleHeaderGetXvac(tup) \
  ( \
***************
*** 640,643 ****
--- 623,633 ----

  #define SizeOfHeapFreeze (offsetof(xl_heap_freeze, cutoff_xid) + sizeof(TransactionId))

+ /* prototypes of HeapTupleHeader* functions implemented in
+  * utils/time/phantomcid.c */
+ extern CommandId HeapTupleHeaderGetCmin(HeapTupleHeader tup);
+ extern void HeapTupleHeaderSetCmin(HeapTupleHeader tup, CommandId cmin);
+ extern CommandId HeapTupleHeaderGetCmax(HeapTupleHeader tup);
+ extern void HeapTupleHeaderSetCmax(HeapTupleHeader tup, CommandId cmax);
+
  #endif   /* HTUP_H */
Index: src/include/storage/bufpage.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/storage/bufpage.h,v
retrieving revision 1.69
diff -c -r1.69 bufpage.h
*** src/include/storage/bufpage.h    5 Jan 2007 22:19:57 -0000    1.69
--- src/include/storage/bufpage.h    29 Jan 2007 20:18:22 -0000
***************
*** 134,141 ****
   * Releases 7.3 and 7.4 use 1, denoting a new HeapTupleHeader layout.
   * Release 8.0 changed the HeapTupleHeader layout again.
   * Release 8.1 redefined HeapTupleHeader infomask bits.
   */
! #define PG_PAGE_LAYOUT_VERSION        3


  /* ----------------------------------------------------------------
--- 134,143 ----
   * Releases 7.3 and 7.4 use 1, denoting a new HeapTupleHeader layout.
   * Release 8.0 changed the HeapTupleHeader layout again.
   * Release 8.1 redefined HeapTupleHeader infomask bits.
+  * Release 8.3 changed the HeapTupleHeader layout, overlaying cmin and cmax
+  *  to a single field and replaced t_natts with t_infomask2 .
   */
! #define PG_PAGE_LAYOUT_VERSION        4


  /* ----------------------------------------------------------------
Index: src/include/utils/phantomcid.h
===================================================================
RCS file: src/include/utils/phantomcid.h
diff -N src/include/utils/phantomcid.h
*** /dev/null    1 Jan 1970 00:00:00 -0000
--- src/include/utils/phantomcid.h    29 Jan 2007 20:13:58 -0000
***************
*** 0 ****
--- 1,25 ----
+ /*-------------------------------------------------------------------------
+  *
+  * phantomcid.h
+  *      Phantom cid function definitions
+  *
+  *
+  * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ #ifndef PHANTOMCID_H
+ #define PHANTOMCID_H
+
+ /* HeapTupleHeaderGetCmin, *SetCmin, *GetCmax and *SetCmax
+  * function prototypes are in access/htup.h, because that's
+  * where the macro definitions that the functions replaced
+  * used to be.
+  */
+
+ extern void AtEOXact_PhantomCid(void);
+
+ #endif   /* PHANTOMCID_H */
Index: src/pl/plperl/plperl.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/pl/plperl/plperl.c,v
retrieving revision 1.125
diff -c -r1.125 plperl.c
*** src/pl/plperl/plperl.c    27 Jan 2007 16:46:21 -0000    1.125
--- src/pl/plperl/plperl.c    29 Jan 2007 17:53:05 -0000
***************
*** 41,47 ****
  {
      char       *proname;
      TransactionId fn_xmin;
!     CommandId    fn_cmin;
      bool        fn_readonly;
      bool        lanpltrusted;
      bool        fn_retistuple;    /* true, if function returns tuple */
--- 41,47 ----
  {
      char       *proname;
      TransactionId fn_xmin;
!     ItemPointerData fn_tid;
      bool        fn_readonly;
      bool        lanpltrusted;
      bool        fn_retistuple;    /* true, if function returns tuple */
***************
*** 1445,1451 ****
           * function's pg_proc entry without changing its OID.
           ************************************************************/
          uptodate = (prodesc->fn_xmin == HeapTupleHeaderGetXmin(procTup->t_data) &&
!                 prodesc->fn_cmin == HeapTupleHeaderGetCmin(procTup->t_data));

          if (!uptodate)
          {
--- 1445,1451 ----
           * function's pg_proc entry without changing its OID.
           ************************************************************/
          uptodate = (prodesc->fn_xmin == HeapTupleHeaderGetXmin(procTup->t_data) &&
!                 ItemPointerEquals(&prodesc->fn_tid, &procTup->t_self));

          if (!uptodate)
          {
***************
*** 1485,1491 ****
          MemSet(prodesc, 0, sizeof(plperl_proc_desc));
          prodesc->proname = strdup(internal_proname);
          prodesc->fn_xmin = HeapTupleHeaderGetXmin(procTup->t_data);
!         prodesc->fn_cmin = HeapTupleHeaderGetCmin(procTup->t_data);

          /* Remember if function is STABLE/IMMUTABLE */
          prodesc->fn_readonly =
--- 1485,1491 ----
          MemSet(prodesc, 0, sizeof(plperl_proc_desc));
          prodesc->proname = strdup(internal_proname);
          prodesc->fn_xmin = HeapTupleHeaderGetXmin(procTup->t_data);
!         prodesc->fn_tid = procTup->t_self;

          /* Remember if function is STABLE/IMMUTABLE */
          prodesc->fn_readonly =
Index: src/pl/plpgsql/src/pl_comp.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/pl/plpgsql/src/pl_comp.c,v
retrieving revision 1.109
diff -c -r1.109 pl_comp.c
*** src/pl/plpgsql/src/pl_comp.c    5 Jan 2007 22:20:02 -0000    1.109
--- src/pl/plpgsql/src/pl_comp.c    29 Jan 2007 17:56:22 -0000
***************
*** 162,168 ****
      {
          /* We have a compiled function, but is it still valid? */
          if (!(function->fn_xmin == HeapTupleHeaderGetXmin(procTup->t_data) &&
!               function->fn_cmin == HeapTupleHeaderGetCmin(procTup->t_data)))
          {
              /* Nope, drop the function and associated storage */
              delete_function(function);
--- 162,168 ----
      {
          /* We have a compiled function, but is it still valid? */
          if (!(function->fn_xmin == HeapTupleHeaderGetXmin(procTup->t_data) &&
!               ItemPointerEquals(&function->fn_tid, &procTup->t_self)))
          {
              /* Nope, drop the function and associated storage */
              delete_function(function);
***************
*** 308,314 ****
      function->fn_name = pstrdup(NameStr(procStruct->proname));
      function->fn_oid = fcinfo->flinfo->fn_oid;
      function->fn_xmin = HeapTupleHeaderGetXmin(procTup->t_data);
!     function->fn_cmin = HeapTupleHeaderGetCmin(procTup->t_data);
      function->fn_functype = functype;
      function->fn_cxt = func_cxt;
      function->out_param_varno = -1;        /* set up for no OUT param */
--- 308,314 ----
      function->fn_name = pstrdup(NameStr(procStruct->proname));
      function->fn_oid = fcinfo->flinfo->fn_oid;
      function->fn_xmin = HeapTupleHeaderGetXmin(procTup->t_data);
!     function->fn_tid = procTup->t_self;
      function->fn_functype = functype;
      function->fn_cxt = func_cxt;
      function->out_param_varno = -1;        /* set up for no OUT param */
Index: src/pl/plpgsql/src/plpgsql.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/pl/plpgsql/src/plpgsql.h,v
retrieving revision 1.83
diff -c -r1.83 plpgsql.h
*** src/pl/plpgsql/src/plpgsql.h    28 Jan 2007 16:15:49 -0000    1.83
--- src/pl/plpgsql/src/plpgsql.h    29 Jan 2007 17:56:00 -0000
***************
*** 547,553 ****
      char       *fn_name;
      Oid            fn_oid;
      TransactionId fn_xmin;
!     CommandId    fn_cmin;
      int            fn_functype;
      PLpgSQL_func_hashkey *fn_hashkey;    /* back-link to hashtable key */
      MemoryContext fn_cxt;
--- 547,553 ----
      char       *fn_name;
      Oid            fn_oid;
      TransactionId fn_xmin;
!     ItemPointerData fn_tid;
      int            fn_functype;
      PLpgSQL_func_hashkey *fn_hashkey;    /* back-link to hashtable key */
      MemoryContext fn_cxt;
Index: src/pl/plpython/plpython.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/pl/plpython/plpython.c,v
retrieving revision 1.93
diff -c -r1.93 plpython.c
*** src/pl/plpython/plpython.c    28 Jan 2007 19:36:46 -0000    1.93
--- src/pl/plpython/plpython.c    29 Jan 2007 17:54:30 -0000
***************
*** 123,129 ****
      char       *proname;        /* SQL name of procedure */
      char       *pyname;            /* Python name of procedure */
      TransactionId fn_xmin;
!     CommandId    fn_cmin;
      bool        fn_readonly;
      PLyTypeInfo result;            /* also used to store info for trigger tuple
                                   * type */
--- 123,129 ----
      char       *proname;        /* SQL name of procedure */
      char       *pyname;            /* Python name of procedure */
      TransactionId fn_xmin;
!     ItemPointerData fn_tid;
      bool        fn_readonly;
      PLyTypeInfo result;            /* also used to store info for trigger tuple
                                   * type */
***************
*** 1100,1106 ****
              elog(FATAL, "proc->me != plproc");
          /* did we find an up-to-date cache entry? */
          if (proc->fn_xmin != HeapTupleHeaderGetXmin(procTup->t_data) ||
!             proc->fn_cmin != HeapTupleHeaderGetCmin(procTup->t_data))
          {
              Py_DECREF(plproc);
              proc = NULL;
--- 1100,1106 ----
              elog(FATAL, "proc->me != plproc");
          /* did we find an up-to-date cache entry? */
          if (proc->fn_xmin != HeapTupleHeaderGetXmin(procTup->t_data) ||
!             !ItemPointerEquals(&proc->fn_tid, &procTup->t_self))
          {
              Py_DECREF(plproc);
              proc = NULL;
***************
*** 1151,1157 ****
      proc->proname = PLy_strdup(NameStr(procStruct->proname));
      proc->pyname = PLy_strdup(procName);
      proc->fn_xmin = HeapTupleHeaderGetXmin(procTup->t_data);
!     proc->fn_cmin = HeapTupleHeaderGetCmin(procTup->t_data);
      /* Remember if function is STABLE/IMMUTABLE */
      proc->fn_readonly =
          (procStruct->provolatile != PROVOLATILE_VOLATILE);
--- 1151,1157 ----
      proc->proname = PLy_strdup(NameStr(procStruct->proname));
      proc->pyname = PLy_strdup(procName);
      proc->fn_xmin = HeapTupleHeaderGetXmin(procTup->t_data);
!     proc->fn_tid = procTup->t_self;
      /* Remember if function is STABLE/IMMUTABLE */
      proc->fn_readonly =
          (procStruct->provolatile != PROVOLATILE_VOLATILE);
Index: src/pl/tcl/pltcl.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/pl/tcl/pltcl.c,v
retrieving revision 1.108
diff -c -r1.108 pltcl.c
*** src/pl/tcl/pltcl.c    4 Oct 2006 00:30:14 -0000    1.108
--- src/pl/tcl/pltcl.c    29 Jan 2007 17:51:14 -0000
***************
*** 76,82 ****
  {
      char       *proname;
      TransactionId fn_xmin;
!     CommandId    fn_cmin;
      bool        fn_readonly;
      bool        lanpltrusted;
      FmgrInfo    result_in_func;
--- 76,82 ----
  {
      char       *proname;
      TransactionId fn_xmin;
!     ItemPointerData fn_tid;
      bool        fn_readonly;
      bool        lanpltrusted;
      FmgrInfo    result_in_func;
***************
*** 962,968 ****
          prodesc = (pltcl_proc_desc *) Tcl_GetHashValue(hashent);

          uptodate = (prodesc->fn_xmin == HeapTupleHeaderGetXmin(procTup->t_data) &&
!                 prodesc->fn_cmin == HeapTupleHeaderGetCmin(procTup->t_data));

          if (!uptodate)
          {
--- 962,968 ----
          prodesc = (pltcl_proc_desc *) Tcl_GetHashValue(hashent);

          uptodate = (prodesc->fn_xmin == HeapTupleHeaderGetXmin(procTup->t_data) &&
!                 ItemPointerEquals(&prodesc->fn_tid, &procTup->t_self));

          if (!uptodate)
          {
***************
*** 1004,1010 ****
          MemSet(prodesc, 0, sizeof(pltcl_proc_desc));
          prodesc->proname = strdup(internal_proname);
          prodesc->fn_xmin = HeapTupleHeaderGetXmin(procTup->t_data);
!         prodesc->fn_cmin = HeapTupleHeaderGetCmin(procTup->t_data);

          /* Remember if function is STABLE/IMMUTABLE */
          prodesc->fn_readonly =
--- 1004,1010 ----
          MemSet(prodesc, 0, sizeof(pltcl_proc_desc));
          prodesc->proname = strdup(internal_proname);
          prodesc->fn_xmin = HeapTupleHeaderGetXmin(procTup->t_data);
!         prodesc->fn_tid = procTup->t_self;

          /* Remember if function is STABLE/IMMUTABLE */
          prodesc->fn_readonly =
Index: src/test/regress/parallel_schedule
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/test/regress/parallel_schedule,v
retrieving revision 1.38
diff -c -r1.38 parallel_schedule
*** src/test/regress/parallel_schedule    28 Jan 2007 16:16:54 -0000    1.38
--- src/test/regress/parallel_schedule    29 Jan 2007 12:27:13 -0000
***************
*** 61,67 ****
  # ----------
  # The fourth group of parallel test
  # ----------
! test: select_into select_distinct select_distinct_on select_implicit select_having subselect union case join
aggregatestransactions random portals arrays btree_index hash_index update namespace prepared_xacts delete 

  test: privileges
  test: misc
--- 61,67 ----
  # ----------
  # The fourth group of parallel test
  # ----------
! test: select_into select_distinct select_distinct_on select_implicit select_having subselect union case join
aggregatestransactions random portals arrays btree_index hash_index update namespace prepared_xacts delete phantomcid 

  test: privileges
  test: misc
Index: src/test/regress/serial_schedule
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/test/regress/serial_schedule,v
retrieving revision 1.36
diff -c -r1.36 serial_schedule
*** src/test/regress/serial_schedule    28 Jan 2007 16:16:54 -0000    1.36
--- src/test/regress/serial_schedule    29 Jan 2007 12:27:13 -0000
***************
*** 107,109 ****
--- 107,110 ----
  test: xml
  test: stats
  test: tablespace
+ test: phantomcid
Index: src/test/regress/expected/phantomcid.out
===================================================================
RCS file: src/test/regress/expected/phantomcid.out
diff -N src/test/regress/expected/phantomcid.out
*** /dev/null    1 Jan 1970 00:00:00 -0000
--- src/test/regress/expected/phantomcid.out    28 Sep 2006 08:51:09 -0000
***************
*** 0 ****
--- 1,28 ----
+ CREATE TEMP TABLE phantomcidtest (foobar int);
+ BEGIN;
+ INSERT INTO phantomcidtest VALUES (1);
+ SELECT * FROM phantomcidtest;
+  foobar
+ --------
+       1
+ (1 row)
+
+ DELETE FROM phantomcidtest;
+ SELECT * FROM phantomcidtest;
+  foobar
+ --------
+ (0 rows)
+
+ COMMIT;
+ /* Test phantom cids with portals */
+ BEGIN;
+ INSERT INTO phantomcidtest VALUES (1);
+ DECLARE c CURSOR FOR SELECT * FROM phantomcidtest;
+ DELETE FROM phantomcidtest;
+ FETCH ALL FROM c;
+  foobar
+ --------
+       1
+ (1 row)
+
+ COMMIT;
Index: src/test/regress/sql/phantomcid.sql
===================================================================
RCS file: src/test/regress/sql/phantomcid.sql
diff -N src/test/regress/sql/phantomcid.sql
*** /dev/null    1 Jan 1970 00:00:00 -0000
--- src/test/regress/sql/phantomcid.sql    27 Sep 2006 12:54:18 -0000
***************
*** 0 ****
--- 1,27 ----
+ CREATE TEMP TABLE phantomcidtest (foobar int);
+
+
+ BEGIN;
+
+ INSERT INTO phantomcidtest VALUES (1);
+
+ SELECT * FROM phantomcidtest;
+
+ DELETE FROM phantomcidtest;
+
+ SELECT * FROM phantomcidtest;
+
+ COMMIT;
+
+ /* Test phantom cids with portals */
+ BEGIN;
+
+ INSERT INTO phantomcidtest VALUES (1);
+
+ DECLARE c CURSOR FOR SELECT * FROM phantomcidtest;
+
+ DELETE FROM phantomcidtest;
+
+ FETCH ALL FROM c;
+
+ COMMIT;

Re: Phantom Command IDs, updated patch

From
Andrew Dunstan
Date:
Heikki Linnakangas wrote:
> Is there any regression tests for the PL languages?
Certainly. See sql and expected directories for each PL. Works using
standard pg_regress.

cheers

andrew

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Bruce Momjian
Date:
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Heikki Linnakangas wrote:
> Here's an updated version of the phantom command ids patch.
>
> I found one more subtle safety issue. The array and hash table for
> phantom command ids is dynamically grown when HeapTupleHeaderSetCmax is
> called. Unfortunately, since HeapTupleHeaderSetCmax is used inside a
> critical sections, running out of memory while trying to grow them would
> cause a PANIC. That's why I moved the SetXmax/SetCmax calls outside
> critical sections in heapam.c. I believe that's safe; if a backend
> aborts after setting the xmax/cmax, no-one is going to care about the
> xid of an aborted transaction in there.
>
> Per Tom's suggestion, I replaced the function cache code in fmgr.c and
> similar code in plperl.c, pltcl.c, plpgsql/pl_comp.c and plpython.c to
> use xmin+tid instead of xmin+cmin for the up-to-dateness check. I don't
> have any tcl, perl or python test cases handy to test them, but the
> change is small and essentially same for all of the above. Is there any
> regression tests for the PL languages?
>
> I made cmin and cmax system attributes aliases for the same physical
> commandid field. I support the idea of a complete overhaul of those
> system attributes, but let's do that in a separate patch.
>
> To measure the overhead, I ran a plpgsql test case that updates a single
> row 10000 times in a loop, generating a new phantom command id in each
> iteration. The test took ~5% longer with the patch, so I think that's
> acceptable. I couldn't measure a difference with pgbench (as expected).
>
> I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
> and ifdef blocks before applying.
>
> --
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com


>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
> and ifdef blocks before applying.

BTW, I don't care much for the terminology "phantom cid" ... there's
nothing particularly "phantom" about them, seeing they get onto disk.
Can anyone think of a better name?  The best I can do offhand is
"merged cid" or "cid pair", which aren't inspiring.  Now would be a
good time to change it while it'd still be an easy search-and-replace
over a patch file ...

            regards, tom lane

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
>> and ifdef blocks before applying.
>
> BTW, I don't care much for the terminology "phantom cid" ... there's
> nothing particularly "phantom" about them, seeing they get onto disk.
> Can anyone think of a better name?  The best I can do offhand is
> "merged cid" or "cid pair", which aren't inspiring.

MultiCid, like the MultiXacts? Maybe not, they're quite different beasts...

Alias cid? Mapped cid? Compressed cid? Hero cid? :)

I'm happy with phantom cid myself. It sounds cool, and they are a bit
phantom-like because the true meaning of a phantom cid is lost when the
transaction ends.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
"Joshua D. Drake"
Date:
Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>>> I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
>>> and ifdef blocks before applying.
>>
>> BTW, I don't care much for the terminology "phantom cid" ... there's
>> nothing particularly "phantom" about them, seeing they get onto disk.
>> Can anyone think of a better name?  The best I can do offhand is
>> "merged cid" or "cid pair", which aren't inspiring.
>
> MultiCid, like the MultiXacts? Maybe not, they're quite different beasts...
>
> Alias cid? Mapped cid? Compressed cid? Hero cid? :)


>
> I'm happy with phantom cid myself. It sounds cool, and they are a bit
> phantom-like because the true meaning of a phantom cid is lost when the
> transaction ends.
>

Phantom was also a super hero ;)

Sincerely,

Joshua D. Drake


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Alvaro Herrera
Date:
Heikki Linnakangas wrote:
> Tom Lane wrote:
> >Heikki Linnakangas <heikki@enterprisedb.com> writes:
> >>I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
> >>and ifdef blocks before applying.
> >
> >BTW, I don't care much for the terminology "phantom cid" ... there's
> >nothing particularly "phantom" about them, seeing they get onto disk.
> >Can anyone think of a better name?  The best I can do offhand is
> >"merged cid" or "cid pair", which aren't inspiring.
>
> MultiCid, like the MultiXacts? Maybe not, they're quite different beasts...

Dual cid?  Double cid?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Heikki Linnakangas wrote:
>> Tom Lane wrote:
>>> BTW, I don't care much for the terminology "phantom cid" ... there's
>>> nothing particularly "phantom" about them, seeing they get onto disk.
>>> Can anyone think of a better name?  The best I can do offhand is
>>> "merged cid" or "cid pair", which aren't inspiring.
>>
>> MultiCid, like the MultiXacts? Maybe not, they're quite different beasts...

> Dual cid?  Double cid?

"Double cid" doesn't sound too bad.  Another thought that just came to
mind is "cid interval" or some variant of that.

            regards, tom lane

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Bruce Momjian
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Heikki Linnakangas wrote:
> >> Tom Lane wrote:
> >>> BTW, I don't care much for the terminology "phantom cid" ... there's
> >>> nothing particularly "phantom" about them, seeing they get onto disk.
> >>> Can anyone think of a better name?  The best I can do offhand is
> >>> "merged cid" or "cid pair", which aren't inspiring.
> >>
> >> MultiCid, like the MultiXacts? Maybe not, they're quite different beasts...
>
> > Dual cid?  Double cid?
>
> "Double cid" doesn't sound too bad.  Another thought that just came to
> mind is "cid interval" or some variant of that.

I don't like "double ctid" because it is really just one ctid, but
represents two.  I am thinking "packed ctid" is the right wording.  It
doesn't have the same impact as "phantom", but it is probably better.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Bruce Momjian
Date:
Heikki Linnakangas wrote:
> Here's an updated version of the phantom command ids patch.
>
> I found one more subtle safety issue. The array and hash table for
> phantom command ids is dynamically grown when HeapTupleHeaderSetCmax is
> called. Unfortunately, since HeapTupleHeaderSetCmax is used inside a
> critical sections, running out of memory while trying to grow them would
> cause a PANIC. That's why I moved the SetXmax/SetCmax calls outside
> critical sections in heapam.c. I believe that's safe; if a backend
> aborts after setting the xmax/cmax, no-one is going to care about the
> xid of an aborted transaction in there.
>
> Per Tom's suggestion, I replaced the function cache code in fmgr.c and
> similar code in plperl.c, pltcl.c, plpgsql/pl_comp.c and plpython.c to
> use xmin+tid instead of xmin+cmin for the up-to-dateness check. I don't
> have any tcl, perl or python test cases handy to test them, but the
> change is small and essentially same for all of the above. Is there any
> regression tests for the PL languages?
>
> I made cmin and cmax system attributes aliases for the same physical
> commandid field. I support the idea of a complete overhaul of those
> system attributes, but let's do that in a separate patch.
>
> To measure the overhead, I ran a plpgsql test case that updates a single
> row 10000 times in a loop, generating a new phantom command id in each
> iteration. The test took ~5% longer with the patch, so I think that's
> acceptable. I couldn't measure a difference with pgbench (as expected).
>
> I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
> and ifdef blocks before applying.

Heikki, I found something odd in your patch.  You had an extra
parentheses at the end of the line in the orginal and new version of the
patch (attached).  I removed it before applying, but I just wanted to
confirm this was OK.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
***************
*** 162,168 ****
      {
          /* We have a compiled function, but is it still valid? */
          if (!(function->fn_xmin == HeapTupleHeaderGetXmin(procTup->t_data) &&
!               function->fn_cmin == HeapTupleHeaderGetCmin(procTup->t_data)))
          {
              /* Nope, drop the function and associated storage */
              delete_function(function);
--- 162,168 ----
      {
          /* We have a compiled function, but is it still valid? */
          if (!(function->fn_xmin == HeapTupleHeaderGetXmin(procTup->t_data) &&
!               ItemPointerEquals(&function->fn_tid, &procTup->t_self)))
          {
              /* Nope, drop the function and associated storage */
              delete_function(function);

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Alvaro Herrera
Date:
Bruce Momjian wrote:
> Heikki Linnakangas wrote:
> > Here's an updated version of the phantom command ids patch.
> >
> > I found one more subtle safety issue. The array and hash table for
> > phantom command ids is dynamically grown when HeapTupleHeaderSetCmax is
> > called. Unfortunately, since HeapTupleHeaderSetCmax is used inside a
> > critical sections, running out of memory while trying to grow them would
> > cause a PANIC. That's why I moved the SetXmax/SetCmax calls outside
> > critical sections in heapam.c. I believe that's safe; if a backend
> > aborts after setting the xmax/cmax, no-one is going to care about the
> > xid of an aborted transaction in there.
> >
> > Per Tom's suggestion, I replaced the function cache code in fmgr.c and
> > similar code in plperl.c, pltcl.c, plpgsql/pl_comp.c and plpython.c to
> > use xmin+tid instead of xmin+cmin for the up-to-dateness check. I don't
> > have any tcl, perl or python test cases handy to test them, but the
> > change is small and essentially same for all of the above. Is there any
> > regression tests for the PL languages?
> >
> > I made cmin and cmax system attributes aliases for the same physical
> > commandid field. I support the idea of a complete overhaul of those
> > system attributes, but let's do that in a separate patch.
> >
> > To measure the overhead, I ran a plpgsql test case that updates a single
> > row 10000 times in a loop, generating a new phantom command id in each
> > iteration. The test took ~5% longer with the patch, so I think that's
> > acceptable. I couldn't measure a difference with pgbench (as expected).
> >
> > I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
> > and ifdef blocks before applying.
>
> Heikki, I found something odd in your patch.  You had an extra
> parentheses at the end of the line in the orginal and new version of the
> patch (attached).  I removed it before applying, but I just wanted to
> confirm this was OK.

Huh, you already applied it?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Heikki Linnakangas
Date:
Bruce Momjian wrote:
> Heikki, I found something odd in your patch.  You had an extra
> parentheses at the end of the line in the orginal and new version of the
> patch (attached).  I removed it before applying, but I just wanted to
> confirm this was OK.

Looking at the CVS history, it looks like Tom changed that piece of code
recently in this commit:

> revision 1.110
> date: 2007-01-30 22:05:12 +0000;  author: tgl;  state: Exp;  lines: +88 -21;
> Repair oversights in the mechanism used to store compiled plpgsql functions.
> The original coding failed (tried to access deallocated memory) if there were
> two active call sites (fn_extra pointers) for the same function and the
> function definition was updated.  Also, if an update of a recursive function
> was detected upon nested entry to the function, the existing compiled version
> was summarily deallocated, resulting in crash upon return to the outer
> instance.  Problem observed while studying a bug report from Sergiy
> Vyshnevetskiy.
>
> Bug does not exist before 8.1 since older versions just leaked the memory of
> obsoleted compiled functions, rather than trying to reclaim it.

Note that the condition in the if-clause is now the other way round, and
the delete_function call is now in the else-branch. Did you get that
right in your commit?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Bruce Momjian
Date:
Heikki Linnakangas wrote:
> Bruce Momjian wrote:
> > Heikki, I found something odd in your patch.  You had an extra
> > parentheses at the end of the line in the orginal and new version of the
> > patch (attached).  I removed it before applying, but I just wanted to
> > confirm this was OK.
>
> Looking at the CVS history, it looks like Tom changed that piece of code
> recently in this commit:
>
> > revision 1.110
> > date: 2007-01-30 22:05:12 +0000;  author: tgl;  state: Exp;  lines: +88 -21;
> > Repair oversights in the mechanism used to store compiled plpgsql functions.
> > The original coding failed (tried to access deallocated memory) if there were
> > two active call sites (fn_extra pointers) for the same function and the
> > function definition was updated.  Also, if an update of a recursive function
> > was detected upon nested entry to the function, the existing compiled version
> > was summarily deallocated, resulting in crash upon return to the outer
> > instance.  Problem observed while studying a bug report from Sergiy
> > Vyshnevetskiy.
> >
> > Bug does not exist before 8.1 since older versions just leaked the memory of
> > obsoleted compiled functions, rather than trying to reclaim it.
>
> Note that the condition in the if-clause is now the other way round, and
> the delete_function call is now in the else-branch. Did you get that
> right in your commit?

No, I did not see that, but I see it now.  I haven't committed anything
yet.  I will research that and get it right.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Heikki, I found something odd in your patch.  You had an extra
> parentheses at the end of the line in the orginal and new version of the
> patch (attached).  I removed it before applying, but I just wanted to
> confirm this was OK.

Please do not apply that patch --- I want to review it first.

            regards, tom lane

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> > > Per Tom's suggestion, I replaced the function cache code in fmgr.c and
> > > similar code in plperl.c, pltcl.c, plpgsql/pl_comp.c and plpython.c to
> > > use xmin+tid instead of xmin+cmin for the up-to-dateness check. I don't
> > > have any tcl, perl or python test cases handy to test them, but the
> > > change is small and essentially same for all of the above. Is there any
> > > regression tests for the PL languages?
> > >
> > > I made cmin and cmax system attributes aliases for the same physical
> > > commandid field. I support the idea of a complete overhaul of those
> > > system attributes, but let's do that in a separate patch.
> > >
> > > To measure the overhead, I ran a plpgsql test case that updates a single
> > > row 10000 times in a loop, generating a new phantom command id in each
> > > iteration. The test took ~5% longer with the patch, so I think that's
> > > acceptable. I couldn't measure a difference with pgbench (as expected).
> > >
> > > I think the patch is ready. Please remove the PHANTOMCID_DEBUG define
> > > and ifdef blocks before applying.
> >
> > Heikki, I found something odd in your patch.  You had an extra
> > parentheses at the end of the line in the orginal and new version of the
> > patch (attached).  I removed it before applying, but I just wanted to
> > confirm this was OK.
>
> Huh, you already applied it?

When I said "applying", I meant before applying the patch to my CVS
tree, not before commiting, because I didn't commit it.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Heikki, I found something odd in your patch.  You had an extra
> > parentheses at the end of the line in the orginal and new version of the
> > patch (attached).  I removed it before applying, but I just wanted to
> > confirm this was OK.
>
> Please do not apply that patch --- I want to review it first.

OK.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> I found one more subtle safety issue. The array and hash table for
> phantom command ids is dynamically grown when HeapTupleHeaderSetCmax is
> called. Unfortunately, since HeapTupleHeaderSetCmax is used inside a
> critical sections, running out of memory while trying to grow them would
> cause a PANIC. That's why I moved the SetXmax/SetCmax calls outside
> critical sections in heapam.c. I believe that's safe; if a backend
> aborts after setting the xmax/cmax, no-one is going to care about the
> xid of an aborted transaction in there.

I don't like that one bit; I think a saner solution is to refactor the
API of combocid.c so that we can obtain the required CID before
modifying the page.  It'll mean it's not a drop-in replacement for
HeapTupleSetCmax, but all callers of that are going to need a close look
anyway.

            regards, tom lane

Re: [pgsql-patches] Phantom Command IDs, updated patch

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Here's an updated version of the phantom command ids patch.

Applied with some revisions (notably, renaming 'em to "combo" command IDs).

            regards, tom lane