Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Date
Msg-id 47D7DE12.80700@enterprisedb.com
Whole thread Raw
In response to Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit  ("Pavan Deolasee" <pavan.deolasee@gmail.com>)
Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit  (Bruce Momjian <bruce@momjian.us>)
List pgsql-patches
Tom Lane wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>> Elsewhere in our codebase where we use arrays that are enlarged as
>> needed, we keep track of the "allocated" size and the "used" size of the
>> array separately, and only call repalloc when the array fills up, and
>> repalloc a larger than necessary array when it does. I chose to just
>> call repalloc every time instead, as repalloc is smart enough to fall
>> out quickly if the chunk the allocation was made in is already larger
>> than the new size. There might be some gain avoiding the repeated
>> repalloc calls, but I doubt it's worth the code complexity, and calling
>> repalloc with a larger than necessary size can actually force it to
>> unnecessarily allocate a new, larger chunk instead of reusing the old
>> one. Thoughts on that?
>
> Seems like a pretty bad idea to me, as the behavior you're counting on
> only applies to chunks up to 8K or thereabouts.

Oh, you're right. Though I'm sure libc realloc has all kinds of smarts
as well, it does seem better to not rely too much on that.

> In a situation where
> you are subcommitting lots of XIDs one at a time, this is likely to have
> quite awful behavior (or at least, you're at the mercy of the local
> malloc library as to how bad it is).  I'd go with the same
> double-it-each-time-needed approach we use elsewhere.

Yep, patch attached. I also changed xactGetCommittedChildren to return
the original array instead of copying it, as Alvaro suggested.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.39
diff -c -r1.39 twophase.c
*** src/backend/access/transam/twophase.c    1 Jan 2008 19:45:48 -0000    1.39
--- src/backend/access/transam/twophase.c    12 Mar 2008 12:45:13 -0000
***************
*** 827,833 ****
          save_state_data(children, hdr.nsubxacts * sizeof(TransactionId));
          /* While we have the child-xact data, stuff it in the gxact too */
          GXactLoadSubxactData(gxact, hdr.nsubxacts, children);
-         pfree(children);
      }
      if (hdr.ncommitrels > 0)
      {
--- 827,832 ----
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.258
diff -c -r1.258 xact.c
*** src/backend/access/transam/xact.c    4 Mar 2008 19:54:06 -0000    1.258
--- src/backend/access/transam/xact.c    12 Mar 2008 13:41:10 -0000
***************
*** 130,136 ****
      int            gucNestLevel;    /* GUC context nesting depth */
      MemoryContext curTransactionContext;        /* my xact-lifetime context */
      ResourceOwner curTransactionOwner;    /* my query resources */
!     List       *childXids;        /* subcommitted child XIDs */
      Oid            prevUser;        /* previous CurrentUserId setting */
      bool        prevSecDefCxt;    /* previous SecurityDefinerContext setting */
      bool        prevXactReadOnly;        /* entry-time xact r/o state */
--- 130,138 ----
      int            gucNestLevel;    /* GUC context nesting depth */
      MemoryContext curTransactionContext;        /* my xact-lifetime context */
      ResourceOwner curTransactionOwner;    /* my query resources */
!     TransactionId    *childXids;    /* subcommitted child XIDs, in XID order */
!     int            nChildXids;        /* # of subcommitted child XIDs */
!     int            maxChildXids;    /* allocated size of childXids */
      Oid            prevUser;        /* previous CurrentUserId setting */
      bool        prevSecDefCxt;    /* previous SecurityDefinerContext setting */
      bool        prevXactReadOnly;        /* entry-time xact r/o state */
***************
*** 156,162 ****
      0,                            /* GUC context nesting depth */
      NULL,                        /* cur transaction context */
      NULL,                        /* cur transaction resource owner */
!     NIL,                        /* subcommitted child Xids */
      InvalidOid,                    /* previous CurrentUserId setting */
      false,                        /* previous SecurityDefinerContext setting */
      false,                        /* entry-time xact r/o state */
--- 158,166 ----
      0,                            /* GUC context nesting depth */
      NULL,                        /* cur transaction context */
      NULL,                        /* cur transaction resource owner */
!     NULL,                        /* subcommitted child Xids */
!     0,                            /* # of subcommitted child Xids */
!     0,                            /* allocated size of childXids */
      InvalidOid,                    /* previous CurrentUserId setting */
      false,                        /* previous SecurityDefinerContext setting */
      false,                        /* entry-time xact r/o state */
***************
*** 559,565 ****
       */
      for (s = CurrentTransactionState; s != NULL; s = s->parent)
      {
!         ListCell   *cell;

          if (s->state == TRANS_ABORT)
              continue;
--- 563,569 ----
       */
      for (s = CurrentTransactionState; s != NULL; s = s->parent)
      {
!         int low, high;

          if (s->state == TRANS_ABORT)
              continue;
***************
*** 567,576 ****
              continue;            /* it can't have any child XIDs either */
          if (TransactionIdEquals(xid, s->transactionId))
              return true;
!         foreach(cell, s->childXids)
          {
!             if (TransactionIdEquals(xid, lfirst_xid(cell)))
                  return true;
          }
      }

--- 571,596 ----
              continue;            /* it can't have any child XIDs either */
          if (TransactionIdEquals(xid, s->transactionId))
              return true;
!
!         /*
!          * Return true for any committed child of this subtransaction. As
!          * the childXids array is ordered, we can binary search.
!          */
!         low = 0;
!         high = s->nChildXids - 1;
!         while (low <= high)
          {
!             int                middle;
!             TransactionId    probe;
!
!             middle = low + (high - low) / 2;
!             probe = s->childXids[middle];
!             if (TransactionIdEquals(probe, xid))
                  return true;
+             else if (TransactionIdPrecedes(probe, xid))
+                 low = middle + 1;
+             else
+                 high = middle - 1;
          }
      }

***************
*** 985,992 ****
      /* Clean up local data */
      if (rels)
          pfree(rels);
-     if (children)
-         pfree(children);

      return latestXid;
  }
--- 1005,1010 ----
***************
*** 1067,1100 ****
  AtSubCommit_childXids(void)
  {
      TransactionState s = CurrentTransactionState;
!     MemoryContext old_cxt;

      Assert(s->parent != NULL);

      /*
!      * We keep the child-XID lists in TopTransactionContext; this avoids
!      * setting up child-transaction contexts for what might be just a few
!      * bytes of grandchild XIDs.
       */
!     old_cxt = MemoryContextSwitchTo(TopTransactionContext);

!     s->parent->childXids = lappend_xid(s->parent->childXids,
!                                        s->transactionId);
!
!     if (s->childXids != NIL)
      {
!         s->parent->childXids = list_concat(s->parent->childXids,
!                                            s->childXids);

          /*
!          * list_concat doesn't free the list header for the second list; do so
!          * here to avoid memory leakage (kluge)
           */
!         pfree(s->childXids);
!         s->childXids = NIL;
      }

!     MemoryContextSwitchTo(old_cxt);
  }

  /*
--- 1085,1155 ----
  AtSubCommit_childXids(void)
  {
      TransactionState s = CurrentTransactionState;
!     TransactionId new_nChildXids;

      Assert(s->parent != NULL);

      /*
!      * The parent childXids array will need to hold all my XID and all my
!      * childXids, in addition to XIDs already there.
       */
!     new_nChildXids = s->parent->nChildXids + s->nChildXids + 1;

!     /* Allocate or enlarge the parent array if necessary */
!     if (s->parent->maxChildXids < new_nChildXids)
      {
!         int                new_maxChildXids;
!         TransactionId  *new_childXids;

          /*
!          * Make it 2x what's needed right now, to avoid having to enlarge it
!          * repeatedly. But we can't go above MaxAllocSize.
           */
!         new_maxChildXids = Min(new_nChildXids * 2,
!                                MaxAllocSize / sizeof(TransactionId));
!
!         if (new_maxChildXids < new_nChildXids)
!             ereport(ERROR,
!                     (errcode(ERRCODE_OUT_OF_MEMORY),
!                      errmsg("maximum number of committed subtransactions (%d) reached",
!                             new_maxChildXids)));
!
!         /*
!          * We keep the child-XID lists in TopTransactionContext; this avoids
!          * setting up child-transaction contexts for what might be just a few
!          * bytes of grandchild XIDs.
!          */
!         if (s->parent->childXids == NULL)
!             new_childXids =
!                 MemoryContextAlloc(TopTransactionContext,
!                                    new_maxChildXids * sizeof(TransactionId));
!         else
!             new_childXids = repalloc(s->parent->childXids,
!                                      new_maxChildXids * sizeof(TransactionId));
!
!         s->parent->childXids  = new_childXids;
!         s->parent->maxChildXids = new_maxChildXids;
      }

!     /*
!      * Copy all my XIDs to parent's array.
!      *
!      * Note: We rely on the fact that the XID of a child always follows that
!      * of its parent's. By copying the XID of this subtransaction before the
!      * XIDs of its children, we ensure that the array stays ordered. Likewise,
!      * all XIDs already in the array belong to our parent, or subtransactions
!      * started and committed before us, so their XIDs must precede ours.
!      */
!     s->parent->childXids[s->parent->nChildXids] = s->transactionId;
!
!     if (s->nChildXids > 0)
!         memcpy(&s->parent->childXids[s->parent->nChildXids + 1],
!                s->childXids, s->nChildXids * sizeof(TransactionId));
!
!     s->parent->nChildXids = new_nChildXids;
!
!     if (s->childXids != NULL)
!         pfree(s->childXids);
  }

  /*
***************
*** 1259,1266 ****
      /* And clean up local data */
      if (rels)
          pfree(rels);
-     if (children)
-         pfree(children);

      return latestXid;
  }
--- 1314,1319 ----
***************
*** 1336,1343 ****
       * AtSubCommit_childXids).    This means we'd better free the list
       * explicitly at abort to avoid leakage.
       */
!     list_free(s->childXids);
!     s->childXids = NIL;
  }

  /* ----------------------------------------------------------------
--- 1389,1399 ----
       * AtSubCommit_childXids).    This means we'd better free the list
       * explicitly at abort to avoid leakage.
       */
!     if (s->childXids != NULL)
!         pfree(s->childXids);
!     s->childXids = NULL;
!     s->nChildXids = 0;
!     s->maxChildXids = 0;
  }

  /* ----------------------------------------------------------------
***************
*** 1506,1512 ****
       */
      s->nestingLevel = 1;
      s->gucNestLevel = 1;
!     s->childXids = NIL;
      GetUserIdAndContext(&s->prevUser, &s->prevSecDefCxt);
      /* SecurityDefinerContext should never be set outside a transaction */
      Assert(!s->prevSecDefCxt);
--- 1562,1570 ----
       */
      s->nestingLevel = 1;
      s->gucNestLevel = 1;
!     s->childXids = NULL;
!     s->nChildXids = 0;
!     s->maxChildXids = 0;
      GetUserIdAndContext(&s->prevUser, &s->prevSecDefCxt);
      /* SecurityDefinerContext should never be set outside a transaction */
      Assert(!s->prevSecDefCxt);
***************
*** 1702,1708 ****
      s->subTransactionId = InvalidSubTransactionId;
      s->nestingLevel = 0;
      s->gucNestLevel = 0;
!     s->childXids = NIL;

      /*
       * done with commit processing, set current transaction state back to
--- 1760,1768 ----
      s->subTransactionId = InvalidSubTransactionId;
      s->nestingLevel = 0;
      s->gucNestLevel = 0;
!     s->childXids = NULL;
!     s->nChildXids = 0;
!     s->maxChildXids = 0;

      /*
       * done with commit processing, set current transaction state back to
***************
*** 1931,1937 ****
      s->subTransactionId = InvalidSubTransactionId;
      s->nestingLevel = 0;
      s->gucNestLevel = 0;
!     s->childXids = NIL;

      /*
       * done with 1st phase commit processing, set current transaction state
--- 1991,1999 ----
      s->subTransactionId = InvalidSubTransactionId;
      s->nestingLevel = 0;
      s->gucNestLevel = 0;
!     s->childXids = NULL;
!     s->nChildXids = 0;
!     s->maxChildXids = 0;

      /*
       * done with 1st phase commit processing, set current transaction state
***************
*** 2101,2107 ****
      s->subTransactionId = InvalidSubTransactionId;
      s->nestingLevel = 0;
      s->gucNestLevel = 0;
!     s->childXids = NIL;

      /*
       * done with abort processing, set current transaction state back to
--- 2163,2171 ----
      s->subTransactionId = InvalidSubTransactionId;
      s->nestingLevel = 0;
      s->gucNestLevel = 0;
!     s->childXids = NULL;
!     s->nChildXids = 0;
!     s->maxChildXids = 0;

      /*
       * done with abort processing, set current transaction state back to
***************
*** 4051,4056 ****
--- 4115,4133 ----
  static void
  ShowTransactionStateRec(TransactionState s)
  {
+     StringInfoData buf;
+
+     initStringInfo(&buf);
+
+     if (s->nChildXids > 0)
+     {
+         int i;
+
+         appendStringInfo(&buf, "%u", s->childXids[0]);
+         for (i = 1; i < s->nChildXids; i++)
+             appendStringInfo(&buf, " %u", s->childXids[i]);
+     }
+
      if (s->parent)
          ShowTransactionStateRec(s->parent);

***************
*** 4064,4071 ****
                               (unsigned int) s->subTransactionId,
                               (unsigned int) currentCommandId,
                               currentCommandIdUsed ? " (used)" : "",
!                              s->nestingLevel,
!                              nodeToString(s->childXids))));
  }

  /*
--- 4141,4147 ----
                               (unsigned int) s->subTransactionId,
                               (unsigned int) currentCommandId,
                               currentCommandIdUsed ? " (used)" : "",
!                              s->nestingLevel, buf.data)));
  }

  /*
***************
*** 4144,4179 ****
   * xactGetCommittedChildren
   *
   * Gets the list of committed children of the current transaction.    The return
!  * value is the number of child transactions.  *children is set to point to a
!  * palloc'd array of TransactionIds.  If there are no subxacts, *children is
!  * set to NULL.
   */
  int
  xactGetCommittedChildren(TransactionId **ptr)
  {
      TransactionState s = CurrentTransactionState;
-     int            nchildren;
-     TransactionId *children;
-     ListCell   *p;

!     nchildren = list_length(s->childXids);
!     if (nchildren == 0)
!     {
          *ptr = NULL;
!         return 0;
!     }
!
!     children = (TransactionId *) palloc(nchildren * sizeof(TransactionId));
!     *ptr = children;
!
!     foreach(p, s->childXids)
!     {
!         TransactionId child = lfirst_xid(p);
!
!         *children++ = child;
!     }

!     return nchildren;
  }

  /*
--- 4220,4241 ----
   * xactGetCommittedChildren
   *
   * Gets the list of committed children of the current transaction.    The return
!  * value is the number of child transactions.  *children is set to point to an
!  * array of TransactionIds. The array is allocated in TopTransactioncontext;
!  * the caller should not try to pfree() it. If there are no subxacts,
!  * *children is set to NULL.
   */
  int
  xactGetCommittedChildren(TransactionId **ptr)
  {
      TransactionState s = CurrentTransactionState;

!     if (s->nChildXids == 0)
          *ptr = NULL;
!     else
!         *ptr = s->childXids;

!     return s->nChildXids;
  }

  /*
Index: src/include/nodes/pg_list.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/nodes/pg_list.h,v
retrieving revision 1.57
diff -c -r1.57 pg_list.h
*** src/include/nodes/pg_list.h    1 Jan 2008 19:45:58 -0000    1.57
--- src/include/nodes/pg_list.h    12 Mar 2008 10:32:25 -0000
***************
*** 26,36 ****
   * (At the moment, ints and Oids are the same size, but they may not
   * always be so; try to be careful to maintain the distinction.)
   *
-  * There is also limited support for lists of TransactionIds; since these
-  * are used in only one or two places, we don't provide a full implementation,
-  * but map them onto Oid lists.  This effectively assumes that TransactionId
-  * is no wider than Oid and both are unsigned types.
-  *
   *
   * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
--- 26,31 ----
***************
*** 160,171 ****
  #define list_make4_oid(x1,x2,x3,x4) lcons_oid(x1, list_make3_oid(x2, x3, x4))

  /*
-  * Limited support for lists of TransactionIds, mapped onto lists of Oids
-  */
- #define lfirst_xid(lc)                ((TransactionId) lfirst_oid(lc))
- #define lappend_xid(list, datum)    lappend_oid(list, (Oid) (datum))
-
- /*
   * foreach -
   *      a convenience macro which loops through the list
   */
--- 155,160 ----

pgsql-patches by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Proposed patch for LISTEN/NOTIFY race condition
Next
From: "Pavan Deolasee"
Date:
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit