A little COPY speedup - Mailing list pgsql-patches

From Heikki Linnakangas
Subject A little COPY speedup
Date
Msg-id 45E706DD.6080404@enterprisedb.com
Whole thread Raw
Responses Re: A little COPY speedup  ("Pavan Deolasee" <pavan.deolasee@enterprisedb.com>)
Re: A little COPY speedup  (Andrew Dunstan <andrew@dunslane.net>)
Re: A little COPY speedup  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: A little COPY speedup  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-patches
One complaint we've heard from clients trying out EDB or PostgreSQL is
that loading data is slower than on other DBMSs.

I ran oprofile on a COPY FROM to get an overview of where the CPU time
is spent. To my amazement, the function at the top of the list was
PageAddItem with 16% of samples.

On every row, PageAddItem will scan all the line pointers on the target
page, just to see that they're all in use, and create a new line
pointer. That adds up, especially with narrow tuples like what I used in
the test.

Attached is a fix for that. It adds a flag to each heap page that
indicates that "there isn't any free line pointers on this page, so
don't bother trying". Heap pages haven't had any heap-specific per-page
data before, so this patch adds a HeapPageOpaqueData-struct that's
stored in the special space.

My simple test case of a COPY FROM of 10000000 tuples took 19.6 s
without the patch, and 17.7 s with the patch applied. Your mileage may vary.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/heap/heapam.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v
retrieving revision 1.228
diff -c -r1.228 heapam.c
*** src/backend/access/heap/heapam.c    9 Feb 2007 03:35:33 -0000    1.228
--- src/backend/access/heap/heapam.c    1 Mar 2007 16:27:38 -0000
***************
*** 3291,3296 ****
--- 3291,3297 ----
      Relation    reln;
      Buffer        buffer;
      Page        page;
+     HeapPageOpaque opq;

      if (record->xl_info & XLR_BKP_BLOCK_1)
          return;
***************
*** 3300,3305 ****
--- 3301,3307 ----
      if (!BufferIsValid(buffer))
          return;
      page = (Page) BufferGetPage(buffer);
+     opq = (HeapPageOpaque) PageGetSpecialPointer(page);

      if (XLByteLE(lsn, PageGetLSN(page)))
      {
***************
*** 3327,3332 ****
--- 3329,3337 ----

      PageRepairFragmentation(page, NULL);

+     /* clear the hint flag since we just freed some line pointers */
+     opq->hpo_flags &= ~HP_NOFREELINEPOINTERS;
+
      PageSetLSN(page, lsn);
      PageSetTLI(page, ThisTimeLineID);
      MarkBufferDirty(buffer);
Index: src/backend/access/heap/hio.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/hio.c,v
retrieving revision 1.65
diff -c -r1.65 hio.c
*** src/backend/access/heap/hio.c    5 Feb 2007 04:22:18 -0000    1.65
--- src/backend/access/heap/hio.c    1 Mar 2007 16:44:47 -0000
***************
*** 33,51 ****
                       HeapTuple tuple)
  {
      Page        pageHeader;
!     OffsetNumber offnum;
      ItemId        itemId;
      Item        item;

-     /* Add the tuple to the page */
      pageHeader = BufferGetPage(buffer);

      offnum = PageAddItem(pageHeader, (Item) tuple->t_data,
!                          tuple->t_len, InvalidOffsetNumber, LP_USED);

      if (offnum == InvalidOffsetNumber)
          elog(PANIC, "failed to add tuple to page");

      /* Update tuple->t_self to the actual position where it was stored */
      ItemPointerSet(&(tuple->t_self), BufferGetBlockNumber(buffer), offnum);

--- 33,70 ----
                       HeapTuple tuple)
  {
      Page        pageHeader;
!     OffsetNumber offnum, maxoff;
      ItemId        itemId;
      Item        item;
+     HeapPageOpaque opq;

      pageHeader = BufferGetPage(buffer);
+     opq = (HeapPageOpaque)PageGetSpecialPointer(pageHeader);
+     maxoff = PageGetMaxOffsetNumber(pageHeader);
+
+     /* If we know there's no free line pointers, don't waste cycles
+      * searching for one. The flag is set when there definitely isn't
+      * any free line pointers on the page, but the absence of the flag
+      * doesn't mean anything. There might still not be any free line
+      * pointers left. We'll set the flag to save work for future inserts
+      * when that happens.
+      */
+     if(opq->hpo_flags & HP_NOFREELINEPOINTERS)
+         offnum = OffsetNumberNext(maxoff);
+     else
+         offnum = InvalidOffsetNumber;
+
+     /* Add the tuple to the page */

      offnum = PageAddItem(pageHeader, (Item) tuple->t_data,
!                          tuple->t_len, offnum, LP_USED);

      if (offnum == InvalidOffsetNumber)
          elog(PANIC, "failed to add tuple to page");

+     if(offnum > maxoff)
+         opq->hpo_flags |= HP_NOFREELINEPOINTERS;
+
      /* Update tuple->t_self to the actual position where it was stored */
      ItemPointerSet(&(tuple->t_self), BufferGetBlockNumber(buffer), offnum);

***************
*** 309,315 ****
               BufferGetBlockNumber(buffer),
               RelationGetRelationName(relation));

!     PageInit(pageHeader, BufferGetPageSize(buffer), 0);

      if (len > PageGetFreeSpace(pageHeader))
      {
--- 328,335 ----
               BufferGetBlockNumber(buffer),
               RelationGetRelationName(relation));

!     PageInit(pageHeader, BufferGetPageSize(buffer),
!              sizeof(HeapPageOpaqueData));

      if (len > PageGetFreeSpace(pageHeader))
      {
Index: src/backend/commands/vacuum.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.346
diff -c -r1.346 vacuum.c
*** src/backend/commands/vacuum.c    15 Feb 2007 23:23:22 -0000    1.346
--- src/backend/commands/vacuum.c    1 Mar 2007 16:29:03 -0000
***************
*** 2416,2425 ****
--- 2416,2428 ----
                          maxoff;
              int            uncnt;
              int            num_tuples = 0;
+             HeapPageOpaque opq;

              buf = ReadBuffer(onerel, vacpage->blkno);
              LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
              page = BufferGetPage(buf);
+             opq = (HeapPageOpaque) PageGetSpecialPointer(page);
+
              maxoff = PageGetMaxOffsetNumber(page);
              for (offnum = FirstOffsetNumber;
                   offnum <= maxoff;
***************
*** 2453,2458 ****
--- 2456,2463 ----
              START_CRIT_SECTION();

              uncnt = PageRepairFragmentation(page, unused);
+             if(uncnt > 0)
+                 opq->hpo_flags &= ~HP_NOFREELINEPOINTERS;

              MarkBufferDirty(buf);

***************
*** 2907,2912 ****
--- 2912,2918 ----
      Page        page = BufferGetPage(buffer);
      ItemId        itemid;
      int            i;
+     HeapPageOpaque opq = (HeapPageOpaque) PageGetSpecialPointer(page);

      /* There shouldn't be any tuples moved onto the page yet! */
      Assert(vacpage->offsets_used == 0);
***************
*** 2920,2925 ****
--- 2926,2933 ----
      }

      uncnt = PageRepairFragmentation(page, unused);
+     if(uncnt > 0)
+         opq->hpo_flags &= ~HP_NOFREELINEPOINTERS;

      MarkBufferDirty(buffer);

Index: src/backend/commands/vacuumlazy.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuumlazy.c,v
retrieving revision 1.85
diff -c -r1.85 vacuumlazy.c
*** src/backend/commands/vacuumlazy.c    21 Feb 2007 22:47:45 -0000    1.85
--- src/backend/commands/vacuumlazy.c    1 Mar 2007 16:28:02 -0000
***************
*** 588,593 ****
--- 588,594 ----
      int            uncnt;
      Page        page = BufferGetPage(buffer);
      ItemId        itemid;
+     HeapPageOpaque opq = (HeapPageOpaque) PageGetSpecialPointer(page);

      START_CRIT_SECTION();

***************
*** 605,610 ****
--- 606,613 ----
      }

      uncnt = PageRepairFragmentation(page, unused);
+     if(uncnt > 0)
+        opq->hpo_flags &= ~HP_NOFREELINEPOINTERS;

      MarkBufferDirty(buffer);

Index: src/include/access/htup.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/htup.h,v
retrieving revision 1.92
diff -c -r1.92 htup.h
*** src/include/access/htup.h    27 Feb 2007 23:48:09 -0000    1.92
--- src/include/access/htup.h    1 Mar 2007 16:49:11 -0000
***************
*** 18,23 ****
--- 18,38 ----
  #include "storage/relfilenode.h"

  /*
+  * Heap page special space. At the end of the page, we store some per-page
+  * information that's specific to heapam. At the moment there's just one flag.
+  */
+ typedef struct HeapPageOpaqueData
+ {
+     uint8 hpo_flags;
+ } HeapPageOpaqueData;
+
+ typedef HeapPageOpaqueData *HeapPageOpaque;
+
+ /* Bits defined in hpo_flags */
+ #define HP_NOFREELINEPOINTERS    (1 << 0)    /* all line pointers are in use */
+
+
+ /*
   * MaxTupleAttributeNumber limits the number of (user) columns in a tuple.
   * The key limit on this value is that the size of the fixed overhead for
   * a tuple, plus the size of the null-values bitmap (at 1 bit per column),
***************
*** 317,331 ****
  /*
   * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
   * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
!  * other stuff that has to be on a disk page.  Since heap pages use no
!  * "special space", there's no deduction for that.
   *
   * NOTE: we do not need to count an ItemId for the tuple because
   * sizeof(PageHeaderData) includes the first ItemId on the page.  But beware
   * of assuming that, say, you can fit 2 tuples of size MaxHeapTupleSize/2
   * on the same page.
   */
! #define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(sizeof(PageHeaderData)))

  /*
   * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can
--- 332,346 ----
  /*
   * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
   * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
!  * other stuff that has to be on a disk page.
   *
   * NOTE: we do not need to count an ItemId for the tuple because
   * sizeof(PageHeaderData) includes the first ItemId on the page.  But beware
   * of assuming that, say, you can fit 2 tuples of size MaxHeapTupleSize/2
   * on the same page.
   */
! #define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(sizeof(PageHeaderData)) \
!                                   - MAXALIGN(sizeof(HeapPageOpaqueData)))

  /*
   * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Fast COPY after TRUNCATE bug and fix
Next
From: "Pavan Deolasee"
Date:
Subject: Re: A little COPY speedup