Thread: Changes improve the performance of INSERT and UPDATE

Changes improve the performance of INSERT and UPDATE

From
Hiroki Kataoka
Date:
Hi all,

This small patch improves the performance of INSERT and UPDATE.  By my
machine, these changes raised the performance about 5%~10% in pgbench.
Please take a benchmark in a reliable environment.  Since I may forget
some required changes, I specify that this patch is experimental.  But
note that all regression tests have passed.

Thanks,

--
Hiroki Kataoka <kataoka@interwiz.jp>
diff -ru postgresql-cvs.orig/src/backend/storage/page/bufpage.c postgresql-cvs/src/backend/storage/page/bufpage.c
--- postgresql-cvs.orig/src/backend/storage/page/bufpage.c    2005-06-15 16:53:00.000000000 +0900
+++ postgresql-cvs/src/backend/storage/page/bufpage.c    2005-07-14 07:41:11.000000000 +0900
@@ -42,6 +42,7 @@
     p->pd_lower = SizeOfPageHeaderData;
     p->pd_upper = pageSize - specialSize;
     p->pd_special = pageSize - specialSize;
+    p->pd_insoff = FirstOffsetNumber;
     PageSetPageSizeAndVersion(page, pageSize, PG_PAGE_LAYOUT_VERSION);
 }

@@ -100,7 +101,7 @@
  *    If offsetNumber is valid and <= current max offset in the page,
  *    insert item into the array at that position by shuffling ItemId's
  *    down to make room.
- *    If offsetNumber is not valid, then assign one by finding the first
+ *    If offsetNumber is not valid, then assign one by finding the next
  *    one that is both unused and deallocated.
  *
  *    !!! EREPORT(ERROR) IS DISALLOWED HERE !!!
@@ -120,6 +121,7 @@
     OffsetNumber limit;
     bool        needshuffle = false;
     bool        overwritemode = (flags & OverwritePageMode) != 0;
+    bool        findunusedlinp = false;

     flags &= ~OverwritePageMode;

@@ -165,9 +167,15 @@
     }
     else
     {
-        /* offsetNumber was not passed in, so find a free slot */
-        /* look for "recyclable" (unused & deallocated) ItemId */
-        for (offsetNumber = 1; offsetNumber < limit; offsetNumber++)
+        /* offsetNumber was not passed in, try to use pd_insoff */
+        if (OffsetNumberIsValid(phdr->pd_insoff))
+            offsetNumber = phdr->pd_insoff;
+        else
+            offsetNumber = FirstOffsetNumber;
+
+        /* so find a free slot look for "recyclable" (unused & deallocated) */
+        /* ItemId */
+        for (; offsetNumber < limit; offsetNumber++)
         {
             itemId = PageGetItemId(phdr, offsetNumber);
             if (((itemId->lp_flags & LP_USED) == 0) &&
@@ -175,6 +183,8 @@
                 break;
         }
         /* if no free slot, we'll put it at limit (1st open slot) */
+
+        findunusedlinp = true;
     }

     if (offsetNumber > limit)
@@ -222,6 +232,8 @@
     /* adjust page header */
     phdr->pd_lower = (LocationIndex) lower;
     phdr->pd_upper = (LocationIndex) upper;
+    if (findunusedlinp)
+        phdr->pd_insoff = OffsetNumberNext(offsetNumber);

     return offsetNumber;
 }
@@ -345,8 +357,13 @@
             lp->lp_flags &= ~(LP_USED | LP_DELETE);
         if (lp->lp_flags & LP_USED)
             nused++;
-        else if (unused)
-            unused[i - nused] = (OffsetNumber) i;
+        else
+        {
+            if (i - nused == 0)    /* check if it's first unused line pointer */
+                ((PageHeader) page)->pd_insoff = (OffsetNumber) i + 1;
+            if (unused)
+                unused[i - nused] = (OffsetNumber) i;
+        }
     }

     if (nused == 0)
diff -ru postgresql-cvs.orig/src/include/storage/bufpage.h postgresql-cvs/src/include/storage/bufpage.h
--- postgresql-cvs.orig/src/include/storage/bufpage.h    2005-04-29 06:47:18.000000000 +0900
+++ postgresql-cvs/src/include/storage/bufpage.h    2005-07-14 06:57:03.000000000 +0900
@@ -125,6 +125,7 @@
     LocationIndex pd_lower;        /* offset to start of free space */
     LocationIndex pd_upper;        /* offset to end of free space */
     LocationIndex pd_special;    /* offset to start of special space */
+    OffsetNumber pd_insoff;        /* offset number for next new item */
     uint16        pd_pagesize_version;
     ItemIdData    pd_linp[1];        /* beginning of line pointer array */
 } PageHeaderData;


Re: Changes improve the performance of INSERT and UPDATE

From
Tom Lane
Date:
Hiroki Kataoka <kataoka@interwiz.koganei.tokyo.jp> writes:
> This small patch improves the performance of INSERT and UPDATE.  By my
> machine, these changes raised the performance about 5%~10% in pgbench.

Wouldn't it be a lot less invasive to search down from the end, instead
of changing the page header format?
        regards, tom lane


Re: Changes improve the performance of INSERT and UPDATE

From
Tom Lane
Date:
Hiroki Kataoka <kataoka@interwiz.koganei.tokyo.jp> writes:
> This small patch improves the performance of INSERT and UPDATE.  By my
> machine, these changes raised the performance about 5%~10% in pgbench.

BTW, in profiling the backend I've never seen PageAddItem take more than
about 1% of the runtime, and in pgbench in particular it seems to be
down around 0.1% ... so the above seems a bit optimistic ...
        regards, tom lane


Re: Changes improve the performance of INSERT and UPDATE

From
Hiroki Kataoka
Date:
Tom Lane wrote:
> Hiroki Kataoka <kataoka@interwiz.koganei.tokyo.jp> writes:
> 
>>This small patch improves the performance of INSERT and UPDATE.  By my
>>machine, these changes raised the performance about 5%~10% in pgbench.
> 
> Wouldn't it be a lot less invasive to search down from the end, instead
> of changing the page header format?

If the last of line pointer array always be free slot, your method is
effective.  But usual line pointer array is full, or worm-eaten after
vacuum, so there is no benefit.

My method is to have the current insertion position of line pointer
array.  This prevents that PageAddItem scan same range of line pointer
array vainly.  A free slot is rarely produced, so it is not necessary to
search from a head (or bottom in your method) each time.

-- 
Hiroki Kataoka <kataoka@interwiz.jp>


Re: Changes improve the performance of INSERT and UPDATE

From
Hiroki Kataoka
Date:
Tom Lane wrote:
>>This small patch improves the performance of INSERT and UPDATE.  By my
>>machine, these changes raised the performance about 5%~10% in pgbench.
> 
> BTW, in profiling the backend I've never seen PageAddItem take more than
> about 1% of the runtime, and in pgbench in particular it seems to be
> down around 0.1% ... so the above seems a bit optimistic ...

I have the nearly same result, but pgbench says different.  I don't know 
why my test generates 5~10% performance improvement.  Therefore, I want 
to take a benchmark in a reliable environment.

By reference, PageAddItem takes 4%~5% of the runtime in the heavy 
writing operation likes CREATE TABLE AS SELECT.

-- 
Hiroki Kataoka <kataoka@interwiz.jp>


Re: Changes improve the performance of INSERT and UPDATE

From
Tom Lane
Date:
Hiroki Kataoka <kataoka@interwiz.jp> writes:
>>> This small patch improves the performance of INSERT and UPDATE.  By my
>>> machine, these changes raised the performance about 5%~10% in pgbench.
>> 
>> BTW, in profiling the backend I've never seen PageAddItem take more than
>> about 1% of the runtime, and in pgbench in particular it seems to be
>> down around 0.1% ... so the above seems a bit optimistic ...

> I have the nearly same result, but pgbench says different.  I don't know 
> why my test generates 5~10% performance improvement.  Therefore, I want 
> to take a benchmark in a reliable environment.

I've been testing this patch a bit, and I'm unable to measure any
consistent improvement in pgbench times (sometimes it seems to win,
and some other times it doesn't).  And gprof still swears up and down
that PageAddItem is only about 0.1% of the runtime, which would make
it impossible to obtain more than an 0.1% speedup.  I'm inclined to
write off your result as measurement error --- it's notoriously hard
to get reproducible results out of pgbench.

> By reference, PageAddItem takes 4%~5% of the runtime in the heavy 
> writing operation likes CREATE TABLE AS SELECT.

I tried making a million-row table with just two int4 columns and then
duplicating it with CREATE TABLE AS SELECT.  In this context gprof
shows PageAddItem as taking 7% of the runtime, which your patch knocks
down to 1.5%.  This seems to be about the best possible real-world case,
though (the wider the rows, the fewer times PageAddItem can loop), and
so I'm still unconvinced that there's a generic gain here.  Adding an
additional word to page headers has a very definite cost --- we can
assume about a .05% increase in net I/O demands across *every*
application, whether they do a lot of inserts or not --- and so a
patch that provides a noticeable improvement in only a very small set
of circumstances is going to have to be rejected.

Has anyone else experimented with this patch?  Have you gotten any
better impression of the cost/benefit ratio than I'm getting?
        regards, tom lane

PS: If we were going to apply the patch, I'd be inclined to compensate
for the space usage by removing the pd_tli field, which isn't actually
ever used anywhere in the current code.  Then the argument would become
one about opportunity costs --- will we ever need pd_tli in the future?
I don't think we yet have enough experience with the "timeline" feature
to be sure either way.


Re: Changes improve the performance of INSERT and UPDATE

From
Stefan Kaltenbrunner
Date:
Tom Lane wrote:
> Hiroki Kataoka <kataoka@interwiz.jp> writes:
> 
>>>>This small patch improves the performance of INSERT and UPDATE.  By my
>>>>machine, these changes raised the performance about 5%~10% in pgbench.
>>>
>>>BTW, in profiling the backend I've never seen PageAddItem take more than
>>>about 1% of the runtime, and in pgbench in particular it seems to be
>>>down around 0.1% ... so the above seems a bit optimistic ...
> 
> 
>>I have the nearly same result, but pgbench says different.  I don't know 
>>why my test generates 5~10% performance improvement.  Therefore, I want 
>>to take a benchmark in a reliable environment.
> 
> 
> I've been testing this patch a bit, and I'm unable to measure any
> consistent improvement in pgbench times (sometimes it seems to win,
> and some other times it doesn't).  And gprof still swears up and down
> that PageAddItem is only about 0.1% of the runtime, which would make
> it impossible to obtain more than an 0.1% speedup.  I'm inclined to
> write off your result as measurement error --- it's notoriously hard
> to get reproducible results out of pgbench.


I played with the patch two weeks ago (pgbench and some bulkloading of
production data as well as restores from large dumps) - afair I have
been unable to measure any noticable real-life improvement.
Trusting pgbench is quite difficult for such tests - I can get easily up
to 15% variation on consecutive runs on my boxes here ...


Stefan


Re: Changes improve the performance of INSERT and UPDATE

From
Hiroki Kataoka
Date:
Tom Lane wrote:
> I've been testing this patch a bit, and I'm unable to measure any
> consistent improvement in pgbench times (sometimes it seems to win,
> and some other times it doesn't).  And gprof still swears up and down
> that PageAddItem is only about 0.1% of the runtime, which would make
> it impossible to obtain more than an 0.1% speedup.  I'm inclined to
> write off your result as measurement error --- it's notoriously hard
> to get reproducible results out of pgbench.
  OK.  I think that your environment is more realistic than mine.  I 
have been convinced to your result.  Surely, my patch may not be allround.

> I tried making a million-row table with just two int4 columns and then
> duplicating it with CREATE TABLE AS SELECT.  In this context gprof
> shows PageAddItem as taking 7% of the runtime, which your patch knocks
> down to 1.5%.  This seems to be about the best possible real-world case,
> though (the wider the rows, the fewer times PageAddItem can loop), and
> so I'm still unconvinced that there's a generic gain here.
  In fact, I have created this patch since I wants to make COPY FROM 
faster.  So, it is not rare case for me.  There may be no generic gain 
here, but I think it is very important to make COPY FROM and so on more 
speedy.  Of course, if we can prevent the increase in a page header, 
it's better.

> PS: If we were going to apply the patch, I'd be inclined to compensate
> for the space usage by removing the pd_tli field, which isn't actually
> ever used anywhere in the current code.
  It is very a good idea if it's possible.  I want you to surely think 
positively very much.

-- 
Hiroki Kataoka <kataoka@interwiz.jp>