Thread: Vacuum I/O throttling

Vacuum I/O throttling

From
Guy Thornley
Date:
Below is a patch for the lazy vacuum. It implements a simple I/O throttle so
boxen arnt killed for hours a day when VACUUM runs. Patch includes a
paragraph for the manual. The new setting is VACUUM_THROTTLE. It can be SET
from a client connection, too.

The usleep() could be replaced with a select() call with a timeout an no
fd_set's to aid portability..

The intention is I can simply startup a daemon like shellscript that spends
its whole life doing VACUUM, like:

$ while :; do echo "SET vacuum_throttle = 20; VACUUM ANALYZE VERBOSE"; done

It is against version 7.3.2 with a bunch of Debian specific patches applied,
hopefully it will apply fine you people. [Debian's promiscuous patching of
everything always irks me]

Now, some things I noticed while writing this patch:

- Is it correct that the database backend appears to have absolutely no idea
  what tables have free space until VACUUM runs for the first time?

- If a table is VACUUM'd multiple times simultaneously, what happens? [Can
  this even happen? I didn't look for per-vacuum locks]. From what I can
  see, this is a dangerous thing to do...

- Would this patch be more useful as a general I/O throttle for reading
  pages in from disk? [Down in ReadBuffer() somewhere I guess, but I didnt
  look to close (yet;)] I didn't do it this way, because I didnt want sleeps
  forced upon processes that could have important locks held.

Note that I'm fairly noob to these database thingys, and comments are
appreciated.

- Guy


diff -bBur postgresql-7.3.2/doc/src/sgml/runtime.sgml postgresql-7.3.2-guy/doc/src/sgml/runtime.sgml
--- postgresql-7.3.2/doc/src/sgml/runtime.sgml    2003-01-11 05:04:26.000000000 +0000
+++ postgresql-7.3.2-guy/doc/src/sgml/runtime.sgml    2003-08-28 04:16:28.000000000 +0000
@@ -2045,6 +2045,23 @@
      </varlistentry>

      <varlistentry>
+      <term><varname>VACUUM_THROTTLE</varname> (<type>integer</type>)</term>
+      <listitem>
+       <para>
+        Optionally throttle the rate at which the lazy
+        <command>VACUUM</command> will scan database pages. The value
+        specified is either 0 to disable the throttle (the default) or the
+        number of pages/second <command>VACUUM</command> is permitted to
+        look at. If you are having problems with <command>VACUUM</command>
+        nuking your I/O subsystem, try tuning this parameter. Values larger
+        than your OS scheduling frequency will probably not be useful. This
+        does not affect <command>VACUUM FULL</command> or
+        <command>ANALYZE</command>.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><varname>VIRTUAL_HOST</varname> (<type>string</type>)</term>
       <listitem>
        <para>
diff -bBur postgresql-7.3.2/src/backend/commands/vacuumlazy.c postgresql-7.3.2-guy/src/backend/commands/vacuumlazy.c
--- postgresql-7.3.2/src/backend/commands/vacuumlazy.c    2002-09-20 19:56:01.000000000 +0000
+++ postgresql-7.3.2-guy/src/backend/commands/vacuumlazy.c    2003-08-28 03:34:27.000000000 +0000
@@ -204,6 +205,7 @@
     bool        did_vacuum_index = false;
     int            i;
     VacRUsage    ru0;
+    int            page_delay = 0;

     vac_init_rusage(&ru0);

@@ -221,6 +223,9 @@

     lazy_space_alloc(vacrelstats, nblocks);

+    if (VacuumThrottle > 0)
+        page_delay = 1000000 / VacuumThrottle;
+
     for (blkno = 0; blkno < nblocks; blkno++)
     {
         Buffer        buf;
@@ -232,6 +237,9 @@
                     hastup;
         int            prev_dead_count;

+        if(page_delay > 0)
+            usleep(page_delay);
+
         CHECK_FOR_INTERRUPTS();

         /*
diff -bBur postgresql-7.3.2/src/backend/utils/init/globals.c postgresql-7.3.2-guy/src/backend/utils/init/globals.c
--- postgresql-7.3.2/src/backend/utils/init/globals.c    2002-10-03 17:07:53.000000000 +0000
+++ postgresql-7.3.2-guy/src/backend/utils/init/globals.c    2003-08-28 02:55:16.000000000 +0000
@@ -70,4 +70,5 @@
 bool        allowSystemTableMods = false;
 int            SortMem = 1024;
 int            VacuumMem = 8192;
+int            VacuumThrottle = 0;
 int            NBuffers = DEF_NBUFFERS;
diff -bBur postgresql-7.3.2/src/backend/utils/misc/guc.c postgresql-7.3.2-guy/src/backend/utils/misc/guc.c
--- postgresql-7.3.2/src/backend/utils/misc/guc.c    2003-01-28 18:04:13.000000000 +0000
+++ postgresql-7.3.2-guy/src/backend/utils/misc/guc.c    2003-08-28 03:07:27.000000000 +0000
@@ -602,6 +602,11 @@
     },

     {
+        {"vacuum_throttle", PGC_USERSET}, &VacuumThrottle,
+        0, 0, INT_MAX, NULL, NULL
+    },
+
+    {
         {"max_files_per_process", PGC_BACKEND}, &max_files_per_process,
         1000, 25, INT_MAX, NULL, NULL
     },
diff -bBur postgresql-7.3.2/src/include/c.h postgresql-7.3.2-guy/src/include/c.h
--- postgresql-7.3.2/src/include/c.h    2002-10-24 03:11:05.000000000 +0000
+++ postgresql-7.3.2-guy/src/include/c.h    2003-08-28 03:17:03.000000000 +0000
@@ -58,6 +58,7 @@
 #include <string.h>
 #include <stddef.h>
 #include <stdarg.h>
+#include <unistd.h>
 #ifdef HAVE_STRINGS_H
 #include <strings.h>
 #endif
diff -bBur postgresql-7.3.2/src/include/miscadmin.h postgresql-7.3.2-guy/src/include/miscadmin.h
--- postgresql-7.3.2/src/include/miscadmin.h    2002-10-03 17:07:53.000000000 +0000
+++ postgresql-7.3.2-guy/src/include/miscadmin.h    2003-08-28 03:12:25.000000000 +0000
@@ -165,6 +165,7 @@
 extern bool allowSystemTableMods;
 extern DLLIMPORT int SortMem;
 extern int    VacuumMem;
+extern int    VacuumThrottle;

 /*
  *    A few postmaster startup options are exported here so the

Re: Vacuum I/O throttling

From
Tom Lane
Date:
Guy Thornley <guy@esphion.com> writes:
> Below is a patch for the lazy vacuum. It implements a simple I/O throttle so
> boxen arnt killed for hours a day when VACUUM runs.

Wasn't this idea tried and rejected already?  You haven't given us any
information about actual performance.

> The usleep() could be replaced with a select() call with a timeout an no
> fd_set's to aid portability..

usleep is not portable, AFAIR.

            regards, tom lane

Re: Vacuum I/O throttling

From
Bruce Momjian
Date:
Tom Lane wrote:
> Guy Thornley <guy@esphion.com> writes:
> > Below is a patch for the lazy vacuum. It implements a simple I/O throttle so
> > boxen arnt killed for hours a day when VACUUM runs.
>
> Wasn't this idea tried and rejected already?  You haven't given us any
> information about actual performance.

What would be more valuable would be if VACUUM ran only when it know it
could reap a certain number of tuples, or there where FSM need for those
tuples.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Vacuum I/O throttling

From
Guy Thornley
Date:
On Mon, Sep 01, 2003 at 09:05:33AM -0400, Tom Lane wrote:
> Guy Thornley <guy@esphion.com> writes:
> > Below is a patch for the lazy vacuum. It implements a simple I/O throttle so
> > boxen arnt killed for hours a day when VACUUM runs.
>
> Wasn't this idea tried and rejected already?  You haven't given us any
> information about actual performance.
I don't know, sorry; when I looked at the archives I only saw posts about
tuning vacuums, memory usage, etc, and people griping about the way it nukes
the I/O system. I'm new here.

What sort of performance numbers are you looking for? Without the throttle,
I/O is nuked and other database activity takes an age, and with it, its much
happier?

More seriously, this patch isnt meant to actually deal with vacuumed tuples.
The application being developed by the company I am working for requires
24x7x365 unattended operation. Even if vacuum ran every 6 months, for the
transaction renumbering stuff, the way it nukes I/O is not acceptable.
Especially on (potentially) several-hundred gig databases.

We are beginning to learn that "DBMS" and "unattended" dont belong in the
same sentence.

> > The usleep() could be replaced with a select() call with a timeout an no
> > fd_set's to aid portability..
>
> usleep is not portable, AFAIR.
>
>             regards, tom lane

Re: Vacuum I/O throttling

From
Tom Lane
Date:
Guy Thornley <guy@esphion.com> writes:
> What sort of performance numbers are you looking for? Without the throttle,
> I/O is nuked and other database activity takes an age, and with it, its much
> happier?

Some people say that VACUUM nukes their performance, and some don't
find it to be a problem.  AFAICT, it's only an issue if you have little
reserve disk bandwidth, which in itself is a dangerous situation for a
database that you don't want to pay attention to.

I don't want to sound like I'm rejecting your patch out of hand.  What
I do want is to get some idea of its range of usefulness.

> We are beginning to learn that "DBMS" and "unattended" dont belong in the
> same sentence.

"Unattended" and "running on the edge of your resources" don't play nice
together, for sure.

            regards, tom lane

Re: Vacuum I/O throttling]

From
Guy Thornley
Date:
On Tue, Sep 02, 2003 at 12:17:28AM -0400, Tom Lane wrote:
> Guy Thornley <guy@esphion.com> writes:
> > What sort of performance numbers are you looking for? Without the throttle,
> > I/O is nuked and other database activity takes an age, and with it, its much
> > happier?
>
> Some people say that VACUUM nukes their performance, and some don't
> find it to be a problem.  AFAICT, it's only an issue if you have little
> reserve disk bandwidth, which in itself is a dangerous situation for a
> database that you don't want to pay attention to.

Well, I finally got chance to take some performance numbers.

The numbers were taken for a set of queries typical for our app on a test
dataset we actually do our testing with. One of the tables involved is
(rows=35805 width=356) [from explain select * from ...] and the other is
larger, (rows=5407836 width=136).

Test box is a dual Athlon MP2400+ with 512MB of ram. Disk is a bit lacking,
it is a single 40GB 7200rpm IDE disk.

Vacuum               Actual     User   System     Find
------------------------------------------------------
No vacuum           3:26.11  0:00.31  0:00.09  0:08.48
Vacuum throttled    3:31.84  0:00.27  0:00.10  0:09.58
Vacuum            167:22.36  0:00.24  0:00.09  2:11.18

Actual,User,System should be self-explanatory; Find is the actual time taken
to perform a "find /usr /var -type f > /dev/null"

For the throttled test, i used set vacuum_throttle = 20.

Theres a box turned up that has dual 10k rpm scsi disks, but it will be a
few days until I can test the dataset on that one.