Vacuum I/O throttling - Mailing list pgsql-bugs

From Guy Thornley
Subject Vacuum I/O throttling
Date
Msg-id 20030901042219.GF18335@conker.esphion.com
Whole thread Raw
Responses Re: Vacuum I/O throttling
List pgsql-bugs
Below is a patch for the lazy vacuum. It implements a simple I/O throttle so
boxen arnt killed for hours a day when VACUUM runs. Patch includes a
paragraph for the manual. The new setting is VACUUM_THROTTLE. It can be SET
from a client connection, too.

The usleep() could be replaced with a select() call with a timeout an no
fd_set's to aid portability..

The intention is I can simply startup a daemon like shellscript that spends
its whole life doing VACUUM, like:

$ while :; do echo "SET vacuum_throttle = 20; VACUUM ANALYZE VERBOSE"; done

It is against version 7.3.2 with a bunch of Debian specific patches applied,
hopefully it will apply fine you people. [Debian's promiscuous patching of
everything always irks me]

Now, some things I noticed while writing this patch:

- Is it correct that the database backend appears to have absolutely no idea
  what tables have free space until VACUUM runs for the first time?

- If a table is VACUUM'd multiple times simultaneously, what happens? [Can
  this even happen? I didn't look for per-vacuum locks]. From what I can
  see, this is a dangerous thing to do...

- Would this patch be more useful as a general I/O throttle for reading
  pages in from disk? [Down in ReadBuffer() somewhere I guess, but I didnt
  look to close (yet;)] I didn't do it this way, because I didnt want sleeps
  forced upon processes that could have important locks held.

Note that I'm fairly noob to these database thingys, and comments are
appreciated.

- Guy


diff -bBur postgresql-7.3.2/doc/src/sgml/runtime.sgml postgresql-7.3.2-guy/doc/src/sgml/runtime.sgml
--- postgresql-7.3.2/doc/src/sgml/runtime.sgml    2003-01-11 05:04:26.000000000 +0000
+++ postgresql-7.3.2-guy/doc/src/sgml/runtime.sgml    2003-08-28 04:16:28.000000000 +0000
@@ -2045,6 +2045,23 @@
      </varlistentry>

      <varlistentry>
+      <term><varname>VACUUM_THROTTLE</varname> (<type>integer</type>)</term>
+      <listitem>
+       <para>
+        Optionally throttle the rate at which the lazy
+        <command>VACUUM</command> will scan database pages. The value
+        specified is either 0 to disable the throttle (the default) or the
+        number of pages/second <command>VACUUM</command> is permitted to
+        look at. If you are having problems with <command>VACUUM</command>
+        nuking your I/O subsystem, try tuning this parameter. Values larger
+        than your OS scheduling frequency will probably not be useful. This
+        does not affect <command>VACUUM FULL</command> or
+        <command>ANALYZE</command>.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><varname>VIRTUAL_HOST</varname> (<type>string</type>)</term>
       <listitem>
        <para>
diff -bBur postgresql-7.3.2/src/backend/commands/vacuumlazy.c postgresql-7.3.2-guy/src/backend/commands/vacuumlazy.c
--- postgresql-7.3.2/src/backend/commands/vacuumlazy.c    2002-09-20 19:56:01.000000000 +0000
+++ postgresql-7.3.2-guy/src/backend/commands/vacuumlazy.c    2003-08-28 03:34:27.000000000 +0000
@@ -204,6 +205,7 @@
     bool        did_vacuum_index = false;
     int            i;
     VacRUsage    ru0;
+    int            page_delay = 0;

     vac_init_rusage(&ru0);

@@ -221,6 +223,9 @@

     lazy_space_alloc(vacrelstats, nblocks);

+    if (VacuumThrottle > 0)
+        page_delay = 1000000 / VacuumThrottle;
+
     for (blkno = 0; blkno < nblocks; blkno++)
     {
         Buffer        buf;
@@ -232,6 +237,9 @@
                     hastup;
         int            prev_dead_count;

+        if(page_delay > 0)
+            usleep(page_delay);
+
         CHECK_FOR_INTERRUPTS();

         /*
diff -bBur postgresql-7.3.2/src/backend/utils/init/globals.c postgresql-7.3.2-guy/src/backend/utils/init/globals.c
--- postgresql-7.3.2/src/backend/utils/init/globals.c    2002-10-03 17:07:53.000000000 +0000
+++ postgresql-7.3.2-guy/src/backend/utils/init/globals.c    2003-08-28 02:55:16.000000000 +0000
@@ -70,4 +70,5 @@
 bool        allowSystemTableMods = false;
 int            SortMem = 1024;
 int            VacuumMem = 8192;
+int            VacuumThrottle = 0;
 int            NBuffers = DEF_NBUFFERS;
diff -bBur postgresql-7.3.2/src/backend/utils/misc/guc.c postgresql-7.3.2-guy/src/backend/utils/misc/guc.c
--- postgresql-7.3.2/src/backend/utils/misc/guc.c    2003-01-28 18:04:13.000000000 +0000
+++ postgresql-7.3.2-guy/src/backend/utils/misc/guc.c    2003-08-28 03:07:27.000000000 +0000
@@ -602,6 +602,11 @@
     },

     {
+        {"vacuum_throttle", PGC_USERSET}, &VacuumThrottle,
+        0, 0, INT_MAX, NULL, NULL
+    },
+
+    {
         {"max_files_per_process", PGC_BACKEND}, &max_files_per_process,
         1000, 25, INT_MAX, NULL, NULL
     },
diff -bBur postgresql-7.3.2/src/include/c.h postgresql-7.3.2-guy/src/include/c.h
--- postgresql-7.3.2/src/include/c.h    2002-10-24 03:11:05.000000000 +0000
+++ postgresql-7.3.2-guy/src/include/c.h    2003-08-28 03:17:03.000000000 +0000
@@ -58,6 +58,7 @@
 #include <string.h>
 #include <stddef.h>
 #include <stdarg.h>
+#include <unistd.h>
 #ifdef HAVE_STRINGS_H
 #include <strings.h>
 #endif
diff -bBur postgresql-7.3.2/src/include/miscadmin.h postgresql-7.3.2-guy/src/include/miscadmin.h
--- postgresql-7.3.2/src/include/miscadmin.h    2002-10-03 17:07:53.000000000 +0000
+++ postgresql-7.3.2-guy/src/include/miscadmin.h    2003-08-28 03:12:25.000000000 +0000
@@ -165,6 +165,7 @@
 extern bool allowSystemTableMods;
 extern DLLIMPORT int SortMem;
 extern int    VacuumMem;
+extern int    VacuumThrottle;

 /*
  *    A few postmaster startup options are exported here so the

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: sequence last_value not accurate if sequence has never been used
Next
From: Tom Lane
Date:
Subject: Re: Vacuum I/O throttling