Re: auto-sizing wal_buffers - Mailing list pgsql-hackers

From Greg Smith
Subject Re: auto-sizing wal_buffers
Date
Msg-id 4D314406.3000703@2ndquadrant.com
Whole thread Raw
In response to Re: auto-sizing wal_buffers  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: auto-sizing wal_buffers  (Fujii Masao <masao.fujii@gmail.com>)
Re: auto-sizing wal_buffers  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
Tom Lane wrote:
> I think we need to keep the override capability until the autotune
> algorithm has proven itself in the field for a couple of years.
>
> I agree with Josh that a negative value should be used to select the
> autotune method.
>

Agreed on both fronts.  Attached patch does the magic.  Also available
in branch "walbuffers" from git://github.com/greg2ndQuadrant/postgres.git

By changing only shared_buffers I get the following quite reasonable
automatic behavior:

$ psql -c "SELECT name,unit,boot_val,setting,current_setting(name) FROM
pg_settings WHERE name IN ('wal_buffers','shared_buffers')"
      name      | unit | boot_val | setting | current_setting
----------------+------+----------+---------+-----------------
 shared_buffers | 8kB  | 1024     | 3072    | 24MB
 wal_buffers    | 8kB  | -1       | 96      | 768kB

 shared_buffers | 8kB  | 1024     | 4096    | 32MB
 wal_buffers    | 8kB  | -1       | 128     | 1MB

 shared_buffers | 8kB  | 1024     | 16384   | 128MB
 wal_buffers    | 8kB  | -1       | 512     | 4MB

 shared_buffers | 8kB  | 1024     | 131072  | 1GB
 wal_buffers    | 8kB  | -1       | 2048    | 16MB

 shared_buffers | 8kB  | 1024     | 262144  | 2GB
 wal_buffers    | 8kB  | -1       | 2048    | 16MB

If you've set it to the auto-tuning behavior, you don't see that setting
of -1 in the SHOW output; you see the value it's actually been set to.
The only way to know that was set automatically is to look at boot_val
as I've shown here.  I consider this what admins would prefer, as the
easy way to expose the value that was used.  I would understand if
people considered it a little odd though.  Since you can't change it
without a postgresql.conf edit and a server start anyway, and it's
tersely documented in the sample postgresql.conf what -1 does, I don't
see this being a problem for anyone in the field.

To try and clear up some of the confusion around how the earlier
documentation suggests larger values of this aren't needed, I added the
following updated description of how this has been observed to work for
admins in practice:

!         Since the data is written out to disk at every transaction commit,
!         the setting many only need to be be large enough to hold the
amount
!         of WAL data generated by one typical transaction.  Larger values,
!         typically at least a few megabytes, can improve write performance
!         on a busy server where many clients are committing at once.
!         Extremely large settings are unlikely to provide additional
benefit.

And to make this easy as possible to apply if I got this right, here's
some proposed commit text:

Automatically set wal_buffers to be proportional
to the size of shared_buffers.  Make it 1/32
as large when the auto-tuned behavior, which
is the default and set with a value of -1,
is used.  The previous default of 64kB is still
enforced as a minimum value.  The maximum
automatic value is limited to 16MB.

(Note that this not exactly what I put in my own commit message if you
grab from my repo, that had a typo)

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8e2a2c5..c3f5632 100644
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*************** SET ENABLE_SEQSCAN TO OFF;
*** 1638,1649 ****
        </indexterm>
        <listitem>
         <para>
!         The amount of memory used in shared memory for WAL data.  The
!         default is 64 kilobytes (<literal>64kB</>).  The setting need only
!         be large enough to hold the amount of WAL data generated by one
!         typical transaction, since the data is written out to disk at
!         every transaction commit.  This parameter can only be set at server
!         start.
         </para>

         <para>
--- 1638,1659 ----
        </indexterm>
        <listitem>
         <para>
!         The amount of shared memory used for storing WAL data.  The
!         default setting of -1 adjusts this automatically based on the size
!         of <varname>shared_buffers</varname>, making it 1/32 (about 3%) of
!         the size of that normally larger shared memory block.  Automatically
!         set values are limited to a maximum of 16 megabytes
!         (<literal>16MB</>), sufficient to hold one WAL segment worth of data.
!         The smallest allowable setting is 64 kilobytes (<literal>64kB</>).
!        </para>
!
!        <para>
!         Since the data is written out to disk at every transaction commit,
!         the setting many only need to be be large enough to hold the amount
!         of WAL data generated by one typical transaction.  Larger values,
!         typically at least a few megabytes, can improve write performance
!         on a busy server where many clients are committing at once.
!         Extremely large settings are unlikely to provide additional benefit.
         </para>

         <para>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b49b933..060e627 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 68,74 ****
  /* User-settable parameters */
  int            CheckPointSegments = 3;
  int            wal_keep_segments = 0;
! int            XLOGbuffers = 8;
  int            XLogArchiveTimeout = 0;
  bool        XLogArchiveMode = false;
  char       *XLogArchiveCommand = NULL;
--- 68,75 ----
  /* User-settable parameters */
  int            CheckPointSegments = 3;
  int            wal_keep_segments = 0;
! int            XLOGbuffers = -1;
! int            XLOGbuffersMin = 8;
  int            XLogArchiveTimeout = 0;
  bool        XLogArchiveMode = false;
  char       *XLogArchiveCommand = NULL;
*************** GetSystemIdentifier(void)
*** 4779,4789 ****
--- 4780,4812 ----
  /*
   * Initialization of shared memory for XLOG
   */
+
+ void XLOGTuneNumBuffers(void)
+ {
+     /*
+      * If automatic setting was requested, use about 3% as much memory as
+      * requested for the buffer cache.  Clamp the automatic maximum to the
+      * size of one 16MB XLOG segment, while still allowing a larger manual
+      * setting.
+      */
+     if (XLOGbuffers == -1)
+         {
+         XLOGbuffers = NBuffers / 32;
+         if (XLOGbuffers > 2048)
+             XLOGbuffers = 2048;
+         }
+     /* Enforce a 64KB minimum */
+     if (XLOGbuffers < XLOGbuffersMin)
+         XLOGbuffers = XLOGbuffersMin;
+ }
+
  Size
  XLOGShmemSize(void)
  {
      Size        size;

+     XLOGTuneNumBuffers();
+
      /* XLogCtl */
      size = sizeof(XLogCtlData);
      /* xlblocks array */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 942acb9..2421460 100644
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*************** static struct config_int ConfigureNamesI
*** 1766,1772 ****
              GUC_UNIT_XBLOCKS
          },
          &XLOGbuffers,
!         8, 4, INT_MAX, NULL, NULL
      },

      {
--- 1766,1772 ----
              GUC_UNIT_XBLOCKS
          },
          &XLOGbuffers,
!         -1, -1, INT_MAX, NULL, NULL
      },

      {
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f436b83..6c6f9a9 100644
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 162,168 ****
                      #   fsync_writethrough
                      #   open_sync
  #full_page_writes = on            # recover from partial page writes
! #wal_buffers = 64kB            # min 32kB
                      # (change requires restart)
  #wal_writer_delay = 200ms        # 1-10000 milliseconds

--- 162,168 ----
                      #   fsync_writethrough
                      #   open_sync
  #full_page_writes = on            # recover from partial page writes
! #wal_buffers = -1            # min 32kB, -1 sets based on shared_buffers
                      # (change requires restart)
  #wal_writer_delay = 200ms        # 1-10000 milliseconds

diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index e9d8d15..ed7a32a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void GetXLogReceiptTime(Timestamp
*** 293,298 ****
--- 293,299 ----

  extern void UpdateControlFile(void);
  extern uint64 GetSystemIdentifier(void);
+ extern void XLOGTuneNumBuffers(void);
  extern Size XLOGShmemSize(void);
  extern void XLOGShmemInit(void);
  extern void BootStrapXLOG(void);

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: ALTER TYPE 0: Introduction; test cases
Next
From: Greg Smith
Date:
Subject: Re: auto-sizing wal_buffers