Thread: Disable page writes when fsync off, add GUC

Disable page writes when fsync off, add GUC

From
Bruce Momjian
Date:
This patch disables page writes to WAL when fsync is off, because with
no fsync guarantee, the page write recovery isn't useful.

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes, but still might want fsync.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.335
diff -c -c -r1.335 runtime.sgml
*** doc/src/sgml/runtime.sgml    2 Jul 2005 19:16:36 -0000    1.335
--- doc/src/sgml/runtime.sgml    4 Jul 2005 03:58:34 -0000
***************
*** 1687,1692 ****
--- 1687,1723 ----
        </listitem>
       </varlistentry>

+      <varlistentry id="guc-full-page-writes" xreflabel="full_page_writes">
+       <indexterm>
+        <primary><varname>full_page_writes</> configuration parameter</primary>
+       </indexterm>
+       <term><varname>full_page_writes</varname> (<type>boolean</type>)</term>
+       <listitem>
+        <para>
+         A page write in process during an operating system crash might
+         be only partially written to disk, leading to an on-disk page
+         that contains a mix of old and new data. During recovery, the
+         row changes stored in WAL are not enough to recover from this
+         situation.
+        </para>
+
+        <para>
+         When this option is on, the <productname>PostgreSQL</> server
+         writes full pages when first modified after a checkpoint to WAL
+         so full recovery is possible. Turning this option off might lead
+         to a corrupt system after an operating system crash because
+         uncorrected partial pages might contain inconsistent or corrupt
+         data. The risks are less but similar to <varname>fsync</>.
+        </para>
+
+        <para>
+         This option can only be set at server start or in the
+         <filename>postgresql.conf</filename> file.  The default is
+         <literal>on</>.
+        </para>
+       </listitem>
+      </varlistentry>
+
       <varlistentry id="guc-wal-buffers" xreflabel="wal_buffers">
        <term><varname>wal_buffers</varname> (<type>integer</type>)</term>
        <indexterm>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.205
diff -c -c -r1.205 xlog.c
*** src/backend/access/transam/xlog.c    30 Jun 2005 00:00:50 -0000    1.205
--- src/backend/access/transam/xlog.c    4 Jul 2005 03:58:38 -0000
***************
*** 97,102 ****
--- 97,103 ----
  char       *XLogArchiveCommand = NULL;
  char       *XLOG_sync_method = NULL;
  const char    XLOG_sync_method_default[] = DEFAULT_SYNC_METHOD_STR;
+ bool        fullPageWrites = true;

  #ifdef WAL_DEBUG
  bool        XLOG_DEBUG = false;
***************
*** 593,599 ****
                  {
                      /* OK, put it in this slot */
                      dtbuf[i] = rdt->buffer;
!                     if (XLogCheckBuffer(rdt, &(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
                      {
                          dtbuf_bkp[i] = true;
                          rdt->data = NULL;
--- 594,602 ----
                  {
                      /* OK, put it in this slot */
                      dtbuf[i] = rdt->buffer;
!                     /* If fsync is off, no need to backup pages. */
!                     if (enableFsync && fullPageWrites &&
!                         XLogCheckBuffer(rdt, &(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
                      {
                          dtbuf_bkp[i] = true;
                          rdt->data = NULL;
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.271
diff -c -c -r1.271 guc.c
*** src/backend/utils/misc/guc.c    28 Jun 2005 05:09:02 -0000    1.271
--- src/backend/utils/misc/guc.c    4 Jul 2005 03:58:46 -0000
***************
*** 82,87 ****
--- 82,88 ----
  extern int    CommitDelay;
  extern int    CommitSiblings;
  extern char *default_tablespace;
+ extern bool    fullPageWrites;

  static const char *assign_log_destination(const char *value,
                         bool doit, GucSource source);
***************
*** 482,487 ****
--- 483,500 ----
          false, NULL, NULL
      },
      {
+         {"full_page_writes", PGC_SIGHUP, WAL_SETTINGS,
+             gettext_noop("Fully writes pages when first modified after a checkpoint."),
+             gettext_noop("A page write in process during an operating system crash might be "
+                          "only partially written to disk.  During recovery, the row changes"
+                          "stored in WAL are not enough to recover.  This option writes "
+                          "pages when first modified after a checkpoint to WAL so full recovery "
+                          "is possible.")
+         },
+         &fullPageWrites,
+         true, NULL, NULL
+     },
+     {
          {"silent_mode", PGC_POSTMASTER, LOGGING_WHEN,
              gettext_noop("Runs the server silently."),
              gettext_noop("If this parameter is set, the server will automatically run in the "
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.151
diff -c -c -r1.151 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample    2 Jul 2005 18:46:45 -0000    1.151
--- src/backend/utils/misc/postgresql.conf.sample    4 Jul 2005 03:58:46 -0000
***************
*** 121,126 ****
--- 121,127 ----
  #wal_sync_method = fsync    # the default varies across platforms:
                  # fsync, fdatasync, fsync_writethrough,
                  # open_sync, open_datasync
+ #full_page_writes = on        # recover from partial page writes
  #wal_buffers = 8        # min 4, 8KB each
  #commit_delay = 0        # range 0-100000, in microseconds
  #commit_siblings = 5        # range 1-1000

Re: Disable page writes when fsync off, add GUC

From
Stephen Frost
Date:
* Bruce Momjian (pgman@candle.pha.pa.us) wrote:
> This patch disables page writes to WAL when fsync is off, because with
> no fsync guarantee, the page write recovery isn't useful.

This doesn't seem quite right to me.  What happens with PITR?  And
Postgres crashes?  While many people seriously distrust running w/ fsync
off, I'm sure there's quite a few folks which do.

> This also adds a full_page_writes GUC to turn off page writes to WAL.
> Some people might not want full_page_writes, but still might want fsync.

Adding an option to not do page writes to WAL seems fine to me, but I
think WAL writes should be on by default, even in the fsync=off case.
If people want to turn it off, fine, for either case since we expect
they understand what it means to have it turned off, but I don't think
the two options should be coupled as is being proposed.

    Thanks,

        Stephen

Attachment

Re: Disable page writes when fsync off, add GUC

From
Peter Eisentraut
Date:
Bruce Momjian wrote:
> This patch disables page writes to WAL when fsync is off, because
> with no fsync guarantee, the page write recovery isn't useful.
>
> This also adds a full_page_writes GUC to turn off page writes to WAL.
> Some people might not want full_page_writes, but still might want
> fsync.

Do you have some numbers to suggest that there is a performance benefit
to be had?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Disable page writes when fsync off, add GUC

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian wrote:
> > This patch disables page writes to WAL when fsync is off, because
> > with no fsync guarantee, the page write recovery isn't useful.
> >
> > This also adds a full_page_writes GUC to turn off page writes to WAL.
> > Some people might not want full_page_writes, but still might want
> > fsync.
>
> Do you have some numbers to suggest that there is a performance benefit
> to be had?

Josh reported page writes to be a big hit (which we already knew), but I
don't have any with fsync off, though it seems like a no-brainer.
However, I am thinking decoupling them is best.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Disable page writes when fsync off, add GUC

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> This also adds a full_page_writes GUC to turn off page writes to WAL.
> Some people might not want full_page_writes.

Fsync linkage removed, patch attached and applied.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.335
diff -c -c -r1.335 runtime.sgml
*** doc/src/sgml/runtime.sgml    2 Jul 2005 19:16:36 -0000    1.335
--- doc/src/sgml/runtime.sgml    5 Jul 2005 23:15:33 -0000
***************
*** 1660,1666 ****

         <para>
          This option can only be set at server start or in the
!         <filename>postgresql.conf</filename> file.
         </para>
        </listitem>
       </varlistentry>
--- 1660,1668 ----

         <para>
          This option can only be set at server start or in the
!         <filename>postgresql.conf</filename> file.  If this option
!         is <literal>off</>, consider also turning off
!         <varname>guc-full-page-writes</>.
         </para>
        </listitem>
       </varlistentry>
***************
*** 1687,1692 ****
--- 1689,1725 ----
        </listitem>
       </varlistentry>

+      <varlistentry id="guc-full-page-writes" xreflabel="full_page_writes">
+       <indexterm>
+        <primary><varname>full_page_writes</> configuration parameter</primary>
+       </indexterm>
+       <term><varname>full_page_writes</varname> (<type>boolean</type>)</term>
+       <listitem>
+        <para>
+         A page write in process during an operating system crash might
+         be only partially written to disk, leading to an on-disk page
+         that contains a mix of old and new data. During recovery, the
+         row changes stored in WAL are not enough to completely restore
+         the page.
+        </para>
+
+        <para>
+         When this option is on, the <productname>PostgreSQL</> server
+         writes full pages to WAL when they first modified after a checkpoint
+         so full recovery is possible. Turning this option off might lead
+         to a corrupt system after an operating system crash because
+         uncorrected partial pages might contain inconsistent or corrupt
+         data. The risks are less but similar to <varname>fsync</>.
+        </para>
+
+        <para>
+         This option can only be set at server start or in the
+         <filename>postgresql.conf</filename> file.  The default is
+         <literal>on</>.
+        </para>
+       </listitem>
+      </varlistentry>
+
       <varlistentry id="guc-wal-buffers" xreflabel="wal_buffers">
        <term><varname>wal_buffers</varname> (<type>integer</type>)</term>
        <indexterm>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.206
diff -c -c -r1.206 xlog.c
*** src/backend/access/transam/xlog.c    4 Jul 2005 04:51:44 -0000    1.206
--- src/backend/access/transam/xlog.c    5 Jul 2005 23:15:36 -0000
***************
*** 103,108 ****
--- 103,109 ----
  char       *XLogArchiveCommand = NULL;
  char       *XLOG_sync_method = NULL;
  const char    XLOG_sync_method_default[] = DEFAULT_SYNC_METHOD_STR;
+ bool        fullPageWrites = true;

  #ifdef WAL_DEBUG
  bool        XLOG_DEBUG = false;
***************
*** 594,600 ****
                  {
                      /* OK, put it in this slot */
                      dtbuf[i] = rdt->buffer;
!                     if (XLogCheckBuffer(rdt, &(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
                      {
                          dtbuf_bkp[i] = true;
                          rdt->data = NULL;
--- 595,603 ----
                  {
                      /* OK, put it in this slot */
                      dtbuf[i] = rdt->buffer;
!                     /* If fsync is off, no need to backup pages. */
!                     if (fullPageWrites &&
!                         XLogCheckBuffer(rdt, &(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
                      {
                          dtbuf_bkp[i] = true;
                          rdt->data = NULL;
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.272
diff -c -c -r1.272 guc.c
*** src/backend/utils/misc/guc.c    4 Jul 2005 04:51:51 -0000    1.272
--- src/backend/utils/misc/guc.c    5 Jul 2005 23:15:39 -0000
***************
*** 83,88 ****
--- 83,89 ----
  extern int    CommitDelay;
  extern int    CommitSiblings;
  extern char *default_tablespace;
+ extern bool    fullPageWrites;

  static const char *assign_log_destination(const char *value,
                         bool doit, GucSource source);
***************
*** 483,488 ****
--- 484,501 ----
          false, NULL, NULL
      },
      {
+         {"full_page_writes", PGC_SIGHUP, WAL_SETTINGS,
+             gettext_noop("Writes full pages to WAL when first modified after a checkpoint."),
+             gettext_noop("A page write in process during an operating system crash might be "
+                          "only partially written to disk.  During recovery, the row changes"
+                          "stored in WAL are not enough to recover.  This option writes "
+                          "pages when first modified after a checkpoint to WAL so full recovery "
+                          "is possible.")
+         },
+         &fullPageWrites,
+         true, NULL, NULL
+     },
+     {
          {"silent_mode", PGC_POSTMASTER, LOGGING_WHEN,
              gettext_noop("Runs the server silently."),
              gettext_noop("If this parameter is set, the server will automatically run in the "
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.151
diff -c -c -r1.151 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample    2 Jul 2005 18:46:45 -0000    1.151
--- src/backend/utils/misc/postgresql.conf.sample    5 Jul 2005 23:15:39 -0000
***************
*** 121,126 ****
--- 121,127 ----
  #wal_sync_method = fsync    # the default varies across platforms:
                  # fsync, fdatasync, fsync_writethrough,
                  # open_sync, open_datasync
+ #full_page_writes = on        # recover from partial page writes
  #wal_buffers = 8        # min 4, 8KB each
  #commit_delay = 0        # range 0-100000, in microseconds
  #commit_siblings = 5        # range 1-1000

Re: Disable page writes when fsync off, add GUC

From
"Michael Paesold"
Date:
Bruce Momjian wrote:

> Bruce Momjian wrote:
>> This also adds a full_page_writes GUC to turn off page writes to WAL.
>> Some people might not want full_page_writes.
>
> Fsync linkage removed, patch attached and applied.

...
+     When this option is on, the <productname>PostgreSQL</> server
+     writes full pages to WAL when they first modified after a checkpoint
+     so full recovery is possible.

I believe this should be "when they _are_ first modified after".

Perhaps you should also mention power failure, not only an operating system
crash as disaster scenario, even if the latter includes the former.

Best Regards,
Michael Paesold


Re: Disable page writes when fsync off, add GUC

From
Bruce Momjian
Date:
Michael Paesold wrote:
> Bruce Momjian wrote:
>
> > Bruce Momjian wrote:
> >> This also adds a full_page_writes GUC to turn off page writes to WAL.
> >> Some people might not want full_page_writes.
> >
> > Fsync linkage removed, patch attached and applied.
>
> ...
> +     When this option is on, the <productname>PostgreSQL</> server
> +     writes full pages to WAL when they first modified after a checkpoint
> +     so full recovery is possible.
>
> I believe this should be "when they _are_ first modified after".
>
> Perhaps you should also mention power failure, not only an operating system
> crash as disaster scenario, even if the latter includes the former.
>

Thanks.  Done.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.336
diff -c -c -r1.336 runtime.sgml
*** doc/src/sgml/runtime.sgml    5 Jul 2005 23:18:09 -0000    1.336
--- doc/src/sgml/runtime.sgml    6 Jul 2005 14:40:15 -0000
***************
*** 1705,1715 ****

         <para>
          When this option is on, the <productname>PostgreSQL</> server
!         writes full pages to WAL when they first modified after a checkpoint
!         so full recovery is possible. Turning this option off might lead
!         to a corrupt system after an operating system crash because
!         uncorrected partial pages might contain inconsistent or corrupt
!         data. The risks are less but similar to <varname>fsync</>.
         </para>

         <para>
--- 1705,1716 ----

         <para>
          When this option is on, the <productname>PostgreSQL</> server
!         writes full pages to WAL when they are first modified after a
!         checkpoint so full recovery is possible. Turning this option off
!         might lead to a corrupt system after an operating system crash
!         or power failure because uncorrected partial pages might contain
!         inconsistent or corrupt data. The risks are less but similar to
!         <varname>fsync</>.
         </para>

         <para>