Re: WAL logging volume and CREATE TABLE - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: WAL logging volume and CREATE TABLE
Date
Msg-id 201108030246.p732kto14874@momjian.us
Whole thread Raw
In response to Re: WAL logging volume and CREATE TABLE  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: WAL logging volume and CREATE TABLE
List pgsql-hackers
Robert Haas wrote:
> On Tue, Aug 2, 2011 at 11:30 AM, Bruce Momjian <bruce@momjian.us> wrote:
> > Tom Lane wrote:
> >> Bruce Momjian <bruce@momjian.us> writes:
> >> > Our docs suggest an optimization to reduce WAL logging when you are
> >> > creating and populating a table:
> >>
> >> > ? ? http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
> >>
> >> > ? ? In minimal level, WAL-logging of some bulk operations, like CREATE
> >> > ? ? INDEX, CLUSTER and COPY on a table that was created or truncated in the
> >> > ? ? same transaction can be safely skipped, which can make those operations
> >> > ? ? much faster (see Section 14.4.7). But minimal WAL does not contain
> >> > ? ? enough information to reconstruct the data from a base backup and the
> >> > ? ? WAL logs, so either archive or hot_standby level must be used to enable
> >> > ? ? WAL archiving (archive_mode) and streaming replication.
> >>
> >> > I am confused why we issue significant WAL traffic for CREATE INDEX?
> >>
> >> The point is that in minimal level we *don't*. ?We just fsync the index
> >> file before committing. ?In higher levels we have to write the whole
> >> index contents to the WAL, not only the disk file, so that the info
> >> reaches the archive or standby slaves.
> >>
> >> Same for the other cases.
> >
> > I realize the need for WAL logging CREATE INDEX for non-'minimal'
> > wal_level values.
> >
> > But the documentation states the WAL logging is reduced for CREATE INDEX
> > by doing CREATE TABLE in the same transaction block. ?Why is this true?
> > Why would the CREATE TABLE affect the "CREATE INDEX" WAL volume?
> >
> > I am wondering if the documention is correct about CLUSTER and COPY, but
> > incorrect for CREATE INDEX.
>
> I think the problem here might be ambiguous wording.  I believe that
> the modifier "on a table that was created or truncated in the same
> transaction" is intended to apply only to "COPY", but the way it's
> written, someone (such as you) might be forgiven for thinking that it
> applied to the larger phrase "CREATE INDEX, CLUSTER, or COPY".

I have created a documentation patch to clarify this, and to mention
CREATE TABLE AS which also has this optimization.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 4fadca9..a1f51ec
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*************** SET ENABLE_SEQSCAN TO OFF;
*** 1452,1461 ****
         </para>
         <para>
          In <literal>minimal</> level, WAL-logging of some bulk operations, like
!         <command>CREATE INDEX</>, <command>CLUSTER</> and <command>COPY</> on
!         a table that was created or truncated in the same transaction can be
!         safely skipped, which can make those operations much faster (see
!         <xref linkend="populate-pitr">). But minimal WAL does not contain
          enough information to reconstruct the data from a base backup and the
          WAL logs, so either <literal>archive</> or <literal>hot_standby</>
          level must be used to enable
--- 1452,1463 ----
         </para>
         <para>
          In <literal>minimal</> level, WAL-logging of some bulk operations, like
!         <command>CREATE INDEX</>, <command>CLUSTER</>, and <command>CREATE
!         TABLE AS</>, can be safely skipped, which can make those
!         operations much faster (see <xref linkend="populate-pitr">).
!         In minimal WAL-logging mode, it is also possible to skip WAL-logging of
!         and <command>COPY</> operations on tables that were created
!         or truncated in the same transaction.  But minimal WAL does not contain
          enough information to reconstruct the data from a base backup and the
          WAL logs, so either <literal>archive</> or <literal>hot_standby</>
          level must be used to enable

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Transient plans versus the SPI API
Next
From: Robert Haas
Date:
Subject: Re: WIP fix proposal for bug #6123