Documentation Update: WAL & Checkpoints - Mailing list pgsql-hackers

From Michael Renner
Subject Documentation Update: WAL & Checkpoints
Date
Msg-id 49C532B7.5030608@amd.co.at
Whole thread Raw
Responses Re: Documentation Update: WAL & Checkpoints
List pgsql-hackers
Hi,

this is a small update to the first paragraph of the WAL configuration
chapter, going into more detail WRT redo vs. checkpoint records, since
the underlying behavior is currently only deducible from the source. I'm
not perfectly sure if I got everything right, so feel free to change as
necessary.

I think it'd be more appropriate to split the chapter and separate
basics from implementation details and tuneables, but for time being
this ought to suffice. Is somebody "in charge" of the documentation and
overall structure or is it a community effort as everything else?


Best regards,
Michael Renner
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index cff6fde..69b8b0a 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -322,19 +322,24 @@
   </para>

   <para>
-   <firstterm>Checkpoints</firstterm><indexterm><primary>checkpoint</></>
-   are points in the sequence of transactions at which it is guaranteed
-   that the data files have been updated with all information written before
-   the checkpoint.  At checkpoint time, all dirty data pages are flushed to
-   disk and a special checkpoint record is written to the log file.
-   In the event of a crash, the crash recovery procedure looks at the latest
-   checkpoint record to determine the point in the log (known as the redo
-   record) from which it should start the REDO operation.  Any changes made to
-   data files before that point are known to be already on disk.  Hence, after
-   a checkpoint has been made, any log segments preceding the one containing
-   the redo record are no longer needed and can be recycled or removed. (When
-   <acronym>WAL</acronym> archiving is being done, the log segments must be
-   archived before being recycled or removed.)
+   <firstterm>Checkpoints</firstterm><indexterm><primary>checkpoint</></> are
+   points in the logical sequence of transactions at which it is guaranteed
+   that the data files have been updated with all information created before
+   the start of the checkpoint command.  Since flushing all dirty data (meaning
+   "changed only in the WAL") to disk can take a while on databases with
+   write-heavy loads, checkpoints are not a single operation but rather a
+   series of events.  When a checkpoint starts, a redo record is written to the
+   WAL and PostgreSQL starts writing out dirty data which has accumulated up to
+   the redo record.  At checkpoint completion time, all changed files are
+   fsynced and a special checkpoint record is written to the log file. In the
+   event of a crash, the crash recovery procedure looks at the latest
+   checkpoint record to determine from which redo record it should start the
+   REDO operation.  Any changes made to data files before that point are known
+   to be already on disk.  Hence, after a checkpoint has been made, any log
+   segments preceding the one containing the redo record are no longer needed
+   and can be recycled or removed. (When <acronym>WAL</acronym> archiving is
+   being done, the log segments must be archived before being recycled or
+   removed.)
   </para>

   <para>

pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: win32 open item
Next
From: Tom Lane
Date:
Subject: Re: One less footgun: deprecating pg_dump -d