Thread: [patch] [doc] Further note required activity aspect of automatic checkpoint and archving

[patch] [doc] Further note required activity aspect of automatic checkpoint and archving

From
"David G. Johnston"
Date:
Hackers,

Over in general [1] Robert Inder griped about the not-so-recent change to our automatic checkpointing, and thus archiving, behavior where non-activity results in nothing happening.  In looking over the documentation I felt a few changes could be made to increase the chance that a reader learns this key dynamic.  Attached is a patch with those changes.  Copied inline for ease of review.

commit 8af7f653907688252d8663a80e945f6f5782b0de
Author: David G. Johnston <david.g.johnston@gmail.com>
Date:   Mon Oct 12 21:32:32 2020 +0000

    Further note required activity aspect of automatic checkpoint and archiving
   
    A few spots in the documentation could use a reminder that checkpoints
    and archiving requires that actual WAL records be written in order to happen
    automatically.

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 42a8ed328d..c312fc9387 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -722,6 +722,8 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/0
     short <varname>archive_timeout</varname> &mdash; it will bloat your archive
     storage.  <varname>archive_timeout</varname> settings of a minute or so are
     usually reasonable.
+    This is mitigated by the fact that empty WAL segments will not be archived
+    even if the archive_timeout period has elapsed.
    </para>
 
    <para>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ee914740cc..306f78765c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3131,6 +3131,8 @@ include_dir 'conf.d'
       <listitem>
        <para>
         Maximum time between automatic WAL checkpoints.
+        The automatic checkpoint will do nothing if no new WAL has been
+        written since the last recorded checkpoint.
         If this value is specified without units, it is taken as seconds.
         The valid range is between 30 seconds and one day.
         The default is five minutes (<literal>5min</literal>).
@@ -3337,18 +3339,17 @@ include_dir 'conf.d'
       </term>
       <listitem>
        <para>
+        Force the completion of the current, non-empty, WAL segment when
+        this amount of time (if non-zero) has elapsed since the last
+        segment file switch.
         The <xref linkend="guc-archive-command"/> is only invoked for
         completed WAL segments. Hence, if your server generates little WAL
         traffic (or has slack periods where it does so), there could be a
         long delay between the completion of a transaction and its safe
         recording in archive storage.  To limit how old unarchived
         data can be, you can set <varname>archive_timeout</varname> to force the
-        server to switch to a new WAL segment file periodically.  When this
-        parameter is greater than zero, the server will switch to a new
-        segment file whenever this amount of time has elapsed since the last
-        segment file switch, and there has been any database activity,
-        including a single checkpoint (checkpoints are skipped if there is
-        no database activity).  Note that archived files that are closed
+        server to switch to a new WAL segment file periodically.
+        Note that archived files that are closed
         early due to a forced switch are still the same length as completely
         full files.  Therefore, it is unwise to use a very short
         <varname>archive_timeout</varname> &mdash; it will bloat your archive

David J.

Attachment
On 2020-10-12 23:54, David G. Johnston wrote:
> --- a/doc/src/sgml/backup.sgml
> +++ b/doc/src/sgml/backup.sgml
> @@ -722,6 +722,8 @@ test ! -f 
> /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/0
>       short <varname>archive_timeout</varname> — it will bloat 
> your archive
>       storage.  <varname>archive_timeout</varname> settings of a minute 
> or so are
>       usually reasonable.
> +    This is mitigated by the fact that empty WAL segments will not be 
> archived
> +    even if the archive_timeout period has elapsed.
>      </para>

This is hopefully not what happens.  What this would mean is that I'd 
then have a sequence of WAL files named, say,

1, 2, 3, 7, 8, ...

because a few in the middle were not archived because they were empty.

> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -3131,6 +3131,8 @@ include_dir 'conf.d'
>         <listitem>
>          <para>
>           Maximum time between automatic WAL checkpoints.
> +        The automatic checkpoint will do nothing if no new WAL has been
> +        written since the last recorded checkpoint.
>           If this value is specified without units, it is taken as seconds.
>           The valid range is between 30 seconds and one day.
>           The default is five minutes (<literal>5min</literal>).

I think what happens is that the checkpoint is skipped, not that the 
checkpoint happens but does nothing.  That is the wording you cited in 
the other thread from 
<https://www.postgresql.org/docs/13/wal-configuration.html>.



On Fri, Jan 15, 2021 at 12:16 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 2020-10-12 23:54, David G. Johnston wrote:
> --- a/doc/src/sgml/backup.sgml
> +++ b/doc/src/sgml/backup.sgml
> @@ -722,6 +722,8 @@ test ! -f
> /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/0
>       short <varname>archive_timeout</varname> &mdash; it will bloat
> your archive
>       storage.  <varname>archive_timeout</varname> settings of a minute
> or so are
>       usually reasonable.
> +    This is mitigated by the fact that empty WAL segments will not be
> archived
> +    even if the archive_timeout period has elapsed.
>      </para>

This is hopefully not what happens.  What this would mean is that I'd
then have a sequence of WAL files named, say,

1, 2, 3, 7, 8, ...

because a few in the middle were not archived because they were empty.

This addition assumes it is known that the archive process first fills the files to their maximum size and then archives them.  That filling of the file is what causes the next file in the sequence to be created.  So if the archiving doesn't happen the files do not get filled and the status-quo prevails.

If the above wants to be made more explicit in this change maybe:

"This is mitigated by the fact that archiving, and thus filling, the active WAL segment will not happen if that segment is empty; it will continue as the active segment."


> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -3131,6 +3131,8 @@ include_dir 'conf.d'
>         <listitem>
>          <para>
>           Maximum time between automatic WAL checkpoints.
> +        The automatic checkpoint will do nothing if no new WAL has been
> +        written since the last recorded checkpoint.
>           If this value is specified without units, it is taken as seconds.
>           The valid range is between 30 seconds and one day.
>           The default is five minutes (<literal>5min</literal>).

I think what happens is that the checkpoint is skipped, not that the
checkpoint happens but does nothing.  That is the wording you cited in
the other thread from
<https://www.postgresql.org/docs/13/wal-configuration.html>.

Consistency is good; and considering it further the skipped wording is generally better anyway.

"The automatic checkpoint will be skipped if no new WAL has been written since the last recorded checkpoint."

David J.

Hi David,

On 1/15/21 2:50 PM, David G. Johnston wrote:
> 
> If the above wants to be made more explicit in this change maybe:
> 
> "This is mitigated by the fact that archiving, and thus filling, the 
> active WAL segment will not happen if that segment is empty; it will 
> continue as the active segment."

"archiving, and thus filling" seems awkward to me. Perhaps:

This is mitigated by the fact that WAL segments will not be archived 
until they have been filled with some data, even if the archive_timeout 
period has elapsed.

> Consistency is good; and considering it further the skipped wording is 
> generally better anyway.
> 
> "The automatic checkpoint will be skipped if no new WAL has been written 
> since the last recorded checkpoint."
Looks good to me.

Could you produce a new patch so Peter has something complete to look at?

Regards,
-- 
-David
david@pgmasters.net



> On 18 Mar 2021, at 16:36, David Steele <david@pgmasters.net> wrote:

> Could you produce a new patch so Peter has something complete to look at?

As this thread has been stalled for for a few commitfests by now I'm marking
this patch as returned with feedback.  Feel free to open a new entry for an
updated patch.

--
Daniel Gustafsson        https://vmware.com/