Thread: Documentation Update: Document pg_start_backup checkpoint behavior

Documentation Update: Document pg_start_backup checkpoint behavior

From
Michael Renner
Date:
Hi,

small patch for the documentation describing the current pg_start_backup
checkpoint behavior as per
http://archives.postgresql.org//pgsql-general/2008-09/msg01124.php .

Should we note down a TODO to revisit the current checkpoint handling?

best regards,
Michael
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 02545f1..6ea9488 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -737,12 +737,8 @@ SELECT pg_start_backup('label');
      (see the configuration parameter
      <xref linkend="guc-checkpoint-completion-target">).  Usually
      this is what you want because it minimizes the impact on query
-     processing.  If you just want to start the backup as soon as
-     possible, execute a <command>CHECKPOINT</> command
-     (which performs a checkpoint as quickly as possible) and then
-     immediately execute <function>pg_start_backup</>.  Then there
-     will be very little for <function>pg_start_backup</>'s checkpoint
-     to do, and it won't take long.
+     processing.  Unfortunately it's currently not possible to expedite
+     the checkpointing done by pg_start_backup.
     </para>
    </listitem>
    <listitem>

Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Bruce Momjian
Date:
Michael Renner wrote:
> Hi,
>
> small patch for the documentation describing the current pg_start_backup
> checkpoint behavior as per
> http://archives.postgresql.org//pgsql-general/2008-09/msg01124.php .
>
> Should we note down a TODO to revisit the current checkpoint handling?
>
> best regards,
> Michael

> diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
> index 02545f1..6ea9488 100644
> --- a/doc/src/sgml/backup.sgml
> +++ b/doc/src/sgml/backup.sgml
> @@ -737,12 +737,8 @@ SELECT pg_start_backup('label');
>       (see the configuration parameter
>       <xref linkend="guc-checkpoint-completion-target">).  Usually
>       this is what you want because it minimizes the impact on query
> -     processing.  If you just want to start the backup as soon as
> -     possible, execute a <command>CHECKPOINT</> command
> -     (which performs a checkpoint as quickly as possible) and then
> -     immediately execute <function>pg_start_backup</>.  Then there
> -     will be very little for <function>pg_start_backup</>'s checkpoint
> -     to do, and it won't take long.
> +     processing.  Unfortunately it's currently not possible to expedite
> +     the checkpointing done by pg_start_backup.
>      </para>
>     </listitem>
>     <listitem>

I have combined the above patch with another change that reports a
checkpoint is taking place:

    test=> select pg_start_backup('12');
    NOTICE:  performing checkpoint
     pg_start_backup
    -----------------
     0/2000020
    (1 row)

Patch attached.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.123
diff -c -c -r2.123 backup.sgml
*** doc/src/sgml/backup.sgml    5 Mar 2009 19:50:03 -0000    2.123
--- doc/src/sgml/backup.sgml    3 Apr 2009 03:35:42 -0000
***************
*** 737,748 ****
       (see the configuration parameter
       <xref linkend="guc-checkpoint-completion-target">).  Usually
       this is what you want because it minimizes the impact on query
!      processing.  If you just want to start the backup as soon as
!      possible, execute a <command>CHECKPOINT</> command
!      (which performs a checkpoint as quickly as possible) and then
!      immediately execute <function>pg_start_backup</>.  Then there
!      will be very little for <function>pg_start_backup</>'s checkpoint
!      to do, and it won't take long.
      </para>
     </listitem>
     <listitem>
--- 737,744 ----
       (see the configuration parameter
       <xref linkend="guc-checkpoint-completion-target">).  Usually
       this is what you want because it minimizes the impact on query
!      processing.  Unfortunately it's currently not possible to expedite
!      the checkpointing done by pg_start_backup.
      </para>
     </listitem>
     <listitem>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.334
diff -c -c -r1.334 xlog.c
*** src/backend/access/transam/xlog.c    11 Mar 2009 23:19:24 -0000    1.334
--- src/backend/access/transam/xlog.c    3 Apr 2009 03:35:42 -0000
***************
*** 6977,6982 ****
--- 6977,6984 ----
      /* Ensure we release forcePageWrites if fail below */
      PG_ENSURE_ERROR_CLEANUP(pg_start_backup_callback, (Datum) 0);
      {
+         ereport(NOTICE,
+                 (errmsg("performing checkpoint")));
          /*
           * Force a CHECKPOINT.    Aside from being necessary to prevent torn
           * page problems, this guarantees that two successive backup runs will

Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> +         ereport(NOTICE,
> +                 (errmsg("performing checkpoint")));

You've *got* to be kidding.
        regards, tom lane


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Heikki Linnakangas
Date:
Bruce Momjian wrote:
> Michael Renner wrote:
>> +     processing.  Unfortunately it's currently not possible to expedite
>> +     the checkpointing done by pg_start_backup.
>>      </para>
>>     </listitem>
>>     <listitem>
> 
> I have combined the above patch with another change that reports a
> checkpoint is taking place:
> 
>     test=> select pg_start_backup('12');
>     NOTICE:  performing checkpoint
>      pg_start_backup
>     -----------------
>      0/2000020
>     (1 row)

Rather than deplore that you can't expedite the checkpoint, why don't we 
just make it possible? It's trivial to do, and in hindsight I think we 
should've implemented that option when we got smoothed checkpoints. 
Let's just decide what the command should look like.

The first question is what the default behavior should be? We've seen 
enough complaints and I've been bitten by that myself during development 
of other stuff often enough that I think we should change the default to 
immediate. From backwards-compatibility point of view, we shouldn't 
change the default, but then again an immediate checkpoint was what you 
got before 8.3.

For the interface, I can see two options:

1. New function

pg_start_backup('label') -> immediate checkpoint
pg_start_backup_lazy('label') -> lazy checkpoint

2. New argument

pg_start_backup('label') -> immediate checkpoint
pg_start_backup('label', false) -> immediate checkpoint
pg_start_backup('label', true) -> lazy checkpoint

The first looks nicer, IMHO, because the word 'lazy' makes it 
self-documenting. In the second form, you have to look at the manual to 
figure out what the 2nd argument does.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Tom Lane
Date:
I wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>> +         ereport(NOTICE,
>> +                 (errmsg("performing checkpoint")));

> You've *got* to be kidding.

Sigh.  I have to apologize for that over-hasty complaint: I misread
where you intended to put the message.  (Seems like there is too
much stuff in xlog.c that executes in too many different contexts.
Maybe we could split it up sometime.)

Still, I don't much like this solution.  I agree with Heikki:
let's just fix it.
        regards, tom lane


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Rather than deplore that you can't expedite the checkpoint, why don't we 
> just make it possible?

+1

> The first question is what the default behavior should be? We've seen 
> enough complaints and I've been bitten by that myself during development 
> of other stuff often enough that I think we should change the default to 
> immediate. From backwards-compatibility point of view, we shouldn't 
> change the default, but then again an immediate checkpoint was what you 
> got before 8.3.

I think we shouldn't change the default.  Which puts a hole in your
suggestion for function naming.  But then again, I like the extra
argument better anyway ...
        regards, tom lane


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Bernd Helmle
Date:
--On Freitag, April 03, 2009 08:30:14 +0300 Heikki Linnakangas 
<heikki.linnakangas@enterprisedb.com> wrote:

> The first looks nicer, IMHO, because the word 'lazy' makes it
> self-documenting. In the second form, you have to look at the manual to
> figure out what the 2nd argument does.

Regarding that many catalog functions are already overloaded and in the 
name of consistency i vote for your 2nd argument.

--  Thanks
                   Bernd


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Bruce Momjian
Date:
Tom Lane wrote:
> I wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> >> +         ereport(NOTICE,
> >> +                 (errmsg("performing checkpoint")));
> 
> > You've *got* to be kidding.
> 
> Sigh.  I have to apologize for that over-hasty complaint: I misread
> where you intended to put the message.  (Seems like there is too
> much stuff in xlog.c that executes in too many different contexts.
> Maybe we could split it up sometime.)
> 
> Still, I don't much like this solution.  I agree with Heikki:
> let's just fix it.

Agreed, fixing it is better than trying to document/report odd behavior.

There was talk about making pg_start_backup do an immediate checkpoint
but there was some discussion that you wouldn't want an I/O storm from
pg_start_backup().  However, figuring you are going to do the tar backup
anyway, the pg_start_backup I/O seems trivial.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Bruce Momjian
Date:
Tom Lane wrote:
> I wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> >> +         ereport(NOTICE,
> >> +                 (errmsg("performing checkpoint")));
> 
> > You've *got* to be kidding.
> 
> Sigh.  I have to apologize for that over-hasty complaint: I misread
> where you intended to put the message.  (Seems like there is too
> much stuff in xlog.c that executes in too many different contexts.
> Maybe we could split it up sometime.)

No question xlog.c has been a dumping ground for our increasing xlog
features.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> Still, I don't much like this solution.  I agree with Heikki:
>> let's just fix it.

> Agreed, fixing it is better than trying to document/report odd behavior.

> There was talk about making pg_start_backup do an immediate checkpoint
> but there was some discussion that you wouldn't want an I/O storm from
> pg_start_backup().  However, figuring you are going to do the tar backup
> anyway, the pg_start_backup I/O seems trivial.

The solution Heikki is proposing is to let the user choose immediate
or slow checkpoint.  I agree that there's not much point in the latter
if you are using something dumb like tar to take the filesystem backup,
but maybe the user has something smarter that won't cause such a big
I/O storm.
        regards, tom lane


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>> There was talk about making pg_start_backup do an immediate checkpoint
>> but there was some discussion that you wouldn't want an I/O storm from
>> pg_start_backup().  However, figuring you are going to do the tar backup
>> anyway, the pg_start_backup I/O seems trivial.

Good point.

> The solution Heikki is proposing is to let the user choose immediate
> or slow checkpoint.  I agree that there's not much point in the latter
> if you are using something dumb like tar to take the filesystem backup,
> but maybe the user has something smarter that won't cause such a big
> I/O storm.

If the user is knowledgeable enough to use a smarter backup tool, he's 
probably knowledgeable enough to put pg_start_backup('foo', true) 
instead of just pg_start_backup('foo') in his scripts. But a new user 
who's just playing around and making his first backup, probably using 
tar, isn't.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Tom Lane wrote:
>> The solution Heikki is proposing is to let the user choose immediate
>> or slow checkpoint.  I agree that there's not much point in the latter
>> if you are using something dumb like tar to take the filesystem backup,
>> but maybe the user has something smarter that won't cause such a big
>> I/O storm.

> If the user is knowledgeable enough to use a smarter backup tool, he's 
> probably knowledgeable enough to put pg_start_backup('foo', true) 
> instead of just pg_start_backup('foo') in his scripts. But a new user 
> who's just playing around and making his first backup, probably using 
> tar, isn't.

It's not actually that difficult to have a tar backup be rate-limited.
If you're dumping the tar output onto a tape drive, or sending it across
a network, or indeed doing anything except dropping it onto another
local disk drive, you are going to find that tar is not saturating
your disk.
        regards, tom lane


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Tom Lane
Date:
I wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Rather than deplore that you can't expedite the checkpoint, why don't we 
>> just make it possible?

> +1

>> The first question is what the default behavior should be? We've seen 
>> enough complaints and I've been bitten by that myself during development 
>> of other stuff often enough that I think we should change the default to 
>> immediate. From backwards-compatibility point of view, we shouldn't 
>> change the default, but then again an immediate checkpoint was what you 
>> got before 8.3.

> I think we shouldn't change the default.  Which puts a hole in your
> suggestion for function naming.  But then again, I like the extra
> argument better anyway ...

I'm going to go ahead and make this happen, using the arrangement

pg_start_backup('label') -> slow checkpoint (backwards compatible)
pg_start_backup('label', false) -> slow checkpoint
pg_start_backup('label', true) -> immediate checkpoint

Bruce suggested what seemed like an excellent idea, which is to make
this self-documenting using the new default-arguments feature ---
it'll look something like this in \df:

regression=# create function foo (label text, fast bool = false) returns int as $$select 1$$ language sql;
CREATE FUNCTION
regression=# \df foo                            List of functionsSchema | Name | Result data type |          Argument
datatypes           
 
--------+------+------------------+----------------------------------------public | foo  | integer          | label
text,fast boolean DEFAULT false
 
(1 row)

        regards, tom lane


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> I'm going to go ahead and make this happen, using the arrangement
> 
> pg_start_backup('label') -> slow checkpoint (backwards compatible)
> pg_start_backup('label', false) -> slow checkpoint
> pg_start_backup('label', true) -> immediate checkpoint
Probably a dumb question, but just to be sure: none of these functions
will return before the checkpoint is complete, right?  (In other
words, when the function returns, it is safe to begin the base
backup?)
-Kevin


Re: Documentation Update: Document pg_start_backup checkpoint behavior

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> I'm going to go ahead and make this happen, using the arrangement
>> 
>> pg_start_backup('label') -> slow checkpoint (backwards compatible)
>> pg_start_backup('label', false) -> slow checkpoint
>> pg_start_backup('label', true) -> immediate checkpoint
> Probably a dumb question, but just to be sure: none of these functions
> will return before the checkpoint is complete, right?  (In other
> words, when the function returns, it is safe to begin the base
> backup?)

Correct.  The only change here is whether or not to pass the
CHECKPOINT_IMMEDIATE flag to RequestCheckpoint.  We do CHECKPOINT_WAIT
in any case.
        regards, tom lane