Re: For what should pg_stop_backup wait? - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: For what should pg_stop_backup wait?
Date
Msg-id 3f0b79eb0808071947o1f70c19fj89e919851fe80448@mail.gmail.com
Whole thread Raw
In response to Re: For what should pg_stop_backup wait?  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: For what should pg_stop_backup wait?  (Simon Riggs <simon@2ndquadrant.com>)
Re: For what should pg_stop_backup wait?  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Thu, Aug 7, 2008 at 11:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Thu, 2008-08-07 at 14:59 +0100, Simon Riggs wrote:
>
>> I'll do a patch. Thanks for your input.
>
> Please review attached patch.

Thank you for your patch!

But, there are two problems in this patch, I think.

> !      * Wait until the last WAL file has been archived. We assume that the
> !      * alphabetic sorting property of the WAL files ensures the history
> !      * file is guaranteed archived by the time the last WAL file is archived.
> !      * The history file name depends upon the startpoint, whereas the last
> !      * file depends upon the stoppoint. They are always different because we
> !      * make an explicit xlog switch earlier in this function.

If there are a few transactions during backup, startpoint may be the same as
the stoppoint.
   postgres=# SELECT pg_xlogfile_name(pg_start_backup('test')) AS startpoint;           startpoint
--------------------------   000000010000000000000004   (1 row)
 
   *** A few transaction occurs ***
   postgres=# SELECT pg_xlogfile_name(pg_stop_backup()) AS stoppoint;           stoppoint   --------------------------
 000000010000000000000004   (1 row)
 

In this situation, the history file (000000010000000000000004.00000020.backup)
is behind the stoppoint (000000010000000000000004) in the alphabetic order.
So, pg_stop_backup should wait for both the stoppoint and the history
file, I think.


> !     while (!XLogArchiveCheckDone(stopxlogfilename, false))

If a concurrent checkpoint removes the status file before XLogArchiveCheckDone,
pg_stop_backup continues waiting forever. This is undesirable behavior.
Yes, statement_timeout may help. But, I don't want to use it, because the
*successful* backup is canceled.

How about checking whether the stoppoint was archived by comparing with
the last WAL archived. The archiver process can tell the last WAL archived.
Or, we can calculate it from the status file.

On the other hand, pg_stop_backup doesn't continue waiting for the history file
forever. Because, only pg_stop_backup removes the status file of it,
and a concurrent
pg_stop_backup never happen.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: "Robert Haas"
Date:
Subject: Re: patch: Add columns via CREATE OR REPLACE VIEW
Next
From: "Asko Oja"
Date:
Subject: Re: patch: Add columns via CREATE OR REPLACE VIEW