Re: Cascade replication - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Cascade replication
Date
Msg-id CAHGQGwFMJoryh0f3xSRfhb-rw0Pid3E8+fm11xeAiuD0=Dn8zQ@mail.gmail.com
Whole thread Raw
In response to Re: Cascade replication  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Cascade replication
List pgsql-hackers
On Wed, Jul 6, 2011 at 4:53 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Jul 6, 2011 at 2:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> 1. De-archive the file to RECOVERYXLOG
>>> 2. If RECOVERYXLOG is valid, remove a pre-existing one and rename
>>>    RECOVERYXLOG to the correct name
>>> 3. Replay the file with the correct name
>>
>> Yes please, that makes sense.

In #2, if the server is killed with SIGKILL just after removing a pre-existing
file and before renaming RECOVERYXLOG, we lose the file with correct name.
Even in this case, we would be able to restore it from the archive, but what if
unfortunately the archive is unavailable? We would lose the file infinitely. So
we should introduce the following safeguard?
   2'. If RECOVERYXLOG is valid, move a pre-existing file to pg_xlog/backup,       rename RECOVERYXLOG to the correct
name,and remove the pre-existing       file from pg_xlog/backup 
       Currently we give up a recovery if there is the target file in
neither the       archive nor pg_xlog. But, if we adopt the above safeguard, in that case,       we should try to read
thefile from also pg_xlog/backup. 

In #2, there is another problem; walsender might have the pre-existing file
open, so the startup process would need to request walsenders to close the
file before removing (or renaming) it, wait for new file to appear and open it
again. This might make the code complicated. Does anyone have better
approach?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Shigeru Hanada
Date:
Subject: Re: patch: enhanced get diagnostics statement 2
Next
From: Magnus Hagander
Date:
Subject: Re: proper format for printing GetLastError()