Re: Patch for fail-back without fresh backup - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Patch for fail-back without fresh backup
Date
Msg-id CABOikdP=2ZfcFKaNFe2f_HJs60qkdaokbV0mmdBCpAxRAUKrXg@mail.gmail.com
Whole thread Raw
In response to Re: Patch for fail-back without fresh backup  (Sawada Masahiko <sawada.mshk@gmail.com>)
Responses Re: Patch for fail-back without fresh backup  (Andres Freund <andres@2ndquadrant.com>)
Re: Patch for fail-back without fresh backup  (Sawada Masahiko <sawada.mshk@gmail.com>)
List pgsql-hackers

On Tue, Oct 8, 2013 at 2:33 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
On Fri, Oct 4, 2013 at 4:32 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>
I attached the v12 patch which have modified based on above suggestions.

There are still some parts of this design/patch which I am concerned about.

1. The design clubs synchronous standby and failback safe standby rather very tightly. IIRC this is based on the feedback you received early, so my apologies for raising it again so late.
 a. GUC synchrnous_standby_names is used to name synchronous as well as failback safe standbys. I don't know if that will confuse users. 
 b. synchronous_commit's value will also control whether a sync/async failback safe standby wait for remote write or flush. Is that reasonable ? Or should there be a different way to configure the failback safe standby's WAL safety ?

2. With the current design/implementation, user can't configure a synchronous and an async failback safe standby at the same time. I think we discussed this earlier and there was an agreement on the limitation. Just wanted to get that confirmed again.

3. SyncRepReleaseWaiters() does not know whether its waking up backends waiting for sync rep or failback safe rep. Is that ok ? For example, I found that the elog() message announcing next takeover emitted by the function may look bad. Since changing synchronous_transfer requires server restart, we can teach SyncRepReleaseWaiters() to look at that parameter to figure out whether the standby is sync and/or failback safe standby.

4. The documentation still need more work to clearly explain the use case.

5. Have we done any sort of stress testing of the patch ? If there is a bug, the data corruption at the master can go unnoticed. So IMHO we need many crash recovery tests to ensure that the patch is functionally correct.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: SSI freezing bug
Next
From: Marko Tiikkaja
Date:
Subject: Re: plpgsql.print_strict_params