Re: Patch for fail-back without fresh backup - Mailing list pgsql-hackers

From Samrat Revagade
Subject Re: Patch for fail-back without fresh backup
Date
Msg-id CAF8Q-GxF2pq0o28b0e5n__PT6vs4iPFGSPRh4tO6x9BUiNWV5w@mail.gmail.com
Whole thread Raw
In response to Re: Patch for fail-back without fresh backup  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Patch for fail-back without fresh backup  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers



On Tue, Oct 8, 2013 at 3:16 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-10-08 15:07:02 +0530, Pavan Deolasee wrote:
> On Tue, Oct 8, 2013 at 2:33 PM, Sawada Masahiko <sawada.mshk@gmail.com>wrote:
>
> > On Fri, Oct 4, 2013 at 4:32 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> > >
> > I attached the v12 patch which have modified based on above suggestions.
> >
>
> There are still some parts of this design/patch which I am concerned about.
>
> 1. The design clubs synchronous standby and failback safe standby rather
> very tightly. IIRC this is based on the feedback you received early, so my
> apologies for raising it again so late.

It is my impression that there still are several people having pretty
fundamental doubts about this approach in general. From what I remember
neither Heikki, Simon, Tom nor me were really convinced about this
approach.


Listing down all objections and their solutions:

Major Objection on the proposal:
* Tom Lane*
# additional complexity to the code it will cause performance overhead - On an average it causes 0.5 - 1% performance overhead for fast transaction workload, as the wait is mostly on backend process. The latest re-factored code, looks less complex.
# Use of rsync with checksum - but many pages on the two servers may differ in their binary values because of hint bits

*Heikki :*
# Use pg_rewind to do the same:
It has well known problem of hint bit updates.
If we use this we need enable checksums or explicitly WAL log hint bits which leads to performance overhead

*Amit Kapila*
# How to take care of extra WAL on old master during recovery.?
we can solve this by deleting all WAL file when old master before it starts as new standby.

*Simon Riggs*
# Renaming patch - done
# remove extra set of parameters - done
# performance drop - On an average it causes 0.5 - 1% performance overhead for fast transaction workload, as the wait is mostly on backend process.
# The way of configuring standby - with synchronous_transfer parameter we can configure 4 types of standby servers depending on the need.

*Fujii Masao*
# how patch interacts with cascaded standby - patch works same as synchronous replication
# CHECKPOINT in the standby, it got stuck infinitely. - fixed this
# Complicated conditions in SyncRepWaitForLSN() – code has been refactored in v11
# Improve source code comments - done

*Pavan Deolasee*
 # Interaction of synchronous_commit with synchronous_transfer - Now synchronous_commit only controls whether and how
to wait for the standby only when a transaction commits. synchronous_transfer OTOH tells how to interpret the standby listed in
synchronous_standbys parameter.
 # Further Improvements in the documentation - we will do that
 # More stress testing - we will do that

Any inputs on stress testing would help.

pgsql-hackers by date:

Previous
From: Sawada Masahiko
Date:
Subject: Re: Patch for fail-back without fresh backup
Next
From: Soroosh Sardari
Date:
Subject: Pattern matching operators a index