Re: Synchronous Log Shipping Replication - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Synchronous Log Shipping Replication
Date
Msg-id 1221035107.3913.591.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: Synchronous Log Shipping Replication  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Synchronous Log Shipping Replication  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Wed, 2008-09-10 at 11:10 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Wed, 2008-09-10 at 13:28 +0900, Fujii Masao wrote:
> >> On Tue, Sep 9, 2008 at 8:38 PM, Heikki Linnakangas
> >> <heikki.linnakangas@enterprisedb.com> wrote:
> >>> There's one thing I haven't figured out in this discussion. Does the write
> >>> to the disk happen before or after the write to the slave? Can you guarantee
> >>> that if a transaction is committed in the master, it's also committed in the
> >>> slave, or vice versa?
> > 
> > The write happens concurrently and independently on both.
> > 
> > Yes, you wait for the write *and* send pointer to be "flushed" before
> > you allow a synch commit with synch replication. (Definition of flushed
> > is changeable by parameters).
> 
> The thing that bothers me is the behavior when the synchronous slave 
> doesn't respond. A timeout has been discussed, after which the master 
> just gives up on sending, and starts acting as if there's no slave. 
> How's that different from asynchronous mode where WAL is sent to the 
> server concurrently when it's flushed to disk, but we don't wait for the 
> send to finish? ISTM that in both cases the only guarantee we can give 
> is that when a transaction is acknowledged as committed, it's committed 
> in the master but not necessarily in the slave.

We should differentiate between what the WALsender does and what the
user does in response to a network timeout.

Saying "I want to wait for a synchronous commit and I am willing to wait
for ever to ensure it" leads to long hangs in some cases.

I was suggesting that some users may wish to wait up to time X before
responding to the commit. The WALSender may keep retrying long after
that point, but that doesn't mean all current users need to do that
also. The user would need to say whether the response to the timeout was
an error, or just accept and get on with it.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: "Pavan Deolasee"
Date:
Subject: Re: Synchronous Log Shipping Replication
Next
From: Heikki Linnakangas
Date:
Subject: Re: WIP patch: Collation support