Re: Synchronous Log Shipping Replication - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Synchronous Log Shipping Replication
Date
Msg-id 48C66019.40400@enterprisedb.com
Whole thread Raw
In response to Re: Synchronous Log Shipping Replication  ("Fujii Masao" <masao.fujii@gmail.com>)
Responses Re: Synchronous Log Shipping Replication  ("Fujii Masao" <masao.fujii@gmail.com>)
List pgsql-hackers
Fujii Masao wrote:
> What makes the sender process bottleneck?

The keyword here is "might". There's many possibilities, like:
- Slow network.
- Ridiculously fast disk. Like a RAM disk. If you have a synchronous 
slave you can fail over to, putting WAL on a RAM disk isn't that crazy.
- slower WAL disk on the slave.
etc.

>> Backends then wait
>> * not at all for asynch commit
>> * just for Write for local synch commit
>> * for both Write and Send for remote synch commit
>> (various additional options for what happens to confirm Send)
> 
> I'd like to introduce new parameter "synchronous_replication" which specifies
> whether backends waits for the response from WAL sender process. By
> combining synchronous_commit and synchronous_replication, users can
> choose various options.

There's one thing I haven't figured out in this discussion. Does the 
write to the disk happen before or after the write to the slave? Can you 
guarantee that if a transaction is committed in the master, it's also 
committed in the slave, or vice versa?

>> Another thought occurs that we might measure the time a Send takes and
>> specify a limit on how long we are prepared to wait for confirmation.
>> Limit=0 => asynchronous. Limit > 0 implies synchronous-up-to-the-limit.
>> This would give better user behaviour across a highly variable network
>> connection.
> 
> In the viewpoint of detection of a network failure, this feature is necessary.
> When the network goes down, WAL sender can be blocked until it detects
> the network failure, i.e. WAL sender keeps waiting for the response which
> never comes. A timeout notification is necessary in order to detect a
> network failure soon.

Agreed. But what happens if you hit that timeout? Should we enforce that 
timeout within the server, or should we leave that to the external 
heartbeat system?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Synchronous Log Shipping Replication
Next
From: Simon Riggs
Date:
Subject: Re: Synchronous Log Shipping Replication