Re: Synchronous Log Shipping Replication - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: Synchronous Log Shipping Replication
Date
Msg-id 48C6613A.2000704@bluegap.ch
Whole thread Raw
In response to Re: Synchronous Log Shipping Replication  ("Fujii Masao" <masao.fujii@gmail.com>)
Responses Re: Synchronous Log Shipping Replication  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Synchronous Log Shipping Replication  ("Fujii Masao" <masao.fujii@gmail.com>)
List pgsql-hackers
Hi,

Fujii Masao wrote:
> Really? In the benchmark result of my prototype, the bottleneck is
> still disk I/O.
> The communication (between the master and the slave) latency is smaller than
> WAL writing (fsyncing) one. Of course, I assume that we use not-poor network
> like 1000BASE-T.

Sure. If you do WAL sending to standby and WAL writing to disk in 
parallel, only the slower one is relevant (in case you want to wait for 
both). If that happens to be the disk, you won't see any performance 
degradation compared to standalone operation.

If you want the standby to confirm having written (and flushed) the WAL 
to disk as well, that can't possibly be faster than the active node's 
local disk (assuming equally fast and busy disk subsystems).

> I'd like to introduce new parameter "synchronous_replication" which specifies
> whether backends waits for the response from WAL sender process. By
> combining synchronous_commit and synchronous_replication, users can
> choose various options.

Various config options have already been proposed. I personally don't 
think that helps us much. Instead, I'd prefer to see prototype code or 
at least concepts. We can juggle with the GUC variable names or other 
config options later on.

> In the viewpoint of detection of a network failure, this feature is necessary.
> When the network goes down, WAL sender can be blocked until it detects
> the network failure, i.e. WAL sender keeps waiting for the response which
> never comes. A timeout notification is necessary in order to detect a
> network failure soon.

That's one of the areas I'm missing from the overall concept. I'm glad 
it comes up. You certainly realize, that such a timeout must be set high 
enough so as not to trigger "false negatives" every now and then? Or do 
you expect some sort of retry loop in case the link to the standby comes 
up again? How about multiple standby servers?

Regards

Markus Wanner


pgsql-hackers by date:

Previous
From: Markus Wanner
Date:
Subject: Re: Synchronous Log Shipping Replication
Next
From: Simon Riggs
Date:
Subject: Re: Synchronous Log Shipping Replication