Home > mailing lists

Re: Synchronous Standalone Master Redoux - Mailing list pgsql-hackers

From	Shaun Thomas
Subject	Re: Synchronous Standalone Master Redoux
Date	July 10, 2012 10:29:07
Msg-id	4FFC2E0E.2090509@optionshouse.com Whole thread Raw
In response to	Re: Synchronous Standalone Master Redoux (Daniel Farina <daniel@heroku.com>)
Responses	Re: Synchronous Standalone Master Redoux Re: Synchronous Standalone Master Redoux Re: Synchronous Standalone Master Redoux
List	pgsql-hackers

Tree view

On 07/10/2012 01:11 AM, Daniel Farina wrote:

> So if I get this straight, what you are saying is "be asynchronous
> replication unless someone is around, in which case be synchronous"
> is the mode you want.

Er, no. I think I see where you might have gotten that, but no.

> This is a pretty tricky definition: consider if you bring a standby
> on-line from archive replay and it shows up in streaming with pretty
> high lag, and stops all commit traffic while it reaches the bounded
> window of what "acceptable" lag is. That sounds pretty terrible, too.
> How does DBRD handle this? It seems like the catchup phase might be
> interesting prior art.

Well, DRBD actually has a very definitive sync mode, and no 
"attenuation" is involved at all. Here's what a fully working cluster 
looks like, according to /proc/drbd:

cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate

Here's what happens when I disconnect the secondary:

cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown

So there's a few things here:

1. Primary is waiting for the secondary to reconnect.
2. It knows its own data is still up to date.
3. It's waiting to assess the secondary when it re-appears
4. It's still capable of writing to the device.

This is more akin to degraded RAID-1. Writes are synchronous as long as 
two devices exist, but if one vanishes, you can still use the disk at 
your own risk. Checking the status of DRBD will show this readily. I 
also want to point out it is *fully* synchronous when both nodes are 
available. I.e., you can't even call a filesystem sync without the sync 
succeeding on both nodes.

When you re-connect a secondary device, it catches up as fast as 
possible by replaying waiting transactions, and then re-attaching to the 
cluster. Until it's fully caught-up, it doesn't exist. DRBD acknowledges 
the secondary is there and attempting to catch up, but does not leave 
"degraded" mode until the secondary reaches "UpToDate" status.

This is a much more graceful failure scenario than is currently possible 
with PostgreSQL. With DRBD, you'd still need a tool to notice the master 
node is in an invalid state and perform a failover, but the secondary 
going belly-up will not suddenly halt the master.

But I'm not even hoping for *that* level of functionality. I just want 
to be able to tell PostgreSQL to notice when the secondary becomes 
unavailable *on its own*, and then perform in "degraded non-sync mode" 
because it's much faster than any monitor I can possibly attach to 
perform the same function. I plan on using DRBD until either PG can do 
that, or a better alternative presents itself.

Async is simply too slow for our OLTP system except for the disaster 
recovery node, which isn't expected to carry on within seconds of the 
primary's failure. I briefly considered sync mode when it appeared as a 
feature, but I see it's still too early in its development cycle, 
because there are no degraded operation modes. That's fine, I'm willing 
to wait.

I just don't understand the push-back, I guess. RAID-1 is the poster 
child for synchronous writes for fault tolerance. It will whine 
constantly to anyone who will listen when operating only on one device, 
but at least it still works. I'm pretty sure nobody would use RAID-1 if 
its failure mode was: block writes until someone installs a replacement 
disk.

-- 
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

pgsql-hackers by date:

From: Bruce Momjian
Date: 10 July 2012, 10:21:02
Subject: Re: Use of rsync for data directory copying

From: Aidan Van Dyk
Date: 10 July 2012, 11:05:14
Subject: Re: Synchronous Standalone Master Redoux

Re: Synchronous Standalone Master Redoux - Mailing list pgsql-hackers

Previous

Next