Synchronous Standalone Master Redoux - Mailing list pgsql-hackers

From Shaun Thomas
Subject Synchronous Standalone Master Redoux
Date
Msg-id 4FFB3F49.4050108@optionshouse.com
Whole thread Raw
Responses Re: Synchronous Standalone Master Redoux
Re: Synchronous Standalone Master Redoux
List pgsql-hackers
Hey everyone,

Upon doing some usability tests with PostgreSQL 9.1 recently, I ran 
across this discussion:

http://archives.postgresql.org/pgsql-hackers/2011-12/msg01224.php

And after reading the entire thing, I found it odd that the overriding 
pushback was because nobody could think of a use case. The argument was: 
if you don't care if the slave dies, why not just use asynchronous 
replication?

I'd like to introduce all of you to DRBD. DRBD is, for those who aren't 
familiar, distributed (network) block-level replication. Right now, this 
is what we're using, and will use in the future, to ensure a stable 
synchronous PostgreSQL copy on our backup node. I was excited to read 
about synchronous replication, because with it, came the possibility we 
could have two readable nodes with the servers we already have. You 
can't do that with DRBD; secondary nodes can't even mount the device.

So here's your use case:

1. Slave wants to be synchronous with master. Master wants replication 
on at least one slave. They have this, and are happy.
2. For whatever reason, slave crashes or becomes unavailable.
3. Master notices no more slaves are available, and operates in 
standalone mode, accumulating WAL files until a suitable slave appears.
4. Slave finishes rebooting/rebuilding/upgrading/whatever, and 
re-subscribes to the feed.
5. Slave stays in degraded sync (asynchronous) mode until it is caught 
up, and then switches to synchronous. This makes both master and slave 
happy, because *intent* of synchronous replication is fulfilled.

PostgreSQL's implementation means the master will block until 
someone/something notices and tells it to stop waiting, or the slave 
comes back. For pretty much any high-availability environment, this is 
not viable. Based on that alone, I can't imagine a scenario where 
synchronous replication would be considered beneficial.

The current setup doubles unplanned system outage scenarios in such a 
way I'd never use it in a production environment. Right now, we only 
care if the master server dies. With sync rep, we'd have to watch both 
servers like a hawk and be ready to tell the master to disable sync rep, 
lest our 10k TPS system come to an absolute halt because the slave died.

With DRBD, when a slave node goes offline, the master operates in 
standalone until the secondary re-appears, after which it 
re-synchronizes missing data, and then operates in sync mode afterwards. 
Just because the data is temporarily out of sync does *not* mean we want 
asynchronous replication. I think you'd be hard pressed to find many 
users taking advantage of DRBD's async mode. Just because data is 
temporarily catching up, doesn't mean it will remain in that state.

I would *love* to have the functionality discussed in the patch. If I 
can make a case for it, I might even be able to convince my company to 
sponsor its addition, provided someone has time to integrate it. Right 
now, we're using DRBD so we can have a very short outage window while 
the offline node gets promoted, and it works, but that means a basically 
idle server at all times. I'd gladly accept a 10-20% performance hit for 
sync rep if it meant that other server could reliably act as a read 
slave. That's currently impossible because async replication is too 
slow, and sync is too fragile for reasons stated above.

Am I totally off-base, here? I was shocked when I actually read the 
documentation on how sync rep worked, and saw that no servers would 
function properly until at least two were online.

-- 
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com


______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


pgsql-hackers by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: Btree or not btree? That is the question
Next
From: Greg Sabino Mullane
Date:
Subject: Re: Btree or not btree? That is the question