Home > mailing lists

Re: Synchronous Standalone Master Redoux - Mailing list pgsql-hackers

From	Shaun Thomas
Subject	Re: Synchronous Standalone Master Redoux
Date	July 11, 2012 13:41:59
Msg-id	4FFD8294.3050902@optionshouse.com Whole thread Raw
In response to	Re: Synchronous Standalone Master Redoux (Daniel Farina <daniel@heroku.com>)
Responses	Re: Synchronous Standalone Master Redoux Re: Synchronous Standalone Master Redoux Re: Synchronous Standalone Master Redoux Re: Synchronous Standalone Master Redoux
List	pgsql-hackers

Tree view

On 07/10/2012 06:02 PM, Daniel Farina wrote:

> For example, what if DRBD can only complete one page per second for
> some reason?  Does it it simply have the primary wait at this glacial
> pace, or drop synchronous replication and go degraded?  Or does it do
> something more clever than just a timeout?

That's a good question, and way beyond what I know about the internals. 
:) In practice though, there are configurable thresholds, and if 
exceeded, it will invalidate the secondary. When using Pacemaker, we've 
actually had instances where the 10G link we had between the servers 
died, so each node thought the other was down. That lead to the 
secondary node self-promoting and trying to steal the VIP from the 
primary. Throw in a gratuitous arp, and you get a huge mess.

That lead to what DRBD calls split-brain, because both nodes were 
running and writing to the block device. Thankfully, you can actually 
tell one node to discard its changes and re-subscribe. Doing that will 
replay the transactions from the "good" node on the "bad" one. And even 
then, it's a good idea to run an online verify to do a block-by-block 
checksum and correct any differences.

Of course, all of that's only possible because it's a block-level 
replication. I can't even imagine PG doing anything like that. It would 
have to know the last good transaction from the primary and do an 
implied PIT recovery to reach that state, then re-attach for sync commits.

> Regardless of what DRBD does, I think the problem with the
> async/sync duality as-is is there is no nice way to manage exposure
> to transaction loss under various situations and requirements.

Which would be handy. With synchronous commits, it's given that the 
protocol is bi-directional. Then again, PG can detect when clients 
disconnect the instant they do so, and having such an event implicitly 
disable synchronous_standby_names until reconnect would be an easy fix. 
The database already keeps transaction logs, so replaying would still 
happen on re-attach. It could easily throw a warning for every 
sync-required commit so long as it's in "degraded" mode. Those alone are 
very small changes that don't really harm the intent of sync commit.

That's basically what a RAID-1 does, and people have been fine with that 
for decades.

-- 
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

pgsql-hackers by date:

From: Honza Horak
Date: 11 July 2012, 13:21:48
Subject: Re: Ability to listen on two unix sockets

From: Dimitri Fontaine
Date: 11 July 2012, 15:19:28
Subject: Re: Synchronous Standalone Master Redoux

Re: Synchronous Standalone Master Redoux - Mailing list pgsql-hackers

Previous

Next