Re: Standalone synchronous master - Mailing list pgsql-hackers
From | Jim Nasby |
---|---|
Subject | Re: Standalone synchronous master |
Date | |
Msg-id | 52CDF4E1.8000604@nasby.net Whole thread Raw |
In response to | Re: Standalone synchronous master (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Standalone synchronous master
|
List | pgsql-hackers |
On 1/8/14, 6:05 PM, Tom Lane wrote: > Josh Berkus<josh@agliodbs.com> writes: >> >On 01/08/2014 03:27 PM, Tom Lane wrote: >>> >>What we lack, and should work on, is a way for sync mode to have M larger >>> >>than one. AFAICS, right now we'll report commit as soon as there's one >>> >>up-to-date replica, and some high-reliability cases are going to want >>> >>more. >> >"Sync N times" is really just a guarantee against data loss as long as >> >you lose N-1 servers or fewer. And it becomes an even >> >lower-availability solution if you don't have at least N+1 replicas. >> >For that reason, I'd like to see some realistic actual user demand >> >before we take the idea seriously. > Sure. I wasn't volunteering to implement it, just saying that what > we've got now is not designed to guarantee data survival across failure > of more than one server. Changing things around the margins isn't > going to improve such scenarios very much. > > It struck me after re-reading your example scenario that the most > likely way to figure out what you had left would be to see if some > additional system (think Nagios monitor, or monitors) had records > of when the various database servers went down. This might be > what you were getting at when you said "logging", but the key point > is it has to be logging done on an external server that could survive > failure of the database server. postmaster.log ain't gonna do it. Yeah, and I think that the logging command that was suggested allows for that *if configured correctly*. Automatic degradation to async is useful for protecting you against all modes of a single failure: Master fails, you've gotthe replica. Replica fails, you've got the master. But fit hits the shan as soon as you get a double failure, and that double failure can be very subtle. Josh's case is notsubtle: You lost power AND the master died. You KNOW you have two failures. But what happens if there's a network blip that's not large enough to notice (but large enough to degrade your replication)and the master dies? Now you have no clue if you've lost data. Compare this to async: if the master goes down (one failure), you have zero clue if you lost data or not. At least with auto-degredationyou know you have to have 2 failures to suffer data loss. -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
pgsql-hackers by date: