Re: Sync Rep Design - Mailing list pgsql-hackers

From Stefan Kaltenbrunner
Subject Re: Sync Rep Design
Date
Msg-id 4D204367.4030308@kaltenbrunner.cc
Whole thread Raw
In response to Re: Sync Rep Design  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On 01/02/2011 09:35 AM, Heikki Linnakangas wrote:
> On 02.01.2011 00:40, Josh Berkus wrote:
>> On 1/1/11 5:59 AM, Stefan Kaltenbrunner wrote:
>>> well you keep saying that but to be honest I cannot really even see a
>>> usecase for me - what is "only a random one of a set of servers is sync
>>> at any time and I don't really know which one".
>>> My usecases would al involved 2 sync standbys and 1 or more async ones.
>>> but the second sync one would be in a different datacenter and I NEED to
>>> protect against a datacenter failure which your proposals says I cannot
>>> do :(
>>
>> As far as I know, *nobody* has written the bookkeeping code to actually
>> track which standbys have ack'd. We need to get single-ack synch
>> standby merged, tested and working before we add anything as complicated
>> as "each standby on this list must ack". That means that it's extremely
>> unlikely for 9.1 at this point.
>
> The bookkeeping will presumably consist of an XLogRecPtr in shared
> memory for each standby, tracking how far the standby has acknowledged.
> At commit, you scan the standby slots in shared memory and check that
> the required standbys have acknowledged your commit record. The
> bookkeeping required is the same whether or not we support a list of
> standbys that must ack or just one.
>
>> Frankly, if Simon hadn't already submitted code, I'd be pushing for
>> single-standby-only for 9.1, instead of "any one".
>
> Yes, we are awfully late, but let's not panic.
>
> BTW, there's a bunch of replication related stuff that we should work to
> close, that are IMHO more important than synchronous replication. Like
> making the standby follow timeline changes, to make failovers smoother,
> and the facility to stream a base-backup over the wire. I wish someone
> worked on those...

yeah I agree that those two are much more of a problem for the general 
user base. Whatever people think about our current system - it is very 
easy to configure(in terms of knobs to toggle) but extremely hard to get 
set up and dealt with during failovers(and I know nobody who got it 
right the first few times or has not fucked up one thing in the process).
Syncrep is importantant but I would argue that getting those two fixed 
is even more so ;)



Stefan


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: SSI SLRU low-level functions first cut
Next
From: Simon Riggs
Date:
Subject: Re: Sync Rep Design