Re: sync rep design architecture (was "disposition of remaining patches") - Mailing list pgsql-hackers

From Greg Smith
Subject Re: sync rep design architecture (was "disposition of remaining patches")
Date
Msg-id 4D67DB7B.8030801@2ndquadrant.com
Whole thread Raw
In response to Re: disposition of remaining patches  (Daniel Farina <daniel@heroku.com>)
Responses Re: sync rep design architecture (was "disposition of remaining patches")
List pgsql-hackers
Daniel Farina wrote:
> Server A syncreps to Server B
>
> Now I want to provision server A-prime, which will eventually take the
> place of A.
>
> Server A syncreps to Server B
> Server A syncreps to Server A-prime
>
> Right now, as it stands, the syncrep patch will be happy as soon as
> the data has been fsynced to either B or A-prime; I don't think we can
> guarantee at any point that A-prime can become the leader, and feed B.
>   

One of the very fundamental breaks between how this patch implements 
sync rep and what some people might expect is this concern.  Having such 
tight control over the exact order of failover isn't quite here yet, so 
sometimes people will need to be creative to work within the 
restrictions of what is available.  The path for this case is probably:

1) Wait until A' is caught up
2) Switchover to B as the right choice to be the new master, with A' as 
its standby and A going off-line at the same time.
3) Switchover the master role from B to A'.  Bring up B as its standby. 

There are other possible transition plans available too.

I appreciate that you would like to do this as an atomic operation, 
rather than handling it as two steps--one of which puts you in a middle 
point where B, a possibly inferior standby, is operating at the master.  
There are a dozen other complicated "my use case says I want <X> and it 
must be done as <Y>" requests for Sync Rep floating around here, too.  
They're all getting ignored in favor of something smaller that can get 
built today. 

The first question I'd ask is whether you could you settle for this more 
cumbersome than you'd prefer switchover plan for now.  The second is 
whether implementing what this feature currently does would get in the 
way of coding of what you really want eventually. 

I didn't get the Streaming Rep + Hot Standby features I wanted in 9.0 
either.  But committing what was reasonable to include in that version 
let me march forward with very useful new code, doing another year of 
development on my own projects and getting some new things get fixed in 
core.  And so far it looks like 9.1 will sort out all of the kinks I was 
unhappy about.  The same sort of thing will need to happen to get Sync 
Rep committed and then appropriate for more use cases.  There isn't any 
margin left for discussions of scope creep left here; really it's "is 
this subset useful for some situations and stable enough to commit" now.

> 2. The unprivileged user can disable syncrep, in any situation. This
> flexibility is *great*, but you don't really want people to do it when
> one is performing the switchover.

For the moment you may have to live with a situation where user 
connections must be blocked during the brief moment of switchover to 
eliminate this issue.  That's what I end up doing with 9.0 production 
systems to get a really clean switchover, there's a second of hiccup 
even in the best case.  I'm not sure yet of the best way yet to build a 
UI to make that more transparent in the sync rep case.  It's sure not a 
problem that's going to get solved in this release though.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




pgsql-hackers by date:

Previous
From: Marko Tiikkaja
Date:
Subject: Re: wCTE behaviour
Next
From: Cédric Villemain
Date:
Subject: Re: WIP: cross column correlation ...