On Fri, 2010-10-08 at 16:34 -0400, Greg Smith wrote:
> Tom Lane wrote:
> > How are you going to "mark the standby as degraded"? The
> > standby can't keep that information, because it's not even connected
> > when the master makes the decision.
>
> From a high level, I'm assuming only that the master has a list in
> memory of the standby system(s) it believes are up to date, and that it
> is supposed to commit to synchronously. When I say mark as degraded, I
> mean that the master merely closes whatever communications channel it
> had open with that system and removes the standby from that list.
My current coding works with two sets of parameters:
The "master marks standby as degraded" is handled by the tcp keepalives.
When it notices no response, it kicks out the standby. We already had
this, so I never mentioned it before as being part of the solution.
The second part is the synchronous_replication_timeout which is a user
settable parameter defining how long the app is prepared to wait, which
could be more or less time than the keepalives.
> If that standby now reconnects again, I don't see how resolving what
> happens at that point is any different from when a standby is first
> started after both systems were turned off. If the standby is current
> with the data available on the master when it has an initial
> conversation, great; it's now available for synchronous commit too
> then. If it's not, it goes into a catchup mode first instead. When the
> master sees you're back to current again, if you're on the list of sync
> servers too you go back onto the list of active sync systems.
>
> There's shouldn't be any state information to save here. If the master
> and standby can't figure out if they are in or out of sync with one
> another based on the conversation they have when they first connect to
> one another, that suggests to me there needs to be improvements made in
> the communications protocol they use to exchange messages.
Agreed.
-- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services