On Sat, 2010-06-19 at 14:53 -0400, Robert Haas wrote:
> On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark <gsstark@mit.edu> wrote:
> > On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> 4. Streaming Replication needs to detect death of master. We need
> >> some sort of keep-alive, here. Whether it's at the TCP level (as
> >> advocated by Tom Lane and others) or at the protocol level (as
> >> advocated by Greg Stark) is something that we have yet to decide; once
> >> it's decided, someone will need to do it...
> >
> > This sounds like a useful feature but I don't see why it's not 9.1
> > material. The status quo is that the expected usage pattern is manual
> > failover. As long as the slave responds to manual intervention when in
> > this state I don't think this is a blocking issue. Monitoring and
> > automatic failover are clearly things we plan to add features to
> > handle better in the future.
>
> Right now, if the SR master reboots unexpectedly (say, power plug pull
> and restart), the slave never notices. It just sits there forever
> waiting for the next byte of data from the master to arrive (which it
> never will). You have to manually restart the server or hit
> walreceiver with a SIGTERM to get it to start streaming agian. I
> guess we could decide we're just not going to deal with that, but it
> seems like a fairly large misfeature to me.
Are you saying it doesn't respond to a trigger file any any point? That
would be a problem.
Sounds like we should have a pg_restart_walreceiver() function. We
shouldn't be encouraging people to send signals to backends, its too
easy to get wrong.
-- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services