Re: Re: Hot Standby query cancellation and Streaming Replication integration - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Re: Hot Standby query cancellation and Streaming Replication integration
Date
Msg-id 201003021830.o22IUFW28509@momjian.us
Whole thread Raw
In response to Re: Re: Hot Standby query cancellation and Streaming Replication integration  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: Re: Hot Standby query cancellation and Streaming Replication integration
Re: Re: Hot Standby query cancellation and Streaming Replication integration
List pgsql-hackers
Greg Smith wrote:
> > I assumed they would set max_standby_delay = -1 and be happy.
> >   
> 
> The admin in this situation might be happy until the first time the 
> primary fails and a failover is forced, at which point there is an 
> unbounded amount of recovery data to apply that was stuck waiting behind 
> whatever long-running queries were active.  I don't know if you've ever 
> watched what happens to a pre-8.2 cold standby when you start it up with 
> hundreds or thousands of backed up WAL files to process before the 
> server can start, but it's not a fast process.  I watched a production 
> 8.1 standby get >4000 files behind once due to an archive_command bug, 
> and it's not something I'd like to ever chew my nails off to again.  If 
> your goal was HA and you're trying to bring up the standby, the server 
> is down the whole time that's going on.
> 
> This is why no admin who prioritizes HA would consider 
> 'max_standby_delay = -1' a reasonable setting, and those are the sort of 
> users Joachim's example was discussing.  Only takes one rogue query that 
> runs for a long time to make the standby so far behind it's useless for 
> HA purposes.  And you also have to ask yourself "if recovery is halted 
> while waiting for this query to run, how stale is the data on the 
> standby getting?".  That's true for any large setting for this 
> parameter, but using -1 for the unlimited setting also gives the maximum 
> possible potential for such staleness.
> 
> 'max_standby_delay = -1' is really only a reasonable idea if you are 
> absolutely certain all queries are going to be short, which we can't 
> dismiss as an unfounded use case so it has value.  I would expect you 
> have to also combine it with a matching reasonable statement_timeout to 
> enforce that expectation to make that situation safer.

Well, as you stated in your blog, you are going to have one of these
downsides:
o  master bloato  delayed recoveryo  cancelled queries

Right now you can't choose "master bloat", but you can choose the other
two.  I think that is acceptable for 9.0, assuming the other two don't
have the problems that Tom foresees.

Our documentation should probably just come how and state that clearly.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [GENERAL] trouble with to_char('L')
Next
From: Bruce Momjian
Date:
Subject: Re: Re: Hot Standby query cancellation and Streaming Replication integration