Re: time-delayed standbys - Mailing list pgsql-hackers

From Robert Haas
Subject Re: time-delayed standbys
Date
Msg-id BANLkTin4OhdSywpOP2n2gt+vt+fTBYBKHA@mail.gmail.com
Whole thread Raw
In response to Re: time-delayed standbys  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Thu, Jun 30, 2011 at 1:00 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 6/30/11 2:00 AM, Simon Riggs wrote:
>>>> Manual (or scripted) intervention is always necessary if you reach disk
>>>> >> 100% full.
>>> >
>>> > Wow, that's a pretty crappy failure mode... but I don't think we need
>>> > to fix it just on account of this patch.  It would be nice to fix, of
>>> > course.
>> How is that different to running out of space in the main database?
>>
>> If I try to pour a pint of milk into a small cup, I don't blame the cup.
>
> I have to agree with Simon here.  ;-)
>
> We can do some things to make this easier for administrators, but
> there's no way to "solve" the problem.  And the things we could do would
> have to be advanced optional modes which aren't on by default, so they
> wouldn't really help the DBA with poor planning skills.  Here's my
> suggestions:
>
> 1) Have a utility (pg_archivecleanup?) which checks if we have more than
> a specific settings's worth of archive_logs, and breaks replication and
> deletes the archive logs if we hit that number.  This would also require
> some way for the standby to stop replicating *without* becoming a
> standalone server, which I don't think we currently have.
>
> 2) Have a setting where, regardless of standby_delay settings, the
> standby will interrupt any running queries and start applying logs as
> fast as possible if it hits a certain number of unapplied archive logs.
>  Of course, given the issues we had with standby_delay, I'm not sure I
> want to complicate it further.
>
> I think we've already fixed the biggest issue in 9.1, since we now have
> a limit on the number of WALs the master will keep if archiving is
> failing ... yes?  That's the only big *avoidable* failure mode we have,
> where a failing standby effectively shuts down the master.

I'm not sure we changed anything in this area for 9.1.  Am I wrong?
wal_keep_segments was present in 9.0.  Using that instead of archiving
is a reasonable way to bound the amount of disk space that can get
used, at the cost of possibly needing to rebuild the standby if things
get too far behind.  Of course, in any version, you could also use an
archive_command that will remove old files to make space if the disk
is full, with the same downside: if the standby isn't done with those
files, you're now in for a rebuild.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
Next
From: Robert Haas
Date:
Subject: Re: time-delayed standbys