Re: WAL archiving to network drive - Mailing list pgsql-general

From Greg Smith
Subject Re: WAL archiving to network drive
Date
Msg-id Pine.GSO.4.64.0808282159450.11207@westnet.com
Whole thread Raw
In response to Re: WAL archiving to network drive  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Wed, 20 Aug 2008, Tom Lane wrote:

> Greg Smith <gsmith@gregsmith.com> writes:
>> You also don't want to be the guy who has to explain why the database is
>> taking hours to come back up again after it crashed and has 4000 WAL
>> segments to replay, because archiving failed for a long time and prevented
>> proper checkpoints (ask Robert Treat if you don't believe me, he also once
>> was that guy).
>
> Say what?  Archiver failure can't/shouldn't prevent checkpointing.

Shouldn't, sure.  The wacky case Robert ran into I was alluding to
involved the system not checkpointing anymore and just piling the archive
files up, and while I think it's safe to say that was all a hardware
problem stuff like that makes me nervous.

It is true that archiver failure prevents *normal* checkpointing, where
WAL files get recycled rather than piling up.  I know that shouldn't make
any difference, but I've also been through two similarly awful situations
resulting from odd archiver problems that seemed mysterious at the time
(staring at the source later cleared up what really happened) that left me
even more paranoid than usual when working in this area.

The stance I've adopted says anything involving uncertain network
resources should get moved to outside of the code the database itself
runs.  Any time you're following a different path than the usual one
through the server code (in this case exercising the archive failure and
resubmission section), I see that as an opportunity to run into more
obscure bugs.  That's just not code that gets run/tested as often.  It
also minimizes the amount of software the admin wrote that has to be right
(bugs in the archive_command script are really bad) in order for the
database to keep running.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-general by date:

Previous
From: Christophe
Date:
Subject: Re: indexes on functions and create or replace function
Next
From: "Matthew Dennis"
Date:
Subject: Re: indexes on functions and create or replace function