Re: WAL and archive disks full - Mailing list pgsql-admin

From Kieren Scott
Subject Re: WAL and archive disks full
Date
Msg-id BAY149-w32A853DBBB87C2AD0C2DECAE820@phx.gbl
Whole thread Raw
In response to Re: WAL and archive disks full  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: WAL and archive disks full  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-admin
Apologies for the hypothetical scenario, I was trying to gain a greater
understanding of what actions postgres would require in order to get the instance
started without any errors (such as archiver errors because wal files had been
wrongly manually deleted in order to free up space).

I'd be happy with a sitution which lets us start over again with a new base backup.

We have separate mount points for wal, and archived wal filesystems. Nothing
else apart from wal files are written to the filesystems.

I noticed a situation recently whereby our backup scripts had been failing, and the script
had subsequently not been clearing down the archive wal filesysytem after a successful backup.
The wal filesystem was almost full because the archive_command couldn't copy wal files
to the archive filesystem.

Sorry it's a bit of a what-if scenario. I can envisage encountering a situation in the future
whereby we hit this problem, and I was trying to put a plan in place for how to deal with it.

Thanks in advance.



> Date: Mon, 23 Aug 2010 16:47:57 -0500
> From: Kevin.Grittner@wicourts.gov
> To: kierenscott@hotmail.com; pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] WAL and archive disks full
>
> Kieren Scott <kierenscott@hotmail.com> wrote:
>
> > What would be the best course of action for resolving a situation
> > whereby your postgres instance had crashed due to the wal disk and
> > archive wal disk becoming 100% full? Say your backups have been
> > failing and your 'monitoring' had not reported it correctly.
> >
> > You can't start the instance because it needs to write to the WAL
> > disk (which is full), but if you manually move WAL files off the
> > WAL disk, the archiver will fail because it can't find WAL files
> > it needs to archive. The instance may also still be in backup
> > mode, because the backups had not completed due to the disk full
> > situation.
> >
> > Being new to postgres, im trying to understand what actions need
> > to be taken to get the instance back up and running without
> > compromising recoverability...?
>
> You will get more detailed advice if you avoid hypotheticals and say
> exactly what's going on and what your priorities are. For starters,
> are you OK with a situation which gets your primary database running
> again and lets you start over with a new base backup, or is it
> critical that you continue your backup stream without having to take
> a new base backup? My advice would depend on that answer to that.
>
> Also, it would be helpful to have an idea what your various mount
> points are, how big they are, and what's on them. (If there's
> something else *also* on the same mount point as the WAL files, that
> might make a difference. What do you mean, exactly, when you say
> your wal disk and archive wal disk are 100% full? (Are those
> separate mount points? Did the archive fail to restore, thereby
> building up to where the archive later began to fail, or is it a
> shared drive?)
>
> -Kevin
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin

pgsql-admin by date:

Previous
From: "McGehee, Robert"
Date:
Subject: Unable to drop role
Next
From: Alvaro Herrera
Date:
Subject: Re: Unable to drop role