Re: Hard limit on WAL space used (because PANIC sucks) - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Hard limit on WAL space used (because PANIC sucks)
Date
Msg-id 1390399098.58583.YahooMailNeo@web122305.mail.ne1.yahoo.com
Whole thread Raw
In response to Re: Hard limit on WAL space used (because PANIC sucks)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Well, PANIC is certainly bad, but what I'm suggesting is that we
> just focus on getting that down to ERROR and not worry about
> trying to get out of the disk-shortage situation automatically.
> Nor do I believe that it's such a good idea to have the database
> freeze up until space appears rather than reporting errors.

Dusting off my DBA hat for a moment, I would say that I would be
happy if each process either generated an ERROR or went into a wait
state.  They would not all need to do the same thing; whichever is
easier in each process's context would do.  It would be nice if a
process which went into a long wait state until disk space became
available would issue a WARNING about that, where that is possible.

I feel that anyone running a database that matters to them should
be monitoring disk space and getting an alert from the monitoring
system in advance of actually running out of disk space; but we all
know that poorly managed systems are out there.  To accomodate them
we don't want to add risk or performance hits for those who do
manage their systems well.

The attempt to make more space by deleting WAL files scares me a
bit.  The dynamics of that seem like they could be
counterproductive if the pg_xlog directory is on the same
filesystem as everything else and there is a rapidly growing file.
Every time you cleaned up, the fast-growing file would eat more of
the space pg_xlog had been using, until perhaps it had no space to
keep even a single segment, no?  And how would that work with a
situation where the archive_command was failing, causing a build-up
WAL files?  It just seems like too much mechanism and incomplete
coverage of the problem space.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information)
Next
From: Robert Haas
Date:
Subject: Re: ALTER SYSTEM SET typos and fix for temporary file name management