Re: Better handling of archive_command problems - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Better handling of archive_command problems
Date
Msg-id CA+TgmoZ=ZZOXFocUe0LpTynhrUcHGy=uHZmEYS5QLu4XF=t6mA@mail.gmail.com
Whole thread Raw
In response to Re: Better handling of archive_command problems  (Daniel Farina <daniel@heroku.com>)
Responses Re: Better handling of archive_command problems  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Tue, May 14, 2013 at 12:23 AM, Daniel Farina <daniel@heroku.com> wrote:
> On Mon, May 13, 2013 at 3:02 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Has anyone else thought about approaches to mitigating the problems
>> that arise when an archive_command continually fails, and the DBA must
>> manually clean up the mess?
>
> Notably, the most common problem in this vein suffered at Heroku has
> nothing to do with archive_command failing, and everything to do with
> the ratio of block device write performance (hence, backlog) versus
> the archiving performance.  When CPU is uncontended it's not a huge
> deficit, but it is there and it causes quite a bit of stress.
>
> Archive commands failing are definitely a special case there, where it
> might be nice to bring write traffic to exactly zero for a time.

One possible objection to this line of attack is that, IIUC, waits to
acquire a LWLock are non-interruptible.  If someone tells PostgreSQL
to wait for some period of time before performing each WAL write,
other backends that grab the WALWriteLock will not respond to query
cancels during that time.  Worse, the locks have a tendency to back
up.  What I have observed is that if WAL isn't flushed in a timely
fashion, someone will try to grab WALWriteLock while holding
WALInsertLock.  Now anyone who attempts to insert WAL is in a
non-interruptible wait.  If the system is busy, it won't be long
before someone tries to extend pg_clog, and to do that they'll try to
grab WALInsertLock while holding CLogControlLock.  At that point, any
CLOG lookup that misses in the already-resident pages will send that
backend into a non-interruptible wait. I have seen cases where this
pile-up occurs during a heavy pgbench workload and paralyzes the
entire system, including any read-only queries, until the WAL write
completes.

Now despite all that, I can see this being useful enough that Heroku
might want to insert a very small patch into their version of
PostgreSQL to do it this way, and just live with the downsides.  But
anything that can propagate non-interruptible waits across the entire
system does not sound to me like a feature that is sufficiently
polished that we want to expose it to users less sophisticated than
Heroku (i.e. nearly all of them).  If we do this, I think we ought to
find a way to make the waits interruptible, and to insert them in
places where they really don't interfere with read-only backends.  I'd
probably also argue that we ought to try to design it such that the
GUC can be in MB/s rather than delay/WAL writer cycle.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Dead regression tests
Next
From: Robert Haas
Date:
Subject: Re: Logging of PAM Authentication Failure