Re: Weird XFS WAL problem - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: Weird XFS WAL problem
Date
Msg-id 4C07B0D30200002500031EAA@gw.wicourts.gov
Whole thread Raw
In response to Re: Weird XFS WAL problem  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: Weird XFS WAL problem
List pgsql-performance
Greg Smith <greg@2ndquadrant.com> wrote:
> Kevin Grittner wrote:
>> I've seen this, too (with xfs).  Our RAID controller, in spite of
>> having BBU cache configured for writeback, waits for actual
>> persistence on disk for write barriers (unlike for fsync).  This
>> does strike me as surprising to the point of bordering on
>> qualifying as a bug.
> Completely intentional, and documented at
>
http://xfs.org/index.php/XFS_FAQ#Q._Should_barriers_be_enabled_with_storage_which_has_a_persistent_write_cache.3F

Yeah, I read that long ago and I've disabled write barriers because
of it; however, it still seems wrong that the RAID controller
insists on flushing to the drives in write-back mode.  Here are my
reasons for wishing it was otherwise:

(1)  We've had batteries on our RAID controllers fail occasionally.
The controller automatically degrades to write-through, and we get
an email from the server and schedule a tech to travel to the site
and replace the battery; but until we take action we are now exposed
to possible database corruption.  Barriers don't automatically come
on when the controller flips to write-through mode.

(2)  It precludes any possibility of moving from fsync techniques to
write barrier techniques for ensuring database integrity.  If the OS
respected write barriers and the controller considered the write
satisfied when it hit BBU cache, write barrier techniques would
work, and checkpoints could be made smoother.  Think how nicely that
would inter-operate with point (1).

So, while I understand it's Working As Designed, I think the design
is surprising and sub-optimal.

-Kevin

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: Weird XFS WAL problem
Next
From: Andy Colson
Date:
Subject: Re: slow query performance