Re: fsync reliability - Mailing list pgsql-hackers

From Greg Smith
Subject Re: fsync reliability
Date
Msg-id 4DB592A9.3080307@2ndQuadrant.com
Whole thread Raw
In response to Re: fsync reliability  (Daniel Farina <daniel@heroku.com>)
Responses Re: fsync reliability  (Daniel Farina <daniel@heroku.com>)
List pgsql-hackers
On 04/24/2011 10:06 PM, Daniel Farina wrote:
> On Thu, Apr 21, 2011 at 8:51 PM, Greg Smith<greg@2ndquadrant.com>  wrote:
>    
>> There's still the "fsync'd a data block but not the directory entry yet"
>> issue as fall-out from this too.  Why doesn't PostgreSQL run into this
>> problem?  Because the exact code sequence used is this one:
>>
>> open
>> write
>> fsync
>> close
>>
>> And Linux shouldn't ever screw that up, or the similar rename path.  Here's
>> what the close man page says, from http://linux.die.net/man/2/close :
>>      
> Theodore Ts'o addresses this *exact* sequence of events, and suggests
> if you want that rename to definitely stick that you must fsync the
> directory:
>
> http://www.linuxfoundation.org/news-media/blogs/browse/2009/03/don%E2%80%99t-fear-fsync
>    

Not exactly.  That's talking about the sequence used for creating a 
file, plus a rename.  When new WAL files are being created, I believe 
the ugly part of this is avoided.  The path when WAL files are recycled 
using rename does seem to be the one with the most likely edge case.

The difficult case Tso's discussion is trying to satisfy involves 
creating a new file and then swapping it for an old one atomically.  
PostgreSQL never does that exactly.  It creates new files, pads them 
with zeros, and then starts writing to them; it also renames old files 
that are already of the correctly length.  Combined with the fact that 
there are always fsyncs after writes to the files, and this case really 
isn't exactly the same as any of the others people are complaining about.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: make check in contrib
Next
From: Christopher Browne
Date:
Subject: Re: branching for 9.2devel