Re: [HACKERS] Potential data loss of 2PC files - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [HACKERS] Potential data loss of 2PC files
Date
Msg-id 1c3b83e9-11c0-e75a-e856-db021c6fa799@iki.fi
Whole thread Raw
In response to Re: [HACKERS] Potential data loss of 2PC files  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] Potential data loss of 2PC files  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On 12/27/2016 01:31 PM, Andres Freund wrote:
> On 2016-12-27 14:09:05 +0900, Michael Paquier wrote:
>> On Fri, Dec 23, 2016 at 3:02 AM, Andres Freund <andres@anarazel.de> wrote:
>>> Not quite IIRC: that doesn't deal with file size increase.  All this would be easier if hardlinks wouldn't exist
IIUC.It's basically a question whether dentry, inode or contents need to be synced.   Yes, it sucks.
 
>>
>> I did more monitoring of the code... Looking at unlogged tables and
>> empty() routines of access methods, isn't there a risk as well for
>> unlogged tables? mdimmedsync() does not fsync() the parent directory
>> either!
>
> But the files aren't created there, so I don't generally see the
> problem.  And the creation of the metapages etc. should be WAL logged
> anyway.  So I don't think we should / need to do anything special
> there.  You can argue however that we wouldn't necessarily fsync the
> parent directory for the file creation, ever.  But that'd be more
> smgrcreate's responsibility than anything.

So, if I understood correctly, the problem scenario is:

1. Create and write to a file.
2. fsync() the file.
3. Crash.
4. After restart, the file is gone.

If that can happen, don't we have the same problem in many other places? 
Like, all the SLRUs? They don't fsync the directory either.


Is unlink() guaranteed to be durable, without fsyncing the directory? If 
not, then we need to fsync() the directory even if there are no files in 
it at the moment, because some might've been removed earlier in the 
checkpoint cycle.

- Heikki




pgsql-hackers by date:

Previous
From: Pavan Deolasee
Date:
Subject: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
Next
From: Simon Riggs
Date:
Subject: Re: [HACKERS] Superowners