Re: silent data loss with ext4 / all current versions - Mailing list pgsql-hackers

From Andres Freund
Subject Re: silent data loss with ext4 / all current versions
Date
Msg-id 20160308055552.akvmwjer6km76qqi@alap3.anarazel.de
Whole thread Raw
In response to Re: silent data loss with ext4 / all current versions  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: silent data loss with ext4 / all current versions
Re: silent data loss with ext4 / all current versions
Re: silent data loss with ext4 / all current versions
List pgsql-hackers
On 2016-03-08 12:26:34 +0900, Michael Paquier wrote:
> On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund <andres@anarazel.de> wrote:
> > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote:
> >> I have spent a couple of hours looking at that in details, and the
> >> patch is neat.
> >
> > Cool. Doing some more polishing right now. Will be back with an updated
> > version soonish.
> >
> > Did you do some testing?
>
> Not much in details yet, I just ran a check-world with fsync enabled
> for the recovery tests, plus quick manual tests with a cluster
> manually set up. I'll do more with your new version now that I know
> there will be one.

Here's my updated version.

Note that I've split the patch into two. One for the infrastructure, and
one for the callsites.

> >> +   /* XXX: Add racy file existence check? */
> >> +   if (rename(oldfile, newfile) < 0)
> >
> >> I am not sure we should worry about that, what do you think could
> >> cause the old file from going missing all of a sudden. Other backend
> >> processes are not playing with it in the code paths where this routine
> >> is called. Perhaps adding a comment in the header to let users know
> >> that would help?
> >
> > What I'm thinking of is adding a check whether the *target* file already
> > exists, and error out in that case. Just like the link() based path
> > normally does.
>
> Ah, OK. Well, why not. I'd rather have an assertion instead of an error though.

I think it should definitely be an error if anything. But I'd rather
only add it in master...

Andres

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Recovery test failure for recovery_min_apply_delay on hamster
Next
From: Andres Freund
Date:
Subject: Re: Allowing to run a buildfarm animal under valgrind