Re: fsync reliability - Mailing list pgsql-hackers

From Greg Stark
Subject Re: fsync reliability
Date
Msg-id BANLkTinCTTbPd_uc3p3cvutnkRTzSMk-sw@mail.gmail.com
Whole thread Raw
In response to Re: fsync reliability  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
On Mon, Apr 25, 2011 at 5:00 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Stop right there; the slow path was the only one that had any hope of being
> correct.  It can actually slow things by a factor of 100X or more,
> worst-case.  "So, we currently have the choice between filesystem corruption
> or major performance loss":  yes, you do.  Writing files is tricky and it
> can either be slow or safe.  If you're going to avoid even trying to enforce
> the right thing here, you're really going to get really burned.

Well no. That's like saying the whole database can't possibly process
transactions faster than the rate at which fsyncs can happen. That's
not true because we can process transactions in parallel and fsync a
whole bunch simultaneously.

The API tytso and company are suggesting is that if you want
reasonable performance you should create a thread for each file, fsync
in that thread and then do your rename. Hardly the sanest API one
could imagine.

And if you fail to do that you don't just risk losing data. You get a
filesystem state that *never* existed. It's as if we said that if the
database crashes your transaction might be rolled back, it might be
committed, and we might just replace your data with zeros. Huh?

--
greg


pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: branching for 9.2devel
Next
From: Robert Haas
Date:
Subject: Re: branching for 9.2devel