Thread: fsync and hardware write cache
Something to think about: if you run PostgreSQL with fsync on, but you use the hardware write cache on your disk drives, how likely are you to lose data? Obviously, this is a fairly limited problem, as it only applies to power down (which you can control) or power loss where the risks may be reduced but not eliminated with a UPS. Does it make sense to add a platform specific call that will flush a write cache when fsync is enable?
pgsql@mohawksoft.com writes: > Something to think about: > > if you run PostgreSQL with fsync on, but you use the hardware write cache > on your disk drives, how likely are you to lose data? Obviously, this is a > fairly limited problem, as it only applies to power down (which you can > control) or power loss where the risks may be reduced but not eliminated > with a UPS. > > Does it make sense to add a platform specific call that will flush a write > cache when fsync is enable? AIUI, recent versions of the Linux kernel are supposed to do this for you, but not all drives honor the "flush" command, so you're still at the mercy of your disk vendor... -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
pgsql@mohawksoft.com wrote: > Something to think about: > > if you run PostgreSQL with fsync on, but you use the hardware write cache > on your disk drives, how likely are you to lose data? Obviously, this is a > fairly limited problem, as it only applies to power down (which you can > control) or power loss where the risks may be reduced but not eliminated > with a UPS. > > Does it make sense to add a platform specific call that will flush a write > cache when fsync is enable? We have discussed this in the past and just require hardware to honor the operating system fsync. If it doesn't honor that, how do we fix it other than telling them to properly configure their hardware. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
pgsql@mohawksoft.com wrote: >Something to think about: > >if you run PostgreSQL with fsync on, but you use the hardware write cache >on your disk drives, how likely are you to lose data? Obviously, this is a >fairly limited problem, as it only applies to power down (which you can >control) or power loss where the risks may be reduced but not eliminated >with a UPS. > >Does it make sense to add a platform specific call that will flush a write >cache when fsync is enable? > > > Pete Zaitsev from mysql wrote that there is a special call on Mac OS: Quoting him: >Mac OS X also has this "optimization", but at least it provides an >alternative flush method for Database Servers: > >fcntl(fd, F_FULLFSYNC, NULL) > >can be used instead of fsync() to get true fsync() behavior. > I couldn't confirm this with a quick google search - perhaps someone with MacOS docs (or mysql sources) should check it. What might be useful is a test tool that benchmarks fsync: if it's faster than the rotational speed of a 15k rpm disk then probably someone caches the write calls. -- Manfred
On Mon, Aug 23, 2004 at 10:19:20PM +0200, Manfred Spraul wrote: > >Does it make sense to add a platform specific call that will flush a write > >cache when fsync is enable? > > > Pete Zaitsev from mysql wrote that there is a special call on Mac OS: > Quoting him: > > >Mac OS X also has this "optimization", but at least it provides an > >alternative flush method for Database Servers: > > > >fcntl(fd, F_FULLFSYNC, NULL) > > > >can be used instead of fsync() to get true fsync() behavior. > > > > I couldn't confirm this with a quick google search - perhaps someone > with MacOS docs (or mysql sources) should check it. I can confirm it exists. #define F_FULLFSYNC 51 /* fsync + ask the drive to flush to the media */ > What might be useful is a test tool that benchmarks fsync: if it's > faster than the rotational speed of a 15k rpm disk then probably someone > caches the write calls. I played with doing that - and can't find any system where a naive looped write, fsync, write, fsync took more that about 600us, so I guess I'm missing something somewhere. Cheers, Steve