Thread: Lying drives [Was: Re: Which OS provides the _fastest_ PostgreSQL performance?]
Lying drives [Was: Re: Which OS provides the _fastest_ PostgreSQL performance?]
From
Ron Mayer
Date:
toby wrote: > > That's not quite what I meant by "trust". Some drives lie about the > flush. Is that really true, or a misdiagnosed software bug? I know many _drivers_ lie about flushing - for example EXT3 on Linux before early 2005 "did not have write barrier support that issues the FLUSH CACHE (IDE) or SYNCHRONIZE CACHE (SCSI) commands even on fsync" according to the writer of the Linux SATA driver.[1] This has the same effect of having a lying disk drive to any application code (including those designed to test for lying drives), but is instead merely a software bug. Does anyone have an example of an current (on the market so I can get one) drive that lies about sync? I'd be interested in getting my hands on one to see if it's a OS-software or a drive-hw/firmware issue. [1] http://hardware.slashdot.org/comments.pl?sid=149349&cid=12519114
Re: Lying drives [Was: Re: Which OS provides the _fastest_ PostgreSQL performance?]
From
Guy Thornley
Date:
> > That's not quite what I meant by "trust". Some drives lie about the > > flush. > > Is that really true, or a misdiagnosed software bug? I've yet to find a drive that lies about write completion. (*) The problem is that the drives boot-up default is write-caching enabled (or perhaps the system BIOS sets it that way). If you turn an IDE disks write cache off explicity, using hdparm or similar, they behave. The problem, I think, is a bug in hdparm or the linux kernel: if you use the little-'i' option, the output indicates the WC is disabled. However, if you use big-'I' to actually interrogate the drive, you get the correct setting. I tested this a while ago by writing a program that did fsync() to test write latency and random-reads to test read latency, and then comparing them. - Guy * I did experience a too-close-to-call case, where after write-cache was disabled, the write latency was the same as the read latency. For every other drive the write latency much, MUCH higher. However, before I disabled the WC, the write latency was a fraction of the read latency.
On Mon, 13 Nov 2006, Guy Thornley wrote: > I've yet to find a drive that lies about write completion. The problem > is that the drives boot-up default is write-caching enabled (or perhaps > the system BIOS sets it that way). If you turn an IDE disks write cache > off explicity, using hdparm or similar, they behave. I found a rather ominous warning from SGI on this subject at http://oss.sgi.com/projects/xfs/faq.html#wcache_query "[Disabling the write cache] is kept persistent for a SCSI disk. However, for a SATA/PATA disk this needs to be done after every reset as it will reset back to the default of the write cache enabled. And a reset can happen after reboot or on error recovery of the drive. This makes it rather difficult to guarantee that the write cache is maintained as disabled." As I've been learning more about this subject recently, I've become increasingly queasy about using IDE drives for databases unless they're hooked up to a high-end (S|P)ATA controller. As far as I know the BIOS doesn't mess with the write caches, it's strictly that the drives default to having them on. Some manufacturers lets you adjust the default, which should prevent the behavior SGI warns about from happening; Hitachi's "Feature Tool" at http://www.hitachigst.com/hdd/support/download.htm is one example I've used successfully before. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Greg Smith wrote: > On Mon, 13 Nov 2006, Guy Thornley wrote: > > > I've yet to find a drive that lies about write completion. The problem > > is that the drives boot-up default is write-caching enabled (or perhaps > > the system BIOS sets it that way). If you turn an IDE disks write cache > > off explicity, using hdparm or similar, they behave. > > I found a rather ominous warning from SGI on this subject at > http://oss.sgi.com/projects/xfs/faq.html#wcache_query > > "[Disabling the write cache] is kept persistent for a SCSI disk. However, > for a SATA/PATA disk this needs to be done after every reset as it will > reset back to the default of the write cache enabled. And a reset can > happen after reboot or on error recovery of the drive. This makes it > rather difficult to guarantee that the write cache is maintained as > disabled." > > As I've been learning more about this subject recently, I've become > increasingly queasy about using IDE drives for databases unless they're > hooked up to a high-end (S|P)ATA controller. As far as I know the BIOS Yes, avoiding IDE for serious database servers is a conclusion I made long ago. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +