Thread: Lying drives [Was: Re: Which OS provides the _fastest_ PostgreSQL performance?]

toby wrote:
>
> That's not quite what I meant by "trust". Some drives lie about the
> flush.

Is that really true, or a misdiagnosed software bug?

I know many _drivers_ lie about flushing - for example EXT3
on Linux before early 2005 "did not have write barrier support
that issues the FLUSH CACHE (IDE) or SYNCHRONIZE CACHE (SCSI)
commands even on fsync" according to the writer of
the Linux SATA driver.[1]

This has the same effect of having a lying disk drive to
any application code (including those designed to test for
lying drives), but is instead merely a software bug.


Does anyone have an example of an current (on the market so
I can get one) drive that lies about sync?  I'd be interested
in getting my hands on one to see if it's a OS-software or
a drive-hw/firmware issue.


[1] http://hardware.slashdot.org/comments.pl?sid=149349&cid=12519114

> > That's not quite what I meant by "trust". Some drives lie about the
> > flush.
>
> Is that really true, or a misdiagnosed software bug?

I've yet to find a drive that lies about write completion. (*)

The problem is that the drives boot-up default is write-caching enabled (or
perhaps the system BIOS sets it that way).

If you turn an IDE disks write cache off explicity, using hdparm or similar,
they behave.

The problem, I think, is a bug in hdparm or the linux kernel: if you use the
little-'i' option, the output indicates the WC is disabled. However, if you
use big-'I' to actually interrogate the drive, you get the correct setting.

I tested this a while ago by writing a program that did fsync() to test
write latency and random-reads to test read latency, and then comparing
them.

- Guy

* I did experience a too-close-to-call case, where after write-cache was
  disabled, the write latency was the same as the read latency. For every
  other drive the write latency much, MUCH higher. However, before I
  disabled the WC, the write latency was a fraction of the read latency.

Re: Lying drives [Was: Re: Which OS provides the _fastest_

From
Greg Smith
Date:
On Mon, 13 Nov 2006, Guy Thornley wrote:

> I've yet to find a drive that lies about write completion. The problem
> is that the drives boot-up default is write-caching enabled (or perhaps
> the system BIOS sets it that way). If you turn an IDE disks write cache
> off explicity, using hdparm or similar, they behave.

I found a rather ominous warning from SGI on this subject at
http://oss.sgi.com/projects/xfs/faq.html#wcache_query

"[Disabling the write cache] is kept persistent for a SCSI disk. However,
for a SATA/PATA disk this needs to be done after every reset as it will
reset back to the default of the write cache enabled. And a reset can
happen after reboot or on error recovery of the drive. This makes it
rather difficult to guarantee that the write cache is maintained as
disabled."

As I've been learning more about this subject recently, I've become
increasingly queasy about using IDE drives for databases unless they're
hooked up to a high-end (S|P)ATA controller.  As far as I know the BIOS
doesn't mess with the write caches, it's strictly that the drives default
to having them on.  Some manufacturers lets you adjust the default, which
should prevent the behavior SGI warns about from happening; Hitachi's
"Feature Tool" at http://www.hitachigst.com/hdd/support/download.htm is
one example I've used successfully before.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Lying drives [Was: Re: Which OS provides the

From
Bruce Momjian
Date:
Greg Smith wrote:
> On Mon, 13 Nov 2006, Guy Thornley wrote:
>
> > I've yet to find a drive that lies about write completion. The problem
> > is that the drives boot-up default is write-caching enabled (or perhaps
> > the system BIOS sets it that way). If you turn an IDE disks write cache
> > off explicity, using hdparm or similar, they behave.
>
> I found a rather ominous warning from SGI on this subject at
> http://oss.sgi.com/projects/xfs/faq.html#wcache_query
>
> "[Disabling the write cache] is kept persistent for a SCSI disk. However,
> for a SATA/PATA disk this needs to be done after every reset as it will
> reset back to the default of the write cache enabled. And a reset can
> happen after reboot or on error recovery of the drive. This makes it
> rather difficult to guarantee that the write cache is maintained as
> disabled."
>
> As I've been learning more about this subject recently, I've become
> increasingly queasy about using IDE drives for databases unless they're
> hooked up to a high-end (S|P)ATA controller.  As far as I know the BIOS

Yes, avoiding IDE for serious database servers is a conclusion I made
long ago.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +