Re: Maximum transaction rate - Mailing list pgsql-general

From Ron Mayer
Subject Re: Maximum transaction rate
Date
Msg-id 49C29896.1020607@cheapcomplexdevices.com
Whole thread Raw
In response to Re: Maximum transaction rate  (Marco Colombo <pgsql@esiway.net>)
Responses Re: Maximum transaction rate  (Baron Schwartz <baron@xaprb.com>)
Re: Maximum transaction rate  (Marco Colombo <pgsql@esiway.net>)
List pgsql-general
Marco Colombo wrote:
> Yes, but we knew it already, didn't we? It's always been like
> that, with IDE disks and write-back cache enabled, fsync just
> waits for the disk reporting completion and disks lie about

I've looked hard, and I have yet to see a disk that lies.

ext3, OTOH seems to lie.

IDE drives happily report whether they support write barriers
or not, which you can see with the command:
%hdparm -I /dev/hdf | grep FLUSH_CACHE_EXT
I've tested about a dozen drives, and I've never seen one
claims to support flushing that doesn't.  And I haven't seen
one that doesn't support it that was made less than half a
decade ago.  IIRC, ATA-5 specs from 2000 made supporting
this mandatory.

Linux kernels since 2005 or so check for this feature.  It'll
happily tell you which of your devices don't support it.
  %dmesg | grep 'disabling barriers'
  JBD: barrier-based sync failed on md1 - disabling barriers
And for devices that do, it will happily send IDE FLUSH CACHE
commands to IDE drives that support the feature.   At the same
time Linux kernels started sending the very similar. SCSI
SYNCHRONIZE CACHE commands.


> Anyway, it's the block device job to control disk caches. A
> filesystem is just a client to the block device, it posts a
> flush request, what happens depends on the block device code.
> The FS doesn't talk to disks directly. And a write barrier is
> not a flush request, is a "please do not reorder" request.
> On fsync(), ext3 issues a flush request to the block device,
> that's all it's expected to do.

But AFAICT ext3 fsync() only tell the block device to
flush disk caches if the inode was changed.

Or, at least empirically if I modify a file and do
fsync(fd); on ext3 it does not wait until the disk
spun to where it's supposed to spin.   But if I put
a couple fchmod()'s right before the fsync() it does.

pgsql-general by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: PostgreSQL technical Videos: Proteomic mining and Procedural language development
Next
From: "Roderick A. Anderson"
Date:
Subject: Determining PUBLIC's permissions