Re: Raid 10 chunksize - Mailing list pgsql-performance

From Greg Smith
Subject Re: Raid 10 chunksize
Date
Msg-id alpine.GSO.2.01.0904020437080.17611@westnet.com
Whole thread Raw
In response to Re: Raid 10 chunksize  (Scott Carey <scott@richrelevance.com>)
Responses Re: Raid 10 chunksize
Re: Raid 10 chunksize
Re: Raid 10 chunksize
List pgsql-performance
On Wed, 1 Apr 2009, Scott Carey wrote:

> Write caching on SATA is totally fine.  There were some old ATA drives that
> when paried with some file systems or OS's would not be safe.  There are
> some combinations that have unsafe write barriers.  But there is a standard
> well supported ATA command to sync and only return after the data is on
> disk.  If you are running an OS that is anything recent at all, and any
> disks that are not really old, you're fine.

While I would like to believe this, I don't trust any claims in this area
that don't have matching tests that demonstrate things working as
expected.  And I've never seen this work.

My laptop has a 7200 RPM drive, which means that if fsync is being passed
through to the disk correctly I can only fsync <120 times/second.  Here's
what I get when I run sysbench on it, starting with the default ext3
configuration:

$ uname -a
Linux gsmith-t500 2.6.28-11-generic #38-Ubuntu SMP Fri Mar 27 09:00:52 UTC 2009 i686 GNU/Linux

$ mount
/dev/sda3 on / type ext3 (rw,relatime,errors=remount-ro)

$ sudo hdparm -I /dev/sda | grep FLUSH
        *    Mandatory FLUSH_CACHE
        *    FLUSH_CACHE_EXT

$ ~/sysbench-0.4.8/sysbench/sysbench --test=fileio --file-fsync-freq=1 --file-num=1 --file-total-size=16384
--file-test-mode=rndwrrun 
sysbench v0.4.8:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Extra file open flags: 0
1 files, 16Kb each
16Kb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 1 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random write test
Threads started!
Done.

Operations performed:  0 Read, 10000 Write, 10000 Other = 20000 Total
Read 0b  Written 156.25Mb  Total transferred 156.25Mb  (39.176Mb/sec)
  2507.29 Requests/sec executed


OK, that's clearly cached writes where the drive is lying about fsync.
The claim is that since my drive supports both the flush calls, I just
need to turn on barrier support, right?

[Edit /etc/fstab to remount with barriers]

$ mount
/dev/sda3 on / type ext3 (rw,relatime,errors=remount-ro,barrier=1)

[sysbench again]

  2612.74 Requests/sec executed

-----

This is basically how this always works for me:  somebody claims barriers
and/or SATA disks work now, no really this time.  I test, they give
answers that aren't possible if fsync were working properly, I conclude
turning off the write cache is just as necessary as it always was.  If you
can suggest something wrong with how I'm testing here, I'd love to hear
about it.  I'd like to believe you but I can't seem to produce any
evidence that supports you claims here.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-performance by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: Raid 10 chunksize
Next
From: Matthew Wakeling
Date:
Subject: Re: Very specialised query