Re: HFS+ pg_test_fsync performance - Mailing list pgsql-performance

From desmodemone
Subject Re: HFS+ pg_test_fsync performance
Date
Msg-id CAEs9oFnYFb__R1RouZbL_vhradyRs5MOcYMHV4oob+SF9QrRwA@mail.gmail.com
Whole thread Raw
In response to HFS+ pg_test_fsync performance  (Mel Llaguno <mllaguno@coverity.com>)
Responses Re: HFS+ pg_test_fsync performance  (Mel Llaguno <mllaguno@coverity.com>)
List pgsql-performance



2014-04-15 0:32 GMT+02:00 Mel Llaguno <mllaguno@coverity.com>:
I was given anecdotal information regarding HFS+ performance under OSX as
being unsuitable for production PG deployments and that pg_test_fsync
could be used to measure the relative speed versus other operating systems
(such as Linux). In my performance lab, I have a number of similarly
equipped Linux hosts (Ubuntu 12.04 64-bit LTS Server w/128Gb RAM / 2 OWC
6g Mercury Extreme SSDs / 7200rpm SATA3 HDD / 16 E5-series cores) that I
used to capture baseline Linux numbers. As we generally recommend our
customers use SSD (the s3700 recommended by PG), I wanted to perform a
comparison. On these beefy machines I ran the following tests:

SSD:

# pg_test_fsync -f ./fsync.out -s 30
30 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                        2259.652 ops/sec     443 usecs/op
        fsync                            1949.664 ops/sec     513 usecs/op
        fsync_writethrough                            n/a
        open_sync                        2245.162 ops/sec     445 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                        2161.941 ops/sec     463 usecs/op
        fsync                            1891.894 ops/sec     529 usecs/op
        fsync_writethrough                            n/a
        open_sync                        1118.826 ops/sec     894 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write        2171.558 ops/sec     460 usecs/op
         2 *  8kB open_sync writes       1126.490 ops/sec     888 usecs/op
         4 *  4kB open_sync writes        569.594 ops/sec    1756 usecs/op
         8 *  2kB open_sync writes        285.149 ops/sec    3507 usecs/op
        16 *  1kB open_sync writes        142.528 ops/sec    7016 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close              1947.557 ops/sec     513 usecs/op
        write, close, fsync              1951.082 ops/sec     513 usecs/op

Non-Sync'ed 8kB writes:
        write                           481296.909 ops/sec       2 usecs/op


HDD:

pg_test_fsync -f /tmp/fsync.out -s 30
30 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                         105.783 ops/sec    9453 usecs/op
        fsync                              27.692 ops/sec   36111 usecs/op
        fsync_writethrough                            n/a
        open_sync                         103.399 ops/sec    9671 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                         104.647 ops/sec    9556 usecs/op
        fsync                              27.223 ops/sec   36734 usecs/op
        fsync_writethrough                            n/a
        open_sync                          55.839 ops/sec   17909 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write         103.581 ops/sec    9654 usecs/op
         2 *  8kB open_sync writes         55.207 ops/sec   18113 usecs/op
         4 *  4kB open_sync writes         28.320 ops/sec   35311 usecs/op
         8 *  2kB open_sync writes         14.581 ops/sec   68582 usecs/op
        16 *  1kB open_sync writes          7.407 ops/sec  135003 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close                27.228 ops/sec   36727 usecs/op
        write, close, fsync                27.108 ops/sec   36890 usecs/op

Non-Sync'ed 8kB writes:
        write                           466108.001 ops/sec       2 usecs/op


-------

So far, so good. Local HDD vs. SSD shows a significant difference in fsync
performance. Here are the corresponding fstab entries :

/dev/mapper/cim-base
/opt/cim                ext4    defaults,noatime,nodiratime,discard     0       2 (SSD)
/dev/mapper/p--app--lin-root /               ext4    errors=remount-ro 0
    1 (HDD)

I then tried the pg_test_fsync on my OSX Mavericks machine (quad-core i7 /
Intel 520SSD / 16GB RAM) and got the following results :

# pg_test_fsync -s 30 -f ./fsync.out
30 seconds per test
Direct I/O is not supported on this platform.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                    8752.240 ops/sec     114 usecs/op
        fdatasync                        8556.469 ops/sec     117 usecs/op
        fsync                            8831.080 ops/sec     113 usecs/op
        fsync_writethrough                735.362 ops/sec    1360 usecs/op
        open_sync                        8967.000 ops/sec     112 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                    4256.906 ops/sec     235 usecs/op
        fdatasync                        7485.242 ops/sec     134 usecs/op
        fsync                            7335.658 ops/sec     136 usecs/op
        fsync_writethrough                716.530 ops/sec    1396 usecs/op
        open_sync                        4303.408 ops/sec     232 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write        7559.381 ops/sec     132 usecs/op
         2 *  8kB open_sync writes       4537.573 ops/sec     220 usecs/op
         4 *  4kB open_sync writes       2539.780 ops/sec     394 usecs/op
         8 *  2kB open_sync writes       1307.499 ops/sec     765 usecs/op
        16 *  1kB open_sync writes        659.985 ops/sec    1515 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close              9003.622 ops/sec     111 usecs/op
        write, close, fsync              8035.427 ops/sec     124 usecs/op

Non-Sync'ed 8kB writes:
        write                           271112.074 ops/sec       4 usecs/op

-------


These results were unexpected and surprising. In almost every metric (with
the exception of the Non-Sync¹d 8k8 writes), OSX Mavericks 10.9.2 using
HFS+ out-performed my Ubuntu servers. While the SSDs come from different
manufacturers, both use the SandForce SF-2281 controllers.

Plausible explanations of the apparent disparity in fsync performance
would be welcome.

Thanks, Mel

P.S. One more thing; I found this article which maps fsync mechanisms
versus
operating systems :
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm

This article suggests that both open_datasync and fdatasync are _not_
supported for OSX, but the pg_test_fsync results suggest otherwise.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

My 2 cents :

The results are not surprising, in the linux enviroment the i/o call of pg_test_fsync  are using O_DIRECT  (PG_O_DIRECT) with also the O_SYNC or O_DSYNC calls, so ,in practice, it is waiting the "answer" from the storage bypassing the cache  in sync mode, while in  the Mac OS X it is not doing so, it's only using the O_SYNC or O_DSYNC calls without O_DIRECT,  in practice, it's using the cache of filesystem , even if it is asking the sync of io calls.


Bye

Mat Dba

pgsql-performance by date:

Previous
From: Borodin Vladimir
Date:
Subject: Re: Checkpoint distribution
Next
From: Mel Llaguno
Date:
Subject: Re: HFS+ pg_test_fsync performance