Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options - Mailing list pgsql-performance
From | Bruce Momjian |
---|---|
Subject | Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options |
Date | |
Msg-id | 200409131438.i8DEc8r04384@candle.pha.pa.us Whole thread Raw |
In response to | Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options (mudfoot@rawbw.com) |
Responses |
Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options |
List | pgsql-performance |
Have you seen /src/tools/fsync? --------------------------------------------------------------------------- mudfoot@rawbw.com wrote: > Hi, I'd like to help with the topic in the Subject: line. It seems to be a > TODO item. I've reviewed some threads discussing the matter, so I hope I've > acquired enough history concerning it. I've taken an initial swipe at > figuring out how to optimize sync'ing methods. It's based largely on > recommendations I've read on previous threads about fsync/O_SYNC and so on. > After reviewing, if anybody has recommendations on how to proceed then I'd > love to hear them. > > Attached is a little program that basically does a bunch of sequential writes > to a file. All of the sync'ing methods supported by PostgreSQL WAL can be > used. Results are printed in microseconds. Size and quanity of writes are > configurable. The documentation is in the code (how to configure, build, run, > etc.). I realize that this program doesn't reflect all of the possible > activities of a production database system, but I hope it's a step in the > right direction for this task. I've used it to see differences in behavior > between the various sync'ing methods on various platforms. > > Here's what I've found running the benchmark on some systems to which > I have access. The differences in behavior between platforms is quite vast. > > Summary first... > > <halfjoke> > PostgreSQL should be run on an old Apple MacIntosh attached to > its own Hitachi disk array with 2GB cache or so. Use any sync method > except for fsync(). > </halfjoke> > > Anyway, there is *a lot* of variance in file synching behavior across > different hardware and O/S platforms. It's probably not safe > to conclude much. That said, here are some findings so far based on > tests I've run: > > 1. under no circumstances do fsync() or fdatasync() seem to perform > better than opening files with O_SYNC or O_DSYNC > 2. where there are differences, opening files with O_SYNC or O_DSYNC > tends to be quite faster. > 3. fsync() seems to be the slowest where there are differences. And > O_DSYNC seems to be the fastest where results differ. > 4. the safest thing to assert at this point is that > Solaris systems ought to use the O_DSYNC method for WAL. > > ----------- > > Test system(s) > > Athlon Linux: > AMD Athlon XP2000, 512MB RAM, single (54 or 7200?) RPM 20GB IDE disk, > reiserfs filesystem (3 something I think) > SuSE Linux kernel 2.4.21-99 > > Mac Linux: > I don't know the specific model. 400MHz G3, 512MB, single IDE disk, > ext2 filesystem > Debian GNU/Linux 2.4.16-powerpc > > HP Intel Linux: > Prolient HPDL380G3, 2 x 3GHz Xeon, 2GB RAM, SmartArray 5i 64MB cache, > 2 x 15,000RPM 36GB U320 SCSI drives mirrored. I'm not sure if > writes are cached or not. There's no battery backup. > ext3 filesystem. > Redhat Enterprise Linux 3.0 kernel based on 2.4.21 > > Dell Intel OpenBSD: > Poweredge ?, single 1GHz PIII, 128MB RAM, single 7200RPM 80GB IDE disk, > ffs filesystem > OpenBSD 3.2 GENERIC kernel > > SUN Ultra2: > Ultra2, 2 x 296MHz UltraSPARC II, 2GB RAM, 2 x 10,000RPM 18GB U160 > SCSI drives mirrored with Solstice DiskSuite. UFS filesystem. > Solaris 8. > > SUN E4500 + HDS Thunder 9570v > E4500, 8 x 400MHz UltraSPARC II, 3GB RAM, > HDS Thunder 9570v, 2GB mirrored battery-backed cache, RAID5 with a > bunch of 146GB 10,000RPM FC drives. LUN is on single 2GB FC fabric > connection. > Veritas filesystem (VxFS) > Solaris 8. > > Test methodology: > > All test runs were done with CHUNKSIZE 8 * 1024, CHUNKS 2 * 1024, > FILESIZE_MULTIPLIER 2, and SLEEP 5. So a total of 16MB was sequentially > written for each benchmark. > > Results are in microseconds. > > PLATFORM: Athlon Linux > buffered: 48220 > fsync: 74854397 > fdatasync: 75061357 > open_sync: 73869239 > open_datasync: 74748145 > Notes: System mostly idle. Even during tests, top showed about 95% > idle. Something's not right on this box. All sync methods similarly > horrible on this system. > > PLATFORM: Mac Linux > buffered: 58912 > fsync: 1539079 > fdatasync: 769058 > open_sync: 767094 > open_datasync: 763074 > Notes: system mostly idle. fsync seems worst. Otherwise, they seem > pretty equivalent. This is the fastest system tested. > > PLATFORM: HP Intel Linux > buffered: 33026 > fsync: 29330067 > fdatasync: 28673880 > open_sync: 8783417 > open_datasync: 8747971 > Notes: system idle. O_SYNC and O_DSYNC methods seem to be a lot > better on this platform than fsync & fdatasync. > > PLATFORM: Dell Intel OpenBSD > buffered: 511890 > fsync: 1769190 > fdatasync: -------- > open_sync: 1748764 > open_datasync: 1747433 > Notes: system idle. I couldn't locate fdatasync() on this box, so I > couldn't test it. All sync methods seem equivalent and are very fast -- > though still trail the old Mac. > > PLATFORM: SUN Ultra2 > buffered: 1814824 > fsync: 73954800 > fdatasync: 52594532 > open_sync: 34405585 > open_datasync: 13883758 > Notes: system mostly idle, with occasional spikes from 1-10% utilization. > It looks like substantial difference between each sync method, with > O_DSYNC the best and fsync() the worst. There is substantial > difference between the open* and f* methods. > > PLATFORM: SUN E4500 + HDS Thunder 9570v > buffered: 233947 > fsync: 57802065 > fdatasync: 56631013 > open_sync: 2362207 > open_datasync: 1976057 > Notes: host about 30% idle, but the array tested on was completely idle. > Something looks seriously not right about fsync and fdatasync -- write > cache seems to have no effect on them. As for write cache, that > probably explains the 2 seconds or so for the open_sync and > open_datasync methods. > > -------------- > > Thanks for reading...I look forward to feedback, and hope to be helpful in > this effort! > > Mark > [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-performance by date: