Hi all,
O_DIRECT for WAL writes was discussed at
http://archives.postgresql.org/pgsql-patches/2005-06/msg00064.php
but I have some items that want to be discussed, so I would like to
re-post it to HACKERS.
Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> I think the conclusion from the discussion is that O_DIRECT is in
> addition to the sync method, rather than in place of it, because
> O_DIRECT doesn't have the same media write guarantees as fsync(). Would
> you update the patch to do and see if there is a performance win?
I tested two combinations, - fsync_direct: O_DIRECT+fsync() - open_direct: O_DIRECT+O_SYNC
to compare them with O_DIRECT on my linux machine.
The pgbench results still shows a performance win:
scale| DBsize | open_sync | fsync=false | O_DIRECT only| fsync_direct | open_direct
-----+--------+-----------+--------------+--------------+--------------+--------------- 10 | 150MB | 252.6 tps |
263.5(+4.3%)| 253.4(+ 0.3%)| 253.6(+ 0.4%)| 253.3(+ 0.3%)100 | 1.5GB | 102.7 tps | 117.8(+14.7%)| 147.6(+43.7%)|
148.9(+45.0%)|150.8(+46.8%) 60runs * pgbench -c 10 -t 1000 on one Pentium4, 1GB mem, 2 ATA disks, Linux 2.6.8
O_DIRECT, fsync_direct and open_direct show the same tendency of performance.
There were a win on scale=100, but no win on scale=10, which is a fully
in-memory benchmark.
The following items still want to be discussed:
- Are their names appropriate? Simplify to 'direct'?
- Are both fsync_direct and open_direct necessary? MySQL seems to use only O_DIRECT+fsync() combination.
- Is it ok to set the dio buffer alignment to BLCKSZ? This is simple way to set the alignment to match many
environment. If it is not enough, BLCKSZ would be also a problem for direct io.
BTW, IMHO the major benefit of direct io is saving memory. O_DIRECT gives
a hint that OS should not cache WAL files. Without direct io, OS might make
a effort to cache WAL files, which will never be used, and might discard
data file cache.
---
ITAGAKI Takahiro
NTT Cyber Space Laboratories