On Thu, 2005-06-02 at 11:49 -0700, Mary Edie Meredith wrote:
> My understanding is that O_DIRECT means "direct" as in "no buffering by
> the OS" which implies that if you write from your buffer, the write is
> not going to return unless the OS thinks the write is completed
Right, I think that's definitely the case. The question is whether a
write() under O_DIRECT will also flush the disk's write cache -- i.e.
when the write() completes, we need it to be durable over a spontaneous
power loss. fsync() or O_SYNC should provide this (modulo braindamaged
IDE hardware), but I wouldn't be surprised if O_DIRECT by itself will
not (otherwise you would hurt the performance of applications using
O_DIRECT that don't need these durability guarantees).
> Bottom line: if you do not implement direct/async IO so that you
> optimize caching of hot database objects and minimize memory utilization
> of objects used once, you are probably leaving performance on the table
> for datafiles.
Absolutely -- patches are welcome :) I agree async IO + O_DIRECT in some
form would be interesting, but the changes required are far from trivial
-- my guess is there are lower hanging fruit.
-Neil