Re: win32 performance - fsync question - Mailing list pgsql-hackers

From Zeugswetter Andreas DAZ SD
Subject Re: win32 performance - fsync question
Date
Msg-id 46C15C39FEB2C44BA555E356FBCD6FA40184D303@m0114.s-mxs.net
Whole thread Raw
In response to win32 performance - fsync question  ("E.Rodichev" <er@sai.msu.su>)
List pgsql-hackers
> >> One point that I no longer recall the reasoning behind is that xlog.c
> >> doesn't think O_SYNC is a preferable default over fsync.
> >
> >For larger (>8k) transactions O_SYNC|O_DIRECT is only good with the recent
> >pending patch to group WAL writes together. The fsync method gives the OS a
> >chance to do the grouping. (Of course it does not matter if you have small
> >tx < 8k WAL)
>
> This would be true for fdatasync() but not for fsync(), I think.

No, it is only worse with fsync, since that adds a mandatory seek.

> On win32 (which started this discussion, fsync will sync the directory
> entry as well, which will lead to *at least* two seeks on the disk.
> Writing two blocks after each other to an O_SYNC opened file should give
> exactly two seeks.

I think you are making the following not maintainable assumptions.
1. there is no other outstanding IO on that drive that the OS happily inserts between your two 8k writes
2. the rotational delay is neglectible
3. the per call overhead is neglectible

You will at least wait until the heads reach the write position again,
since you will not be able to supply the next 8k in time for the drive to
continue writing (with the single backend large tx I was referring to).

If you doubt what I am saying do dd blocksize tests on a raw device.
The results are, that up to ~256kb blocksize you can increase the drive
performance on a drive that does not have a powerfailsafe cache, and
does not lie about write success.

Andreas


pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: slow mail server ?
Next
From: "Merlin Moncure"
Date:
Subject: Re: win32 performance - fsync question