Re: fsync or fdatasync - Mailing list pgsql-admin
From | Bruce Momjian |
---|---|
Subject | Re: fsync or fdatasync |
Date | |
Msg-id | 200209102107.g8AL7UN23077@candle.pha.pa.us Whole thread Raw |
In response to | Re: fsync or fdatasync (Ragnar Kjørstad <postgres@ragnark.vestdata.no>) |
Responses |
Re: fsync or fdatasync
|
List | pgsql-admin |
Ragnar Kj�rstad wrote: > > open_datasync is the first choice if available. > > I assume open_datasync means open with O_SYNC flag.. Yes. > > > Why? That will slow tings down... > > > > On what evidence do you assert that? > > > > In theory open_datasync can be the fastest alternative for WAL writing, > > because it should cause the kernel to force each WAL write() request > > down to disk immediately. fdatasync will result in the same amount of > > I/O, but it will also require the kernel to scan its disk cache to see > > if there are any other dirty blocks that need to be written. On many > > kernels this check is not very efficient and can chew substantial > > amounts of CPU time. > > Yes, I see your argument. > However, I've just checked the linux-implementation of fsync() and I > can't really see how it could chew substantial amounts of CPU time. The > way it works every inode has a list of dirty data buffers - all it does > it traverse that list and do a write on each. Remember we support >15 platforms, and I know there is at least one (HPUX?) which does the fsync/fdatasync block finding inefficiently. It may have even been old Linux; I can not remember. > Anyway - I'm sure this is not enough to convince you, so I'll have to > set up a test instead. But not tonight. Again, that is a test case for only one OS. It is helpful if we are going to start doing per-OS defaults, which is something we have talked about. What would be great is a test program we can run on different OS's to find out which is more efficient. > > > > The tradeoff is that open_datasync syncs each WAL > > block individually, which is unnecessary if you are committing > > multiple blocks worth of WAL entries at once --- but there's no hard > > evidence that that slows things down, especially not when the WAL logs > > are on their own disk spindle. > > Well, in theory fsync() will allow the disk to reorder the writes, and > that should give significantly better performance, because it will > reduce the required number of seeks. If the WAL is on a seperate spindel > there will very few seeks in the first place, so there is less to gain, > but for the case with the WAL on the same disk as something else there > is probably some gain. But it makes sense to optimize for the > WAL-on-seperate-disk case... Remember, in most cases, we are fsync'ing only one block so there is no _gathering_ to do. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-admin by date: