Re: fsync or fdatasync - Mailing list pgsql-admin

From Bruce Momjian
Subject Re: fsync or fdatasync
Date
Msg-id 200209102107.g8AL7UN23077@candle.pha.pa.us
Whole thread Raw
In response to Re: fsync or fdatasync  (Ragnar Kjørstad <postgres@ragnark.vestdata.no>)
Responses Re: fsync or fdatasync
List pgsql-admin
Ragnar Kj�rstad wrote:
> > open_datasync is the first choice if available.
>
> I assume open_datasync means open with O_SYNC flag..

Yes.

> > > Why? That will slow tings down...
> >
> > On what evidence do you assert that?
> >
> > In theory open_datasync can be the fastest alternative for WAL writing,
> > because it should cause the kernel to force each WAL write() request
> > down to disk immediately.  fdatasync will result in the same amount of
> > I/O, but it will also require the kernel to scan its disk cache to see
> > if there are any other dirty blocks that need to be written.  On many
> > kernels this check is not very efficient and can chew substantial
> > amounts of CPU time.
>
> Yes, I see your argument.
> However, I've just checked the linux-implementation of fsync() and I
> can't really see how it could chew substantial amounts of CPU time. The
> way it works every inode has a list of dirty data buffers - all it does
> it traverse that list and do a write on each.

Remember we support >15 platforms, and I know there is at least one
(HPUX?) which does the fsync/fdatasync block finding inefficiently. It
may have even been old Linux; I can not remember.

> Anyway - I'm sure this is not enough to convince you, so I'll have to
> set up a test instead. But not tonight.

Again, that is a test case for only one OS.  It is helpful if we are
going to start doing per-OS defaults, which is something we have talked
about.  What would be great is a test program we can run on different
OS's to find out which is more efficient.
>
>
> > The tradeoff is that open_datasync syncs each WAL
> > block individually, which is unnecessary if you are committing
> > multiple blocks worth of WAL entries at once --- but there's no hard
> > evidence that that slows things down, especially not when the WAL logs
> > are on their own disk spindle.
>
> Well, in theory fsync() will allow the disk to reorder the writes, and
> that should give significantly better performance, because it will
> reduce the required number of seeks. If the WAL is on a seperate spindel
> there will very few seeks in the first place, so there is less to gain,
> but for the case with the WAL on the same disk as something else there
> is probably some gain. But it makes sense to optimize for the
> WAL-on-seperate-disk case...


Remember, in most cases, we are fsync'ing only one block so there is no
_gathering_ to do.


--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-admin by date:

Previous
From: Ragnar Kjørstad
Date:
Subject: Re: fsync or fdatasync
Next
From: Oliver Elphick
Date:
Subject: Re: Fw: consulta