Thread: fsync or fdatasync

fsync or fdatasync

From
"Gaetano Mendola"
Date:
Hi all,
apparently the default value for wal_sync_method is fsync,
and apparently the best method is fdatasync.
There is any special consideration for not use the fdatasync
as default ?


Ciao
Gaetano



Re: fsync or fdatasync

From
Stephan Szabo
Date:
On Mon, 9 Sep 2002, Gaetano Mendola wrote:

> Hi all,
> apparently the default value for wal_sync_method is fsync,
> and apparently the best method is fdatasync.
> There is any special consideration for not use the fdatasync
> as default ?

IIRC, there were systems that did not have a functional
fdatasync.



Re: fsync or fdatasync

From
Dmitry Morozovsky
Date:
On Mon, 9 Sep 2002, Gaetano Mendola wrote:

GM> apparently the default value for wal_sync_method is fsync,
GM> and apparently the best method is fdatasync.
GM> There is any special consideration for not use the fdatasync
GM> as default ?

#wal_sync_method = fsync   # the default varies across platforms:
#                          # fsync, fdatasync, open_sync, or open_datasync

I suppose fdatasync *is* the default on platforms where it exists. On
*BSD it does not.

Sincerely,
D.Marck                                   [DM5020, DM268-RIPE, DM3-RIPN]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------


Re: fsync or fdatasync

From
Andrew Sullivan
Date:
On Mon, Sep 09, 2002 at 11:19:50AM +0200, Gaetano Mendola wrote:
> Hi all,
> apparently the default value for wal_sync_method is fsync,
> and apparently the best method is fdatasync.
                     ^^^^^^^^^^
On which platform?  There are all sorts of variables to consider
here, most notably whether the given platform happens to support
fdatasync.

I have found that on some platforms, open_datasync is faster than
anything.

But you can certainly use fdatasync if your platform supports it.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: fsync or fdatasync

From
Tom Lane
Date:
"Gaetano Mendola" <mendola@bigfoot.com> writes:
> apparently the default value for wal_sync_method is fsync,
> and apparently the best method is fdatasync.

Best on what platform, and according to what evidence?

We'll be glad to consider changing the default for a specific platform
if we have a reasonably convincing argument that the other value is
better.  So far, not much study has been done of which method is best
on which platforms (and under what load conditions).

            regards, tom lane

Re: fsync or fdatasync

From
Tom Lane
Date:
Dmitry Morozovsky <marck@rinet.ru> writes:
> #wal_sync_method = fsync   # the default varies across platforms:
> #                          # fsync, fdatasync, open_sync, or open_datasync

> I suppose fdatasync *is* the default on platforms where it exists. On
> *BSD it does not.

[ looks at code... ]  Actually, the current algorithm for choosing the
default is "open_datasync if it exists, else fdatasync if it exists,
else fsync".  There probably are platforms where this method yields a
non-optimal answer, but we need more data before fooling with it.

            regards, tom lane

Re: fsync or fdatasync

From
Sean Chittenden
Date:
> > apparently the default value for wal_sync_method is fsync,
> > and apparently the best method is fdatasync.
>
> Best on what platform, and according to what evidence?
>
> We'll be glad to consider changing the default for a specific platform
> if we have a reasonably convincing argument that the other value is
> better.  So far, not much study has been done of which method is best
> on which platforms (and under what load conditions).

heh, just a quick note: I had one of FreeBSD's kernel guru's point out
that fsync() on linux is a no-op.  I haven't had that src tree in
years so I can't confirm, but I'm inclined to believe him.  Just an
FYI.  -sc

--
Sean Chittenden

Attachment

Re: fsync or fdatasync

From
Ragnar Kjørstad
Date:
On Mon, Sep 09, 2002 at 04:43:04PM -0700, Sean Chittenden wrote:
> > We'll be glad to consider changing the default for a specific platform
> > if we have a reasonably convincing argument that the other value is
> > better.  So far, not much study has been done of which method is best
> > on which platforms (and under what load conditions).
>
> heh, just a quick note: I had one of FreeBSD's kernel guru's point out
> that fsync() on linux is a no-op.  I haven't had that src tree in
> years so I can't confirm, but I'm inclined to believe him.  Just an

No, fsync() is not a no-op on linux.
Unless the filesystem is mounted with o_sync, I suppose - then
everything is written at write() so fsync() is not needed. But
generally, it does sync.


--
Ragnar Kjørstad
Big Storage

Re: fsync or fdatasync

From
Sean Chittenden
Date:
> > > We'll be glad to consider changing the default for a specific
> > > platform if we have a reasonably convincing argument that the
> > > other value is better.  So far, not much study has been done of
> > > which method is best on which platforms (and under what load
> > > conditions).
> >
> > heh, just a quick note: I had one of FreeBSD's kernel guru's point
> > out that fsync() on linux is a no-op.  I haven't had that src tree
> > in years so I can't confirm, but I'm inclined to believe him.
> > Just an
>
> No, fsync() is not a no-op on linux.
> Unless the filesystem is mounted with o_sync, I suppose - then
> everything is written at write() so fsync() is not needed. But
> generally, it does sync.

Hrm, alright.  From what I've figured out, about ~6wk ago fsync() was
added to linux to have it actually fsync()... mind you someone quickly
turned around and created a new patchset that ripped the functionality
out and added it to an extreme linux distro.  ::shrug:: <opinion>Linux
is out of control.</opinion> -sc


PS This wasn't a flame/troll, just trying to figure out what the best
recommendation would be for FreeBSD and it is fsync().

--
Sean Chittenden

Re: fsync or fdatasync

From
Ragnar Kjørstad
Date:
On Mon, Sep 09, 2002 at 05:11:27PM -0700, Sean Chittenden wrote:
> > No, fsync() is not a no-op on linux.
> > Unless the filesystem is mounted with o_sync, I suppose - then
> > everything is written at write() so fsync() is not needed. But
> > generally, it does sync.
>
> Hrm, alright.  From what I've figured out, about ~6wk ago fsync() was
> added to linux to have it actually fsync()... mind you someone quickly
> turned around and created a new patchset that ripped the functionality
> out and added it to an extreme linux distro.  ::shrug:: <opinion>Linux
> is out of control.</opinion> -sc

"6wk"?

Linux has had fsync for as long as I can remember.

Maybe you have it confused with fsync() over NFS? The NFSv2
implementation on linux used to have "async" flag for nfs as default -
making it non NFS-compliant without reconfiguration.


--
Ragnar Kjørstad
Big Storage

Re: fsync or fdatasync

From
Sean Chittenden
Date:
> > > No, fsync() is not a no-op on linux.  Unless the filesystem is
> > > mounted with o_sync, I suppose - then everything is written at
> > > write() so fsync() is not needed. But generally, it does sync.
> >
> > Hrm, alright.  From what I've figured out, about ~6wk ago fsync()
> > was added to linux to have it actually fsync()... mind you someone
> > quickly turned around and created a new patchset that ripped the
> > functionality out and added it to an extreme linux distro.
> > ::shrug:: <opinion>Linux is out of control.</opinion> -sc
>
> "6wk"?
>
> Linux has had fsync for as long as I can remember.
>
> Maybe you have it confused with fsync() over NFS? The NFSv2
> implementation on linux used to have "async" flag for nfs as default
> - making it non NFS-compliant without reconfiguration.

The fsync() call has existed, but in the kernel it didn't actually do
anything is what I've been told.  -sc

--
Sean Chittenden

Attachment

Re: fsync or fdatasync

From
"Gaetano Mendola"
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> wrote in message
news:11753.1031590251@sss.pgh.pa.us...
> "Gaetano Mendola" <mendola@bigfoot.com> writes:
> > apparently the default value for wal_sync_method is fsync,
> > and apparently the best method is fdatasync.
>
> Best on what platform, and according to what evidence?

Well, the man say ( Linux ):


fdatasync flushes all data buffers of a file to disk (before the system call
returns).  It resembles fsync but is
       not required to update the metadata such as access time.

       Applications that access databases or log files often write a tiny
data fragment (e.g., one line in a  log  file)
       and  then  call  fsync immediately in order to ensure that the
written data is physically stored on the harddisk.
       Unfortunately, fsync will always initiate two write operations: one
for the newly written data and another one in
       order to update the modification time stored in the inode. If the
modification time is not a part of the transac�
       tion concept fdatasync can be used to avoid unnecessary inode disk
write operations.


So, what is wrong here ? Seems clear that one write is better than two.

Ciao
Gaetano




Re: fsync or fdatasync

From
Bruce Momjian
Date:
The original poster was wrong about the default.

We use fdatasync where available, and fsync when it is not.  We also use
O_SYNC on open if it is available.

---------------------------------------------------------------------------

Gaetano Mendola wrote:
>
> "Tom Lane" <tgl@sss.pgh.pa.us> wrote in message
> news:11753.1031590251@sss.pgh.pa.us...
> > "Gaetano Mendola" <mendola@bigfoot.com> writes:
> > > apparently the default value for wal_sync_method is fsync,
> > > and apparently the best method is fdatasync.
> >
> > Best on what platform, and according to what evidence?
>
> Well, the man say ( Linux ):
>
>
> fdatasync flushes all data buffers of a file to disk (before the system call
> returns).  It resembles fsync but is
>        not required to update the metadata such as access time.
>
>        Applications that access databases or log files often write a tiny
> data fragment (e.g., one line in a  log  file)
>        and  then  call  fsync immediately in order to ensure that the
> written data is physically stored on the harddisk.
>        Unfortunately, fsync will always initiate two write operations: one
> for the newly written data and another one in
>        order to update the modification time stored in the inode. If the
> modification time is not a part of the transac�
>        tion concept fdatasync can be used to avoid unnecessary inode disk
> write operations.
>
>
> So, what is wrong here ? Seems clear that one write is better than two.
>
> Ciao
> Gaetano
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fsync or fdatasync

From
Ragnar Kjørstad
Date:
On Tue, Sep 10, 2002 at 11:40:24AM -0400, Bruce Momjian wrote:
>
> The original poster was wrong about the default.
>
> We use fdatasync where available, and fsync when it is not.

Makes sense.

> We also use
> O_SYNC on open if it is available.

Why? That will slow tings down...


--
Ragnar Kjørstad

Re: fsync or fdatasync

From
Bruce Momjian
Date:
Ragnar Kj�rstad wrote:
> On Tue, Sep 10, 2002 at 11:40:24AM -0400, Bruce Momjian wrote:
> >
> > The original poster was wrong about the default.
> >
> > We use fdatasync where available, and fsync when it is not.
>
> Makes sense.
>
> > We also use
> > O_SYNC on open if it is available.
>
> Why? That will slow tings down...

Actually, no, we are only O_SYNC'ing the WAL writes and sometimes that
is faster because you are not writing then fsyncing, you are just
writing.  The fdatasync only is better than O_SYNC when you are doing
multiple WAL writes before an fdatasync and we normally don't do that.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fsync or fdatasync

From
Ragnar Kjørstad
Date:
On Tue, Sep 10, 2002 at 01:17:56PM -0400, Bruce Momjian wrote:
> Ragnar Kjørstad wrote:
> > On Tue, Sep 10, 2002 at 11:40:24AM -0400, Bruce Momjian wrote:
> > > We also use
> > > O_SYNC on open if it is available.
> >
> > Why? That will slow tings down...
>
> Actually, no, we are only O_SYNC'ing the WAL writes and sometimes that
> is faster because you are not writing then fsyncing, you are just
> writing.  The fdatasync only is better than O_SYNC when you are doing
> multiple WAL writes before an fdatasync and we normally don't do that.

OK, if it is a single write it makes sense. (But I doubt it makes much
of a difference - the overhead of a system call is almoust nothing
compared to a write to disk...)


--
Ragnar Kjørstad

Re: fsync or fdatasync

From
Ragnar Kjørstad
Date:
On Mon, Sep 09, 2002 at 05:33:18PM -0700, Sean Chittenden wrote:
> > Linux has had fsync for as long as I can remember.
> >
> > Maybe you have it confused with fsync() over NFS? The NFSv2
> > implementation on linux used to have "async" flag for nfs as default
> > - making it non NFS-compliant without reconfiguration.
>
> The fsync() call has existed, but in the kernel it didn't actually do
> anything is what I've been told.  -sc

I think that's just wrong.

I just downloaded linux-2.0.1 (from 1996) to check, and it _does_ fsync.
OK, so it does it pretty inefficiently, but it still does the job.




--
Ragnar Kjørstad
Big Storage

Re: fsync or fdatasync

From
Tom Lane
Date:
=?iso-8859-1?Q?Ragnar_Kj=F8rstad?= <postgres@ragnark.vestdata.no> writes:
> On Tue, Sep 10, 2002 at 11:40:24AM -0400, Bruce Momjian wrote:
>> We use fdatasync where available, and fsync when it is not.

> Makes sense.

>> We also use O_SYNC on open if it is available.

s/also/instead/ ... open_datasync is the first choice if available.

> Why? That will slow tings down...

On what evidence do you assert that?

In theory open_datasync can be the fastest alternative for WAL writing,
because it should cause the kernel to force each WAL write() request
down to disk immediately.  fdatasync will result in the same amount of
I/O, but it will also require the kernel to scan its disk cache to see
if there are any other dirty blocks that need to be written.  On many
kernels this check is not very efficient and can chew substantial
amounts of CPU time.  The tradeoff is that open_datasync syncs each WAL
block individually, which is unnecessary if you are committing
multiple blocks worth of WAL entries at once --- but there's no hard
evidence that that slows things down, especially not when the WAL logs
are on their own disk spindle.  Giving the kernel scheduling freedom
for a small number of blocks doesn't help much anyway in that case.

Check the pghackers archives (a year or two back) for lots and lots of
discussion, but I recall we demonstrated that the current default
choices are reasonable for at least some set of Unixen.  If you've got
more information showing that the present default is wrong on some
kernel, let's have it ... but don't waste our time with blanket
assertions that "X is the right (or wrong) choice", because we know
that's not so across all the platforms we support.  We'd not have
bothered with four sync methods if there weren't good evidence that each
is the best available choice on some platforms.

            regards, tom lane

Re: fsync or fdatasync

From
Ragnar Kjørstad
Date:
On Tue, Sep 10, 2002 at 03:17:00PM -0400, Tom Lane wrote:
> =?iso-8859-1?Q?Ragnar_Kj=F8rstad?= <postgres@ragnark.vestdata.no> writes:
> > On Tue, Sep 10, 2002 at 11:40:24AM -0400, Bruce Momjian wrote:
> >> We use fdatasync where available, and fsync when it is not.
>
> > Makes sense.
>
> >> We also use O_SYNC on open if it is available.
>
> s/also/instead/ ...

Yes, I understood that...

> open_datasync is the first choice if available.

I assume open_datasync means open with O_SYNC flag..

> > Why? That will slow tings down...
>
> On what evidence do you assert that?
>
> In theory open_datasync can be the fastest alternative for WAL writing,
> because it should cause the kernel to force each WAL write() request
> down to disk immediately.  fdatasync will result in the same amount of
> I/O, but it will also require the kernel to scan its disk cache to see
> if there are any other dirty blocks that need to be written.  On many
> kernels this check is not very efficient and can chew substantial
> amounts of CPU time.

Yes, I see your argument.
However, I've just checked the linux-implementation of fsync() and I
can't really see how it could chew substantial amounts of CPU time. The
way it works every inode has a list of dirty data buffers - all it does
it traverse that list and do a write on each.

Anyway - I'm sure this is not enough to convince you, so I'll have to
set up a test instead. But not tonight.


> The tradeoff is that open_datasync syncs each WAL
> block individually, which is unnecessary if you are committing
> multiple blocks worth of WAL entries at once --- but there's no hard
> evidence that that slows things down, especially not when the WAL logs
> are on their own disk spindle.

Well, in theory fsync() will allow the disk to reorder the writes, and
that should give significantly better performance, because it will
reduce the required number of seeks. If the WAL is on a seperate spindel
there will very few seeks in the first place, so there is less to gain,
but for the case with the WAL on the same disk as something else there
is probably some gain. But it makes sense to optimize for the
WAL-on-seperate-disk case...

Another advantage is that fsync() would allow the elevator to merge
multiple IO-requests. Still the same number of bytes to write, but fewer
bigger requests are typicly faster.

But again; numbers speek. I'll get back to you once I find the time to
test it.


> Check the pghackers archives (a year or two back) for lots and lots of
> discussion, but I recall we demonstrated that the current default
> choices are reasonable for at least some set of Unixen.  If you've got
> more information showing that the present default is wrong on some
> kernel, let's have it ... but don't waste our time with blanket
> assertions that "X is the right (or wrong) choice", because we know
> that's not so across all the platforms we support.  We'd not have
> bothered with four sync methods if there weren't good evidence that each
> is the best available choice on some platforms.

No argument there; I'm sure there are applications for all of them.
My point is that I think fdatasync() would be the fastest choice for the
linux kernel.



--
Ragnar Kjørstad

Re: fsync or fdatasync

From
Bruce Momjian
Date:
Ragnar Kj�rstad wrote:
> > open_datasync is the first choice if available.
>
> I assume open_datasync means open with O_SYNC flag..

Yes.

> > > Why? That will slow tings down...
> >
> > On what evidence do you assert that?
> >
> > In theory open_datasync can be the fastest alternative for WAL writing,
> > because it should cause the kernel to force each WAL write() request
> > down to disk immediately.  fdatasync will result in the same amount of
> > I/O, but it will also require the kernel to scan its disk cache to see
> > if there are any other dirty blocks that need to be written.  On many
> > kernels this check is not very efficient and can chew substantial
> > amounts of CPU time.
>
> Yes, I see your argument.
> However, I've just checked the linux-implementation of fsync() and I
> can't really see how it could chew substantial amounts of CPU time. The
> way it works every inode has a list of dirty data buffers - all it does
> it traverse that list and do a write on each.

Remember we support >15 platforms, and I know there is at least one
(HPUX?) which does the fsync/fdatasync block finding inefficiently. It
may have even been old Linux; I can not remember.

> Anyway - I'm sure this is not enough to convince you, so I'll have to
> set up a test instead. But not tonight.

Again, that is a test case for only one OS.  It is helpful if we are
going to start doing per-OS defaults, which is something we have talked
about.  What would be great is a test program we can run on different
OS's to find out which is more efficient.
>
>
> > The tradeoff is that open_datasync syncs each WAL
> > block individually, which is unnecessary if you are committing
> > multiple blocks worth of WAL entries at once --- but there's no hard
> > evidence that that slows things down, especially not when the WAL logs
> > are on their own disk spindle.
>
> Well, in theory fsync() will allow the disk to reorder the writes, and
> that should give significantly better performance, because it will
> reduce the required number of seeks. If the WAL is on a seperate spindel
> there will very few seeks in the first place, so there is less to gain,
> but for the case with the WAL on the same disk as something else there
> is probably some gain. But it makes sense to optimize for the
> WAL-on-seperate-disk case...


Remember, in most cases, we are fsync'ing only one block so there is no
_gathering_ to do.


--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fsync or fdatasync

From
Ragnar Kjørstad
Date:
On Tue, Sep 10, 2002 at 05:07:30PM -0400, Bruce Momjian wrote:
> > Anyway - I'm sure this is not enough to convince you, so I'll have to
> > set up a test instead. But not tonight.
>
> Again, that is a test case for only one OS.  It is helpful if we are
> going to start doing per-OS defaults, which is something we have talked
> about.

Oh; I assumed that was already the case.

> What would be great is a test program we can run on different
> OS's to find out which is more efficient.

Yes. Bare in mind though, that this is as much a filesystem issue as a
kernel issue. Two different filesystems on the same kernel may behave
very differently.

Of course one can't distribute seperate postgresql in different versions
optimized for differet filesystems, so perhaps it's just as good to
leave the default as it is and rather put some info (e.g. benchmarks for
different setting on different filesystems on different operating
systems, and the benchmark-script itself so people can do their own
tests). This way the default is allright, and users that need to tweek a
little extra have the info they need.


> Remember, in most cases, we are fsync'ing only one block so there is no
> _gathering_ to do.

Yes, I know you said so. But if that's the case for only most cases
there are some cases were it's not - so there is still some potential.



--
Ragnar Kjørstad

Re: fsync or fdatasync

From
Bruce Momjian
Date:
Ragnar Kj�rstad wrote:
> On Tue, Sep 10, 2002 at 05:07:30PM -0400, Bruce Momjian wrote:
> > > Anyway - I'm sure this is not enough to convince you, so I'll have to
> > > set up a test instead. But not tonight.
> >
> > Again, that is a test case for only one OS.  It is helpful if we are
> > going to start doing per-OS defaults, which is something we have talked
> > about.
>
> Oh; I assumed that was already the case.
>
> > What would be great is a test program we can run on different
> > OS's to find out which is more efficient.
>
> Yes. Bare in mind though, that this is as much a filesystem issue as a
> kernel issue. Two different filesystems on the same kernel may behave
> very differently.
>
> Of course one can't distribute seperate postgresql in different versions
> optimized for differet filesystems, so perhaps it's just as good to
> leave the default as it is and rather put some info (e.g. benchmarks for
> different setting on different filesystems on different operating
> systems, and the benchmark-script itself so people can do their own
> tests). This way the default is allright, and users that need to tweek a
> little extra have the info they need.
>

What we could do ideally is set the default by running some test during
initdb perhaps.

> > Remember, in most cases, we are fsync'ing only one block so there is no
> > _gathering_ to do.
>
> Yes, I know you said so. But if that's the case for only most cases
> there are some cases were it's not - so there is still some potential.

Yes, but the cases are so rare it is probably not worth bothering about
especially since O_SYNC has to be set on file open so you can't switch
between that and fdatasync depending on how many blocks you have.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fsync or fdatasync

From
Mats Lofkvist
Date:
pgman@candle.pha.pa.us (Bruce Momjian) writes:

> Ragnar Kjørstad wrote:
> > On Tue, Sep 10, 2002 at 11:40:24AM -0400, Bruce Momjian wrote:
> > >
> > > The original poster was wrong about the default.
> > >
> > > We use fdatasync where available, and fsync when it is not.
> >
> > Makes sense.
> >
> > > We also use
> > > O_SYNC on open if it is available.
> >
> > Why? That will slow tings down...
>
> Actually, no, we are only O_SYNC'ing the WAL writes and sometimes that
> is faster because you are not writing then fsyncing, you are just
> writing.  The fdatasync only is better than O_SYNC when you are doing
> multiple WAL writes before an fdatasync and we normally don't do that.
>

I may be wrong on this, but my understanding is that the difference
between fsync() and O_SYNC on the one hand and fdatasync() and O_DSYNC
on the other hand is that the latter don't have to sync metadata
(e.g. file access times) which saves a write to the inode that is
more or less guarantied to require an extra seek.

Iff this is true you never want to use fsync() or O_SYNC when
fdatasync() and O_DSYNC is available (unless you really need the
metadata to be synced too).

      _
Mats Lofkvist
mal@algonet.se

Re: fsync or fdatasync

From
Bruce Momjian
Date:
Mats Lofkvist wrote:
> > Actually, no, we are only O_SYNC'ing the WAL writes and sometimes that
> > is faster because you are not writing then fsyncing, you are just
> > writing.  The fdatasync only is better than O_SYNC when you are doing
> > multiple WAL writes before an fdatasync and we normally don't do that.
> >
>
> I may be wrong on this, but my understanding is that the difference
> between fsync() and O_SYNC on the one hand and fdatasync() and O_DSYNC
> on the other hand is that the latter don't have to sync metadata
> (e.g. file access times) which saves a write to the inode that is
> more or less guarantied to require an extra seek.
>
> Iff this is true you never want to use fsync() or O_SYNC when
> fdatasync() and O_DSYNC is available (unless you really need the
> metadata to be synced too).

Yes, I didn't mention O_DSYNC.  It is in the cards.  If you are
interested, look at the code and how the defaults are chosen.

postgresql.conf say:

#wal_sync_method = fsync        # the default varies across platforms:
#                               # fsync, fdatasync, open_sync, or open_datasync

Which means exactly that, varies based on the platform.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fsync or fdatasync

From
"Gaetano Mendola"
Date:
"Bruce Momjian" <pgman@candle.pha.pa.us> wrote in message
news:200209101540.g8AFeOo21086@candle.pha.pa.us...
>
> The original poster was wrong about the default.
>
> We use fdatasync where available, and fsync when it is not.  We also use
> O_SYNC on open if it is available.


The original poster is me! :-)

I was pointed to a document ( that I don't find anymore )
where I understood that the default was fsync.
I'm sorry for the little flame.... :-(


PS. It is in the plans to use raw disks ?

Ciao
Gaetano





Re: fsync or fdatasync

From
Bruce Momjian
Date:
Gaetano Mendola wrote:
>
> "Bruce Momjian" <pgman@candle.pha.pa.us> wrote in message
> news:200209101540.g8AFeOo21086@candle.pha.pa.us...
> >
> > The original poster was wrong about the default.
> >
> > We use fdatasync where available, and fsync when it is not.  We also use
> > O_SYNC on open if it is available.
>
>
> The original poster is me! :-)
>
> I was pointed to a document ( that I don't find anymore )
> where I understood that the default was fsync.
> I'm sorry for the little flame.... :-(
>
>
> PS. It is in the plans to use raw disks ?

No plans.  Raw disk is only marginally faster and a lot more
complicated.  See the TODO performance link for details.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073