Home > mailing lists

Re: fsync reliability - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: fsync reliability
Date	April 21, 2011 12:56:06
Msg-id	24001.1303401355@sss.pgh.pa.us Whole thread Raw
In response to	fsync reliability (Simon Riggs <simon@2ndQuadrant.com>)
Responses	Re: fsync reliability Re: fsync reliability Re: fsync reliability
List	pgsql-hackers

Tree view

Simon Riggs <simon@2ndQuadrant.com> writes:
> Daniel Farina points out to me that the Linux man page for fsync() says
> "Calling fsync() does not necessarily ensure that the entry in the directory
>        containing the file has also reached disk.  For that an
> explicit fsync() on a
>        file descriptor for the directory is also needed."
> http://www.kernel.org/doc/man-pages/online/pages/man2/fsync.2.html

> This point appears to have been discussed before

Yes ...

> Tom said
> "We don't try to "fsync the
> directory" after a normal table create for instance"
> which is fine because we don't need to. In the event of a crash a
> missing table would be recreated during crash recovery.

Nonsense.  Once a checkpoint occurs after the WAL record that says to
create the table, we won't replay that action.  Or are you proposing
to have checkpoints run around and fsync every directory in the data
tree?

The traditional standard is that the filesystem is supposed to take
care of its own metadata, and even Linux filesystems have pretty much
figured that out.  I don't really see a need for us to be nursemaiding
the filesystem.  At most there's a documentation issue here, ie, we
ought to be more explicit about which filesystems and which mount
options we recommend.
        regards, tom lane

pgsql-hackers by date:

From: Robert Haas
Date: 21 April 2011, 12:52:10
Subject: Re: "stored procedures"

From: Daniel Farina
Date: 21 April 2011, 13:06:20
Subject: Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog?

Re: fsync reliability - Mailing list pgsql-hackers

Previous

Next