Re: sync() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: sync()
Date
Msg-id 18825.1042437163@sss.pgh.pa.us
Whole thread Raw
In response to Re: sync()  (Kevin Brown <kevin@sysexperts.com>)
Responses Re: sync()  (Kevin Brown <kevin@sysexperts.com>)
Re: sync()  (Giles Lean <giles@nemeton.com.au>)
List pgsql-hackers
Kevin Brown <kevin@sysexperts.com> writes:
> So the backends have to keep a common list of all the files they
> touch.  Admittedly, that could be a problem if it means using a bunch
> of shared memory, and it may have additional performance implications
> depending on the implementation ...

It would have to be a list of all files that have been touched since the
last checkpoint.  That's a serious problem for storage in shared memory,
which is by definition fixed-size.

>> Even if it did know, it's not clear to me that we can
>> portably assume that process A issuing an fsync on a file descriptor F
>> it's opened for file X will force to disk previous writes issued against
>> the same physical file X by a different process B using a different file
>> descriptor G.

> If the manpages are to be believed, then under FreeBSD, Linux, and
> HP-UX, calling fsync() will force to disk *all* unwritten buffers
> associated with the file pointed to by the filedescriptor.

> Sadly, however, the Solaris and IRIX manpages suggest that only
> buffers associated with the specific file descriptor itself are
> written, not necessarily all buffers associated with the file pointed
> at by the file descriptor (and interestingly, the Solaris version
> appears to be implemented as a library function and not a system call,
> if the manpage's section is any indication).

Right.  "Portably" was the key word in my comment (sorry for not
emphasizing this more clearly).  The real problem here is how to know
what is the actual behavior of each platform?  I'm certainly not
prepared to trust reading-between-the-lines-of-some-man-pages.  And I
can't think of a simple yet reliable direct test.  You'd really have to
invest detailed study of the kernel source code to know for sure ...
and many of our platforms don't have open-source kernels.

> Under Linux (and perhaps HP-UX), it may be necessary to fsync() the
> directories leading to the file as well, so that the state of the
> filesystem on disk is consistent and safe in the event that the files
> in question are newly-created.

AFAIK, all Unix implementations are paranoid about consistency of
filesystem metadata, including directory contents.  So fsync'ing
directories from a user process strikes me as a waste of time, even
assuming that it were portable, which I doubt.  What we need to worry
about is whether fsync'ing a bunch of our own data files is a practical
substitute for a global sync() call.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Kevin Brown
Date:
Subject: Re: sync()
Next
From: Kevin Brown
Date:
Subject: Re: PostgreSQL site, put up or shut up?