On Wed, Jun 18, 2008 at 02:17:00PM -0400, Greg Smith wrote:
> On Wed, 18 Jun 2008, Sam Mason wrote:
>
> >Isn't fsync only a side-effect of having a write-back cache between
> >programs and the disk? This means it's only purpose is to ensure that
> >the cache is consistent with what's on disk. Because all programs
> >running within a system are running on top of the cache they don't know
> >or care whether the cache actually matches up to the disk.
>
> Most programs don't. PostgreSQL writes to the database in two stages:
> the WAL, followed by an fsync, then later to the main database files.
Sorry, I wasn't being clear. When I said "they don't know or care" I
meant that if you've got a PG process writing it's database files and a
backup process running on the same machine then the backup process will
see the data written by PG independently of whether fsync is called or
not.
> You can't trust the WAL will be around for recovery until the first fsync
> returns. The checkpoint process makes sure everything that went into the
> WAL then made it to the main database files, and again it doesn't trust
> that it's really on disk until the fsync returns.
Yes
> >Therefore, if I understand things correctly, the state of fsync
> >shouldn't matter in this use case. It's equally borken independent to
> >the state of fsync.
>
> Quote borken indeed, and fsync has nothing to do with it.
My original note was mainly in response to Craig's comment that implied
fsync doing far more than it actually does. I remember seeing a few
comments recently saying similar things about fsync, so sorry for
picking specifically on you Craig. Device/filesystem level snapshotting
is exactly what's needed and is independent of any fsync settings.
Sam