Re: -F option, RAM usage, more... - Mailing list pgsql-general

From Tom Lane
Subject Re: -F option, RAM usage, more...
Date
Msg-id 28035.970726731@sss.pgh.pa.us
Whole thread Raw
In response to Re: -F option, RAM usage, more...  ("Mitch Vincent" <mitch@venux.net>)
List pgsql-general
"Mitch Vincent" <mitch@venux.net> writes:
> Now I'm pretty confused (as I'm sure others are) -- can someone that knows
> beyond a reasonable doubt beat us with a clue stick on this?  Are we taking
> a huge risk if we use -F and disable fsync() or no?

Postgres will write() modified pages out to the kernel at transaction
commit, -F or no.  The difference is whether it then issues an fsync()
to force the kernel to write all the modified pages to disk before it
believes the transaction is committed.  With -F (ie, no fsync) the
changes are out of the application and into the kernel's disk buffers,
but not necessarily physically down on disk, at the time Postgres
updates pg_log to show the transaction as committed.

If you have a subsequent system crash then it's possible that the
pg_log update got written out but only some of the data pages modified
by the transaction got written --- in which case you have an
inconsistent DB, because the transaction's effects weren't
all-or-nothing like they're supposed to be.  The behavior would
depend on exactly what order the kernel chose to flush dirty buffers
out to disk.

If you're not using -F then Postgres fsync()s all the data files it's
touched, then writes pg_log, then fsync()s pg_log.  If the kernel
respects fsync 100% then this should guarantee atomic effects of
a transaction: pg_log will not show the transaction as committed unless
all the data changes it made are safely down on disk.

If you have a reliable kernel, reliable power (ie a UPS) and aren't
too worried about hardware failures then there's no good reason to
insist on fsync.  A Postgres server crash wouldn't mess up already-
committed data, since that data is safely out of the server and
into the hands of the kernel.  However, if you don't want to trust
the kernel (+hardware) to get the data it's accepted down to disk
sooner or later, then you'd better be using fsync.

It's interesting that someone reported reiserfs to show different
behavior on crash than ext2.  I'd have thought this was mainly an
issue of what order the kernel chose to flush dirty buffers in,
which doesn't seem like it'd depend on the filesystem organization
... but maybe it does.

            regards, tom lane

pgsql-general by date:

Previous
From: Tim Uckun
Date:
Subject: VIEW problem
Next
From: Tom Lane
Date:
Subject: Re: executing user-defined functions