Re: [HACKERS] Postgres Performance - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Postgres Performance
Date
Msg-id 13483.936380608@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Postgres Performance  (Thomas Lockhart <lockhart@alumni.caltech.edu>)
List pgsql-hackers
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> there is a (small) risk that if your computer crashes after some
> updates but before things are flushed then the db might become
> inconsistant. afaik we have never had an unambiguous report that this
> has actually happened (but others might remember differently). There
> is already that risk to some extent, but instead of the window being
> O(1sec) it becomes O(30sec).

I believe we use fsync not so much to reduce the time window where you
could lose a supposedly-committed update as to ensure that writes are
performed in a known order.  With fsync enabled, the data-file pages
touched by an update query will hit the disk before the pg_log entry
saying the transaction is committed hits the disk.  If you crash
somewhere during that sequence, the transaction appears uncommitted
and there is no loss of consistency.  (We assume here that writing
a single page to disk is an atomic operation, which is only sort-of
true, but it's the best we can do atop a Unix kernel.  Other than that,
there is no "window" for possible inconsistency.)

Without fsync, the kernel writes the pages to disk in whatever order
it finds convenient, so following a crash there might be a pg_log entry
saying transaction N was committed, when in fact only some of
transaction N's tuples made it to disk.  Then you see an inconsistent
database: some of the transaction's updates are there, some are not.
This might be relatively harmless, or deadly, depending on your
application logic and just what the missing updates are.

Another risk without fsync is that a client application might have been
told that the transaction was committed, when in fact it gets lost due to
a crash moments later before pg_log gets physically updated.  Again, the
possible consequences would depend on your application.

The total number of writes performed without fsync is usually way less
than with, since we tend to write certain pages (esp. pg_log) over and
over --- the kernel will reduce that to one physical disk write every
sync interval (~ 30sec) unless we force its hand with fsync.  That's
where most of the performance improvement comes from.

If you have a reliable kernel and reliable hardware/power supply, then
you might as well turn off fsync.  A crash in Postgres itself would
not cause a problem --- the writes are out there in the kernel's disk
buffers, and the only issue is do you trust the platform to get the
data onto stable storage.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Re: University Masters Project
Next
From: Massimo Dal Zotto
Date:
Subject: Re: [HACKERS] PostgreSQL 6.5.2