Re: FAQ Q - Mailing list pgsql-admin

From Tim Ellis
Subject Re: FAQ Q
Date
Msg-id 20020610164607.6439eb6b.Tim.Ellis@gamet.com
Whole thread Raw
In response to Re: FAQ Q  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: FAQ Q
List pgsql-admin
On Mon, 10 Jun 2002 17:16:23 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> You do not have to trust Postgres itself: in all cases we push the log
> entries out to the OS before declaring a transaction committed.

In all cases while fsync mode is enabled, not in all cases (including
no-fsync) right?

> question at hand is whether we use fsync() or other methods to try to
> force the OS to write to disk before we report the transaction
> committed.  Without that, a system-level crash immediately after a
> commit report might mean the "committed" transaction isn't reflected as
> committed on disk.

So no-fsync simply means that "commit" != "sync'd to disk"? How is this
different than the big boys do it? From Sybase internals classes and
Oracle documentation, I know that when you say "COMMIT" you don't get any
response back until the RDBMS has written that transaction TO DISK.

Is the FAQ being too generous in speed and too unfair in recoverability to
the commercial offerings here?

From the FAQ: "Normal fsync mode flushes every completed transaction to
disk, guaranteeing that if the OS crashes or loses power in the next few
seconds, all your data is safely stored on disk. In this mode, we are
slower than most commercial databases, partly because few of them do such
conservative flushing to disk in their default modes."

Seems we're claiming that most commercial databases don't actually flush
anything to disk before returning a result to a COMMIT, and at the same
time that therefore they're faster.

My training says Sybase/DB2/Oracle/Informix won't be any less conservative
and at the same time, won't be any faster.

Sybase at least just tries to avoid writing tranlogs to disk except in
batches, so when you say COMMIT, it pauses until X amount of time passes
or Y number of committed transactions have piled up, then it writes those
trans to disk. But your client is blocked on the COMMIT until the
transaction actually makes it to disk.

Perhaps we are including in "most commercial databases" several RDBMSs not
Sybase/Oracle/DB2/Informix?

> The performance cost of fsync is not nearly what it used to be pre-7.1,
> btw, because we need flush only the WAL log file not data files.

Oh, so here's where my confusion continues. The fsync pre-7.1 was flushing
data pages, whereas 7.1/7.1 only flushes WAL? So basically the FAQ quotes
above and previously apply only to pre-7.1, and not 7.1/7.2?

> (In case of a crash, any missing data-file updates will be reconstructed
> from the recent WAL entries during restart.)

This is what I'd thought the FAQ was referring to when it said "all your
data is safely stored on disk." So am I right in saying:

In 7.1/7.2, when you COMMIT, you are guaranteed that WAL (and probably
nothing but WAL) is written to disk before you get a return and that
therefore if your instance immediately thereafter crashes, that after
recovery (which is required), you will have a consistent (not-corrupted)
database.

???

Cheers,
Tim Ellis
DBA, Gamet

ps -- I'm trying to find a writeup that kind'f compares Postgres to
Sybase/Oracle/DB2/Informix and/or MySQL to all the aforementioned
databases, but can find nothing substantial. I'm interested in
ADMINISTRATIVE viewpoints. Most of the info on the web concentrates on
what DEVELOPERS think of the aforementioned offerings, but nothing about
what ADMINISTRATORS think. For a good example of a bad discussion, see the
Slashdot thread on Postgres vs. MySQL.

Any hints for me?

FAQ 1.14 (at http://postgresql.org/docs/faq-english.html) is close to what
I'm talking about, but I'd love something far more in-depth and which
compares to specific RDBMSs (ie: isn't politically correct).

pgsql-admin by date:

Previous
From: Tom Lane
Date:
Subject: Re: FAQ Q
Next
From: Tom Lane
Date:
Subject: Re: FAQ Q