Home > mailing lists

Re: Mount options for Ext3? - Mailing list pgsql-performance

From	Kevin Brown
Subject	Re: Mount options for Ext3?
Date	January 24, 2003 19:30:11
Msg-id	20030125003011.GA28252@filer Whole thread Raw
In response to	Mount options for Ext3? (Josh Berkus <josh@agliodbs.com>)
Responses	Re: Mount options for Ext3? Re: Mount options for Ext3?
List	pgsql-performance

Tree view

Josh Berkus wrote:
> Folks,
>
> What mount options to people use for Ext3, particularly what do you set "data
> = " for a high-transaction database?  I'm used to ReiserFS ("noatime,
> notail") and am not really sure where to go with Ext3.

For ReiserFS, I can certainly understand using "noatime", but I'm not
sure why you use "notail" except to allow LILO to operate properly on
it.

The default for ext3 is to do ordered writes: data is written before
the associated metadata transaction commits, but the data itself isn't
journalled.  But because PostgreSQL synchronously writes the
transaction log (using fsync() by default, if I'm not mistaken) and
uses sync() during a savepoint, I would think that ordered writes at
the filesystem level would probably buy you very little in the way of
additional data integrity in the event of a crash.

So if I'm right about that, then you might consider using the
"data=writeback" option for the filesystem that contains the actual
data (usually /usr/local/pgsql/data), but I'd use the default
("data=ordered") at the very least (I suppose there's no harm in using
"data=journal" if you're willing to put up with the performance hit,
but it's not clear to me what benefit, if any, there is) for
everything else.

I use ReiserFS also, so I'm basing the above on what knowledge I have
of the ext3 filesystem and the way PostgreSQL writes data.

The more interesting question in my mind is: if you use PostgreSQL on
an ext3 filesystem with "data=ordered" or "data=journal", can you get
away with turning off PostgreSQL's fsync altogether and still get the
same kind of data integrity that you'd get with fsync enabled?  If the
operating system is able to guarantee data integrity, is it still
necessary to worry about it at the database level?

I suspect the answer to that is that you can safely turn off fsync
only if the operating system will guarantee that write transactions
from a process are actually committed in the order they arrive from
that process.  Otherwise you'd have to worry about write transactions
to the transaction log committing before the writes to the data files
during a savepoint, which would leave the overall database in an
inconsistent state if the system were to crash after the transaction
log write (which marks the savepoint as completed) committed but
before the data file writes committed.  And my suspicion is that the
operating system rarely makes any such guarantee, journalled
filesystem or not.

--
Kevin Brown                          kevin@sysexperts.com

pgsql-performance by date:

From: Stephan Szabo
Date: 24 January 2003, 19:29:09
Subject: Re: WEIRD CRASH?!?!

From: Carlos Moreno
Date: 24 January 2003, 19:30:58
Subject: Re: Having trouble with backups (was: Re: Crash Recovery)

Re: Mount options for Ext3? - Mailing list pgsql-performance

Previous

Next