Re: Re: New Linux xfs/reiser file systems - Mailing list pgsql-hackers

From Michael Samuel
Subject Re: Re: New Linux xfs/reiser file systems
Date
Msg-id 20010504213534.A4596@miknet.net
Whole thread Raw
In response to Re: New Linux xfs/reiser file systems  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
On Thu, May 03, 2001 at 11:41:24AM -0400, Bruce Momjian wrote:
> ext2 has serious problems with corrupt file systems after a crash, so I
> understand the need to move to another file system type.  I have been
> waitin for Linux to get a more modern file system. Unfortunately, the
> new ones seem to be worse for PostgreSQL.

If you fsync() a directory in Linux, all the metadata within that directory
will be written out to disk.

As for filesystem corruption, I can say the e2fsck is among the best fsck
programs out there, and I've only ever had 1 occasion where I've lost any
data on an ext2 filesystem, and that was due to bad sectors causing me to
lose the root directory. (Well, apart from human errors, but that doesn't
count)

> OK, we have considered this, but frankly, the new, modern file systems
> like FFS/softupdates have i/o rates near raw speed, with all the
> advantages a file system gives us.  I believe most commercial dbs are
> moving away from raw devices and toward file systems.  In the old days
> the SysV file system was pretty bad at i/o & fragmentation, so they used
> raw devices.

And Solaris' 1/01 media has better support for O_DIRECT (?), which they claim
gives you 93% of the speed of a raw device. (Or something like that; I read
this in marketing material a couple of months ago)

Raw devices are designed to have filesystems on them.  The only excuses for
userland tools accessing them, are fs-specific tools (eg. dump, fsck, etc),
or for non-unix filesystem tools, where the unix VFS doesn't handle things
properly (hfstools).

> > The ability to put indexes on a separate volume from data.
> > The ability to put different tables on different volumes.
> > And so on.
> 
> We certainly need that, but raw devices would not make this any easier,
> I think.

It would be cool if either at compile time or at database creation time, we
could specify a printf-like format for placing tables, indexes, etc.

> It could become a serious problem as people start using reiser/xfs for
> their file systems and don't understand the performance problems.  Even
> more likely is that they will turn off fsync, thinking reiser doesn't
> need it, when in fact, I think it does.

ReiserFS only supports metadata logging.  The performance slowdown must be
due to logging things like mtime or atime, because otherwise ReiserFS is a
very high performance FS. (Although, I admittedly haven't used it since it
was early in it's development)

-- 
Michael Samuel <michael@miknet.net>


pgsql-hackers by date:

Previous
From: mlw
Date:
Subject: Re: New Linux xfs/reiser file systems
Next
From: The Hermit Hacker
Date:
Subject: Re: CVSup not working!