Re: New Linux Filesystem: NILFS - Mailing list pgsql-hackers

From Christopher Browne
Subject Re: New Linux Filesystem: NILFS
Date
Msg-id 87ac5czaps.fsf@wolfe.cbbrowne.com
Whole thread Raw
In response to New Linux Filesystem: NILFS  (Chris Browne <cbbrowne@acm.org>)
Responses Re: New Linux Filesystem: NILFS  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
pgsql@j-davis.com (Jeff Davis) wrote:
> On Wed, 2006-09-06 at 18:55 -0400, Chris Browne wrote:
>> pgsql@j-davis.com (Jeff Davis) writes:
>> >> > Do you see an advantage in using LFS for PostgreSQL?
>> >> 
>> >> Hey guys - I think the original poster only meant to suggest that it
>> >> was *interesting*... :-)
>> >> 
>> >
>> > I see, my mistake.
>> 
>> >From a reliability perspective, I can see some value to it...  
>> 
>> I have seen far too many databases corrupted by journalling gone bad
>> in the past year...  :-(
>
> Can you elaborate a little? Which filesystems have been problematic?
> Which filesystems are you more confident in?

Well, more or less *all* of them, on AMD-64/Linux.

The "pulling the fibrechannel cable" test blew them all.  XFS, ext3,
JFS.  ReiserFS was, if I recall correctly, marginally better, but only
marginally.

On AIX, we have seen JFS2 falling over when there were enough levels
of buffering in the way on disk arrays.

>> > And if there is an improvement, shouldn't that be a project for
>> > something like Linux, where other databases could also benefit? 
>> > It could just be implemented as a database-specific filesystem.
>> 
>> The classic problem with log structured filesystems is that
>> sequential reads tend to be less efficient than in overwriting
>> systems; perhaps if they can get "vacuuming" to be done frequently
>> enough, that might change the shape of things.
>> 
>> That would be a relevant lesson that _we_ have discovered that is
>> potentially applicable to filesystem implementors.
>> 
>> And I don't consider this purely of academic interest; the ability to:
>>  a) Avoid the double writing of journalling, and
>>  b) Avoid the risks of failures due to misordered writes
>> are both genuinely valuable.
>
> Right, LFS is promising in a number of ways. I've read about it in
> the past, and it would be nice if this NILFS implementation sparks
> some new research in the area.

Indeed.

I don't see it being a "production-ready" answer yet, but yeah, I'd
certainly like to see the research continue.  A vital problem is in
the area of vacuuming; there may be things to be learned in both
directions.
-- 
output = reverse("moc.liamg" "@" "enworbbc")
http://linuxdatabases.info/info/fs.html
Health is merely the slowest possible rate at which one can die.


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Win32 hard crash problem
Next
From: Tom Lane
Date:
Subject: Re: ECPG/OpenBSD buildfarm failures, take I