pgsql@j-davis.com (Jeff Davis) wrote:
> On Wed, 2006-09-06 at 18:55 -0400, Chris Browne wrote:
>> pgsql@j-davis.com (Jeff Davis) writes:
>> >> > Do you see an advantage in using LFS for PostgreSQL?
>> >>
>> >> Hey guys - I think the original poster only meant to suggest that it
>> >> was *interesting*... :-)
>> >>
>> >
>> > I see, my mistake.
>>
>> >From a reliability perspective, I can see some value to it...
>>
>> I have seen far too many databases corrupted by journalling gone bad
>> in the past year... :-(
>
> Can you elaborate a little? Which filesystems have been problematic?
> Which filesystems are you more confident in?
Well, more or less *all* of them, on AMD-64/Linux.
The "pulling the fibrechannel cable" test blew them all. XFS, ext3,
JFS. ReiserFS was, if I recall correctly, marginally better, but only
marginally.
On AIX, we have seen JFS2 falling over when there were enough levels
of buffering in the way on disk arrays.
>> > And if there is an improvement, shouldn't that be a project for
>> > something like Linux, where other databases could also benefit?
>> > It could just be implemented as a database-specific filesystem.
>>
>> The classic problem with log structured filesystems is that
>> sequential reads tend to be less efficient than in overwriting
>> systems; perhaps if they can get "vacuuming" to be done frequently
>> enough, that might change the shape of things.
>>
>> That would be a relevant lesson that _we_ have discovered that is
>> potentially applicable to filesystem implementors.
>>
>> And I don't consider this purely of academic interest; the ability to:
>> a) Avoid the double writing of journalling, and
>> b) Avoid the risks of failures due to misordered writes
>> are both genuinely valuable.
>
> Right, LFS is promising in a number of ways. I've read about it in
> the past, and it would be nice if this NILFS implementation sparks
> some new research in the area.
Indeed.
I don't see it being a "production-ready" answer yet, but yeah, I'd
certainly like to see the research continue. A vital problem is in
the area of vacuuming; there may be things to be learned in both
directions.
--
output = reverse("moc.liamg" "@" "enworbbc")
http://linuxdatabases.info/info/fs.html
Health is merely the slowest possible rate at which one can die.