On Fri, 28 Nov 2003, Marco Colombo wrote:
> On Fri, 28 Nov 2003, Craig O'Shannessy wrote:
>
> > >
> > > From my point of view, it's just support for my demands to have each
> > > mission-critical server supported by a UPS, if not redundant power
> > > supplies and two UPSes.
> > >
> >
> > Never had a kernel panic? I've had a few. Probably flakey hardware. I
> > feel safer since journalling file systems hit linux.
>
> On any hardware flakey enough to cause panics, no FS code will save
> you. The FS may "reliably" write total rubbish to disk. It may have been
> doing that for hours, thrashing the whole FS structure, before something
> triggered the panic.
> You are no safer with journal than you are with a plain FAT (or any
> other FS technology). Journal files get corrupted themselves.
>
This isn't always true. For example, my most recent panic was due to a
ide cdrom driver on a fairly expensive Intel dual xeon box, running 2.4.18
I mounted the cdrom and boom, panic. If I'd been running ext2, I would
have had a very lengthy reboot and lots of pissed off users, but as it's
ext3, the system was back up in a couple of minutes, and I just removed
the cdrom drive from fstab (I've got other cdrom drives :)
I can't remember what the problem was, but it was known and unusual, I
think it might have been the drive firmware from memory.
Of course cosmic rays etc can and do flip bits in memory, so any non-ecc
system can panic if the wrong bit flips. Incredibly rare, but again, I'm
glad I'm running a journalling file system, if just for the reboot time.