Dave Page wrote:
> So, the disk goes back to Seagate, and is replaced with another
> identical one, and a similar problem reoccurs (logs below) :-(. I
> haven't run badblocks yet as it takes a fair while, but wanted to find
> out if anyone thought this could be an OS issue or something else.
> Previously I've been using the 2.4.19 Linux kernel, however this machine
> is 2.4.20 (Slackware Linux 9). The SCSI adaptor is an Adaptec 29160, and
> the disks are 34Gb Seagate Cheetah X15's.
>
> Any thoughts or suggestions would be appreciated.
I'd definitely run badblocks against the new drive -- multiple times.
Either it should yield the same bad block list each time (in which
case you've got a set of unrecoverable bad block -- this usually means
there are no spare blocks left), or you should see the number of bad
blocks drop to zero (as the SCSI bad block remapping takes effect),
unless something really funky is going on.
I'd also start looking carefully through the system logs for SCSI
errors. You should see some if you're getting bad block problems (in
particular, you should see bad block remapping attempts that couldn't
read the data from the original bad block -- this, or running out of
spare blocks, is the only reason you should see errors at all on an
otherwise functional setup).
If badblocks shows errors but you don't see any SCSI errors in the
system logs, then it's time to start suspecting the disk controller or
perhaps even the PCI bus controller, because it means something really
weird is happening on the backend that is entirely invisible. Cabling
or termination could be an issue, but I'd expect to see parity errors,
timed out commands, etc. if that's the problem.
--
Kevin Brown kevin@sysexperts.com