Re: hardware checks (was Re: invalid memory alloc request - Mailing list pgsql-general

From Bruce Momjian
Subject Re: hardware checks (was Re: invalid memory alloc request
Date
Msg-id 200601250354.k0P3srQ04494@candle.pha.pa.us
Whole thread Raw
In response to Re: hardware checks (was Re: invalid memory alloc request size)  (Greg Stark <gsstark@mit.edu>)
List pgsql-general
Greg Stark wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> > Janning Vygen <vygen@gmx.de> writes:
> > > one more question: You mentioned standard disk and memory checks. Can you
> > > point to some link where i can find more about it or which software do you
> > > mean? I guess i have to start looking at it.
> >
> > The stuff I've heard recommended is memtest86 for memory checks and
> > badblocks for disk checks.  But perhaps someone on the list has better
> > ideas.
>
> I second memtest86, though even the author says memory errors can be tricksy
> things. Sometimes a large compile finds memory errors that even memtest86
> doesn't find (the symptom is gcc crashing).
>
> However I fear using badblocks alone is pretty useless these days. Modern IDE
> drives detect bad blocks and remap them to other locations. If you just use
> badblocks you'll see mysterious errors that disappear or might not see any
> errors at all. You need to use tools like smartctl to query the drive's SMART
> firmware about errors. It's not easy to interpret but if you watch the numbers
> for a while you can tell if a drive is going bad and continually remapping bad
> blocks. badblocks is useful still as a way of ensuring that every block is
> read and written to, but then you have to look at the SMART data to see what
> happened.

It is my experience the SCSI drive controllers will beep if they have a
bad block that can't be read cleanly.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Setting expire date on insert/modify
Next
From: Tom Lane
Date:
Subject: Re: Postgresql Segfault in 8.1