Re: hardware checks (was Re: invalid memory alloc request size) - Mailing list pgsql-general

From Greg Stark
Subject Re: hardware checks (was Re: invalid memory alloc request size)
Date
Msg-id 87lkx6us7y.fsf@stark.xeocode.com
Whole thread Raw
In response to hardware checks (was Re: invalid memory alloc request size)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: hardware checks (was Re: invalid memory alloc request  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-general
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Janning Vygen <vygen@gmx.de> writes:
> > one more question: You mentioned standard disk and memory checks. Can you
> > point to some link where i can find more about it or which software do you
> > mean? I guess i have to start looking at it.
>
> The stuff I've heard recommended is memtest86 for memory checks and
> badblocks for disk checks.  But perhaps someone on the list has better
> ideas.

I second memtest86, though even the author says memory errors can be tricksy
things. Sometimes a large compile finds memory errors that even memtest86
doesn't find (the symptom is gcc crashing).

However I fear using badblocks alone is pretty useless these days. Modern IDE
drives detect bad blocks and remap them to other locations. If you just use
badblocks you'll see mysterious errors that disappear or might not see any
errors at all. You need to use tools like smartctl to query the drive's SMART
firmware about errors. It's not easy to interpret but if you watch the numbers
for a while you can tell if a drive is going bad and continually remapping bad
blocks. badblocks is useful still as a way of ensuring that every block is
read and written to, but then you have to look at the SMART data to see what
happened.

--
greg

pgsql-general by date:

Previous
From: "surabhi.ahuja"
Date:
Subject: Re: FATAL: terminating connection due to administrator command
Next
From: Agnes Bocchino
Date:
Subject: Re: Initdb panic: invalid record offset at 0/0 creating