Hello Scott!
Thank you. Memtest86 i know. I think we will use this for testing our
hardware too.
Got some other nice information meanwhile from someone also running a DL380
server which had a defect backplane causing similar issues.
He also gave me the hint that there's a test suite CD by Compaq to run some
hardware diagnostic checks on our machine. I will try this out as soon as
possible.
I will inform you when i know more :)
-- Matthias
> -----Original Message-----
> From: Scott Marlowe [mailto:smarlowe@g2switchworks.com]
> Sent: Wednesday, September 20, 2006 4:12 PM
> To: Matthias.Pitzl@izb.de
> Cc: pgsql-general@postgresql.org
> Subject: RE: [GENERAL] Strange database corruption with
> PostgreSQL 7.4.x o n
>
> Keep in mind, a single bad memory location is all it takes to
> cause data
> corruption, so it could well be memory. CPU is less likely if the
> machine is otherwise running stable.
>
> The standard tool on x86 hardware is memtest86 www.memtest86.com
>
> So, you'd have to schedule a maintenance window to run the
> test in since
> you have to basically down the machine and run just
> memtest86. I think
> a few live linux distros have it built in (FC has a memtest label in
> some versions I think)
>
> My first suspicion is always memory. We ordered a batch of
> memory from
> a very off brand supplier, and over 75% tested bad. And it took >24
> hours to find some of the bad memory.
>
> good luck with your testing, let us know how it goes.
>