Thread: constant crashing hardware issue and thank you TAKE AWAY
I discovered that one of the memory sticks in the machine was damaged.
Running memtest86 on the machine generated many RAM errors.
This was causing the strange bi-polar errors in postgresql.
The hardware technician explained that he sees this often and that there is no one cause for such problems.
As I am not a hardware specialist, I never thought that RAM could cause such problems.
I always assumed that the OS (ubuntu or windows) would advise me if there was ever an issue with memory.
TAKE AWAY:
As a result of this I will be checking the RAM on all my machines once a month or the moment a machine starts to act strange.
Thanks again to all who helped with this issue.
That kind of support for “damaged ram” you have it with ECC memory on CPU’s that support it.
XEON cpus for example.
On 17 Apr 2024, at 15:06, jack <jack4pg@a7q.com> wrote:uld advise me if there was ever an issue with me
On 2024-04-17 23:06, jack wrote: <snip> > As a result of this I will be checking the RAM on all my machines once > a month or the moment a machine starts to act strange. Once a month is overkill, and unlikely to be useful. :) With server or enterprise grade hardware, it'll support "ECC" memory. That has extra memory chips + supporting circuity on the memory board so it can detect + correct most errors which happen without them causing problems. For the errors that it can't *correct*, it'll still generate warnings to your system software to let you know (if you've configured it). If you do get such a warning - or if the system starts acting funny like you saw - that's when you'd want to run memtest on the system. --- The other time to run memtest on the system is when you first buy or receive a new server. You'd generally do a "burn in" test of all the things (memory, hard disks/ssds, cpu, gpu, etc) just to make sure everything is ok before you start using it for important stuff. Regards and best wishes, Justin Clift