On Tue, 2002-08-06 at 17:38, Markus Wollny wrote:
> What I'd like to know is if I need to look any further than RAM - shall
> I just chuck the new modules out of the machine? Or is there some other
> issue that could cause this behaviour? I am quite sure that I didn't do
> anything wrong during installation, configuration and import and the
> same application code is running without errors on a different machine
> at this very moment. I don't like the "record with zero length" and
> "Cannot allocate memory"-bits in the logfile at all, let alone the "was
> terminated by signal 9"-thingy.
>
9 is SIGKILL - that is significant because it implies that your OS is
terminating the process (sig 11 would be likely for a bad pointer
dereference, which could well indicate RAM problems).
I don't think that you should immediately suspect your hardware. This
all looks suspiciously like an OS out-of-memory situation -that also
corresponds to it being under load. Two things to check:
1) Swap enabled, set to a suitable value for the load on the machine?
(what does "free" say?)
2) There is a Linux sysctl which determines whether to "overcommit"
memory. Also check that ulimit isn't imposing any per-process memory or
CPU limits.
3) If its a stock Linux install, you may be running excessive daemons,
but I'd be surprised if things got quite this bad.
Regards
John
--
John Gray
Azuli IT
www.azuli.co.uk