Thread: FATAL 1
I found a couple of entries in my dmesg which stated that linux (kernel 2.4) killed two postmasters because the system ran out of memory. I think that postmaster should log such instances as FATAL 1. I am trying to recover the approximate date and time these FATAL 1 entries are made so that I can figure out what went wrong. But the postgres log file I have don't have time stamps. Does anyone know how I can recover these time stamps? If not is there a log level for which time stamps will be made? Thanks
newsreader@mediaone.net writes: > I found a couple of entries in my dmesg which stated that linux > (kernel 2.4) killed two postmasters because the system ran out of > memory. > > I think that postmaster should log such instances as FATAL 1. IIRC, the kernel sends a SIGKILL signal in that case, so the affected application doesn't have a chance to react, it just gets terminated immediately. If you want to monitor these events better you need to ask your kernel for help. If the postmaster gets terminated normally, for various definitions of normal, you will get log entries. > Does anyone know how I can recover these time stamps? If not is there > a log level for which time stamps will be made? You can turn on time stamps in the postgresql.conf file, but that won't help you in this case. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Fri, Aug 10, 2001 at 12:42:33AM +0200, Peter Eisentraut wrote: > newsreader@mediaone.net writes: > > > I think that postmaster should log such instances as FATAL 1. > > IIRC, the kernel sends a SIGKILL signal in that case, so the affected > application doesn't have a chance to react, it just gets terminated > immediately. If you want to monitor these events better you need to ask Ok here is what I find in dmesg ------------ Out of Memory: Killed process 17534 (postmaster). Out of Memory: Killed process 18228 (postmaster) ----------- I think backends got killed instead of postmaster Fact is postmaster did not die; it is still running now and apparently survived the out of memory event
On Thu, Aug 09, 2001 at 11:19:14PM -0400, Tom Lane wrote: > newsreader@mediaone.net writes: > > I think backends got killed instead of postmaster > > > Assuming that it was a backend that got killed, the postmaster should > certainly have seen and logged that event. What are you doing with > the postmaster log? pg 7.1.2. kernel 2.4.4 I have the log. But they don't make any sense to me. I deliberately kill a backend on my development box and notice that FATAL 1 entry in the log So I did $ grep 'FATAL 1' log and get a number of entries but it is not very informative I start my postmaster like $ /usr/local/pg7.1/bin/pg_ctl -o "-F -i -h 192.168.0.1" start -l log and I did not particularly adjusted the debug/log level Only default... Thanks kz
After carefull looking at the pg log I don't think any of FATAL 1 entries do not coincide with those in dmesg But I am not sure
Tom Lane writes: > Probably. If your *postmaster* is running out of memory then you have > some serious problems. (But: we have had memory-leak problems in the > past in certain authentication paths. What PG version are you running, > anyway?) I think the behaviour is that if the system as a whole runs out of memory (physical + swap or whatever) the kernel randomly kills processes to make room. So a disappearing postmaster is not necessarily a sign of a fault on PostgreSQL's part. I'm not sure how to configure the kernel in this area, I just recall the discussion on the kernel mailing list about whether the init process would be allowed to be randomly killed as well. (I kid you not.) -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter