Thread: what could cause postgres to crash?

what could cause postgres to crash?

From
Sandeep Gupta
Date:
Hi,

 My postgres sessions, after being idle for 5 --6 hrs, crash on their own. 
Sometimes with error messages sometimes without. The message I get appended below. I was looking for suggestion to narrow down as to what could have caused this problem. System log doesn't show anything. 

Thanks.
Sandeep



LOG:  statistics collector process (PID 6631) was terminated by signal 9: Killed
LOG:  server process (PID 15710) was terminated by signal 9: Killed
DETAIL:  Failed process was running: COMMIT PREPARED 'T13199'
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another s
erver process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
LOG:  all server processes terminated; reinitializing


Re: what could cause postgres to crash?

From
Tom Lane
Date:
Sandeep Gupta <gupta.sandeep@gmail.com> writes:
>  My postgres sessions, after being idle for 5 --6 hrs, crash on their own.
> Sometimes with error messages sometimes without. The message I get appended
> below. I was looking for suggestion to narrow down as to what could have
> caused this problem. System log doesn't show anything.

> LOG:  statistics collector process (PID 6631) was terminated by signal 9:
> Killed

Signal 9 is a kill.  If you didn't manually kill the process, it's almost
certainly the infamous Linux OOM killer that did it.  And yes, that would
be recorded in the kernel log ...

            regards, tom lane


Re: what could cause postgres to crash?

From
Scott Marlowe
Date:
On Fri, Nov 8, 2013 at 8:16 PM, Sandeep Gupta <gupta.sandeep@gmail.com> wrote:
> Hi,
>
>  My postgres sessions, after being idle for 5 --6 hrs, crash on their own.
> Sometimes with error messages sometimes without. The message I get appended
> below. I was looking for suggestion to narrow down as to what could have
> caused this problem. System log doesn't show anything.

Just in case you don't know. PostgreSQL itself NEVER issues a kill -9
to a backend, and a crashing backend will show as a sig 11 not 9.

While it's possible that, as Tom suggested, it's the OOM killer, it's
also possible somebody got too smart and wrote a script to kill idle
connections and used kill -9 instead of kill or kill -15 (default on
linux for kill is -15).

kill -9, much like a regular backend crash or panic, is bad because it
causes all back ends to restart. This can kill server performance if
it happens much.


Re: what could cause postgres to crash?

From
Sandeep Gupta
Date:
Dear Scott,

 Thanks for the input. I doubt that the system has any scripts to kill idle connections but I would double check that. There are two difficulty in tracing 
and fixing this: 1) The crash is predictable and 2) Can find any killed in the kernel log files. 

There are few materials on the net which tell how to fix this problem. But first would be good to confirm that it is OOM and, if not,  then what exactly is causing this to happen. 

-Sandeep



On Sat, Nov 9, 2013 at 10:46 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Fri, Nov 8, 2013 at 8:16 PM, Sandeep Gupta <gupta.sandeep@gmail.com> wrote:
> Hi,
>
>  My postgres sessions, after being idle for 5 --6 hrs, crash on their own.
> Sometimes with error messages sometimes without. The message I get appended
> below. I was looking for suggestion to narrow down as to what could have
> caused this problem. System log doesn't show anything.

Just in case you don't know. PostgreSQL itself NEVER issues a kill -9
to a backend, and a crashing backend will show as a sig 11 not 9.

While it's possible that, as Tom suggested, it's the OOM killer, it's
also possible somebody got too smart and wrote a script to kill idle
connections and used kill -9 instead of kill or kill -15 (default on
linux for kill is -15).

kill -9, much like a regular backend crash or panic, is bad because it
causes all back ends to restart. This can kill server performance if
it happens much.