Re: [GENERAL] postmaster deadlock while logging after syslogger exited - Mailing list pgsql-general

From David Pacheco
Subject Re: [GENERAL] postmaster deadlock while logging after syslogger exited
Date
Msg-id CACukRjMW2PJ=Lvk1+NOU3Jxgrwe_MB+=X7_+xT0-UZ=OTh_GZA@mail.gmail.com
Whole thread Raw
In response to Re: [GENERAL] postmaster deadlock while logging after syslogger exited  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Mon, Nov 6, 2017 at 12:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Pacheco <dap@joyent.com> writes:
> ... that process appears to have exited due to a fatal error
> (out of memory).  (I know it exited because the process still exists in the
> kernel -- it hasn't been reaped yet -- and I think it ran out of memory
> based on a log message I found from around the time when the process
> exited.)

Could we see the exact log message(s) involved?  It's pretty hard to
believe that the logger would have consumed much memory.


Thanks for the quick reply!

Based on kernel state about the dead but unreaped syslogger process, I believe the process exited at 2017-10-27T23:46:21.258Z.  Here are all of the entries in the PostgreSQL log from 23:19:12 until the top of the next hour:

There's no log entry at exactly 23:46:21 or even immediately before that, but there are a lot of "out of memory" errors and a FATAL one at 23:47:28.  Unfortunately, we haven't configured logging to include the pid, so I can't be sure which messages came from the syslogger.

There are also many log entries for some very long SQL queries.  I'm sure that contributed to this problem by filling up the pipe.  I was able to extract the contents of the pipe while the system was hung, and it was more of these giant query strings.

I think it's likely that this database instance was running in a container with way too small a memory cap for the number of processes configured.  (This was a zone (a lightweight container) allocated with 2GB of memory and configured with 512MB of shared_buffers and up to 200 connections.)  I expect that the system got to a persistent state of having basically no memory available, at which point nearly any attempt to allocate memory could fail.  The syslogger itself may not have been using much memory.

So I'm not so much worried about the memory usage itself, but it would be nice if this condition were handled better.  Handling out-of-memory is obviously hard, especially when it means being unable to fork, but even crashing would have been better for our use-case.  And of course, there are other reasons that the syslogger could exit prematurely besides being low on memory, and those might be more recoverable.

Thanks,
Dave

pgsql-general by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: [GENERAL] idle in transaction, why
Next
From: Rob Sargent
Date:
Subject: Re: [GENERAL] idle in transaction, why