Re: BUG #16264: Server closed the connection unexpectedly - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #16264: Server closed the connection unexpectedly
Date
Msg-id 2549.1582157371@sss.pgh.pa.us
Whole thread Raw
In response to BUG #16264: Server closed the connection unexpectedly  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
Robin Duquette <robin.duquette@pyxidr.com> writes:
>    1. I'm running macOS 10.15.3
>    2. Everything was working fine with PostgreSQL version 12.1 and since
>    I have installed 12.2 on two machines (with identical os), many queries
>    induce server process to get terminated by signal 9 (according to log).
>    This is the case on the two machines that I'm running 12.2 Unfortunately,
>    the log doesn't say more than that (see below an extract).

Huh, interesting.  Signal 9 (SIGKILL) is an externally-imposed process
termination, rather than an internal failure.  We've seen one similar
report recently:

https://www.postgresql.org/message-id/flat/CEF2C288-13E6-4727-81D0-0775F40F313B%40arcict.com

and as mentioned there, the most likely theory is that the backend process
is consuming an unreasonable amount of memory and the SIGKILL is coming
from a system-level out-of-memory defense mechanism.  I hadn't thought
that macOS did that, but it looks like I'm finding out differently.

That does not get us a whole lot closer to identifying the cause, though.
It's certainly believable that we introduced some kind of memory leak
between 12.1 and 12.2, but that's not enough info to find it.

First things first though.  Can you watch the system with "top" or
Activity Monitor and confirm or disprove that there's a memory
consumption issue before the SIGKILL?  We ought to be sure about
that before we go spending a lot of time.

If that does seem to be the case, launching the postmaster under a
restrictive ulimit (maybe "ulimit -v 1000000" or so) could be a
second step.  That ought to help reduce the problem from a SIGKILL
to a normal out-of-memory error, which not only would make things
a bit more stable for you, but it should allow the failing query
to dump a memory map to the postmaster's stderr, which would give
us a little more to go on about where the leak is.

In the end, though, I'm afraid we might have to ask you to produce
a reproducible test case of a query that consumes excessive memory.
These things can be very hard to identify without digging into it
with a debugger.

            regards, tom lane



pgsql-bugs by date:

Previous
From: Robin Duquette
Date:
Subject: Re: BUG #16264: Server closed the connection unexpectedly
Next
From: Artur Zakirov
Date:
Subject: Re: Full text search bug ('russian' regconfig)