Thread: BUG #16264: Server closed the connection unexpectedly
The following bug has been logged on the website: Bug reference: 16264 Logged by: Robin Duquette Email address: research@pyxidr.com PostgreSQL version: 12.2 Operating system: macOS version 10.15.3 Description: I just installed postgresql 12.2 (from 12.1) and I get many "server closed the connection unexpectedly" when executing queries in pgAdmin or using psycopg2 (in Python). I didn't get this issue with 12.1.
PG Bug reporting form <noreply@postgresql.org> writes: > I just installed postgresql 12.2 (from 12.1) and I get many "server closed > the connection unexpectedly" when executing queries in pgAdmin or using > psycopg2 (in Python). I didn't get this issue with 12.1. This might well be a bug, but there's nothing we can do about it with no details. There's some advice about submitting useful bug reports here: https://wiki.postgresql.org/wiki/Guide_to_reporting_problems regards, tom lane
Thanks Tom for your prompt response and my apologies for the lack of details (indeed, I didn't give you much to help you!). These are what I can tell you, let me know if you have any questions.
Best regards,
- I'm running macOS 10.15.3
- Everything was working fine with PostgreSQL version 12.1 and since I have installed 12.2 on two machines (with identical os), many queries induce server process to get terminated by signal 9 (according to log). This is the case on the two machines that I'm running 12.2 Unfortunately, the log doesn't say more than that (see below an extract).
- I have used the installer certified by EnterpriseDB to install PostgreSQL
- I didn't make any changes to the postgresql.conf file
- The issue shows up with pgAdmin 4 and psycopg2 (Python 3.7).
2020-02-19 18:24:57.449 EST [175] LOG: server process (PID 3032) was terminated by signal 9: Killed: 92020-02-19 18:24:57.449 EST [175] DETAIL: Failed process was running: set timezone to 'Europe/Paris';selectextract(year from fd.datetime_starting) as year,rfd.market,rfd.fundamental_subtype as prod_type,sum(fd.value) as installed_capacityfromedb.tbl_ref_fundamental_data rfd,cwe_uk.tbl_hist_datetime_fundamental_data fdwhererfd.id = fd.data_id andrfd.value_type = 'Actual' andrfd.fundamental_type = 'Installed capacity' andrfd.commodity = 'Power' andrfd.market in ('AT', 'BE', 'FR', 'DE', 'NL', 'CH', 'UK') andrfd.term = 'Annual' andfd.datetime_starting::date >= '2000-01-01' andfd.datetime_starting::date <= '2020-02-18'group byextract(year from fd.datetime_starting),rfd.market,rfd.fundamental_subtypeorder byextract(year from fd.datetime_starting),rfd.market,rfd.fundamental_subtype;2020-02-19 18:24:57.449 EST [175] LOG: terminating any other active server processes2020-02-19 18:24:57.449 EST [2986] WARNING: terminating connection because of crash of another server process2020-02-19 18:24:57.449 EST [2986] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
Best regards,
ROBIN DUQUETTE |
On Wed, Feb 19, 2020 at 11:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> I just installed postgresql 12.2 (from 12.1) and I get many "server closed
> the connection unexpectedly" when executing queries in pgAdmin or using
> psycopg2 (in Python). I didn't get this issue with 12.1.
This might well be a bug, but there's nothing we can do about it with
no details. There's some advice about submitting useful bug reports
here:
https://wiki.postgresql.org/wiki/Guide_to_reporting_problems
regards, tom lane
Robin Duquette <robin.duquette@pyxidr.com> writes: > 1. I'm running macOS 10.15.3 > 2. Everything was working fine with PostgreSQL version 12.1 and since > I have installed 12.2 on two machines (with identical os), many queries > induce server process to get terminated by signal 9 (according to log). > This is the case on the two machines that I'm running 12.2 Unfortunately, > the log doesn't say more than that (see below an extract). Huh, interesting. Signal 9 (SIGKILL) is an externally-imposed process termination, rather than an internal failure. We've seen one similar report recently: https://www.postgresql.org/message-id/flat/CEF2C288-13E6-4727-81D0-0775F40F313B%40arcict.com and as mentioned there, the most likely theory is that the backend process is consuming an unreasonable amount of memory and the SIGKILL is coming from a system-level out-of-memory defense mechanism. I hadn't thought that macOS did that, but it looks like I'm finding out differently. That does not get us a whole lot closer to identifying the cause, though. It's certainly believable that we introduced some kind of memory leak between 12.1 and 12.2, but that's not enough info to find it. First things first though. Can you watch the system with "top" or Activity Monitor and confirm or disprove that there's a memory consumption issue before the SIGKILL? We ought to be sure about that before we go spending a lot of time. If that does seem to be the case, launching the postmaster under a restrictive ulimit (maybe "ulimit -v 1000000" or so) could be a second step. That ought to help reduce the problem from a SIGKILL to a normal out-of-memory error, which not only would make things a bit more stable for you, but it should allow the failing query to dump a memory map to the postmaster's stderr, which would give us a little more to go on about where the leak is. In the end, though, I'm afraid we might have to ask you to produce a reproducible test case of a query that consumes excessive memory. These things can be very hard to identify without digging into it with a debugger. regards, tom lane