Re: [ADMIN] Postgres will not allow new connections, suspendedprocess, waiting error - Mailing list pgsql-admin

From Magnus Hagander
Subject Re: [ADMIN] Postgres will not allow new connections, suspendedprocess, waiting error
Date
Msg-id CABUevEx_3h4T+9AuzgCHm+78qudwZjYDdw2kLpJiFjWQv9EMqg@mail.gmail.com
Whole thread Raw
In response to Re: [ADMIN] Postgres will not allow new connections, suspendedprocess, waiting error  (Prateek Mahajan <prateekm99@gmail.com>)
List pgsql-admin


On Sat, Jul 1, 2017 at 12:59 AM, Prateek Mahajan <prateekm99@gmail.com> wrote:
More details.

Environment
PostgreSQL 9.5, EnterpriseDB Postgres installer
Windows Server 2012R2 with Active Directory
Symantec End Point Protection

Symptom:

After about 1 week of running, one of PostgreSQL process (postgres.exe) showed "suspended" in task manager, and I cannot kill it in the task manager ("Access Denied" error message appeared). This "suspended" process was not the master PID as indicated in postmaster.pid file. 
Current live connections still work but one cannot establish new connections. The only solution that I have is to restart the Server 
Other information:

The PostgreSQL service is run under a domain account.
The maximum connection was never reached as it is set as 1000 and we only had about 10 connections.
There was plenty of available memory there. The total memory is 288GB and only 8% was used
There were minimum hard drive activities as it occurred. The C drive where PostgreSQL was installed had about 86GB of free space.
There are additional 4 table spaces that are not on C drive but spread over 4 hard drives. Each of 4 hard drives has more than 500GB of space.
we have been using the same configuration files for years and the same file is also used on a second PostgreSQL server, which does not have the issue at all.
The PostgreSQL logs had something like this when this happened and it continues to produce this warning message every minute or so:

2017-06-28 19:40:21 CDT WARNING:  worker took too long to start; canceled
2017-06-28 19:41:21 CDT WARNING:  worker took too long to start; canceled
2017-06-28 19:42:21 CDT WARNING:  worker took too long to start; canceled
2017-06-28 19:43:21 CDT WARNING:  worker took too long to start; canceled


Those are autovacuum workers trying to start. My guess is that's a symptom of the same basic problem, which is that your machine behaves as if it's heavily overloaded.

As a first try I'd attempt removing the Symantec Endpoint stuff and see if that helps. It's very common that software like that breaks the database. And being unable to kill things in the task manager clearly indicates the problem lies outside the control of Postgres. 

--

pgsql-admin by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: [ADMIN] Postgres will not allow new connections, suspendedprocess, waiting error
Next
From: "Goldsmith, Christopher [ASM Research]"
Date:
Subject: Re: [ADMIN] Postgres vs EnterpriseDB Vulnerability scans with Nessus