Home > mailing lists

Re: connections not getting closed on a replica - Mailing list pgsql-general

From	Kevin Grittner
Subject	Re: connections not getting closed on a replica
Date	December 11, 2015 22:13:27
Msg-id	CACjxUsM5AwszA1ofaVNg_VqzTtwbOjVaNt=PuLHnj8LyDGziLw@mail.gmail.com Whole thread Raw
In response to	Re: connections not getting closed on a replica (Carlo Cabanilla <carlo@datadoghq.com>)
Responses	Re: connections not getting closed on a replica
List	pgsql-general

Tree view

On Fri, Dec 11, 2015 at 3:37 PM, Carlo Cabanilla <carlo@datadoghq.com> wrote:

> 16 cores

> a default pool size of 650, steady state of 500-600 server
> connections

With so many more connections than resources to serve them, one
thing that can happen is that just by happen-stance enough processes
become busy at one time that they start context switching a lot
before they finish, leaving spinlocks blocked and causing other
contention that slows all query run times.  This causes bloat to
increase because some database transactions are left active for
longer times.  If the client software and/or pooler don't queue
requests at that point there will be more connections made because
connections have not been freed because of the contention causing
slowness -- which exacerbates that problem and leads to a downward
spiral.  That can become so bad that there is no recovery until
either the clients software is stopped or the database is
restarted.

>> I don't suppose you have vmstat 1 output from the incident?  If
>> it happens again, try to capture that.
>
> Are you looking for a stat in particular?

Not really; what I like about `vmstat 1` is how many useful pieces
of information are on each line, allowing me to get a good overview
of what's going on.  For example, if system CPU time is high, it is
very likely to be a problem with transparent huge pages, which is
one thing that can cause these symptoms.  A "write glut" can also
do so, which can be controlled by adjusting checkpoint and
background writer settings, plus the OS vm.dirty_* settings (and
maybe keeping shared_buffers smaller than you otherwise might).
NUMA problems are not at issue, since there is only one memory
node.

Without more evidence of what is causing the problem, suggestions
for a solution are shots in the dark.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-general by date:

From: Kevin Grittner
Date: 11 December 2015, 21:53:36
Subject: Re: [JDBC] plpgsql function with RETURNS SETOF refcursor in JAVA

From: Shay Cohavi
Date: 12 December 2015, 19:08:27
Subject: postgresql 9.3 failover time

Re: connections not getting closed on a replica - Mailing list pgsql-general

Previous

Next