Thread: too many clients already

too many clients already

From
"Abraham, Danny"
Date:
Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200,  shared_buffers is 2GB

But .. select count(*) from pg_stat_activity   is only 66.

Thanks

Danny



Re: too many clients already

From
Rob Sargent
Date:

> On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:
>
> Hi,
>
> Will appreciate a hint here.
>
> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
> and transient difficulty to log in.
>
> Happens on all PG versions (Tested 9.5,10.4,11.5)
>
> Big installation: max_connections is 1200,  shared_buffers is 2GB
>
> But .. select count(*) from pg_stat_activity   is only 66.
>
> Thanks
>
> Danny
>
>
Lots of idle, kept-alive clients?  Do you have a connection pooler (e.g. pg-bouncer)?


Re: too many clients already

From
Adrian Klaver
Date:
On 4/2/20 8:06 AM, Abraham, Danny wrote:
> Hi,
> 
> Will appreciate a hint here.
> 
> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
> and transient difficulty to log in.
> 
> Happens on all PG versions (Tested 9.5,10.4,11.5)
> 
> Big installation: max_connections is 1200,  shared_buffers is 2GB
> 
> But .. select count(*) from pg_stat_activity   is only 66.

On the chance pg_stat_activity is lying to you what does:

ps ax | grep post

show for backends?

For example:

ps ax | grep post
  1217 ?        Ss     0:00 /usr/lib/postfix/bin//master -w
  1233 ?        S      0:00 /usr/local/pgsql12/bin/postmaster -D 
/usr/local/pgsql12/data
  1253 ?        Ss     0:00 postgres: logger
  1277 ?        Ss     0:00 postgres: checkpointer
  1278 ?        Ss     0:00 postgres: background writer
  1279 ?        Ss     0:00 postgres: walwriter
  1280 ?        Ss     0:00 postgres: autovacuum launcher
  1281 ?        Ss     0:00 postgres: stats collector
  1282 ?        Ss     0:00 postgres: logical replication launcher 
*4693 ?        Ss     0:00 postgres: aklaver task_manager [local] idle
*4907 ?        Ss     0:00 postgres: aklaver production [local] idle

> 
> Thanks
> 
> Danny
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



RE: Re: too many clients already

From
"Abraham, Danny"
Date:
No pg-bouncer or connection pooling.
ps -elf | grep postgres | grep idle | wc -l  ==>    61

and BTW: Running, say 500 one command psql in parallel will have the same affect..

-----Original Message-----
From: Rob Sargent <robjsargent@gmail.com>
Sent: Thursday, April 02, 2020 6:10 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already



> On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:
>
> Hi,
>
> Will appreciate a hint here.
>
> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
> and transient difficulty to log in.
>
> Happens on all PG versions (Tested 9.5,10.4,11.5)
>
> Big installation: max_connections is 1200,  shared_buffers is 2GB
>
> But .. select count(*) from pg_stat_activity   is only 66.
>
> Thanks
>
> Danny
>
>
Lots of idle, kept-alive clients?  Do you have a connection pooler (e.g. pg-bouncer)?



Re: too many clients already

From
Adrian Klaver
Date:
On 4/2/20 8:22 AM, Abraham, Danny wrote:
> No pg-bouncer or connection pooling.
> ps -elf | grep postgres | grep idle | wc -l  ==>    61
> 
> and BTW: Running, say 500 one command psql in parallel will have the same affect..

Hmm. In psql on the cluster in question what does below return?:

show max_connections;

> 
> -----Original Message-----
> From: Rob Sargent <robjsargent@gmail.com>
> Sent: Thursday, April 02, 2020 6:10 PM
> To: Abraham, Danny <danny_abraham@bmc.com>
> Cc: pgsql-general@postgresql.org
> Subject: [EXTERNAL] Re: too many clients already
> 
> 
> 
>> On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:
>>
>> Hi,
>>
>> Will appreciate a hint here.
>>
>> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
>> and transient difficulty to log in.
>>
>> Happens on all PG versions (Tested 9.5,10.4,11.5)
>>
>> Big installation: max_connections is 1200,  shared_buffers is 2GB
>>
>> But .. select count(*) from pg_stat_activity   is only 66.
>>
>> Thanks
>>
>> Danny
>>
>>
> Lots of idle, kept-alive clients?  Do you have a connection pooler (e.g. pg-bouncer)?
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



RE: Re: too many clients already

From
"Abraham, Danny"
Date:
Big installation: max_connections is 1200,  shared_buffers is 2GB

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com> 
Sent: Thursday, April 02, 2020 6:30 PM
To: Abraham, Danny <danny_abraham@bmc.com>; pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On 4/2/20 8:22 AM, Abraham, Danny wrote:
> No pg-bouncer or connection pooling.
> ps -elf | grep postgres | grep idle | wc -l  ==>    61
> 
> and BTW: Running, say 500 one command psql in parallel will have the same affect..

Hmm. In psql on the cluster in question what does below return?:

show max_connections;

> 
> -----Original Message-----
> From: Rob Sargent <robjsargent@gmail.com>
> Sent: Thursday, April 02, 2020 6:10 PM
> To: Abraham, Danny <danny_abraham@bmc.com>
> Cc: pgsql-general@postgresql.org
> Subject: [EXTERNAL] Re: too many clients already
> 
> 
> 
>> On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:
>>
>> Hi,
>>
>> Will appreciate a hint here.
>>
>> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
>> and transient difficulty to log in.
>>
>> Happens on all PG versions (Tested 9.5,10.4,11.5)
>>
>> Big installation: max_connections is 1200,  shared_buffers is 2GB
>>
>> But .. select count(*) from pg_stat_activity   is only 66.
>>
>> Thanks
>>
>> Danny
>>
>>
> Lots of idle, kept-alive clients?  Do you have a connection pooler (e.g. pg-bouncer)?
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: too many clients already

From
Adrian Klaver
Date:
On 4/2/20 8:35 AM, Abraham, Danny wrote:
> Big installation: max_connections is 1200,  shared_buffers is 2GB

Have you confirmed that the above is actually in effect by doing?:

show max_connections;

> 
> -----Original Message-----
> From: Adrian Klaver <adrian.klaver@aklaver.com>
> Sent: Thursday, April 02, 2020 6:30 PM
> To: Abraham, Danny <danny_abraham@bmc.com>; pgsql-general@postgresql.org
> Subject: [EXTERNAL] Re: too many clients already
> 
> On 4/2/20 8:22 AM, Abraham, Danny wrote:
>> No pg-bouncer or connection pooling.
>> ps -elf | grep postgres | grep idle | wc -l  ==>    61
>>
>> and BTW: Running, say 500 one command psql in parallel will have the same affect..
> 
> Hmm. In psql on the cluster in question what does below return?:
> 
> show max_connections;
> 
>>
>> -----Original Message-----
>> From: Rob Sargent <robjsargent@gmail.com>
>> Sent: Thursday, April 02, 2020 6:10 PM
>> To: Abraham, Danny <danny_abraham@bmc.com>
>> Cc: pgsql-general@postgresql.org
>> Subject: [EXTERNAL] Re: too many clients already
>>
>>
>>
>>> On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:
>>>
>>> Hi,
>>>
>>> Will appreciate a hint here.
>>>
>>> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
>>> and transient difficulty to log in.
>>>
>>> Happens on all PG versions (Tested 9.5,10.4,11.5)
>>>
>>> Big installation: max_connections is 1200,  shared_buffers is 2GB
>>>
>>> But .. select count(*) from pg_stat_activity   is only 66.
>>>
>>> Thanks
>>>
>>> Danny
>>>
>>>
>> Lots of idle, kept-alive clients?  Do you have a connection pooler (e.g. pg-bouncer)?
>>
>>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



RE: Re: too many clients already

From
"Abraham, Danny"
Date:
va-tlv-ctm-qa22.isr.bmc.com% sql
psql: FATAL:  sorry, too many clients already
va-tlv-ctm-qa22.isr.bmc.com% sql
psql (11.5)
Type "help" for help.

ctrlmdb=> show max_connections;
 max_connections
-----------------
 1200
(1 row)

ctrlmdb=> show shared_buffers;
 shared_buffers
----------------
 2000MB
(1 row)



-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com> 
Sent: Thursday, April 02, 2020 6:37 PM
To: Abraham, Danny <danny_abraham@bmc.com>; pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On 4/2/20 8:35 AM, Abraham, Danny wrote:
> Big installation: max_connections is 1200,  shared_buffers is 2GB

Have you confirmed that the above is actually in effect by doing?:

show max_connections;

> 
> -----Original Message-----
> From: Adrian Klaver <adrian.klaver@aklaver.com>
> Sent: Thursday, April 02, 2020 6:30 PM
> To: Abraham, Danny <danny_abraham@bmc.com>; pgsql-general@postgresql.org
> Subject: [EXTERNAL] Re: too many clients already
> 
> On 4/2/20 8:22 AM, Abraham, Danny wrote:
>> No pg-bouncer or connection pooling.
>> ps -elf | grep postgres | grep idle | wc -l  ==>    61
>>
>> and BTW: Running, say 500 one command psql in parallel will have the same affect..
> 
> Hmm. In psql on the cluster in question what does below return?:
> 
> show max_connections;
> 
>>
>> -----Original Message-----
>> From: Rob Sargent <robjsargent@gmail.com>
>> Sent: Thursday, April 02, 2020 6:10 PM
>> To: Abraham, Danny <danny_abraham@bmc.com>
>> Cc: pgsql-general@postgresql.org
>> Subject: [EXTERNAL] Re: too many clients already
>>
>>
>>
>>> On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:
>>>
>>> Hi,
>>>
>>> Will appreciate a hint here.
>>>
>>> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
>>> and transient difficulty to log in.
>>>
>>> Happens on all PG versions (Tested 9.5,10.4,11.5)
>>>
>>> Big installation: max_connections is 1200,  shared_buffers is 2GB
>>>
>>> But .. select count(*) from pg_stat_activity   is only 66.
>>>
>>> Thanks
>>>
>>> Danny
>>>
>>>
>> Lots of idle, kept-alive clients?  Do you have a connection pooler (e.g. pg-bouncer)?
>>
>>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: too many clients already

From
Tom Lane
Date:
"Abraham, Danny" <danny_abraham@bmc.com> writes:
> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
> and transient difficulty to log in.
> Happens on all PG versions (Tested 9.5,10.4,11.5)
> Big installation: max_connections is 1200,  shared_buffers is 2GB
> But .. select count(*) from pg_stat_activity   is only 66.

I'd be suspicious that there are a lot of clients stuck in connection
startup (likely the authentication phase); those connections aren't going
to show in pg_stat_activity until they finish connecting.  The "ps"
suggestion Adrian gave you would not show them either, because they're
not going to say "idle".

Enabling log_connections and watching the postmaster log would help
prove or disprove that theory.

            regards, tom lane



RE: Re: too many clients already

From
"Abraham, Danny"
Date:
Well, I guess the questions is - how do I optimize PG for a stream of very short life checks...
See below:

2020-04-02 11:05:37.010 CDTLOG:  connection received: host=10.64.72.157 port=45799
2020-04-02 11:05:37.014 CDTLOG:  connection received: host=10.64.72.157 port=45814
2020-04-02 11:05:37.014 CDTLOG:  connection received: host=10.64.72.157 port=45813
2020-04-02 11:05:37.018 CDTFATAL:  sorry, too many clients already
2020-04-02 11:05:37.015 CDTLOG:  connection received: host=10.64.72.157 port=45815
2020-04-02 11:05:37.015 CDTLOG:  connection received: host=10.64.72.157 port=45817
2020-04-02 11:05:37.015 CDTLOG:  connection received: host=10.64.72.157 port=45809
2020-04-02 11:05:37.015 CDTLOG:  connection received: host=10.64.72.157 port=45818
2020-04-02 11:05:37.016 CDTLOG:  connection received: host=10.64.72.157 port=45819
2020-04-02 11:05:37.021 CDTFATAL:  sorry, too many clients already
2020-04-02 11:05:37.021 CDTFATAL:  sorry, too many clients already
2020-04-02 11:05:37.021 CDTFATAL:  sorry, too many clients already
2020-04-02 11:05:37.021 CDTFATAL:  sorry, too many clients already
2020-04-02 11:05:37.021 CDTFATAL:  sorry, too many clients already
2020-04-02 11:05:37.022 CDTFATAL:  sorry, too many clients already
2020-04-02 11:05:37.022 CDTFATAL:  sorry, too many clients already
-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Thursday, April 02, 2020 6:52 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

"Abraham, Danny" <danny_abraham@bmc.com> writes:
> Running on a big and stressed AIX platform and receiving lots of "CDTFATAL:  sorry, too many clients already"
> and transient difficulty to log in.
> Happens on all PG versions (Tested 9.5,10.4,11.5) Big installation:
> max_connections is 1200,  shared_buffers is 2GB
> But .. select count(*) from pg_stat_activity   is only 66.

I'd be suspicious that there are a lot of clients stuck in connection startup (likely the authentication phase); those
connectionsaren't going to show in pg_stat_activity until they finish connecting.  The "ps" 
suggestion Adrian gave you would not show them either, because they're not going to say "idle".

Enabling log_connections and watching the postmaster log would help prove or disprove that theory.

            regards, tom lane



Re: too many clients already

From
Tom Lane
Date:
"Abraham, Danny" <danny_abraham@bmc.com> writes:
> Well, I guess the questions is - how do I optimize PG for a stream of very short life checks...

You should be using a connection pooler for a load like that.
PG backends are fairly heavyweight things --- you don't want
to fire one up for just a single query, at least not when
there are many such queries per second.

I think pgbouncer and pgpool are the most widely used options,
but this is a bit outside my expertise.

            regards, tom lane



RE: Re: too many clients already

From
"Abraham, Danny"
Date:
Agree.

I suspect that this is a mal configured pgpool - the developer thinks that the pool is reusing connections,
While it is, in fact, reopening them.

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Thursday, April 02, 2020 7:40 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

"Abraham, Danny" <danny_abraham@bmc.com> writes:
> Well, I guess the questions is - how do I optimize PG for a stream of very short life checks...

You should be using a connection pooler for a load like that.
PG backends are fairly heavyweight things --- you don't want to fire one up for just a single query, at least not when
thereare many such queries per second. 

I think pgbouncer and pgpool are the most widely used options, but this is a bit outside my expertise.

            regards, tom lane