Thread: Determining server load

Determining server load

From
Israel Brewster
Date:
I have a Postgresql (9.4.6) cluster that hosts several databases, used by about half-a-dozen different in-house apps. I have two servers set up as master-slave with streaming replication. Lately I've been running into an issue where one of the apps periodically can't connect to the db. The problem is always extremely short lived (less than a minute), such that by the time I can look into it, there is no issue. My *suspicion* is that I am briefly hitting the max connection limit of my server (currently set at 100). If so, I can certainly *fix* the issue easily by increasing the connection limit, but I have two questions about this:

1) Is there some way I can track concurrent connections to see if my theory is correct? I know I can do a count(*) on pg_stat_activity to get the current number of connections at any point (currently at 45 BTW), but aside from repeatedly querying this at short intervals, which I am afraid would put undue load on the server by the time it is frequent enough to be of use, I don't know how to track concurrent connections.

I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of those connections are probably very short lived that doesn't really tell me anything about concurrent connections.

2) Is increasing the connection limit even the "proper" fix for this, or am I at a load point where I need to start looking at tools like pgpool or something to distribute some of the load to my hot standby server? I do realize you may not be able to answer that directly, since I haven't given enough information about my server/hardware/load, etc, but answers that tell me how to better look at the load over time and figure out if I am overloaded are appreciated.

For reference, the server is running on the following hardware:

2x 8-core Xeon E5-2630 v3 2.4 GHz processor (so plenty of horsepower there)
32 GB Ram total, currently with 533144k showing as "free" and 370464k of swap used 
371 GB SSD RAID 10 (currently only using 40GB of space)
Dual Gigabit ethernet

Thanks for any advice that can be provided!
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------



Attachment

Re: Determining server load

From
John R Pierce
Date:
On 9/27/2016 9:54 AM, Israel Brewster wrote:
>
> I did look at pgbadger, which tells me I have gotten as high as 62
> connections/second, but given that most of those connections are
> probably very short lived that doesn't really tell me anything about
> concurrent connections.

Each connection requires a process fork of the database server, which is
very expensive.  you might consider using a connection pool such as
pgbouncer, to maintain a fixed(dynamic) number of real database
connections, and have your apps connect/disconnect to this pool.
Obviously, you need a pool for each database, and your apps need to be
'stateless' and not make or rely on any session changes to the
connection so they don't interfere with each other.   Doing this
correctly can make an huge performance improvement on the sort of apps
that do (connect, transaction, disconnect) a lot.



--
john r pierce, recycling bits in santa cruz



Re: Determining server load

From
Adrian Klaver
Date:
On 09/27/2016 09:54 AM, Israel Brewster wrote:
> I have a Postgresql (9.4.6) cluster that hosts several databases, used
> by about half-a-dozen different in-house apps. I have two servers set up
> as master-slave with streaming replication. Lately I've been running
> into an issue where one of the apps periodically can't connect to the
> db. The problem is always extremely short lived (less than a minute),
> such that by the time I can look into it, there is no issue. My
> *suspicion* is that I am briefly hitting the max connection limit of my
> server (currently set at 100). If so, I can certainly *fix* the issue
> easily by increasing the connection limit, but I have two questions
> about this:

What does your Postgres log show around this time?

>
> 1) Is there some way I can track concurrent connections to see if my
> theory is correct? I know I can do a count(*) on pg_stat_activity to get
> the current number of connections at any point (currently at 45 BTW),
> but aside from repeatedly querying this at short intervals, which I am
> afraid would put undue load on the server by the time it is frequent
> enough to be of use, I don't know how to track concurrent connections.
>
> I did look at pgbadger, which tells me I have gotten as high as 62
> connections/second, but given that most of those connections are
> probably very short lived that doesn't really tell me anything about
> concurrent connections.
>
> 2) Is increasing the connection limit even the "proper" fix for this, or
> am I at a load point where I need to start looking at tools like pgpool
> or something to distribute some of the load to my hot standby server? I
> do realize you may not be able to answer that directly, since I haven't
> given enough information about my server/hardware/load, etc, but answers
> that tell me how to better look at the load over time and figure out if
> I am overloaded are appreciated.
>
> For reference, the server is running on the following hardware:
>
> 2x 8-core Xeon E5-2630 v3 2.4 GHz processor (so plenty of horsepower there)
> 32 GB Ram total, currently with 533144k showing as "free" and 370464k of
> swap used
> 371 GB SSD RAID 10 (currently only using 40GB of space)
> Dual Gigabit ethernet
>
> Thanks for any advice that can be provided!
> -----------------------------------------------
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293
> -----------------------------------------------
>
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Determining server load

From
Israel Brewster
Date:
On Sep 27, 2016, at 10:07 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 09/27/2016 09:54 AM, Israel Brewster wrote:
>> I have a Postgresql (9.4.6) cluster that hosts several databases, used
>> by about half-a-dozen different in-house apps. I have two servers set up
>> as master-slave with streaming replication. Lately I've been running
>> into an issue where one of the apps periodically can't connect to the
>> db. The problem is always extremely short lived (less than a minute),
>> such that by the time I can look into it, there is no issue. My
>> *suspicion* is that I am briefly hitting the max connection limit of my
>> server (currently set at 100). If so, I can certainly *fix* the issue
>> easily by increasing the connection limit, but I have two questions
>> about this:
>
> What does your Postgres log show around this time?

So in looking further, I realized the actual error I was getting was "no route to host", which is obviously a
networkingissue and not a postgres issue - could not connect was only the end result. The logs then, of course, show
normaloperation. That said, now that I am thinking about it, I'm still curious as to how I can track concurrent
connections,with the revised goal of simply seeing how heavily loaded my server really is, and when tools such as
pgpoolor the pgbouncer that another user mentioned start making sense for the number of connections I am dealing with.
Thanks.
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

>
>>
>> 1) Is there some way I can track concurrent connections to see if my
>> theory is correct? I know I can do a count(*) on pg_stat_activity to get
>> the current number of connections at any point (currently at 45 BTW),
>> but aside from repeatedly querying this at short intervals, which I am
>> afraid would put undue load on the server by the time it is frequent
>> enough to be of use, I don't know how to track concurrent connections.
>>
>> I did look at pgbadger, which tells me I have gotten as high as 62
>> connections/second, but given that most of those connections are
>> probably very short lived that doesn't really tell me anything about
>> concurrent connections.
>>
>> 2) Is increasing the connection limit even the "proper" fix for this, or
>> am I at a load point where I need to start looking at tools like pgpool
>> or something to distribute some of the load to my hot standby server? I
>> do realize you may not be able to answer that directly, since I haven't
>> given enough information about my server/hardware/load, etc, but answers
>> that tell me how to better look at the load over time and figure out if
>> I am overloaded are appreciated.
>>
>> For reference, the server is running on the following hardware:
>>
>> 2x 8-core Xeon E5-2630 v3 2.4 GHz processor (so plenty of horsepower there)
>> 32 GB Ram total, currently with 533144k showing as "free" and 370464k of
>> swap used
>> 371 GB SSD RAID 10 (currently only using 40GB of space)
>> Dual Gigabit ethernet
>>
>> Thanks for any advice that can be provided!
>> -----------------------------------------------
>> Israel Brewster
>> Systems Analyst II
>> Ravn Alaska
>> 5245 Airport Industrial Rd
>> Fairbanks, AK 99709
>> (907) 450-7293
>> -----------------------------------------------
>>
>>
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



Re: Determining server load

From
Melvin Davidson
Date:


On Tue, Sep 27, 2016 at 2:25 PM, Israel Brewster <israel@ravnalaska.net> wrote:
On Sep 27, 2016, at 10:07 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 09/27/2016 09:54 AM, Israel Brewster wrote:
>> I have a Postgresql (9.4.6) cluster that hosts several databases, used
>> by about half-a-dozen different in-house apps. I have two servers set up
>> as master-slave with streaming replication. Lately I've been running
>> into an issue where one of the apps periodically can't connect to the
>> db. The problem is always extremely short lived (less than a minute),
>> such that by the time I can look into it, there is no issue. My
>> *suspicion* is that I am briefly hitting the max connection limit of my
>> server (currently set at 100). If so, I can certainly *fix* the issue
>> easily by increasing the connection limit, but I have two questions
>> about this:
>
> What does your Postgres log show around this time?

So in looking further, I realized the actual error I was getting was "no route to host", which is obviously a networking issue and not a postgres issue - could not connect was only the end result. The logs then, of course, show normal operation. That said, now that I am thinking about it, I'm still curious as to how I can track concurrent connections, with the revised goal of simply seeing how heavily loaded my server really is, and when tools such as pgpool or the pgbouncer that another user mentioned start making sense for the number of connections I am dealing with. Thanks.
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

>
>>
>> 1) Is there some way I can track concurrent connections to see if my
>> theory is correct? I know I can do a count(*) on pg_stat_activity to get
>> the current number of connections at any point (currently at 45 BTW),
>> but aside from repeatedly querying this at short intervals, which I am
>> afraid would put undue load on the server by the time it is frequent
>> enough to be of use, I don't know how to track concurrent connections.
>>
>> I did look at pgbadger, which tells me I have gotten as high as 62
>> connections/second, but given that most of those connections are
>> probably very short lived that doesn't really tell me anything about
>> concurrent connections.
>>
>> 2) Is increasing the connection limit even the "proper" fix for this, or
>> am I at a load point where I need to start looking at tools like pgpool
>> or something to distribute some of the load to my hot standby server? I
>> do realize you may not be able to answer that directly, since I haven't
>> given enough information about my server/hardware/load, etc, but answers
>> that tell me how to better look at the load over time and figure out if
>> I am overloaded are appreciated.
>>
>> For reference, the server is running on the following hardware:
>>
>> 2x 8-core Xeon E5-2630 v3 2.4 GHz processor (so plenty of horsepower there)
>> 32 GB Ram total, currently with 533144k showing as "free" and 370464k of
>> swap used
>> 371 GB SSD RAID 10 (currently only using 40GB of space)
>> Dual Gigabit ethernet
>>
>> Thanks for any advice that can be provided!
>> -----------------------------------------------
>> Israel Brewster
>> Systems Analyst II
>> Ravn Alaska
>> 5245 Airport Industrial Rd
>> Fairbanks, AK 99709
>> (907) 450-7293
>> -----------------------------------------------
>>
>>
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

>I'm still curious as to how I can track concurrent connections, ...

Have you considered enabling the following  in postgresql.conf?
log_connections=on
log_disconnections=on

It will put a bit of a bloat in you postgres log, but it will all allow you extract connects/disconnects over a time range. That should allow you
to determine concurrent connections during that that.


Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Determining server load

From
Israel Brewster
Date:
On Sep 27, 2016, at 9:55 AM, John R Pierce <pierce@hogranch.com> wrote:
>
> On 9/27/2016 9:54 AM, Israel Brewster wrote:
>>
>> I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of those
connectionsare probably very short lived that doesn't really tell me anything about concurrent connections. 
>
> Each connection requires a process fork of the database server, which is very expensive.  you might consider using a
connectionpool such as pgbouncer, to maintain a fixed(dynamic) number of real database connections, and have your apps
connect/disconnectto this pool.    Obviously, you need a pool for each database, and your apps need to be 'stateless'
andnot make or rely on any session changes to the connection so they don't interfere with each other.   Doing this
correctlycan make an huge performance improvement on the sort of apps that do (connect, transaction, disconnect) a lot. 

Understood. My main *performance critical* apps all use an internal connection pool for this reason - Python's psycopg2
pool,to be exact. I still see a lot of connects/disconnects, but I *think* that's psycopg2 recycling connections in the
background- I'm not 100% certain how the pools there work (and maybe they need some tweaking as well, i.e. setting to
re-useconnections more times or something). The apps that don't use pools are typically data-gathering scripts where it
doesn'tmater how long it takes to connect/write the data (within reason). 

That said, it seems highly probable, if not a given, that there comes a point where the overhead of handling all those
connectionsstarts slowing things down, and not just for the new connection being made. How to figure out where that
pointis for my system, and how close to it I am at the moment, is a large part of what I am wondering. 

Note also that I did realize I was completely wrong about the initial issue - it turned out it was a network issue, not
apostgresql one. Still, I think my specific questions still apply, if only in an academic sense now :-) 

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------


>
>
>
> --
> john r pierce, recycling bits in santa cruz
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



Re: Determining server load

From
Israel Brewster
Date:
>I'm still curious as to how I can track concurrent connections, ...

Have you considered enabling the following  in postgresql.conf?
log_connections=on
log_disconnections=on

It will put a bit of a bloat in you postgres log, but it will all allow you extract connects/disconnects over a time range. That should allow you
to determine concurrent connections during that that.

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------



Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: Determining server load

From
Adrian Klaver
Date:
On 09/27/2016 11:40 AM, Israel Brewster wrote:
> On Sep 27, 2016, at 9:55 AM, John R Pierce <pierce@hogranch.com> wrote:
>>
>> On 9/27/2016 9:54 AM, Israel Brewster wrote:
>>>
>>> I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of those
connectionsare probably very short lived that doesn't really tell me anything about concurrent connections. 
>>
>> Each connection requires a process fork of the database server, which is very expensive.  you might consider using a
connectionpool such as pgbouncer, to maintain a fixed(dynamic) number of real database connections, and have your apps
connect/disconnectto this pool.    Obviously, you need a pool for each database, and your apps need to be 'stateless'
andnot make or rely on any session changes to the connection so they don't interfere with each other.   Doing this
correctlycan make an huge performance improvement on the sort of apps that do (connect, transaction, disconnect) a lot. 
>
> Understood. My main *performance critical* apps all use an internal connection pool for this reason - Python's
psycopg2pool, to be exact. I still see a lot of connects/disconnects, but I *think* that's psycopg2 recycling
connectionsin the background - I'm not 100% certain how the pools there work (and maybe they need some tweaking as
well,i.e. setting to re-use connections more times or something). The apps that don't use pools are typically
data-gatheringscripts where it doesn't mater how long it takes to connect/write the data (within reason). 

http://initd.org/psycopg/docs/pool.html

"Note

This pool class is mostly designed to interact with Zope and probably
not useful in generic applications. "

Are you using Zope?

>
> That said, it seems highly probable, if not a given, that there comes a point where the overhead of handling all
thoseconnections starts slowing things down, and not just for the new connection being made. How to figure out where
thatpoint is for my system, and how close to it I am at the moment, is a large part of what I am wondering. 
>
> Note also that I did realize I was completely wrong about the initial issue - it turned out it was a network issue,
nota postgresql one. Still, I think my specific questions still apply, if only in an academic sense now :-) 
>
> -----------------------------------------------
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293
> -----------------------------------------------
>
>
>>
>>
>>
>> --
>> john r pierce, recycling bits in santa cruz
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Determining server load

From
Jonathan Vanasco
Date:

On Sep 27, 2016, at 2:46 PM, Israel Brewster wrote:

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

There are a lot of postgres configs and server specific tools... but on the application side and for general debugging, have you looked at statsd ?  https://github.com/etsy/statsd

it's a lightweight node.js app that runs on your server and listens for UDP signals, which your apps can emit for counting or timing.  We have a ton of Python apps logging to it, including every postgres connection open/close and error.  The overhead of clients and server is negligible.  When combined with the graphite app for browsing data via charts, it becomes really useful at detecting issues with load or errors stemming from a deployment  -- you just look for spikes and cliffs.  We even use it to log the volume of INSERTS vs SELECTS vs UPDATES being sent to postgres.

The more services/apps you run, the more useful it gets, as you can figure out which apps/deployments are screwing up postgres and the exact moment things went wrong.

Re: Determining server load

From
Israel Brewster
Date:
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------




> On Sep 27, 2016, at 10:48 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 09/27/2016 11:40 AM, Israel Brewster wrote:
>> On Sep 27, 2016, at 9:55 AM, John R Pierce <pierce@hogranch.com> wrote:
>>>
>>> On 9/27/2016 9:54 AM, Israel Brewster wrote:
>>>>
>>>> I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of
thoseconnections are probably very short lived that doesn't really tell me anything about concurrent connections. 
>>>
>>> Each connection requires a process fork of the database server, which is very expensive.  you might consider using
aconnection pool such as pgbouncer, to maintain a fixed(dynamic) number of real database connections, and have your
appsconnect/disconnect to this pool.    Obviously, you need a pool for each database, and your apps need to be
'stateless'and not make or rely on any session changes to the connection so they don't interfere with each other.
Doingthis correctly can make an huge performance improvement on the sort of apps that do (connect, transaction,
disconnect)a lot. 
>>
>> Understood. My main *performance critical* apps all use an internal connection pool for this reason - Python's
psycopg2pool, to be exact. I still see a lot of connects/disconnects, but I *think* that's psycopg2 recycling
connectionsin the background - I'm not 100% certain how the pools there work (and maybe they need some tweaking as
well,i.e. setting to re-use connections more times or something). The apps that don't use pools are typically
data-gatheringscripts where it doesn't mater how long it takes to connect/write the data (within reason). 
>
> http://initd.org/psycopg/docs/pool.html
>
> "Note
>
> This pool class is mostly designed to interact with Zope and probably not useful in generic applications. "
>
> Are you using Zope?

You'll notice that note only applies to the PersistentConnectionPool, not the ThreadedConnectionPool (Which has a note
sayingthat it can be safely used in multi-threaded applications), or the SimpleConnectionPool (which is useful only for
single-threadedapplications). Since I'm not using Zope, and do have multi-threaded applications, I'm naturally using
theThreadedConnectionPool :-) 

>
>>
>> That said, it seems highly probable, if not a given, that there comes a point where the overhead of handling all
thoseconnections starts slowing things down, and not just for the new connection being made. How to figure out where
thatpoint is for my system, and how close to it I am at the moment, is a large part of what I am wondering. 
>>
>> Note also that I did realize I was completely wrong about the initial issue - it turned out it was a network issue,
nota postgresql one. Still, I think my specific questions still apply, if only in an academic sense now :-) 
>>
>> -----------------------------------------------
>> Israel Brewster
>> Systems Analyst II
>> Ravn Alaska
>> 5245 Airport Industrial Rd
>> Fairbanks, AK 99709
>> (907) 450-7293
>> -----------------------------------------------
>>
>>
>>>
>>>
>>>
>>> --
>>> john r pierce, recycling bits in santa cruz
>>>
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Attachment

Re: Determining server load

From
Israel Brewster
Date:
On Sep 27, 2016, at 10:55 AM, Jonathan Vanasco <postgres@2xlp.com> wrote:


On Sep 27, 2016, at 2:46 PM, Israel Brewster wrote:

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

There are a lot of postgres configs and server specific tools... but on the application side and for general debugging, have you looked at statsd ?  https://github.com/etsy/statsd

it's a lightweight node.js app that runs on your server and listens for UDP signals, which your apps can emit for counting or timing.  We have a ton of Python apps logging to it, including every postgres connection open/close and error.  The overhead of clients and server is negligible.  When combined with the graphite app for browsing data via charts, it becomes really useful at detecting issues with load or errors stemming from a deployment  -- you just look for spikes and cliffs.  We even use it to log the volume of INSERTS vs SELECTS vs UPDATES being sent to postgres.

The more services/apps you run, the more useful it gets, as you can figure out which apps/deployments are screwing up postgres and the exact moment things went wrong.


That sounds quite promising. I'll look into it. Thanks!


-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

Re: Determining server load

From
Melvin Davidson
Date:


On Tue, Sep 27, 2016 at 2:46 PM, Israel Brewster <israel@ravnalaska.net> wrote:
>I'm still curious as to how I can track concurrent connections, ...

Have you considered enabling the following  in postgresql.conf?
log_connections=on
log_disconnections=on

It will put a bit of a bloat in you postgres log, but it will all allow you extract connects/disconnects over a time range. That should allow you
to determine concurrent connections during that that.

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
-----------------------------------------------



Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.



Does this help?

--Total concurrent connections
SELECT COUNT(*)
  FROM pg_stat_activity;

--concurrent connections by user
SELECT usename,
       count(*)
 FROM pg_stat_activity
GROUP BY 1
ORDER BY 1;

--concurrent connections by database
SELECT datname,
       usename,
       count(*)
 FROM pg_stat_activity
GROUP BY 1, 2
ORDER BY 1, 2;

-- database connections by user
SELECT usename,
       datname,
       count(*)
 FROM pg_stat_activity
GROUP BY 1, 2
ORDER BY 1, 2;

--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Determining server load

From
Adrian Klaver
Date:
On 09/27/2016 12:01 PM, Israel Brewster wrote:
>
> -----------------------------------------------
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293
> -----------------------------------------------
>
>
>
>
>
>
>> On Sep 27, 2016, at 10:48 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> On 09/27/2016 11:40 AM, Israel Brewster wrote:
>>> On Sep 27, 2016, at 9:55 AM, John R Pierce <pierce@hogranch.com> wrote:
>>>>
>>>> On 9/27/2016 9:54 AM, Israel Brewster wrote:
>>>>>
>>>>> I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of
thoseconnections are probably very short lived that doesn't really tell me anything about concurrent connections. 
>>>>
>>>> Each connection requires a process fork of the database server, which is very expensive.  you might consider using
aconnection pool such as pgbouncer, to maintain a fixed(dynamic) number of real database connections, and have your
appsconnect/disconnect to this pool.    Obviously, you need a pool for each database, and your apps need to be
'stateless'and not make or rely on any session changes to the connection so they don't interfere with each other.
Doingthis correctly can make an huge performance improvement on the sort of apps that do (connect, transaction,
disconnect)a lot. 
>>>
>>> Understood. My main *performance critical* apps all use an internal connection pool for this reason - Python's
psycopg2pool, to be exact. I still see a lot of connects/disconnects, but I *think* that's psycopg2 recycling
connectionsin the background - I'm not 100% certain how the pools there work (and maybe they need some tweaking as
well,i.e. setting to re-use connections more times or something). The apps that don't use pools are typically
data-gatheringscripts where it doesn't mater how long it takes to connect/write the data (within reason). 
>>
>> http://initd.org/psycopg/docs/pool.html
>>
>> "Note
>>
>> This pool class is mostly designed to interact with Zope and probably not useful in generic applications. "
>>
>> Are you using Zope?
>
> You'll notice that note only applies to the PersistentConnectionPool, not the ThreadedConnectionPool (Which has a
notesaying that it can be safely used in multi-threaded applications), or the SimpleConnectionPool (which is useful
onlyfor single-threaded applications). Since I'm not using Zope, and do have multi-threaded applications, I'm naturally
usingthe ThreadedConnectionPool :-) 

Oops, did not catch that.


>
>>
>>>
>>> That said, it seems highly probable, if not a given, that there comes a point where the overhead of handling all
thoseconnections starts slowing things down, and not just for the new connection being made. How to figure out where
thatpoint is for my system, and how close to it I am at the moment, is a large part of what I am wondering. 
>>>
>>> Note also that I did realize I was completely wrong about the initial issue - it turned out it was a network issue,
nota postgresql one. Still, I think my specific questions still apply, if only in an academic sense now :-) 
>>>
>>> -----------------------------------------------
>>> Israel Brewster
>>> Systems Analyst II
>>> Ravn Alaska
>>> 5245 Airport Industrial Rd
>>> Fairbanks, AK 99709
>>> (907) 450-7293
>>> -----------------------------------------------
>>>
>>>
>>>>
>>>>
>>>>
>>>> --
>>>> john r pierce, recycling bits in santa cruz
>>>>
>>>>
>>>>
>>>> --
>>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Determining server load

From
Israel Brewster
Date:
On Sep 27, 2016, at 10:55 AM, Melvin Davidson <melvin6925@gmail.com> wrote:



On Tue, Sep 27, 2016 at 2:46 PM, Israel Brewster <israel@ravnalaska.net> wrote:
>I'm still curious as to how I can track concurrent connections, ...

Have you considered enabling the following  in postgresql.conf?
log_connections=on
log_disconnections=on

It will put a bit of a bloat in you postgres log, but it will all allow you extract connects/disconnects over a time range. That should allow you
to determine concurrent connections during that that.

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
-----------------------------------------------



Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.



Does this help?

--Total concurrent connections
SELECT COUNT(*)
  FROM pg_stat_activity;

--concurrent connections by user
SELECT usename,
       count(*)
 FROM pg_stat_activity
GROUP BY 1
ORDER BY 1;

--concurrent connections by database
SELECT datname,
       usename,
       count(*)
 FROM pg_stat_activity
GROUP BY 1, 2
ORDER BY 1, 2;

-- database connections by user
SELECT usename,
       datname,
       count(*)
 FROM pg_stat_activity
GROUP BY 1, 2
ORDER BY 1, 2;

--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


That helps for one-time stat collection, but as I mentioned in my original message, since connections may not last long, I could be getting close to, or even hitting, my connection limit while still getting values back from those that show plenty of connections remaining, depending on how often I checked.

I guess what would be ideal in my mind is that whenever Postgresql logged an opened/closed connection, it also looked the *total* number of open connections at that time. I don't think that's possible, however :-)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

Re: Determining server load

From
John R Pierce
Date:
On 9/27/2016 12:06 PM, Israel Brewster wrote:
> That helps for one-time stat collection, but as I mentioned in my
> original message, since connections may not last long, I could be
> getting close to, or even hitting, my connection limit while still
> getting values back from those that show plenty of connections
> remaining, depending on how often I checked.
>
> I guess what would be ideal in my mind is that whenever Postgresql
> logged an opened/closed connection, it also looked the *total* number
> of open connections at that time. I don't think that's possible,
> however :-)

if you stick pgbouncer in front of postgres (with a pool for each
user@database), I believe you CAN track the max connections via
pgbouncer's pool stats.


--
john r pierce, recycling bits in santa cruz



Re: Determining server load

From
Israel Brewster
Date:
On Sep 27, 2016, at 11:16 AM, John R Pierce <pierce@hogranch.com> wrote:
>
> On 9/27/2016 12:06 PM, Israel Brewster wrote:
>> That helps for one-time stat collection, but as I mentioned in my original message, since connections may not last
long,I could be getting close to, or even hitting, my connection limit while still getting values back from those that
showplenty of connections remaining, depending on how often I checked. 
>>
>> I guess what would be ideal in my mind is that whenever Postgresql logged an opened/closed connection, it also
lookedthe *total* number of open connections at that time. I don't think that's possible, however :-) 
>
> if you stick pgbouncer in front of postgres (with a pool for each user@database), I believe you CAN track the max
connectionsvia pgbouncer's pool stats. 

Ahh! If so, that alone would be reason enough for using pgbouncer. Thanks!

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------


>
>
> --
> john r pierce, recycling bits in santa cruz
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general