Thread: Big number of connections
Hello! We are going to build system based on PostgreSQL database for huge number of individual users (few thousands). Each user will have his own account, for authorization we will use Kerberos (MIT or Windows). Most of users will have low activity, but for various reasons, connection should be open all the time. I'd like to know what potential problems and limitations we can expect with such deployment. During preliminary testing we have found that for each connection we need ~1MB RAM. Is there any way to decrease this ? Is there any risk, that such number of users will degrade performance ? I'll be happy to hear any remarks and suggestions related to design, administration and handling of such installation. best regards Jarek
-----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Jarek Sent: Thursday, March 31, 2016 3:08 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Big number of connections Hello! We are going to build system based on PostgreSQL database for huge number of individual users (few thousands). Each userwill have his own account, for authorization we will use Kerberos (MIT or Windows). Most of users will have low activity, but for various reasons, connection should be open all the time. I'd like to know what potential problems and limitations we can expect with such deployment. During preliminary testing we have found that for each connection we need ~1MB RAM. Is there any way to decrease this? Is there any risk, that such number of users will degrade performance ? I'll be happy to hear any remarks and suggestions related to design, administration and handling of such installation. best regards Jarek _______________________________________________________________________________ Take a look at PgBouncer. It should solve your problems. Regards, Igor Neyman
On 03/31/2016 03:12 PM, Igor Neyman wrote: > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Jarek > Sent: Thursday, March 31, 2016 3:08 PM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Big number of connections > > Hello! > > We are going to build system based on PostgreSQL database for huge number of individual users (few thousands). Eachuser will have his own account, for authorization we will use Kerberos (MIT or Windows). > Most of users will have low activity, but for various reasons, connection should be open all the time. > I'd like to know what potential problems and limitations we can expect with such deployment. > During preliminary testing we have found that for each connection we need ~1MB RAM. Is there any way to decrease this? Is there any risk, that such number of users will degrade performance ? > I'll be happy to hear any remarks and suggestions related to design, administration and handling of such installation. > > best regards > Jarek > > _______________________________________________________________________________ > > Take a look at PgBouncer. > It should solve your problems. > If they are going to keep the client connections open, they would need to run pgbouncer in statement or transaction mode. cheers andrew
Andrew Dunstan wrote: > On 03/31/2016 03:12 PM, Igor Neyman wrote: > > > We are going to build system based on PostgreSQL database for huge number of individual users (few thousands). Eachuser will have his own account, for authorization we will use Kerberos (MIT or Windows). > > >Most of users will have low activity, but for various reasons, connection should be open all the time. > > >I'd like to know what potential problems and limitations we can expect with such deployment. > > > During preliminary testing we have found that for each connection we need ~1MB RAM. Is there any way to decreasethis ? Is there any risk, that such number of users will degrade performance ? > > > I'll be happy to hear any remarks and suggestions related to design, administration and handling of such installation. > >Take a look at PgBouncer. > >It should solve your problems. > > If they are going to keep the client connections open, they would need to > run pgbouncer in statement or transaction mode. As I understand, in pgbouncer you cannot have connections that serve different users. If each individual requires its own database-level user, pgbouncer would not help at all. I would look seriously into getting rid of the always-open requirement for connections. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Although somewhat academic, since pgboucer doesn’t support it (and might not ever), have a look at this ticket which, if it was ever supported, would give you what you needed:
On Mar 31, 2016, at 15:47, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:If they are going to keep the client connections open, they would need to
run pgbouncer in statement or transaction mode.
As I understand, in pgbouncer you cannot have connections that serve
different users. If each individual requires its own database-level
user, pgbouncer would not help at all.
I would look seriously into getting rid of the always-open requirement
for connections.
— Chris Cogdon
On 3/31/2016 17:47, Alvaro Herrera wrote:
I'm trying to figure out where the "always open" requirement comes from; there are very, very few instances where that's real, when you get down to it.Andrew Dunstan wrote:On 03/31/2016 03:12 PM, Igor Neyman wrote:We are going to build system based on PostgreSQL database for huge number of individual users (few thousands). Each user will have his own account, for authorization we will use Kerberos (MIT or Windows). Most of users will have low activity, but for various reasons, connection should be open all the time. I'd like to know what potential problems and limitations we can expect with such deployment.During preliminary testing we have found that for each connection we need ~1MB RAM. Is there any way to decrease this ? Is there any risk, that such number of users will degrade performance ?I'll be happy to hear any remarks and suggestions related to design, administration and handling of such installation.Take a look at PgBouncer. It should solve your problems.If they are going to keep the client connections open, they would need to run pgbouncer in statement or transaction mode.As I understand, in pgbouncer you cannot have connections that serve different users. If each individual requires its own database-level user, pgbouncer would not help at all. I would look seriously into getting rid of the always-open requirement for connections.
Attachment
Hello! Dnia 2016-03-31, czw o godzinie 19:12 +0000, Igor Neyman pisze: > Take a look at PgBouncer. > It should solve your problems. Well, we don't have problems yet :), but we are looking for possible threats. I'll be happy to hear form users of big PostgreSQL installations, how many users do you have and what kind of problems we may expect. Is there any risk, that huge number of roles will slowdown overall performance ? best regards Jarek
On 4/1/16 2:54 AM, jarek wrote: > I'll be happy to hear form users of big PostgreSQL installations, how > many users do you have and what kind of problems we may expect. > Is there any risk, that huge number of roles will slowdown overall > performance ? The red flag from your original email was concern over each connection consuming 1MB of memory. If you're so tight on memory that you can't afford 4GB of backend-local data, then I don't think you'll be happy with any method of trying to handle 4000 concurrent connections. Assuming you're on decent sized hardware though, 3000-4000 open connections shouldn't be much of an issue *as long as very few are active at once*. If you get into a situation where there's a surge of activity and you suddenly have 2x more active connections than cores, you won't be happy. I've seen that push servers into a state where the only way to recover was to disconnect everyone. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
From: Jim Nasby Sent: Sunday, April 03, 2016 10:19 AM >>On 4/1/16 2:54 AM, jarek wrote: >> I'll be happy to hear form users of big PostgreSQL installations, how >> many users do you have and what kind of problems we may expect. >> Is there any risk, that huge number of roles will slowdown overall >> performance ? >Assuming you're on decent sized hardware though, 3000-4000 open connections shouldn't be much of an >issue *as long as veryfew are active at once*. If you get into a situation where there's a surge of activity >and you suddenly have 2x moreactive connections than cores, you won't be happy. I've seen that push >servers into a state where the only way to recoverwas to disconnect everyone. >-- >Jim Nasby Jim - I don't quite understand the math here: on a server with 20 cores, it can only support 40 active users? I come from the SQL Server world where a single 20 core server could support hundreds/thousands of active users and/or manydozens of background/foreground data processes. Is there something fundamentally different between the two platformsrelative to active user loads? How would we be able to use Postgres for larger web apps? Mike Sofen
Hi
2016-04-04 15:14 GMT+02:00 Mike Sofen <msofen@runbox.com>:
From: Jim Nasby Sent: Sunday, April 03, 2016 10:19 AM
>>On 4/1/16 2:54 AM, jarek wrote:
>> I'll be happy to hear form users of big PostgreSQL installations, how
>> many users do you have and what kind of problems we may expect.
>> Is there any risk, that huge number of roles will slowdown overall
>> performance ?
>Assuming you're on decent sized hardware though, 3000-4000 open connections shouldn't be much of an >issue *as long as very few are active at once*. If you get into a situation where there's a surge of activity >and you suddenly have 2x more active connections than cores, you won't be happy. I've seen that push >servers into a state where the only way to recover was to disconnect everyone.
>--
>Jim Nasby
Jim - I don't quite understand the math here: on a server with 20 cores, it can only support 40 active users?
I come from the SQL Server world where a single 20 core server could support hundreds/thousands of active users and/or many dozens of background/foreground data processes. Is there something fundamentally different between the two platforms relative to active user loads? How would we be able to use Postgres for larger web apps?
PostgreSQL doesn't contain integrated pooler - so any connection to Postgres enforces one PostgreSQL proces. A performance benchmarks is showing maximum performance about 10x cores. With high number of connections you have to use low size of work_mem, what enforces can have negative impact on performance too. Too high number of active PostgreSQL processes increase a risk of performance problems with spin locks, etc.
Usually Web frameworks has own pooling solution - so just use it. If you need more logical connection than is optimum against number of cores, then you should to use external pooler like pgpool II or pgbouncer.
http://www.pgpool.net/mediawiki/index.php/Main_Page
http://pgbouncer.github.io/
http://www.pgpool.net/mediawiki/index.php/Main_Page
http://pgbouncer.github.io/
Pgbouncer is light with only necessary functions, pgpool is little bit heavy with lot of functions.
Regards
Pavel
Mike Sofen
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Il 04/04/2016 15:33, Pavel Stehule ha scritto: > > > PostgreSQL doesn't contain integrated pooler - so any connection to > Postgres enforces one PostgreSQL proces. A performance benchmarks is > showing maximum performance about 10x cores. With high number of > connections you have to use low size of work_mem, what enforces can > have negative impact on performance too. Too high number of active > PostgreSQL processes increase a risk of performance problems with spin > locks, etc. :-O I wasn't absolutely aware of this thing... is there a way to monitor active connections, or at least to report when they grow too much? (say, I have an 8-core system and want to track down if, and when, active connections grow over 80) Thanks Moreno.-
2016-04-04 17:43 GMT+03:00 Moreno Andreo <moreno.andreo@evolu-s.it>:
s there a way to monitor active connections, or at least to report when they grow too much?
(say, I have an 8-core system and want to track down if, and when, active connections grow over 80)
You can achieve that just running simple query like
select count(*) from pg_stat_activity where state = 'active'
<div class="moz-cite-prefix">Il 04/04/2016 16:54, Artem Tomyuk ha scritto:<br /></div><blockquote cite="mid:CANYYVqKn9i9kkmxCBwUcd6g5FbgXmTBRxYrAVD2p6wLgLh4BrA@mail.gmail.com"type="cite"><div dir="ltr"><div class="gmail_extra"><br/><div class="gmail_quote">2016-04-04 17:43 GMT+03:00 Moreno Andreo <span dir="ltr"><<a href="mailto:moreno.andreo@evolu-s.it"moz-do-not-send="true" target="_blank">moreno.andreo@evolu-s.it</a>></span>:<br/><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="" id=":3ii"style="overflow:hidden">s there a way to monitor active connections, or at least to report when they grow too much?<br/> (say, I have an 8-core system and want to track down if, and when, active connections grow over 80)</div></blockquote></div><br/> You can achieve that just running simple query like<br /><span class="">select</span><spanclass=""> count(*) </span><span class="">from</span><span class=""> pg_stat_activity </span><spanclass="">where</span><span class=""> state = </span><span class="">'active'</span> </div><div class="gmail_extra"><br/></div><div class="gmail_extra"><br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra"><br/></div></div></blockquote> Thanks, but this way I get the "sample" on that actual moment: what I'dneed is to monitor, or to have something warning me like "Hey, You've got 2000 active connections! Time to grow up!" :-)<br/><br /> Cheers,<br /> Moreno.-<br /><br />
there are two ways:
- to write bash script with condition if number of conn. is > 1000 send me email and put that script in crontab
- monitor it with external monitoring system like zabbix, nagios etc....
2016-04-04 18:00 GMT+03:00 Moreno Andreo <moreno.andreo@evolu-s.it>:
Thanks, but this way I get the "sample" on that actual moment: what I'd need is to monitor, or to have something warning me like "Hey, You've got 2000 active connections! Time to grow up!" :-)Il 04/04/2016 16:54, Artem Tomyuk ha scritto:2016-04-04 17:43 GMT+03:00 Moreno Andreo <moreno.andreo@evolu-s.it>:s there a way to monitor active connections, or at least to report when they grow too much?
(say, I have an 8-core system and want to track down if, and when, active connections grow over 80)
You can achieve that just running simple query like
select count(*) from pg_stat_activity where state = 'active'
Cheers,
Moreno.-
2016-04-04 16:43 GMT+02:00 Moreno Andreo <moreno.andreo@evolu-s.it>:
Il 04/04/2016 15:33, Pavel Stehule ha scritto:
PostgreSQL doesn't contain integrated pooler - so any connection to Postgres enforces one PostgreSQL proces. A performance benchmarks is showing maximum performance about 10x cores. With high number of connections you have to use low size of work_mem, what enforces can have negative impact on performance too. Too high number of active PostgreSQL processes increase a risk of performance problems with spin locks, etc.
:-O
I wasn't absolutely aware of this thing... is there a way to monitor active connections, or at least to report when they grow too much?
(say, I have an 8-core system and want to track down if, and when, active connections grow over 80)
100 connections are probably ok, 200 is over the optimum - there is some tolerance.
We are speaking about optimum - I had possibility to work with system where max connections was 300, 600 - and it was working. But then the max_connection doesn't work as safeguard against overloading. And the system under higher load can be pretty slow.
Regards
Pavel
Thanks
Moreno.-
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance