Thread: pgbench unable to scale beyond 100 concurrent connections
Hi,
--
Thanks and Regards,
I am testing pgbench with more than 100 connections.
also set max_connection in postgresql.conf more than 100.
Initially pgbench tries to scale nearby 150 but later it come down to 100 connections and stable there.
It this limitation of pgbench? or bug? or i am doing it wrong way?
---
I tested it with max_connection = 200 in postgresql.conf
and pgbench witn -c 180/190/200
Please reply.
Sachin Kotwal
Hello Sachin, Your report is very imprecise so it is hard to tell anything. What version of client and server are you running? On what hardware ? (200 connections => 200 active postgres processes, how many processes per core are you expecting to run? the recommanded value is about 2 connections per physical core...) What precise command is started? How to you know it "comes down to 100 connections"? Are there error messages from pgbench or postgresql? My random guess would be that you start too many connections with only one thread client side and/or on a too small hardware client or server-side for the expected scale, so given the load and latency some connections just never get to do anything? Maybe try with "-j 20" so that there are not too many connections per pgbench thread? > I am testing pgbench with more than 100 connections. also set > max_connection in postgresql.conf more than 100. > > Initially pgbench tries to scale nearby 150 but later it come down to 100 > connections and stable there. > > It this limitation of pgbench? or bug? or i am doing it wrong way? > > --- > I tested it with max_connection = 200 in postgresql.conf > and pgbench witn -c 180/190/200 > Please reply. Please send precise information instead of expecting people to guess... -- Fabien
Hi Fabien,
--
Thanks and Regards,
Sorry for very short report.
I feel pgbench is not so complex tool.
Please see below answers to your questions.
On Wed, Jun 29, 2016 at 5:07 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hello Sachin,
Your report is very imprecise so it is hard to tell anything.
What version of client and server are you running?
I am testing it with 9.6-beta1 binaries. For server and client it is same.
I am using pgbench on top of postgres_fdw.
On what hardware ? (200 connections => 200 active postgres processes, how many processes per core are you expecting to run?
I am running in small virtual box machine. with 1GB RAM and 2 cores.
I think there should not be problem with 200 processes on 2 core machines.
I tested same number of concurrent connections on same machine with sysbench it is working fine.
I am not sure what is difference between pgbench and sysbench, they might be process based or thread based.
But I can say if I can create 200+ concurrent connection with sysbench , i should be able create same with pgbench.
Thoughts?
the recommanded value is about 2 connections per physical core...)
I think 2 connections per core is very small value . for 200 i need atleast 100 core machine , which is not good.
What precise command is started?
How to you know it "comes down to 100 connections"?
I put watch on live connections to database.
something like : watch -n 1 'ps -ef | grep postgres | grep 192.168.56.101 | wc -l'
NOTE: grep cxommand may change as per environment.
Also user below query to see active connection.
# select count(*) from pg_stat_activity;
Are there error messages from pgbench or postgresql?
postgresql does not give any error.
pgbench says:
client 36 aborted in state 2: ERROR: could not connect to server "server_1"
DETAIL: FATAL: sorry, too many clients already
My random guess would be that you start too many connections with only one thread client side and/or on a too small hardware client or server-side for the expected scale, so given the load and latency some connections just never get to do anything?
This may be reason but it should be able to maintain idle connection for that time if never get to do anything.
Maybe try with "-j 20" so that there are not too many connections per pgbench thread?
I do not have such good hardware for now.
I feel pgbench should be able to perform well on small hardware.
Feel free to ask any question regarding setup.
I am testing pgbench with more than 100 connections. also set max_connection in postgresql.conf more than 100.
Initially pgbench tries to scale nearby 150 but later it come down to 100
connections and stable there.
It this limitation of pgbench? or bug? or i am doing it wrong way?
---
I tested it with max_connection = 200 in postgresql.conf
and pgbench witn -c 180/190/200Please reply.
Please send precise information instead of expecting people to guess...
--
Fabien
Sachin Kotwal
On 29 June 2016 at 18:47, Sachin Kotwal <kotsachin@gmail.com> wrote:
I am testing pgbench with more than 100 connections.also set max_connection in postgresql.conf more than 100.Initially pgbench tries to scale nearby 150 but later it come down to 100 connections and stable there.It this limitation of pgbench? or bug? or i am doing it wrong way?
What makes you think this is a pgbench limitation?
It sounds like you're benchmarking the client and server on the same system. Couldn't this be a limitation of the backend PostgreSQL server?
It also sounds like your method of counting concurrent connections is probably flawed. You're not allowing for setup and teardown time; if you want over 200 connections really running at very high rates of connection and disconnection you'll probably need to raise max_connections a bit to allow for the ones that're starting up or tearing down at any given time.
Really, though, why would you want to do this? I can measure my car's speed falling off a cliff, but that's not a very interesting benchmark for a car. I can't imagine any sane use of the database this way, with incredibly rapid setup and teardown of lots of connections. Look into connection pooling, either client side or in a proxy like pgbouncer.
Hi,
--
Thanks and Regards,
On Wed, Jun 29, 2016 at 6:29 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 29 June 2016 at 18:47, Sachin Kotwal <kotsachin@gmail.com> wrote:I am testing pgbench with more than 100 connections.also set max_connection in postgresql.conf more than 100.Initially pgbench tries to scale nearby 150 but later it come down to 100 connections and stable there.It this limitation of pgbench? or bug? or i am doing it wrong way?What makes you think this is a pgbench limitation?
As I mentioned when I tried same thing with sysbench It can give me 200+ concurrent connection with same method and same machine.
It sounds like you're benchmarking the client and server on the same system. Couldn't this be a limitation of the backend PostgreSQL server?
I think having client and server on same server should not be problem.
As i can do this with different benchmarking tool It should not be limitation of backend PostgreSQL server.
It also sounds like your method of counting concurrent connections is probably flawed. You're not allowing for setup and teardown time; if you want over 200 connections really running at very high rates of connection and disconnection you'll probably need to raise max_connections a bit to allow for the ones that're starting up or tearing down at any given time.
May be. Please let me know how I can count concurrent connection in this case.
There should not be connection and disconnection because I am not using -C option of pgbench which cause connection and disconnection for each query.
If I set max_connection of postgresql.conf to 200 and testing with -c 150 .
This should work fine, but it is not.
Really, though, why would you want to do this? I can measure my car's speed falling off a cliff, but that's not a very interesting benchmark for a car. I can't imagine any sane use of the database this way, with incredibly rapid setup and teardown of lots of connections. Look into connection pooling, either client side or in a proxy like pgbouncer.
I am testing one scenario of multiple coordinator with help of postgres_fdw to enhance connection ability of postgres without any connection pooling .
Setup might be difficult to explain here but will explain if required.
can you test simply 100 scale database size with pgbench and run pgbench with 200+ connection of small virtual box to see same observation ?
Please let me know if I can help to know to reproduce this problem.
--
Sachin Kotwal
On Wed, Jun 29, 2016 at 8:36 AM, Sachin Kotwal <kotsachin@gmail.com> wrote: > postgresql does not give any error. > > pgbench says: > client 36 aborted in state 2: ERROR: could not connect to server "server_1" > DETAIL: FATAL: sorry, too many clients already The error message that you are seeing there "FATAL: sorry, too many clients already" is coming from the server. pgbench is just reporting the error which the server has sent to it. Now, it is impossible for me to know why the server is out of connections, but a likely cause is that you haven't set max_connections high enough. If there is some other cause, you haven't provided enough information to diagnose it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 29, 2016 at 5:36 AM, Sachin Kotwal <kotsachin@gmail.com> wrote: > Hi Fabien, > > Sorry for very short report. > I feel pgbench is not so complex tool. > > Please see below answers to your questions. > > > On Wed, Jun 29, 2016 at 5:07 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: >> >> >> Hello Sachin, >> >> Your report is very imprecise so it is hard to tell anything. >> >> What version of client and server are you running? > > I am testing it with 9.6-beta1 binaries. For server and client it is same. > I am using pgbench on top of postgres_fdw. That seems like a pretty important bit of info, there. .. > postgresql does not give any error. Surely it does. Perhaps you are not looking in the correct log file. > > pgbench says: > client 36 aborted in state 2: ERROR: could not connect to server "server_1" > DETAIL: FATAL: sorry, too many clients already The error is probably coming from the remote side of the FDW, and then getting passed from there to the direct server, and from there to pgbench. Cheers, Jeff
Hello, > I am testing it with 9.6-beta1 binaries. For server and client it is same. > I am using pgbench on top of postgres_fdw. Hmmm... So pgbench is connected to some pg instance, and this pg instance is connected to something else on another host? Or to the same instance, in which case you would double the number of required connections and processes. >> On what hardware ? (200 connections => 200 active postgres processes, how >> many processes per core are you expecting to run? > > I am running in small virtual box machine. with 1GB RAM and 2 cores. > I think there should not be problem with 200 processes on 2 core machines. > I tested same number of concurrent connections on same machine with > sysbench it is working fine. 200 processes which do nothing is fine. 200 processes which are actively competing for resources to process request is not so fine on a 2 core host. > I am not sure what is difference between pgbench and sysbench, they might > be process based or thread based. Pgbench starts threads (option -j N, default N = 1), but with 2 cores threads are not very interesting. I do not know about sysbench. > But I can say if I can create 200+ concurrent connection with sysbench , > i should be able create same with pgbench. Thoughts? If sysbench creates connections which do nothing it is fine. However Pgbench will require the server to be active. > I think 2 connections per core is very small value . for 200 i need atleast > 100 core machine , which is not good. I'm not sure what you would expect with 200 active connections on 2 cores. >> What precise command is started? No answer to this question. I wanted to see what options are passed to pgbench... >> How to you know it "comes down to 100 connections"? > > I put watch on live connections to database. > something like : watch -n 1 'ps -ef | grep postgres | grep 192.168.56.101 | > wc -l' Ok. > Also user below query to see active connection. > # select count(*) from pg_stat_activity; Ok. >> Are there error messages from pgbench or postgresql? >> >> postgresql does not give any error. > > pgbench says: > client 36 aborted in state 2: ERROR: could not connect to server "server_1" > DETAIL: FATAL: sorry, too many clients already As noted by Jeff & Robert, you have reached the maximum number of connections allowed by the server. >> My random guess would be that you start too many connections with only one >> thread client side and/or on a too small hardware client or server-side for >> the expected scale, so given the load and latency some connections just >> never get to do anything? >> >> This may be reason but it should be able to maintain idle connection for > that time if never get to do anything. Probably, it depends. I'm not sure how pgbench starts its connections depending on the options (which you did not provide), if it is on demand then maybe. You could add the log_connection option and see the details & number of the created connections. >> I do not have such good hardware for now. Well, you try to start 200 active servers connected to 200 active clients, which requires some ressources... > I feel pgbench should be able to perform well on small hardware. Sure, with 2 core I would start it with 2 clients. Maybe you could start by running smaller number of pgbench connections (1, 2, 3, 4...) and check the number of processes created, to check the effect of the fdw stuff. -- Fabien.
On 29 June 2016 at 21:49, Sachin Kotwal <kotsachin@gmail.com> wrote:
Hi,On Wed, Jun 29, 2016 at 6:29 PM, Craig Ringer <craig@2ndquadrant.com> wrote:On 29 June 2016 at 18:47, Sachin Kotwal <kotsachin@gmail.com> wrote:I am testing pgbench with more than 100 connections.also set max_connection in postgresql.conf more than 100.Initially pgbench tries to scale nearby 150 but later it come down to 100 connections and stable there.It this limitation of pgbench? or bug? or i am doing it wrong way?What makes you think this is a pgbench limitation?As I mentioned when I tried same thing with sysbench It can give me 200+ concurrent connection with same method and same machine.
What command lines / configs are you using? Details are necessary, talking about this in general hand-waving terms is not getting anywhere.
It sounds like you're benchmarking the client and server on the same system. Couldn't this be a limitation of the backend PostgreSQL server?I think having client and server on same server should not be problem.As i can do this with different benchmarking tool It should not be limitation of backend PostgreSQL server.
OK, so your sysbench use is actually talking to PostgreSQL as well. Then yes. Assuming they're testing roughly the same thing, which I somewhat doubt.
There should not be connection and disconnection because I am not using -C option of pgbench which cause connection and disconnection for each query.
OK, in that case it's hard to explain the behaviour you're seeing.
More details please.
If I set max_connection of postgresql.conf to 200 and testing with -c 150 .This should work fine, but it is not.
If you're using FDWs to connect to the same server again, you'll need a max_connections slot for each FDW connection as well.
I am testing one scenario of multiple coordinator with help of postgres_fdw to enhance connection ability of postgres without any connection pooling .Setup might be difficult to explain here but will explain if required.
Yes, you need to explain it.
can you test simply 100 scale database size with pgbench and run pgbench with 200+ connection of small virtual box to see same observation ?
It works fine - of course. There's more to this story than you've explained so far.
Hi All,
--
Thanks and Regards,
Sorry for trouble you with small environment setup for testing.
I should to test this with large machine.
What I was testing were involved multiple things same time so quite confusing .
possible reason for this testing failure is :
1. small hardware
2. haproxy not able to balance connection 100-100 on each server.
3. postgres_fdw foreign server unable to established large number of connection with remote server/Shard.
I was testing multiple coordinator using postgres_fdw (sharding) and haproxy on top of it for load balancing.
as below
pg_fdw (conn=100, diff pg instance on diff machine) |
/ \ |
pgbench (haproxy-port)->Haproxy/ (should accept 200 conn) \ | Shards/Nodes (1…N)
\ / | remote pg servers
\ / |
pg_fdw(conn=100, diff pg instance on diff machine) |
Hope i will test this scenario in detail once i get time and good hardware.
If some one test this scenario please let me know.
Thanks and regards,
Sachin Kotwal
On Thu, Jun 30, 2016 at 4:03 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 29 June 2016 at 21:49, Sachin Kotwal <kotsachin@gmail.com> wrote:Hi,On Wed, Jun 29, 2016 at 6:29 PM, Craig Ringer <craig@2ndquadrant.com> wrote:On 29 June 2016 at 18:47, Sachin Kotwal <kotsachin@gmail.com> wrote:I am testing pgbench with more than 100 connections.also set max_connection in postgresql.conf more than 100.Initially pgbench tries to scale nearby 150 but later it come down to 100 connections and stable there.It this limitation of pgbench? or bug? or i am doing it wrong way?What makes you think this is a pgbench limitation?As I mentioned when I tried same thing with sysbench It can give me 200+ concurrent connection with same method and same machine.What command lines / configs are you using? Details are necessary, talking about this in general hand-waving terms is not getting anywhere.It sounds like you're benchmarking the client and server on the same system. Couldn't this be a limitation of the backend PostgreSQL server?I think having client and server on same server should not be problem.As i can do this with different benchmarking tool It should not be limitation of backend PostgreSQL server.OK, so your sysbench use is actually talking to PostgreSQL as well. Then yes. Assuming they're testing roughly the same thing, which I somewhat doubt.There should not be connection and disconnection because I am not using -C option of pgbench which cause connection and disconnection for each query.OK, in that case it's hard to explain the behaviour you're seeing.More details please.If I set max_connection of postgresql.conf to 200 and testing with -c 150 .This should work fine, but it is not.If you're using FDWs to connect to the same server again, you'll need a max_connections slot for each FDW connection as well.I am testing one scenario of multiple coordinator with help of postgres_fdw to enhance connection ability of postgres without any connection pooling .Setup might be difficult to explain here but will explain if required.Yes, you need to explain it.can you test simply 100 scale database size with pgbench and run pgbench with 200+ connection of small virtual box to see same observation ?It works fine - of course. There's more to this story than you've explained so far.--
Sachin Kotwal