Thread: Problems with max_connections parameter
Hi all, I am a master student on database systems and I'm working on a stress testing methodology. The goal is to stress testing PostgreSQL under different workloads. Thus, I would like to discuss with you some results. My methodology is based on the increment of two variables: complexity and workload. The complexity is the setup of the testing environment and the SUT. That is the hardware and software setup. The workload is the number of transactions submitted to the SUT. The test case increases the number of transactions to find the limit of system. I had some problems with the parameter max_connections. By default the value of this parameter is 100. I set its value to 2,000, accordingly to the amount of available memory. I used the following formula to set the operating system parameter SHMMAX: SHMMAX=250kB + 8.2kB * shared_buffers + 14.2 kB * max_connections The database has started correctly. I begun the tests using 5 clients machines and 1 server. In the server side, I just ran Postgres. In the clients side, I used threads to simulate a large number of transactions. I simulated 1000, 10000 and 100000 transactions sequentially. I used the transaction of TPC-B, this benchmark simulates a bank system. It have insert, update and select of customers accounts. At the first test, 1000 transactions were submitted to the database at 2 seconds. The time interval for analysis is divided into seconds. The distribution of transactions per time interval is uneven due to delay client boot. In the first interval 200 transactions were submitted and all were attended and completed successfully. In sub-sequent intevalo 800 transactions were submitted and of these 33 were not started due to connection refusal by the BD. This rate is not expected since the DB was configured to serve 2000 concurrent connections. An important fact that deserves to be emphasized is that there were no cases of aborted transactions, i.e. all transactions submitted and accepted by BD were successfully completed with average execution time equal to or less than 1 second. The worst case happens with the 100000 transactions. In this case the successful rate is near to 800 per second, but the "connection errors" reached 10000 per second. The configured limit to max_connections was not reached at any time. This is a bug? Do I make myself clear? Best regards Jorge Augusto Meira
Jorge Augusto Meira escreveu: > This is a bug? > You don't provide sufficient information. PostgreSQL version? Error message? Talking about your problem, are you sure you're not reaching max_connections? Did you check the logs? -- Euler Taveira de Oliveira http://www.timbira.com/
Euler Taveira de Oliveira <euler@timbira.com> wrote: > Talking about your problem, are you sure you're not reaching > max_connections? It also strikes me that from the minimal information given, it might be possible that pid numbers or port numbers are wrapping around before the OS is ready to allow re-use. I haven't seen that behavior except in machines infected with a worm, but this test might be edging into the same territory if it's using a new connection for each request. Obviously, nobody who cared about performance would use that technique in production, but it rather sounds like this test does. -Kevin
Hi Thanks for the answers. Really, I didn't showed you basic informations. I used the PostgreSQL 8.4. The server configuration was: Processor: Intel Xeon Processor W3570 Quad Core Processor Mem: 20GB Network Interface: Gigabit HD: 12 x 1 TB (RAID1+0) OS: Debian GNU/Linux, kernel 2.6.26-2-64 The client machines has similar configuration. The error message was: "Erro Conex=E3o: A tentativa de conex=E3o falhou." or "Erro Conex=E3o: FATAL: connection limit exceeded for non-superusers" I used the logs of my java application used to simulate the clients. Thanks Jorge On Fri, Dec 3, 2010 at 12:54 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov > wrote: > Euler Taveira de Oliveira <euler@timbira.com> wrote: > > > Talking about your problem, are you sure you're not reaching > > max_connections? > > It also strikes me that from the minimal information given, it might > be possible that pid numbers or port numbers are wrapping around > before the OS is ready to allow re-use. I haven't seen that > behavior except in machines infected with a worm, but this test > might be edging into the same territory if it's using a new > connection for each request. Obviously, nobody who cared about > performance would use that technique in production, but it rather > sounds like this test does. > > -Kevin >
Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes: > The error message was: > "Erro Conexão: A tentativa de conexão falhou." > or > "Erro Conexão: FATAL: connection limit exceeded for non-superusers" Hmm ... I can't find the first of those anywhere in the 8.4 message lists; but the second one definitely says that you *are* hitting the max_connections limit, whether you think you should be or not. I wonder whether you are neglecting to allow for the fact that backends have a nonzero shutdown time? If you disconnect and immediately reconnect, it's possible that your old backend is still around, so that the new connection attempt causes max_connections to be exceeded. This is particularly likely if the test program is on the same machine as the database server, because the test program itself is likely to have a higher scheduling priority than the old backend. regards, tom lane
Hi Tom The test program is running in other 5 client machines. In the logs of my test program, the max_connection parameter limit is never reached. Regards Jorge On Mon, Dec 6, 2010 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes: > > The error message was: > > "Erro Conex=E3o: A tentativa de conex=E3o falhou." > > or > > "Erro Conex=E3o: FATAL: connection limit exceeded for non-superusers" > > Hmm ... I can't find the first of those anywhere in the 8.4 message > lists; but the second one definitely says that you *are* hitting the > max_connections limit, whether you think you should be or not. > > I wonder whether you are neglecting to allow for the fact that backends > have a nonzero shutdown time? If you disconnect and immediately > reconnect, it's possible that your old backend is still around, so that > the new connection attempt causes max_connections to be exceeded. This > is particularly likely if the test program is on the same machine as the > database server, because the test program itself is likely to have a > higher scheduling priority than the old backend. > > regards, tom lane >
Jorge Augusto Meira escreveu: > The test program is running in other 5 client machines. > In the logs of my test program, the max_connection parameter limit is > never reached. > How could the test program know? Indeed it doesn't. Are you using some delay between one test and another one? I would be a good idea, specially if you're restarting the PostgreSQL between tests. -- Euler Taveira de Oliveira http://www.timbira.com/
Hi, Euler - How could the test program know? The test program after any operation (inser, tupdate or select) receive a message of postgres like OK or ERROR (Connection error: FATAL). - Are you using some delay between one test and another one? I would be a good idea, specially if you're restarting the PostgreSQL between tests. Yes, I used a delay. After any test I restarted the postgres and cleared my database. Regards Jorge On Mon, Dec 6, 2010 at 4:26 PM, Euler Taveira de Oliveira <euler@timbira.com > wrote: > Jorge Augusto Meira escreveu: > > The test program is running in other 5 client machines. > > In the logs of my test program, the max_connection parameter limit is > > never reached. > > > How could the test program know? Indeed it doesn't. Are you using some > delay > between one test and another one? I would be a good idea, specially if > you're > restarting the PostgreSQL between tests. > > > -- > Euler Taveira de Oliveira > http://www.timbira.com/ > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs >
Hi again Have something else I can do to reach the limit of the parameter max_connections? This may be a bug? Thanks Jorge On Mon, Dec 6, 2010 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes: >> The error message was: >> "Erro Conex=E3o: A tentativa de conex=E3o falhou." >> or >> "Erro Conex=E3o: FATAL: connection limit exceeded for non-superusers" > > Hmm ... I can't find the first of those anywhere in the 8.4 message > lists; but the second one definitely says that you *are* hitting the > max_connections limit, whether you think you should be or not. > > I wonder whether you are neglecting to allow for the fact that backends > have a nonzero shutdown time? =A0If you disconnect and immediately > reconnect, it's possible that your old backend is still around, so that > the new connection attempt causes max_connections to be exceeded. =A0This > is particularly likely if the test program is on the same machine as the > database server, because the test program itself is likely to have a > higher scheduling priority than the old backend. > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0regards, tom lane >
On Mon, Dec 13, 2010 at 8:46 PM, Jorge Augusto Meira <jmeira@c3sl.ufpr.br> wrote: > Have something else I can do to reach the limit of the parameter > max_connections? > > This may be a bug? Well, I don't think you've really identified what's happening. Kevin and Tom both suggested possible explanations upthread. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company