Thread: Problems with max_connections parameter

Problems with max_connections parameter

From

Jorge Augusto Meira

Date:

02 December 2010, 19:24:52

Hi all,

I am a master student on database systems and I'm working on a stress
testing methodology.
The goal is to stress testing PostgreSQL under different workloads.
Thus, I would like to discuss with you some results.

My methodology is based on the increment of two variables: complexity and
workload. The complexity is the setup of the testing environment and the
SUT. That is the hardware and software setup. The workload is  the number of
transactions submitted to the SUT.

The test case increases the number of transactions to find the limit of
system.
I had some problems with the parameter max_connections.

By default the value of this parameter is 100.

I set its value to 2,000, accordingly to the amount of available memory. I
used the following formula to set the operating system parameter SHMMAX:

SHMMAX=250kB + 8.2kB * shared_buffers + 14.2 kB * max_connections

The database has started correctly.

I begun the tests using 5 clients machines and 1 server. In the server side,
I just ran Postgres. In the clients side, I used threads to simulate a large
number of transactions.

I simulated 1000, 10000 and 100000 transactions sequentially.

I used the transaction of TPC-B, this benchmark simulates a bank system. It
have insert, update and select of customers accounts.

At the first test, 1000 transactions were submitted to the database at 2
seconds. The time interval for analysis is divided into seconds. The
distribution of transactions per time interval is uneven due to delay client
boot. In the first interval 200 transactions were submitted and all were
attended and completed successfully.
In sub-sequent intevalo 800 transactions were submitted and of these 33 were
not started due to connection refusal by the BD.

This rate is not expected since the DB was configured to serve 2000
concurrent connections.

An important fact that deserves to be emphasized is that there were no cases
of aborted transactions, i.e. all transactions submitted and accepted by BD
were successfully completed with average execution time equal to or less
than 1 second.

The worst case happens with the 100000 transactions. In this case the
successful rate is near to 800 per second, but the "connection errors"
reached 10000 per second.

The configured limit to max_connections was not reached at any time.

This is a bug?

Do I make myself clear?

Best regards
Jorge Augusto Meira

Re: Problems with max_connections parameter

From

Euler Taveira de Oliveira

Date:

03 December 2010, 13:41:49

Jorge Augusto Meira escreveu:
> This is a bug?
>
You don't provide sufficient information. PostgreSQL version? Error message?

Talking about your problem, are you sure you're not reaching max_connections?
Did you check the logs?


--
  Euler Taveira de Oliveira
  http://www.timbira.com/

Re: Problems with max_connections parameter

From

"Kevin Grittner"

Date:

03 December 2010, 13:54:24

Euler Taveira de Oliveira <euler@timbira.com> wrote:

> Talking about your problem, are you sure you're not reaching
> max_connections?

It also strikes me that from the minimal information given, it might
be possible that pid numbers or port numbers are wrapping around
before the OS is ready to allow re-use.  I haven't seen that
behavior except in machines infected with a worm, but this test
might be edging into the same territory if it's using a new
connection for each request.  Obviously, nobody who cared about
performance would use that technique in production, but it rather
sounds like this test does.

-Kevin

Re: Problems with max_connections parameter

From

Jorge Augusto Meira

Date:

06 December 2010, 11:48:02

Hi

Thanks for the answers.

Really, I didn't showed you basic informations.

I used the PostgreSQL 8.4. The server configuration was:
       Processor: Intel Xeon Processor W3570 Quad Core Processor
       Mem: 20GB
       Network Interface: Gigabit
       HD: 12 x 1 TB (RAID1+0)
       OS: Debian GNU/Linux, kernel 2.6.26-2-64

The client machines has similar configuration.

The error message was:
"Erro Conex=E3o: A tentativa de conex=E3o falhou."
or
"Erro Conex=E3o: FATAL: connection limit exceeded for non-superusers"

I used the logs of my java application used to simulate the clients.

Thanks
Jorge

On Fri, Dec 3, 2010 at 12:54 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov
> wrote:

> Euler Taveira de Oliveira <euler@timbira.com> wrote:
>
> > Talking about your problem, are you sure you're not reaching
> > max_connections?
>
> It also strikes me that from the minimal information given, it might
> be possible that pid numbers or port numbers are wrapping around
> before the OS is ready to allow re-use.  I haven't seen that
> behavior except in machines infected with a worm, but this test
> might be edging into the same territory if it's using a new
> connection for each request.  Obviously, nobody who cared about
> performance would use that technique in production, but it rather
> sounds like this test does.
>
> -Kevin
>

Re: Problems with max_connections parameter

From

Tom Lane

Date:

06 December 2010, 14:31:35

Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes:
> The error message was:
> "Erro Conexão: A tentativa de conexão falhou."
> or
> "Erro Conexão: FATAL: connection limit exceeded for non-superusers"

Hmm ... I can't find the first of those anywhere in the 8.4 message
lists; but the second one definitely says that you *are* hitting the
max_connections limit, whether you think you should be or not.

I wonder whether you are neglecting to allow for the fact that backends
have a nonzero shutdown time?  If you disconnect and immediately
reconnect, it's possible that your old backend is still around, so that
the new connection attempt causes max_connections to be exceeded.  This
is particularly likely if the test program is on the same machine as the
database server, because the test program itself is likely to have a
higher scheduling priority than the old backend.

            regards, tom lane

Re: Problems with max_connections parameter

From

Jorge Augusto Meira

Date:

06 December 2010, 16:57:53

Hi Tom

The test program is running in other 5 client machines.
In the logs of my test program, the max_connection parameter limit is never
reached.

Regards
Jorge

On Mon, Dec 6, 2010 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes:
> > The error message was:
> > "Erro Conex=E3o: A tentativa de conex=E3o falhou."
> > or
> > "Erro Conex=E3o: FATAL: connection limit exceeded for non-superusers"
>
> Hmm ... I can't find the first of those anywhere in the 8.4 message
> lists; but the second one definitely says that you *are* hitting the
> max_connections limit, whether you think you should be or not.
>
> I wonder whether you are neglecting to allow for the fact that backends
> have a nonzero shutdown time?  If you disconnect and immediately
> reconnect, it's possible that your old backend is still around, so that
> the new connection attempt causes max_connections to be exceeded.  This
> is particularly likely if the test program is on the same machine as the
> database server, because the test program itself is likely to have a
> higher scheduling priority than the old backend.
>
>                        regards, tom lane
>

Re: Problems with max_connections parameter

From

Euler Taveira de Oliveira

Date:

06 December 2010, 17:27:01

Jorge Augusto Meira escreveu:
> The test program is running in other 5 client machines.
> In the logs of my test program, the max_connection parameter limit is
> never reached.
>
How could the test program know? Indeed it doesn't. Are you using some delay
between one test and another one? I would be a good idea, specially if you're
restarting the PostgreSQL between tests.


--
  Euler Taveira de Oliveira
  http://www.timbira.com/

Re: Problems with max_connections parameter

From

Jorge Augusto Meira

Date:

06 December 2010, 19:10:03

Hi, Euler

- How could the test program know?
The test program after any operation (inser, tupdate or select) receive a
message of postgres like OK or ERROR (Connection error: FATAL).

- Are you using  some delay between one test and another one? I would be a
good idea, specially if you're restarting the PostgreSQL  between tests.
Yes, I used a delay. After any test I restarted the postgres and cleared my
database.

Regards
Jorge

On Mon, Dec 6, 2010 at 4:26 PM, Euler Taveira de Oliveira <euler@timbira.com
> wrote:

> Jorge Augusto Meira escreveu:
> > The test program is running in other 5 client machines.
> > In the logs of my test program, the max_connection parameter limit is
> > never reached.
> >
> How could the test program know? Indeed it doesn't. Are you using some
> delay
> between one test and another one? I would be a good idea, specially if
> you're
> restarting the PostgreSQL between tests.
>
>
> --
>  Euler Taveira de Oliveira
>  http://www.timbira.com/
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

Re: Problems with max_connections parameter

From

Jorge Augusto Meira

Date:

14 December 2010, 00:46:46

Hi again

Have something else I can do to reach the limit of the parameter
max_connections?

This may be a bug?

Thanks
Jorge

On Mon, Dec 6, 2010 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes:
>> The error message was:
>> "Erro Conex=E3o: A tentativa de conex=E3o falhou."
>> or
>> "Erro Conex=E3o: FATAL: connection limit exceeded for non-superusers"
>
> Hmm ... I can't find the first of those anywhere in the 8.4 message
> lists; but the second one definitely says that you *are* hitting the
> max_connections limit, whether you think you should be or not.
>
> I wonder whether you are neglecting to allow for the fact that backends
> have a nonzero shutdown time? =A0If you disconnect and immediately
> reconnect, it's possible that your old backend is still around, so that
> the new connection attempt causes max_connections to be exceeded. =A0This
> is particularly likely if the test program is on the same machine as the
> database server, because the test program itself is likely to have a
> higher scheduling priority than the old backend.
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0regards, tom lane
>

Re: Problems with max_connections parameter

From

Robert Haas

Date:

14 December 2010, 19:40:32

On Mon, Dec 13, 2010 at 8:46 PM, Jorge Augusto Meira
<jmeira@c3sl.ufpr.br> wrote:
> Have something else I can do to reach the limit of the parameter
> max_connections?
>
> This may be a bug?

Well, I don't think you've really identified what's happening.  Kevin
and Tom both suggested possible explanations upthread.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company