Thread: multiple apaches against single postgres database

multiple apaches against single postgres database

From

Honza Novak

Date:

24 October 2007, 09:48:48

Hi all,
i'm looking for correct or at least good enough solution for use of
multiple apaches with single postgres database. (apaches are 2.0.x, and
postgres is 8.1.x)

At this moment i'm involved in management of a website where we have
large user load on our web servers.  Apaches are set up to be able to
answer 300 requests at the same time and at the moment we have 4
apaches.  Eaxh of these apaches handles about 100 requests
simultaneously at average.We have no connection pooling setup between
apaches and postgresql. Postgres accepts up to 200 connections and
normaly there is about 20 used connections (although, there is quite a
lot of traffic between postgres and apaches, queries are simple enough,
so postgres handles it nicely)

But sometimes (i don't know exactly for what reason) some queries gets
stuck (mostly they are inserts or updates, but realy simple) and
postgres is unable to answer in time,  which starts a "wave" because
queries from apaches are delayed, which means that there is bigger
number of user request in process, which means more connections to
postgres, until we reach connection limit. But there is something even
worse and that is, that i don't know why postmaster process probably
forks itself ending with lots of (some hunreds) of postmasters running.
When we kill all these postmasters and start postgres again, it ends the
same because apaches probably overloads database server with their
waiting requests. In this case we first need to stop apaches, start
postgres, and then apaches and everything works fine ...... until next
problem, which can occur in hours, days or weeks.

And my questions:
1. Does someone hes similar experience? or clue what to do with it?

2. What is correct setup of postgresql backend serving data for many
(4+) apaches? i know that there are connection pooling solutions
(pgPool, pgBouncer, or apache 2.2) and i'm thinking about them, but it
seems that we have other problem beside that we didn't implement any
pooling solution yet.

3. is there a way to somehow log what happened to the postgres server
before accident? do you think that logging of all sql statements would
help me? if i enable it, what will be the performance overhead?

I might be asking too much, but i appreciate any help, hint, or
direction what to explore.

Thanks, i'm looking forward for answers.

Honza

Re: multiple apaches against single postgres database

From

Michal Taborsky - Internet Mall

Date:

24 October 2007, 10:15:44

Honza Novak napsal(a):
> And my questions:
> 1. Does someone hes similar experience? or clue what to do with it?

Sure, this is considered "normal" behavior for web applications. The
solution is to use connection pooling.

> 2. What is correct setup of postgresql backend serving data for many
> (4+) apaches? i know that there are connection pooling solutions
> (pgPool, pgBouncer, or apache 2.2) and i'm thinking about them, but it
> seems that we have other problem beside that we didn't implement any
> pooling solution yet.

We use pgpool running on each web server. You can have also the pgpool
running on the database server or even a separate server just for that.
You'll have to test to see what's best for you.

> 3. is there a way to somehow log what happened to the postgres server
> before accident? do you think that logging of all sql statements would
> help me? if i enable it, what will be the performance overhead?

What you are seeing is called "positive feedback". Once the server
reaches a certain performance threshold, it starts to delay the queries,
which causes more load, which causes further delay, until everything
comes to a halt. Sometimes the system can recover from this, if you have
properly setup limits (it will just refuse the requests until it can
cool off), sometimes it doesn't. The point is never get over the threshold.

Also, maybe you need better hardware for that kind of load, but since
you didn't provide more detail, we can't tell you.

It's quite meaningless to analyze performance once the system is
overloaded. You have to analyze before that happens and identify the
longest running queries under normal load and try to optimize them.
Under heavy load, even the simplest query may seem to be taking long
time, but it doesn't necessarily mean there is something wrong with it.

--
Michal Táborský
chief systems architect
Internet Mall, a.s.
<http://www.MALL.cz>

Re: multiple apaches against single postgres database

From

Gregory Stark

Date:

24 October 2007, 10:17:36

"Honza Novak" <kacerr@developers.zlutazimnice.cz> writes:

> Hi all,
> i'm looking for correct or at least good enough solution for use of multiple
> apaches with single postgres database. (apaches are 2.0.x, and postgres is
> 8.1.x)
>
> At this moment i'm involved in management of a website where we have large user
> load on our web servers.  Apaches are set up to be able to answer 300 requests
> at the same time and at the moment we have 4 apaches.

Do you have 300 processors? Are your requests particularly i/o-bound? Why
would running 300 processes simultaneously be faster than running a smaller
number sequentially? It doesn't sound like your systems are capable of
handling such a large number of requests simultaneously.

The traditional answer is to separate static content such as images which are
more i/o-bound onto a separate apache configuration which has a larger number
of connections, limit the number of connections for the cpu-bound dynamic
content server, and have a 1-1 ratio between apache dynamic content
connections and postgres backends. The alternative is to use connection
pooling. Often a combination of the two is best.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: multiple apaches against single postgres database

From

Sven Geisler

Date:

24 October 2007, 11:58:22

Hi Honza,

as Gregory wrote, let apache do the job.
The apache does queue a request if all running workers are busy.

1. Split static content.
We have an apache as frontend which serves all static content and
forwards (reverse-proxy) dynamic content to the "backends"

2. Split different types of dynamic content.
We have an apache for all interactive requests - where the user expects
quick responses. We have another apache for non-interactive content such
as downloads and uploads. Whose request, which doesn't do much cpu work
at all (and don't fit to 1.).

3. Limit each apache to achieve a good work load
It is to late if the operation systems tries to schedule the simultan
work load because you have more processes in ready state than free CPUs.
Set MaxClients of the apache for interactive requests that your
server(s) doesn't get overloaded.
You can set MaxClients just to limit the parallel downloads/uploads.
Set MaxClients of the frontend higher. A good settings is when the
interactive requests are queued at the backend apache without reaching
the limit of the request queue.

4. Set max_connection that you don't reach this limit.
Maximum number of connection from interactive backend + maximum number
of connections from non-interactive backend + reserve for the database
admin.

5. Check all limits that you never reach a memory limit and you box
starts to swap.

6. Monitor you application well
- Count number of open connections to each apache
- Check the load of the server.
- Check context switches on the PostgreSQL box.

I understand an apache process group as one apache.

Here is a example from our web application

Two frontends - each MaxClients = 1024.
Interactive backend - MaxClients = 35.
non-Interactive backend - MaxClients = 65.
max_connections = 120 (assuming each backend child process has one
connections)
With this setting we have even under load normally not more queries
running at the PostgreSQL server as cores are available.

Please note that example should give you only experience for the scale.
We need a long time to find this values for our environment (application
and hardware).

BTW: This can also be setup on a single box. We have customers where
different apache are running on the same server.

There are a number of papers in the web which describe such setups.
Checkout <http://perl.apache.org/docs/1.0/guide/performance.html> for
example.

Sven.

Gregory Stark schrieb:
> "Honza Novak" <kacerr@developers.zlutazimnice.cz> writes:
>
>> Hi all,
>> i'm looking for correct or at least good enough solution for use of multiple
>> apaches with single postgres database. (apaches are 2.0.x, and postgres is
>> 8.1.x)
>>
>> At this moment i'm involved in management of a website where we have large user
>> load on our web servers.  Apaches are set up to be able to answer 300 requests
>> at the same time and at the moment we have 4 apaches.
>
> Do you have 300 processors? Are your requests particularly i/o-bound? Why
> would running 300 processes simultaneously be faster than running a smaller
> number sequentially? It doesn't sound like your systems are capable of
> handling such a large number of requests simultaneously.
>
> The traditional answer is to separate static content such as images which are
> more i/o-bound onto a separate apache configuration which has a larger number
> of connections, limit the number of connections for the cpu-bound dynamic
> content server, and have a 1-1 ratio between apache dynamic content
> connections and postgres backends. The alternative is to use connection
> pooling. Often a combination of the two is best.
>

--
Sven Geisler <sgeisler@aeccom.com>   Tel +49.30.921017.81  Fax .50
Senior Developer, AEC/communications GmbH & Co. KG Berlin, Germany

Re: multiple apaches against single postgres database

From

"Kevin Grittner"

Date:

24 October 2007, 14:22:59

>>> On Wed, Oct 24, 2007 at  7:15 AM, in message
<471F3752.8070205@developers.zlutazimnice.cz>, Honza Novak
<kacerr@developers.zlutazimnice.cz> wrote:

> But sometimes (i don't know exactly for what reason) some queries gets
> stuck (mostly they are inserts or updates, but realy simple) and
> postgres is unable to answer in time

In addition to the points made by others, there is a chance that a
contributing factor is the tendency of PostgreSQL (prior to the
upcoming 8.3 release) to hold onto dirty pages for as long as
possible and throw them all at the disk drives in at checkpoint
time.  In some such cases the advice from previous emails may not
be enough -- you may have to use very aggressive background writer
settings, a smaller shared buffers setting, and/or reduce or
eliminate the OS write delays.

If you find this to be your problem, you may want to be an early
adopter of the 8.3 release, once it is out.

-Kevin

Re: multiple apaches against single postgres database

From

Tatsuo Ishii

Date:

25 October 2007, 00:28:23

Sorry for an off topic posting...

Michal,

> Honza Novak napsal(a):
> > And my questions:
> > 1. Does someone hes similar experience? or clue what to do with it?
>
> Sure, this is considered "normal" behavior for web applications. The
> solution is to use connection pooling.
>
> > 2. What is correct setup of postgresql backend serving data for many
> > (4+) apaches? i know that there are connection pooling solutions
> > (pgPool, pgBouncer, or apache 2.2) and i'm thinking about them, but it
> > seems that we have other problem beside that we didn't implement any
> > pooling solution yet.
>
> We use pgpool running on each web server. You can have also the pgpool
> running on the database server or even a separate server just for that.
> You'll have to test to see what's best for you.

As a member of pgpool development team, I am always looking for pgpool
examples in the real world which could be open to public. Can you
plese tell me more details the pgpool usage if possible?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> > 3. is there a way to somehow log what happened to the postgres server
> > before accident? do you think that logging of all sql statements would
> > help me? if i enable it, what will be the performance overhead?
>
> What you are seeing is called "positive feedback". Once the server
> reaches a certain performance threshold, it starts to delay the queries,
> which causes more load, which causes further delay, until everything
> comes to a halt. Sometimes the system can recover from this, if you have
> properly setup limits (it will just refuse the requests until it can
> cool off), sometimes it doesn't. The point is never get over the threshold.
>
> Also, maybe you need better hardware for that kind of load, but since
> you didn't provide more detail, we can't tell you.
>
> It's quite meaningless to analyze performance once the system is
> overloaded. You have to analyze before that happens and identify the
> longest running queries under normal load and try to optimize them.
> Under heavy load, even the simplest query may seem to be taking long
> time, but it doesn't necessarily mean there is something wrong with it.
>
> --
> Michal Táborský
> chief systems architect
> Internet Mall, a.s.
> <http://www.MALL.cz>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings