Re: Troubles with performances - Mailing list pgsql-general

From Tim Kientzle
Subject Re: Troubles with performances
Date
Msg-id 3A6D0325.148FEC19@acm.org
Whole thread Raw
In response to Troubles with performances  (Guillaume Lémery <glemery@comclick.com>)
List pgsql-general
> > How many http connections per second are you getting?
> 200

Huh?  200 connections per second is very different
from 200 simultaneous connections.  Which is it?

> I don't handle dynamic pages, but only HTTP redirects, so I think I
> don't need cache...

> I do not use PHP or CGI because they are too slow.
> I built an Apache module. I'd like to have response in 200ms max,
> because it's an application for banners.

That should be easy if you simplify your code a LOT.
1) Configure Apache to limit to 10 child processes.
   (No, that's not a mis-print.)  By limiting the number
   of child processes, you limit CPU load and help ensure
   fast response.  You also limit total memory use.
   (To really optimize, drop Apache and find a good
   single-threaded/single-process web server; modify
   that to build a custom web server just for your app.)
2) When your module is initialized, open the database and
   read _everything_ into memory.  If you're building
   a banner redirect system, then you probably only have at most
   a few megabytes of data, so just store it all in memory
   at startup.  From there, just look things up in memory.
3) Don't write logging information to the database; write
   it to a file.  (Designing a good logging system is
   tricky to do well.  Log _files_ are easier to understand
   and manage and faster to write than trying to send log
   data to a database.  If you need summary information such
   as hit totals in the database, have a separate program
   periodically scan the log files to generate such data.)
The net result of 2 and 3 is that you won't ever touch
the database during normal operation.  Logging to files
is extremely fast (one disk write for each transaction)
and keeping your banner data in memory ensures that you
can generate responses very quickly.  You should be
able to consistently generate responses in under 10ms with
this kind of design.    (Under 1ms if you do everything
exactly right.)  The only drawback is that a change in
your database data won't immediately impact what's being
served; you can deal with this within Apache by setting
a limit on the number of hits served per child.  That
will help encourage Apache child processes to be restarted
fairly regularly.

> Oddly, increasing the number of connections pooled doesn't increase
> performance and if I create too much connections (e.g. 15 or 20,
> performance decrease).

Of course.  Remember that Apache is a forking web server; you've
got five pooled connections for _every_process_.  That's way too
many; a single process handles only one request at a time,
you only need to pool one connection in each process.  By pooling
more connections in each process, you're just asking PostgreSQL
to keep a LOT of connections open, which is just a needless
drain on the system.

pgsql-general by date:

Previous
From: Anand Raman
Date:
Subject: Re: problem with copy
Next
From: Bruce Momjian
Date:
Subject: Re: User names