On 07/03/2012 12:54 AM, John R Pierce wrote:
> On 07/03/12 12:34 AM, Craig Ringer wrote:
>> I'm seriously impressed that your system is working under load at all
>> with 800 concurrent connections fighting to write all at once.
>
> indeed, in my transactional benchmarks on a 12 core, 24 thread dual
> xeon x5600 class systems, with 16 or 20 spindle raid10, I find
> somewherre around 50 to 80 database connection threads has the highest
> overall throughput (several thousand OLTP transactions/second).
> this hardware has vastly better IO and CPU performance than any AWS
> virtual machine.
>
>
> as craig suggested, your network threads could put the incoming
> requests into queue(s), and run a tunable number of database
> connection threads that take requests out of the queue and send them
> to the database, and if neccessary, return results to the network
> thread. doing this will give better CPU utilization, you can try
> different database worker thread counts til you hit the optimal number
> for your hardware.
>
Just to clear the air on this, this is almost exactly what I'm doing.
The number of 800 came out of experimenting with numbers (I'm sure it
took you some time to find the optimum of 50-80 for your configuration).
The number of "worker" threads are configurable, and they do receive
their work from a shared queue. By the way, on the operations that I'm
doing, postgres is performing very well, with average of less than 10ms
per transaction, with throughput of times over 600 tps.
However, writing data to postgres is not the only thing I need to do to
process the data. If the time to process rises for other reasons, low
number of threads may not be able to withstand constant stream of
incoming data, and I have to raise the worker thread number to
compensate. As I was doing this, I ran into the problem described in the
original email, and it puzzled me. However, only because I opened 800
connections, doesn't mean that all of the connections are being being
actively used concurrently (so not that much fighting). I indeed should
switch to a connection pool model in such a case, just to not over-fork
postgres, however, I don't see that postgres is consuming any
significant amount of system resources by forked server processes.
Thank you,
Pawel.