Re: Troubles with performances - Mailing list pgsql-general

From Alexander Jerusalem
Subject Re: Troubles with performances
Date
Msg-id 4.3.2.7.0.20010122134804.00b13d90@pop.gmx.net
Whole thread Raw
In response to Re: Troubles with performances  (Guillaume Lémery <glemery@comclick.com>)
List pgsql-general
The really big sites scale by employing two strategies:

* Parallelize everything. That is design the application so that each
important part can run on multiple machines at the same time. For low
budget projects this is easy to achieve at the web and application logic
tiers but hard for databases. Databases with built in paralellism like
Oracle cost a fortune.

* Cache everything. For example if your data changes infrequently or
changes need not be visible immediately you can send back a cached web page
instead of querying the database every time. Servlet engines like Resin
(www.caucho.com) do that out of the box.

If it is as you said, that the database is really the bottleneck (which is
rare by the way), I would go for caching. If caching is not possible
because you have very frequent updates and all queries must include the
most recent data, you can simulate database parallelism by writing all
changes to two database servers in a distributed transaction. You will need
an XA capable transaction manager for this (in Java this cheap and easy
because you can use an open source EJB server like JBoss or a low cost
application server like Orion)

Alexander Jerusalem
ajeru@gmx.net
vknn

At 09:38 22.01.01, Guillaume Lémery wrote:
>>A few quick thoughts:
>>   200 simultaneous queries sounds like a lot
>>to me, and suggests you might be CPU-bound.
>Yes, in 'top' CPU usage is 99%...
>
>>Here are the things you should try:
>>1) Put PostgreSQL on a separate server from your
>>    Web server/application logic.  If you are CPU-bound,
>>    then this can greatly improve your throughput.
>I don't really think my Web Server uses a lot of CPU, because my
>application is written in a module .so and it does only queries to my database.
>
>>2) After doing #1, watch the CPU load on both database
>>    and web server machines.  If your web server is the
>>    bottleneck, try running multiple web servers.
>I was wondering to have 2 web servers per database server, but if you say
>that 200 simultaneous queries is a lot.. what about 400 ? :-(
>
>>3) After trying #1, if your database is the bottleneck
>>    and your data will all fit into RAM, try redesigning
>>    your web application to run entirely from memory.
>>    With this design, you would only do UPDATEs on each
>>    page; SELECTs would be done directly from data
>>    you've stored in memory.  This dramatically reduces
>>    database traffic.  However, it makes it hard to
>>    run multiple web servers and also creates problems if
>>    you have other applications updating the database.
>I don't know how to put the data in memormy to increase performance...
>What should I do ?
>
>>This assumes, of course, that you've carefully studied
>>and optimized the performance of your code.  In my experience,
>>application performance is usually the bottleneck, not
>>the database.  Each of the above strategies is appropriate
>>in different circumstances.
>Well, with all the performance problems, I tried to optimize my code
>everywhere I could.
>All I can do now is to change the way I pass the queries to the database...
>
>Thank you for all the advices.
>
>Guillaume


pgsql-general by date:

Previous
From: Lincoln Yeoh
Date:
Subject: Re: Troubles with performances
Next
From: Guillaume Lémery
Date:
Subject: Re: Troubles with performances