Thread: Memory requirements for web-project
Hello pgsql-general, i need to calculate the memory requirement if i am using apache+pgsql. Lets assume that i want 160.000 hits a day and pgsql takes 3 seconds to work for each client, how much ram is required specially for pgsql? I think I have to calculate the memory requiremend for a specific time period. 160.000 hits are wanted, so I calculate 160.000 div 24 = approx 6666 Hits a hour, div 60 = 111 Hits a minute, = 1,85 Hits per secound. lets round it up to 2 hits per secound. So I calculate now 2 hits per second * 3 seconds for a client = 6 processes at the same time, is that correct? If so, how much is the memory requiremend? I have seen with "TOP" that pgsql requires 5000kb for a request, but it is used per client (new processes for each new client-request?) -- Boris www.x-itec.de
At 05:10 PM 2/4/01 +0100, Boris wrote: >160.000 hits are wanted, so I calculate 160.000 div 24 = approx 6666 >Hits a hour, div 60 = 111 Hits a minute, = 1,85 Hits per secound. > >lets round it up to 2 hits per secound. So I calculate now >2 hits per second * 3 seconds for a client >= 6 processes at the same time, > >is that correct? My guess is that your hits are likely to bunch up at certain times, rather than be spread out evenly (90% of the people behave similarly 90% of the time). So you might wish to multiply by 5 or 10 (or whatever you pluck from the air ;) ) to cater for peak load. Why are you saying it takes 3 seconds for a client? If it's because the client is slow, you can do something about that, but if it's because your query takes 3 seconds then maybe some people here could help with your query. >If so, how much is the memory requiremend? I have seen with "TOP" that >pgsql requires 5000kb for a request, but it is used per client (new processes The minimum size varies depending on what platform it's running on. The max seems to vaguely depend on the size of the result set. But usually most of the mem is reclaimed after the query is finished, at least on my system. Just do some tests with your bigger queries. I'd say go for 512MB RAM. Coz, usually you can't add RAM during peak load, and if you can, it usually means you have hardware where 1GB or more is minimum anyway ;). And you might be able to stick half on another server (web+app) if desperate :). Cheerio, Link.
On Sun, Feb 04, 2001 at 05:10:21PM +0100, some SMTP stream spewed forth: > Hello pgsql-general, > > i need to calculate the memory requirement if i am using apache+pgsql. > > Lets assume that i want 160.000 hits a day and pgsql takes 3 seconds > to work for each client, how much ram is required specially for pgsql? > > I think I have to calculate the memory requiremend for a specific time > period. > > 160.000 hits are wanted, so I calculate 160.000 div 24 = approx 6666 > Hits a hour, div 60 = 111 Hits a minute, = 1,85 Hits per secound. Is it safe to preume that most of those 160,000 hits will come during some relatively short peak period? You could end up with 5-10 hits per second depending on the usage trends. > > lets round it up to 2 hits per secound. So I calculate now > 2 hits per second * 3 seconds for a client > = 6 processes at the same time, > > is that correct? It seems correct to me. > > If so, how much is the memory requiremend? I have seen with "TOP" that > pgsql requires 5000kb for a request, but it is used per client (new processes for > each new client-request?) Required memory would include Apache plus any modules (PHP, mod_perl, cgis, etc.), postgres executible *and* whatever memory postgres requires for processing the query (this may depend on what the user is doing...the "type" of hit). Don't forget about caches and shared memory also. gh > > > -- > Boris > www.x-itec.de > >
Hello Lincoln, Sunday, February 04, 2001, 5:51:37 PM, you wrote: LY> At 05:10 PM 2/4/01 +0100, Boris wrote: >>160.000 hits are wanted, so I calculate 160.000 div 24 = approx 6666 >>Hits a hour, div 60 = 111 Hits a minute, = 1,85 Hits per secound. >> >>lets round it up to 2 hits per secound. So I calculate now >>2 hits per second * 3 seconds for a client >>= 6 processes at the same time, >> >>is that correct? LY> My guess is that your hits are likely to bunch up at certain times, rather LY> than be spread out evenly (90% of the people behave similarly 90% of the LY> time). So you might wish to multiply by 5 or 10 (or whatever you pluck from LY> the air ;) ) to cater for peak load. LY> Why are you saying it takes 3 seconds for a client? If it's because the LY> client is slow, you can do something about that, but if it's because your LY> query takes 3 seconds then maybe some people here could help with your query. >>If so, how much is the memory requiremend? I have seen with "TOP" that >>pgsql requires 5000kb for a request, but it is used per client (new LY> processes LY> The minimum size varies depending on what platform it's running on. LY> The max seems to vaguely depend on the size of the result set. But usually LY> most of the mem is reclaimed after the query is finished, at least on my LY> system. LY> Just do some tests with your bigger queries. LY> I'd say go for 512MB RAM. LY> Coz, usually you can't add RAM during peak load, and if you can, it usually LY> means you have hardware where 1GB or more is minimum anyway ;). LY> And you might be able to stick half on another server (web+app) if LY> desperate :). LY> Cheerio, LY> Link. Hello Lincoln, Sunday, February 04, 2001, 5:51:37 PM, you wrote: LY> At 05:10 PM 2/4/01 +0100, Boris wrote: >>160.000 hits are wanted, so I calculate 160.000 div 24 = approx 6666 >>Hits a hour, div 60 = 111 Hits a minute, = 1,85 Hits per secound. >> >>lets round it up to 2 hits per secound. So I calculate now >>2 hits per second * 3 seconds for a client >>= 6 processes at the same time, >> >>is that correct? LY> My guess is that your hits are likely to bunch up at certain times, rather LY> than be spread out evenly (90% of the people behave similarly 90% of the LY> time). So you might wish to multiply by 5 or 10 (or whatever you pluck from LY> the air ;) ) to cater for peak load. Ah, very interesting, yes - I forgot! LY> Why are you saying it takes 3 seconds for a client? If it's because the No its a server problem, has nothing to do with the database -) LY> The minimum size varies depending on what platform it's running on. Aha, interesting. LY> The max seems to vaguely depend on the size of the result set. But usually LY> most of the mem is reclaimed after the query is finished, at least on my LY> system. Interesting to know. LY> Just do some tests with your bigger queries. Ok, good idea. LY> I'd say go for 512MB RAM. That sounds good, the only question left is the memory requiremend of apache per client, i do not completely understand the spawning things with Apache. On high load there are alway minimum 10 processes left, but where is the limit? Interesting thing. LY> Coz, usually you can't add RAM during peak load, and if you can, it usually LY> means you have hardware where 1GB or more is minimum anyway ;). LY> And you might be able to stick half on another server (web+app) if LY> desperate :). ok. LY> Cheerio, LY> Link. Thanks, that helped a lot! -- Boris www.x-itec.de -- Boris [MCSE, CNA] ................................................................... X-ITEC : Consulting * Programming * Net-Security * Crypto-Research ........: [PRIVATE ADDRESS:] : Boris Köster eMail koester@x-itec.de http://www.x-itec.de : Grüne 33-57368 Lennestadt Germany Tel: +49 (0)2721 989400 : 101 PERFECTION - SECURITY - STABILITY - FUNCTIONALITY ........:.......................................................... Everything I am writing is (c) by Boris Köster and may not be rewritten or distributed in any way without my permission.
Hello GH, Sunday, February 04, 2001, 5:46:54 PM, you wrote: G> On Sun, Feb 04, 2001 at 05:10:21PM +0100, some SMTP stream spewed forth: G> Is it safe to preume that most of those 160,000 hits will come during G> some relatively short peak period? You could end up with 5-10 hits per G> second depending on the usage trends. Yes I understand. G> "type" of hit). Don't forget about caches and shared memory also. What do you mean with the cache and the shared memory? -- Boris
On Sun, Feb 04, 2001 at 06:15:46PM +0100, some SMTP stream spewed forth: > Hello GH, > > Sunday, February 04, 2001, 5:46:54 PM, you wrote: > > G> On Sun, Feb 04, 2001 at 05:10:21PM +0100, some SMTP stream spewed forth: > > > G> Is it safe to preume that most of those 160,000 hits will come during > G> some relatively short peak period? You could end up with 5-10 hits per > G> second depending on the usage trends. > > Yes I understand. > > G> "type" of hit). Don't forget about caches and shared memory also. > > What do you mean with the cache and the shared memory? PostgreSQL caches some of the data from the database. If you, for example, wanted to have most of your database remain in memory, for speed, then you would want to have more memory than the minimum needed to handle the queries. I meant that you would want to be sure that PostgreSQL is configured with regard to the expected load and the number and mem requirements of queries. You might see out of memory errors under load. gh > > > -- > Boris > >
It depends on what you're doing per "hit"... I could allocate a hell of a lot of memory in 3 seconds :-) Buy twice as much RAM as you think you'll need, that way you're safe! I'd get as much RAM as possible anyway, especially now that it's as cheap as it is... 1 gig will only set you back several hundred dollars.. -Mitch ----- Original Message ----- From: "Boris" <koester@x-itec.de> To: <pgsql-general@postgresql.org> Sent: Sunday, February 04, 2001 11:10 AM Subject: Memory requirements for web-project > Hello pgsql-general, > > i need to calculate the memory requirement if i am using apache+pgsql. > > Lets assume that i want 160.000 hits a day and pgsql takes 3 seconds > to work for each client, how much ram is required specially for pgsql? > > I think I have to calculate the memory requiremend for a specific time > period. > > 160.000 hits are wanted, so I calculate 160.000 div 24 = approx 6666 > Hits a hour, div 60 = 111 Hits a minute, = 1,85 Hits per secound. > > lets round it up to 2 hits per secound. So I calculate now > 2 hits per second * 3 seconds for a client > = 6 processes at the same time, > > is that correct? > > If so, how much is the memory requiremend? I have seen with "TOP" that > pgsql requires 5000kb for a request, but it is used per client (new processes for > each new client-request?) > > > -- > Boris > www.x-itec.de > > >
At 06:13 PM 04-02-2001 +0100, Boris wrote: >LY> My guess is that your hits are likely to bunch up at certain times, rather >LY> than be spread out evenly (90% of the people behave similarly 90% of the >LY> time). So you might wish to multiply by 5 or 10 (or whatever you pluck from >LY> the air ;) ) to cater for peak load. > >Ah, very interesting, yes - I forgot! The peak vs the average ratio depends on the sort of application. It may not vary that much for corporate data entry stuff. >LY> Why are you saying it takes 3 seconds for a client? If it's because the > >No its a server problem, has nothing to do with the database -) That's interesting. Any reason why 3 secs, which you can tell us? >That sounds good, the only question left is the memory requiremend of >apache per client, i do not completely understand the spawning things >with Apache. On high load there are alway minimum 10 processes left, >but where is the limit? Interesting thing. Are you running apache and your web application on the same server as your postgresql server? This would mean the total mem taken would probably be apache+webapp+postgresql x max concurrent connections. Apache 1.x starts up a number of processes to handle requests. Check out what is said in httpd.conf about MinSpareServers and MaxSpareServers. The max limit is controlled by MaxClients in httpd.conf If it's a dedicated server for performance reasons you could have your MaxSpareServers and StartServers reasonably high, that way Apache doesn't have to keep killing and restarting new processes. Your MaxClients should be kept to a value X where X concurrent connections won't cause your server to swap. e.g. example for peak if everything running on one server Assuming 512MB RAM apache=3MB with 2MB shared = 1MB per additional connection. webapp=5MB with 2MB shared = 3MB per additional connection. postgresql=6MB with 4MB shared (peak) = 2MB per additional connection. Total = 6MB per additional connection. So with MaxClients 60, you're limited to 60 http connections = 360MB, leaving you 150MB to run various stuff, cope with strangeness, cache your DB etc and let you sleep soundly at night. 512MB is ok for a DB only server doing your proposed load, but may be a bit of a squeeze if you chuck your webserver and apps on it as well. Cheerio, Link.
On Sun, 4 Feb 2001, Boris wrote: > That sounds good, the only question left is the memory requiremend of > apache per client, i do not completely understand the spawning things > with Apache. On high load there are alway minimum 10 processes left, > but where is the limit? Interesting thing. I would like to recommend you use two machines. One for Apache/PHP and another for the database. This would not only give you better performance, but you would be able to scale where you need it without having to account for two sides of the equation... in other words one of those two set of operations (web vs DB) may be much higher of a bottleneck than the other. Depending on your type of site and how heavy you use the DB you may even find that the DB can catch up on a moderate setup while the apache/PHP needs more hardware... or it could be totally the other way around. It really has everything to do with your site and what it does. How about telling you about the type of site and how will the DB be involved in it.
Other people have said a lot on this, but IMHO missed some key points Boris wrote: > > Hello pgsql-general, > > i need to calculate the memory requirement if i am using apache+pgsql. > > Lets assume that i want 160.000 hits a day and pgsql takes 3 seconds > to work for each client, how much ram is required specially for pgsql? As said in another case - depends upon the size of the query's result set. > lets round it up to 2 hits per secound. So I calculate now > 2 hits per second * 3 seconds for a client > = 6 processes at the same time, > is that correct? No - as stated elsewhere, you need to allow for traffic peaks, so if you have traffic logs examine them, otherwise pick your favourite number and double it. The other thing is that the time it takes to complete a query will change the more clients you have running at the same time. Running six simultaneous queries, you might get *one* result back in 3 secs, an average of 6 secs and a maximum of 20 seconds for the slowest client. Test the system with the "ab" apache benchmark tool (comes with apache) or similar - this will give you a clearer idea of what too expect. Then, of course, the result set will be stored in your perl/php/whatever script before being sent to the client. You need to allow for that. Also allow for the time to send the data to the client. Sending 100KB to a client on a 28.8 modem can take a few seconds. > > If so, how much is the memory requiremend? I have seen with "TOP" that > pgsql requires 5000kb for a request, but it is used per client (new processes for > each new client-request?) Don't forget to subtract the shared memory from each of these clients - there will only be one copy of the code in memory for all the clients. Bottom line - if money is tight, do as much testing as you can, otherwise buy as much RAM as your system can take - too much tends not to do a lot of damage. - Richard Huxton