Thread: Just to give an idea ...
Top ten processes on mars (where pgfoundry and archives are located): USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND www 41463 17.3 0.1 12364 5808 ?? SJ 11:00AM 0:05.75 /usr/local/sbin/httpd www 20583 13.5 0.1 12688 6172 ?? RJ 2:00PM 195:45.65 /usr/local/sbin/httpd www 41053 12.2 0.1 12372 5840 ?? SJ 10:56AM 0:28.14 /usr/local/sbin/httpd www 56570 12.3 0.3 21336 12012 ?? RJ Wed11PM 264:32.06 /usr/local/sbin/httpd www 56572 12.5 0.3 21388 12072 ?? RJ Wed11PM 258:55.67 /usr/local/sbin/httpd www 41792 11.9 0.1 12368 5800 ?? SJ 11:00AM 0:01.19 /usr/local/sbin/httpd www 41778 10.5 0.1 12376 5824 ?? SJ 11:00AM 0:04.47 /usr/local/sbin/httpd www 20582 9.0 0.2 13192 6752 ?? SJ 2:00PM 195:51.63 /usr/local/sbin/httpd 60 41798 6.6 0.1 21012 2296 ?? SJ 11:00AM 0:00.38 lmtpd scrappy 41795 6.4 0.0 2220 1408 ?? SJ 11:00AM 0:00.47 cleanup -z -t unix -u and, process map'ng to the VMs: # cat /proc/{41463,20583,41053,56570,56572,41792,41778,20582,41798,41795}/status cat: /proc/41463/status: No such file or directory httpd 11651,690738 nochan 80 80 80,80,80 svr5.postgresql.org httpd 41,778660 select 80 80 80,80,80 svr5.postgresql.org httpd 15453,319902 nochan 80 80 80,80,80 pgfoundry.org httpd 15130,154111 nochan 80 80 80,80,80 pgfoundry.org cat: /proc/41792/status: No such file or directory cat: /proc/41778/status: No such file or directory httpd 11656,301638 nochan 80 80 80,80,80 svr5.postgresql.org lmtpd 0,365833 select 60 60 60,60,60 up4.com cleanup 0,462244 select 1001 1001 1001,1001,1001,6 up4.com and all the high %CPU ones are postgresql.org related ... also, postgresql.org generates ~298GB of traffic per month, which accounts for ~56% of all traffic out of the servers (and that isn't included what I put onto our offsite server for ftp/bittorrent, it used to be ~75% of the traffic), while hub.org (non client) generates ~175GB (or ~33%) ... Most (if not 90%) of the paying clients are static pages that are lucky to generate 100MB of traffic (and note that traffic is all traffic, mail/web/ssh/ftp/etc) ... The point: before accusing "hub clients" of loading the servers, realize that ~50% of the resources are used by postgresql.org, not by clients ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
If we want to offset the load by moving pgfoundry to one of my servers temporarily until you guys get your new servers and what not in place, I'd be happy to host it, if it helps. Gavin On Sat, 5 Jun 2004 11:19:04 -0300 (ADT), Marc G. Fournier <scrappy@postgresql.org> wrote: > > > Top ten processes on mars (where pgfoundry and archives are located): > > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND > www 41463 17.3 0.1 12364 5808 ?? SJ 11:00AM 0:05.75 /usr/local/sbin/httpd > www 20583 13.5 0.1 12688 6172 ?? RJ 2:00PM 195:45.65 /usr/local/sbin/httpd > www 41053 12.2 0.1 12372 5840 ?? SJ 10:56AM 0:28.14 /usr/local/sbin/httpd > www 56570 12.3 0.3 21336 12012 ?? RJ Wed11PM 264:32.06 /usr/local/sbin/httpd > www 56572 12.5 0.3 21388 12072 ?? RJ Wed11PM 258:55.67 /usr/local/sbin/httpd > www 41792 11.9 0.1 12368 5800 ?? SJ 11:00AM 0:01.19 /usr/local/sbin/httpd > www 41778 10.5 0.1 12376 5824 ?? SJ 11:00AM 0:04.47 /usr/local/sbin/httpd > www 20582 9.0 0.2 13192 6752 ?? SJ 2:00PM 195:51.63 /usr/local/sbin/httpd > 60 41798 6.6 0.1 21012 2296 ?? SJ 11:00AM 0:00.38 lmtpd > scrappy 41795 6.4 0.0 2220 1408 ?? SJ 11:00AM 0:00.47 cleanup -z -t unix -u > > and, process map'ng to the VMs: > > # cat /proc/{41463,20583,41053,56570,56572,41792,41778,20582,41798,41795}/status > cat: /proc/41463/status: No such file or directory > httpd 11651,690738 nochan 80 80 80,80,80 svr5.postgresql.org > httpd 41,778660 select 80 80 80,80,80 svr5.postgresql.org > httpd 15453,319902 nochan 80 80 80,80,80 pgfoundry.org > httpd 15130,154111 nochan 80 80 80,80,80 pgfoundry.org > cat: /proc/41792/status: No such file or directory > cat: /proc/41778/status: No such file or directory > httpd 11656,301638 nochan 80 80 80,80,80 svr5.postgresql.org > lmtpd 0,365833 select 60 60 60,60,60 up4.com > cleanup 0,462244 select 1001 1001 1001,1001,1001,6 up4.com > > and all the high %CPU ones are postgresql.org related ... > > also, postgresql.org generates ~298GB of traffic per month, which accounts > for ~56% of all traffic out of the servers (and that isn't included what I > put onto our offsite server for ftp/bittorrent, it used to be ~75% of the > traffic), while hub.org (non client) generates ~175GB (or ~33%) ... > > Most (if not 90%) of the paying clients are static pages that are lucky to > generate 100MB of traffic (and note that traffic is all traffic, > mail/web/ssh/ftp/etc) ... > > The point: before accusing "hub clients" of loading the servers, realize > that ~50% of the resources are used by postgresql.org, not by clients ... > > ---- > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) > Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664 > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
> -----Original Message----- > From: pgsql-www-owner@postgresql.org > [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Marc G. Fournier > Sent: 05 June 2004 15:19 > To: pgsql-www@postgresql.org > Subject: [pgsql-www] Just to give an idea ... > > > Top ten processes on mars (where pgfoundry and archives are located): > > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND > www 41463 17.3 0.1 12364 5808 ?? SJ 11:00AM > 0:05.75 /usr/local/sbin/httpd Ignoring mars and considering jupiter which is suffering from similar problems (which you told me is not the database as I had suspected), what do you think is actually causing this excess load? None of the backend web code should be sufficiently complex to see the sort of loads we seem to be seeing, even under the load the site gets. I've apache-benched far more complex stuff on far less hardware and not suffered like this. Geez, the vast majority of our PHP code simply does include()'s and echo()'s, and even that only normally gets executed once per hour - the users read static html for the most part! I can only imagine that somewhere in the code there is a serious error that is causing this... Or jupiter is actually a Sinclair ZX Spectrum :-) /D
On Sat, 5 Jun 2004, Dave Page wrote: > > >> -----Original Message----- >> From: pgsql-www-owner@postgresql.org >> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Marc G. Fournier >> Sent: 05 June 2004 15:19 >> To: pgsql-www@postgresql.org >> Subject: [pgsql-www] Just to give an idea ... >> >> >> Top ten processes on mars (where pgfoundry and archives are located): >> >> USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND >> www 41463 17.3 0.1 12364 5808 ?? SJ 11:00AM >> 0:05.75 /usr/local/sbin/httpd > > Ignoring mars and considering jupiter which is suffering from similar > problems (which you told me is not the database as I had suspected), > what do you think is actually causing this excess load? None of the > backend web code should be sufficiently complex to see the sort of loads > we seem to be seeing, even under the load the site gets. I've > apache-benched far more complex stuff on far less hardware and not > suffered like this. Geez, the vast majority of our PHP code simply does > include()'s and echo()'s, and even that only normally gets executed once > per hour - the users read static html for the most part! k, www.* is on pluto, not on mars or jupiter, and tends to stay pretty constant (occasional spikes) ... jupiter isn't web related load, but mail ... mars is a combination of things ... note that the point of my point was that Josh's opinion appears to be that "hub clients" are causing load issues on the servers, which is not accurate ... most of our "big clients" are running on neptune, where no postgresql.org VM is running, and the load on that machine *rarely* goes above 5, and when it does, I just need to kill off one of the aspseek processes and it drops back down again :) ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
> -----Original Message----- > From: Marc G. Fournier [mailto:scrappy@postgresql.org] > Sent: 05 June 2004 22:35 > To: Dave Page > Cc: Marc G. Fournier; pgsql-www@postgresql.org > Subject: RE: [pgsql-www] Just to give an idea ... > > On Sat, 5 Jun 2004, Dave Page wrote: > > k, www.* is on pluto, not on mars or jupiter, and tends to > stay pretty constant (occasional spikes) ... OK, wrong planet - never was much good at astronomy... > jupiter isn't web related load, but mail ... > > mars is a combination of things ... > > note that the point of my point was that Josh's opinion > appears to be that "hub clients" are causing load issues on > the servers, which is not accurate ... Yeah, I got that and don't disagree. I'm having *very* hard job understanding what we're doing that is maxing out such a server. I could understand it if it was neptune (where the DBs are for those that don't know). Remember a few days back I pointed you to http://www.postgresql.org/index.php which just timed out after 5 minutes or so whilst http://www.postgresql.org/index.html loaded in a couple of seconds? Well, at the same time, the admin pages which access the DB and are written in PHP were also working just fine. The major differences between index.php and the admin pages are the styling, the number of db accesses (1 for an admin page, maybe 4 for index.php), and the banner ads (which aren't affecting it as the page will load even if they are broken). None of that *should* cause such a massive performance problem on a dual xeon or even PIII server, especially when the vast majority of load is users accessing static HTML (which works just fine). So it seems to me that there is something broken in PHP causing these load problems that we are only hitting in certain circumstances. BTW, index.php is working just fine right now :-( > most of our "big > clients" are running on neptune, where no postgresql.org VM > is running, and the load on that machine *rarely* goes above > 5, and when it does, I just need to kill off one of the > aspseek processes and it drops back down again :) Don't kill the indexer processes unless absolutely necessary - that can screw the database up. Use '/usr/local/aspseek/bin/indexer -E' to safely terminate the running indexers. It can take a little while to shut them down... Regards, Dave.
On Sat, 5 Jun 2004, Dave Page wrote: > So it seems to me that there is something broken in PHP causing these > load problems that we are only hitting in certain circumstances. > > BTW, index.php is working just fine right now :-( 'k, keep an eye on that and see how it runs ... I just upgraded PHP on the template to 4.3.7, which: Fixed a number of crashes inside pgsql, cpdf and gd extensions. now, granted, the pages have been working, but am curious as to whether or not some weren't manifested as crashes, but slow downs ... > Don't kill the indexer processes unless absolutely necessary - that can > screw the database up. Use '/usr/local/aspseek/bin/indexer -E' to safely > terminate the running indexers. It can take a little while to shut them > down... k, just added a 'shutdown-aspseek' alias to the VM, so that its documented *somewhere* ;) ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
> -----Original Message----- > From: Marc G. Fournier [mailto:scrappy@postgresql.org] > Sent: 06 June 2004 02:26 > To: Dave Page > Cc: pgsql-www@postgresql.org > Subject: RE: [pgsql-www] Just to give an idea ... > > On Sat, 5 Jun 2004, Dave Page wrote: > > > So it seems to me that there is something broken in PHP > causing these > > load problems that we are only hitting in certain circumstances. > > > > BTW, index.php is working just fine right now :-( > > 'k, keep an eye on that and see how it runs ... I just > upgraded PHP on the template to 4.3.7, which: > > Fixed a number of crashes inside pgsql, cpdf and gd extensions. > > now, granted, the pages have been working, but am curious as > to whether or not some weren't manifested as crashes, but > slow downs ... OK, that's a possibility. Often my Squid proxy reported failed pages as 'zero sized replies' which look like timeouts as it waits for data. I guess these could easily have been crashes. > > Don't kill the indexer processes unless absolutely necessary - that > > can screw the database up. Use > '/usr/local/aspseek/bin/indexer -E' to > > safely terminate the running indexers. It can take a little > while to > > shut them down... > > k, just added a 'shutdown-aspseek' alias to the VM, so that > its documented > *somewhere* ;) Fyi: the rest of the docs are at http://www.aspseek.org/. /D
Marc, > note that the point of my point was that Josh's opinion appears to be that > "hub clients" are causing load issues on the servers, which is not > accurate ... So, if it's not Hub.org clients, how about doing a little sleuthing on what *is* causing the resource drain? Nobody but you has root access to the real server, Marc, so nobody but you can diagnose why so many of the PostgreSQL.org sites ... including pgFoundry.org ... are behaving like they're under constant DDOS attack. If nothing else, I'd think that you'd be getting complaints from your paying clients about this! -- -Josh Berkus Aglio Database Solutions San Francisco
On Sun, 6 Jun 2004, Josh Berkus wrote: > Marc, > >> note that the point of my point was that Josh's opinion appears to be that >> "hub clients" are causing load issues on the servers, which is not >> accurate ... > > So, if it's not Hub.org clients, how about doing a little sleuthing on > what *is* causing the resource drain? > > Nobody but you has root access to the real server, Marc, so nobody but > you can diagnose why so many of the PostgreSQL.org sites ... including > pgFoundry.org ... are behaving like they're under constant DDOS attack. > If nothing else, I'd think that you'd be getting complaints from your > paying clients about this! Truth be told ... only clients that notice anything are those running webmail, and even on a 'dedicated machine' I find webmail to be dog slow ... But, as far as pgfoundry.org is concerned, as was mentioned on the gforge-admins list, Andrew got Jan to take a look through the tables, and apparently there are few, if any, indices on them, which could account for how slow things look ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
Marc, > Truth be told ... only clients that notice anything are those running > webmail, and even on a 'dedicated machine' I find webmail to be dog slow Well ... that's Hordemail for you. They packed it full of features, but didn't every troubleshoot its performance issues. A lot of the slowness of Hordemail is the truly vast number of tiny images on the screen to make it look "snazzy". Communigate Pro webmail is a *lot* faster. They acheived this by providing only the essential features, and a bare-bones interface. And, of course, it's a commerical product (although very OSS-friendly). > But, as far as pgfoundry.org is concerned, as was mentioned on the > gforge-admins list, Andrew got Jan to take a look through the tables, and > apparently there are few, if any, indices on them, which could account for > how slow things look ... Oh, I thought that had been fixed. Will fix as soon as I find my login info again. -- -Josh Berkus Aglio Database Solutions San Francisco
On Sat, 5 Jun 2004, Gavin M. Roy wrote: > If we want to offset the load by moving pgfoundry to one of my servers > temporarily until you guys get your new servers and what not in place, > I'd be happy to host it, if it helps. altho that is an option, I made some changes to the servers this weekend that seem to be holding load down ... we'll see how the 'weekday traffic' changes that ... but, also, Jan Wieck took a scan through the schema for pgfoundry at Andrew's request, and it looks like there are several areas for improvement there ;( > > Gavin > > On Sat, 5 Jun 2004 11:19:04 -0300 (ADT), Marc G. Fournier > <scrappy@postgresql.org> wrote: >> >> >> Top ten processes on mars (where pgfoundry and archives are located): >> >> USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND >> www 41463 17.3 0.1 12364 5808 ?? SJ 11:00AM 0:05.75 /usr/local/sbin/httpd >> www 20583 13.5 0.1 12688 6172 ?? RJ 2:00PM 195:45.65 /usr/local/sbin/httpd >> www 41053 12.2 0.1 12372 5840 ?? SJ 10:56AM 0:28.14 /usr/local/sbin/httpd >> www 56570 12.3 0.3 21336 12012 ?? RJ Wed11PM 264:32.06 /usr/local/sbin/httpd >> www 56572 12.5 0.3 21388 12072 ?? RJ Wed11PM 258:55.67 /usr/local/sbin/httpd >> www 41792 11.9 0.1 12368 5800 ?? SJ 11:00AM 0:01.19 /usr/local/sbin/httpd >> www 41778 10.5 0.1 12376 5824 ?? SJ 11:00AM 0:04.47 /usr/local/sbin/httpd >> www 20582 9.0 0.2 13192 6752 ?? SJ 2:00PM 195:51.63 /usr/local/sbin/httpd >> 60 41798 6.6 0.1 21012 2296 ?? SJ 11:00AM 0:00.38 lmtpd >> scrappy 41795 6.4 0.0 2220 1408 ?? SJ 11:00AM 0:00.47 cleanup -z -t unix -u >> >> and, process map'ng to the VMs: >> >> # cat /proc/{41463,20583,41053,56570,56572,41792,41778,20582,41798,41795}/status >> cat: /proc/41463/status: No such file or directory >> httpd 11651,690738 nochan 80 80 80,80,80 svr5.postgresql.org >> httpd 41,778660 select 80 80 80,80,80 svr5.postgresql.org >> httpd 15453,319902 nochan 80 80 80,80,80 pgfoundry.org >> httpd 15130,154111 nochan 80 80 80,80,80 pgfoundry.org >> cat: /proc/41792/status: No such file or directory >> cat: /proc/41778/status: No such file or directory >> httpd 11656,301638 nochan 80 80 80,80,80 svr5.postgresql.org >> lmtpd 0,365833 select 60 60 60,60,60 up4.com >> cleanup 0,462244 select 1001 1001 1001,1001,1001,6 up4.com >> >> and all the high %CPU ones are postgresql.org related ... >> >> also, postgresql.org generates ~298GB of traffic per month, which accounts >> for ~56% of all traffic out of the servers (and that isn't included what I >> put onto our offsite server for ftp/bittorrent, it used to be ~75% of the >> traffic), while hub.org (non client) generates ~175GB (or ~33%) ... >> >> Most (if not 90%) of the paying clients are static pages that are lucky to >> generate 100MB of traffic (and note that traffic is all traffic, >> mail/web/ssh/ftp/etc) ... >> >> The point: before accusing "hub clients" of loading the servers, realize >> that ~50% of the resources are used by postgresql.org, not by clients ... >> >> ---- >> Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) >> Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664 >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 5: Have you checked our extensive FAQ? >> >> http://www.postgresql.org/docs/faqs/FAQ.html >> > ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664