Thread: Just to give an idea ...

Just to give an idea ...

From
"Marc G. Fournier"
Date:
Top ten processes on mars (where pgfoundry and archives are located):

USER      PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
www     41463 17.3  0.1 12364 5808  ??  SJ   11:00AM   0:05.75 /usr/local/sbin/httpd
www     20583 13.5  0.1 12688 6172  ??  RJ    2:00PM 195:45.65 /usr/local/sbin/httpd
www     41053 12.2  0.1 12372 5840  ??  SJ   10:56AM   0:28.14 /usr/local/sbin/httpd
www     56570 12.3  0.3 21336 12012  ??  RJ   Wed11PM 264:32.06 /usr/local/sbin/httpd
www     56572 12.5  0.3 21388 12072  ??  RJ   Wed11PM 258:55.67 /usr/local/sbin/httpd
www     41792 11.9  0.1 12368 5800  ??  SJ   11:00AM   0:01.19 /usr/local/sbin/httpd
www     41778 10.5  0.1 12376 5824  ??  SJ   11:00AM   0:04.47 /usr/local/sbin/httpd
www     20582  9.0  0.2 13192 6752  ??  SJ    2:00PM 195:51.63 /usr/local/sbin/httpd
60      41798  6.6  0.1 21012 2296  ??  SJ   11:00AM   0:00.38 lmtpd
scrappy 41795  6.4  0.0  2220 1408  ??  SJ   11:00AM   0:00.47 cleanup -z -t unix -u

and, process map'ng to the VMs:

# cat /proc/{41463,20583,41053,56570,56572,41792,41778,20582,41798,41795}/status
cat: /proc/41463/status: No such file or directory
httpd 11651,690738 nochan 80 80 80,80,80 svr5.postgresql.org
httpd 41,778660 select 80 80 80,80,80 svr5.postgresql.org
httpd 15453,319902 nochan 80 80 80,80,80 pgfoundry.org
httpd 15130,154111 nochan 80 80 80,80,80 pgfoundry.org
cat: /proc/41792/status: No such file or directory
cat: /proc/41778/status: No such file or directory
httpd 11656,301638 nochan 80 80 80,80,80 svr5.postgresql.org
lmtpd 0,365833 select 60 60 60,60,60 up4.com
cleanup 0,462244 select 1001 1001 1001,1001,1001,6 up4.com

and all the high %CPU ones are postgresql.org related ...

also, postgresql.org generates ~298GB of traffic per month, which accounts
for ~56% of all traffic out of the servers (and that isn't included what I
put onto our offsite server for ftp/bittorrent, it used to be ~75% of the
traffic), while hub.org (non client) generates ~175GB (or ~33%) ...

Most (if not 90%) of the paying clients are static pages that are lucky to
generate 100MB of traffic (and note that traffic is all traffic,
mail/web/ssh/ftp/etc) ...

The point: before accusing "hub clients" of loading the servers, realize
that ~50% of the resources are used by postgresql.org, not by clients ...


----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Just to give an idea ...

From
"Gavin M. Roy"
Date:
If we want to offset the load by moving pgfoundry to one of my servers
temporarily until you guys get your new servers and what not in place,
I'd be happy to host it, if it helps.

Gavin

On Sat, 5 Jun 2004 11:19:04 -0300 (ADT), Marc G. Fournier
<scrappy@postgresql.org> wrote:
>
>
> Top ten processes on mars (where pgfoundry and archives are located):
>
> USER      PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
> www     41463 17.3  0.1 12364 5808  ??  SJ   11:00AM   0:05.75 /usr/local/sbin/httpd
> www     20583 13.5  0.1 12688 6172  ??  RJ    2:00PM 195:45.65 /usr/local/sbin/httpd
> www     41053 12.2  0.1 12372 5840  ??  SJ   10:56AM   0:28.14 /usr/local/sbin/httpd
> www     56570 12.3  0.3 21336 12012  ??  RJ   Wed11PM 264:32.06 /usr/local/sbin/httpd
> www     56572 12.5  0.3 21388 12072  ??  RJ   Wed11PM 258:55.67 /usr/local/sbin/httpd
> www     41792 11.9  0.1 12368 5800  ??  SJ   11:00AM   0:01.19 /usr/local/sbin/httpd
> www     41778 10.5  0.1 12376 5824  ??  SJ   11:00AM   0:04.47 /usr/local/sbin/httpd
> www     20582  9.0  0.2 13192 6752  ??  SJ    2:00PM 195:51.63 /usr/local/sbin/httpd
> 60      41798  6.6  0.1 21012 2296  ??  SJ   11:00AM   0:00.38 lmtpd
> scrappy 41795  6.4  0.0  2220 1408  ??  SJ   11:00AM   0:00.47 cleanup -z -t unix -u
>
> and, process map'ng to the VMs:
>
> # cat /proc/{41463,20583,41053,56570,56572,41792,41778,20582,41798,41795}/status
> cat: /proc/41463/status: No such file or directory
> httpd 11651,690738 nochan 80 80 80,80,80 svr5.postgresql.org
> httpd 41,778660 select 80 80 80,80,80 svr5.postgresql.org
> httpd 15453,319902 nochan 80 80 80,80,80 pgfoundry.org
> httpd 15130,154111 nochan 80 80 80,80,80 pgfoundry.org
> cat: /proc/41792/status: No such file or directory
> cat: /proc/41778/status: No such file or directory
> httpd 11656,301638 nochan 80 80 80,80,80 svr5.postgresql.org
> lmtpd 0,365833 select 60 60 60,60,60 up4.com
> cleanup 0,462244 select 1001 1001 1001,1001,1001,6 up4.com
>
> and all the high %CPU ones are postgresql.org related ...
>
> also, postgresql.org generates ~298GB of traffic per month, which accounts
> for ~56% of all traffic out of the servers (and that isn't included what I
> put onto our offsite server for ftp/bittorrent, it used to be ~75% of the
> traffic), while hub.org (non client) generates ~175GB (or ~33%) ...
>
> Most (if not 90%) of the paying clients are static pages that are lucky to
> generate 100MB of traffic (and note that traffic is all traffic,
> mail/web/ssh/ftp/etc) ...
>
> The point: before accusing "hub clients" of loading the servers, realize
> that ~50% of the resources are used by postgresql.org, not by clients ...
>
> ----
> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>

Re: Just to give an idea ...

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-www-owner@postgresql.org
> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Marc G. Fournier
> Sent: 05 June 2004 15:19
> To: pgsql-www@postgresql.org
> Subject: [pgsql-www] Just to give an idea ...
>
>
> Top ten processes on mars (where pgfoundry and archives are located):
>
> USER      PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
> www     41463 17.3  0.1 12364 5808  ??  SJ   11:00AM
> 0:05.75 /usr/local/sbin/httpd

Ignoring mars and considering jupiter which is suffering from similar
problems (which you told me is not the database as I had suspected),
what do you think is actually causing this excess load? None of the
backend web code should be sufficiently complex to see the sort of loads
we seem to be seeing, even under the load the site gets. I've
apache-benched far more complex stuff on far less hardware and not
suffered like this. Geez, the vast majority of our PHP code simply does
include()'s and echo()'s, and even that only normally gets executed once
per hour - the users read static html for the most part!

I can only imagine that somewhere in the code there is a serious error
that is causing this... Or jupiter is actually a Sinclair ZX Spectrum
:-)

/D

Re: Just to give an idea ...

From
"Marc G. Fournier"
Date:
On Sat, 5 Jun 2004, Dave Page wrote:

>
>
>> -----Original Message-----
>> From: pgsql-www-owner@postgresql.org
>> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Marc G. Fournier
>> Sent: 05 June 2004 15:19
>> To: pgsql-www@postgresql.org
>> Subject: [pgsql-www] Just to give an idea ...
>>
>>
>> Top ten processes on mars (where pgfoundry and archives are located):
>>
>> USER      PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
>> www     41463 17.3  0.1 12364 5808  ??  SJ   11:00AM
>> 0:05.75 /usr/local/sbin/httpd
>
> Ignoring mars and considering jupiter which is suffering from similar
> problems (which you told me is not the database as I had suspected),
> what do you think is actually causing this excess load? None of the
> backend web code should be sufficiently complex to see the sort of loads
> we seem to be seeing, even under the load the site gets. I've
> apache-benched far more complex stuff on far less hardware and not
> suffered like this. Geez, the vast majority of our PHP code simply does
> include()'s and echo()'s, and even that only normally gets executed once
> per hour - the users read static html for the most part!

k, www.* is on pluto, not on mars or jupiter, and tends to stay pretty
constant (occasional spikes) ...

jupiter isn't web related load, but mail ...

mars is a combination of things ...

note that the point of my point was that Josh's opinion appears to be that
"hub clients" are causing load issues on the servers, which is not
accurate ... most of our "big clients" are running on neptune, where no
postgresql.org VM is running, and the load on that machine *rarely* goes
above 5, and when it does, I just need to kill off one of the aspseek
processes and it drops back down again :)

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Just to give an idea ...

From
"Dave Page"
Date:

> -----Original Message-----
> From: Marc G. Fournier [mailto:scrappy@postgresql.org]
> Sent: 05 June 2004 22:35
> To: Dave Page
> Cc: Marc G. Fournier; pgsql-www@postgresql.org
> Subject: RE: [pgsql-www] Just to give an idea ...
>
> On Sat, 5 Jun 2004, Dave Page wrote:
>
> k, www.* is on pluto, not on mars or jupiter, and tends to
> stay pretty constant (occasional spikes) ...

OK, wrong planet - never was much good at astronomy...

> jupiter isn't web related load, but mail ...
>
> mars is a combination of things ...
>
> note that the point of my point was that Josh's opinion
> appears to be that "hub clients" are causing load issues on
> the servers, which is not accurate ...

Yeah, I got that and don't disagree. I'm having *very* hard job
understanding what we're doing that is maxing out such a server. I could
understand it if it was neptune (where the DBs are for those that don't
know). Remember a few days back I pointed you to
http://www.postgresql.org/index.php which just timed out after 5 minutes
or so whilst http://www.postgresql.org/index.html loaded in a couple of
seconds? Well, at the same time, the admin pages which access the DB and
are written in PHP were also working just fine. The major differences
between index.php and the admin pages are the styling, the number of db
accesses (1 for an admin page, maybe 4 for index.php), and the banner
ads (which aren't affecting it as the page will load even if they are
broken). None of that *should* cause such a massive performance problem
on a dual xeon or even PIII server, especially when the vast majority of
load is users accessing static HTML (which works just fine).

So it seems to me that there is something broken in PHP causing these
load problems that we are only hitting in certain circumstances.

BTW, index.php is working just fine right now :-(

> most of our "big
> clients" are running on neptune, where no postgresql.org VM
> is running, and the load on that machine *rarely* goes above
> 5, and when it does, I just need to kill off one of the
> aspseek processes and it drops back down again :)

Don't kill the indexer processes unless absolutely necessary - that can
screw the database up. Use '/usr/local/aspseek/bin/indexer -E' to safely
terminate the running indexers. It can take a little while to shut them
down...

Regards, Dave.

Re: Just to give an idea ...

From
"Marc G. Fournier"
Date:
On Sat, 5 Jun 2004, Dave Page wrote:

> So it seems to me that there is something broken in PHP causing these
> load problems that we are only hitting in certain circumstances.
>
> BTW, index.php is working just fine right now :-(

'k, keep an eye on that and see how it runs ... I just upgraded PHP on
the template to 4.3.7, which:

     Fixed a number of crashes inside pgsql, cpdf and gd extensions.

now, granted, the pages have been working, but am curious as to whether or
not some weren't manifested as crashes, but slow downs ...

> Don't kill the indexer processes unless absolutely necessary - that can
> screw the database up. Use '/usr/local/aspseek/bin/indexer -E' to safely
> terminate the running indexers. It can take a little while to shut them
> down...

k, just added a 'shutdown-aspseek' alias to the VM, so that its documented
*somewhere* ;)

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Just to give an idea ...

From
"Dave Page"
Date:

> -----Original Message-----
> From: Marc G. Fournier [mailto:scrappy@postgresql.org]
> Sent: 06 June 2004 02:26
> To: Dave Page
> Cc: pgsql-www@postgresql.org
> Subject: RE: [pgsql-www] Just to give an idea ...
>
> On Sat, 5 Jun 2004, Dave Page wrote:
>
> > So it seems to me that there is something broken in PHP
> causing these
> > load problems that we are only hitting in certain circumstances.
> >
> > BTW, index.php is working just fine right now :-(
>
> 'k, keep an eye on that and see how it runs ... I just
> upgraded PHP on the template to 4.3.7, which:
>
>      Fixed a number of crashes inside pgsql, cpdf and gd extensions.
>
> now, granted, the pages have been working, but am curious as
> to whether or not some weren't manifested as crashes, but
> slow downs ...

OK, that's a possibility. Often my Squid proxy reported failed pages as
'zero sized replies' which look like timeouts as it waits for data. I
guess these could easily have been crashes.

> > Don't kill the indexer processes unless absolutely necessary - that
> > can screw the database up. Use
> '/usr/local/aspseek/bin/indexer -E' to
> > safely terminate the running indexers. It can take a little
> while to
> > shut them down...
>
> k, just added a 'shutdown-aspseek' alias to the VM, so that
> its documented
> *somewhere* ;)

Fyi: the rest of the docs are at http://www.aspseek.org/.

/D

Re: Just to give an idea ...

From
Josh Berkus
Date:
Marc,

> note that the point of my point was that Josh's opinion appears to be that
> "hub clients" are causing load issues on the servers, which is not
> accurate ...

So, if it's not Hub.org clients, how about doing a little sleuthing on what
*is* causing the resource drain?

Nobody but you has root access to the real server, Marc, so nobody but you can
diagnose why so many of the PostgreSQL.org sites ... including
pgFoundry.org ... are behaving like they're under constant DDOS attack.   If
nothing else, I'd think that you'd be getting complaints from your paying
clients about this!

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Just to give an idea ...

From
"Marc G. Fournier"
Date:
On Sun, 6 Jun 2004, Josh Berkus wrote:

> Marc,
>
>> note that the point of my point was that Josh's opinion appears to be that
>> "hub clients" are causing load issues on the servers, which is not
>> accurate ...
>
> So, if it's not Hub.org clients, how about doing a little sleuthing on
> what *is* causing the resource drain?
>
> Nobody but you has root access to the real server, Marc, so nobody but
> you can diagnose why so many of the PostgreSQL.org sites ... including
> pgFoundry.org ... are behaving like they're under constant DDOS attack.
> If nothing else, I'd think that you'd be getting complaints from your
> paying clients about this!

Truth be told ... only clients that notice anything are those running
webmail, and even on a 'dedicated machine' I find webmail to be dog slow
...

But, as far as pgfoundry.org is concerned, as was mentioned on the
gforge-admins list, Andrew got Jan to take a look through the tables, and
apparently there are few, if any, indices on them, which could account for
how slow things look ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Just to give an idea ...

From
Josh Berkus
Date:
Marc,

> Truth be told ... only clients that notice anything are those running
> webmail, and even on a 'dedicated machine' I find webmail to be dog slow

Well ... that's Hordemail for you.   They packed it full of features, but
didn't every troubleshoot its performance issues.    A lot of the slowness of
Hordemail is the truly vast number of tiny images on the screen to make it
look "snazzy".

Communigate Pro webmail is a *lot* faster.   They acheived this by providing
only the essential features, and a bare-bones interface.  And, of course,
it's a commerical product (although very OSS-friendly).

> But, as far as pgfoundry.org is concerned, as was mentioned on the
> gforge-admins list, Andrew got Jan to take a look through the tables, and
> apparently there are few, if any, indices on them, which could account for
> how slow things look ...

Oh, I thought that had been fixed.   Will fix as soon as I find my login info
again.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Just to give an idea ...

From
"Marc G. Fournier"
Date:
On Sat, 5 Jun 2004, Gavin M. Roy wrote:

> If we want to offset the load by moving pgfoundry to one of my servers
> temporarily until you guys get your new servers and what not in place,
> I'd be happy to host it, if it helps.

altho that is an option, I made some changes to the servers this weekend
that seem to be holding load down ... we'll see how the 'weekday traffic'
changes that ...

but, also, Jan Wieck took a scan through the schema for pgfoundry at
Andrew's request, and it looks like there are several areas for
improvement there ;(

  >
> Gavin
>
> On Sat, 5 Jun 2004 11:19:04 -0300 (ADT), Marc G. Fournier
> <scrappy@postgresql.org> wrote:
>>
>>
>> Top ten processes on mars (where pgfoundry and archives are located):
>>
>> USER      PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
>> www     41463 17.3  0.1 12364 5808  ??  SJ   11:00AM   0:05.75 /usr/local/sbin/httpd
>> www     20583 13.5  0.1 12688 6172  ??  RJ    2:00PM 195:45.65 /usr/local/sbin/httpd
>> www     41053 12.2  0.1 12372 5840  ??  SJ   10:56AM   0:28.14 /usr/local/sbin/httpd
>> www     56570 12.3  0.3 21336 12012  ??  RJ   Wed11PM 264:32.06 /usr/local/sbin/httpd
>> www     56572 12.5  0.3 21388 12072  ??  RJ   Wed11PM 258:55.67 /usr/local/sbin/httpd
>> www     41792 11.9  0.1 12368 5800  ??  SJ   11:00AM   0:01.19 /usr/local/sbin/httpd
>> www     41778 10.5  0.1 12376 5824  ??  SJ   11:00AM   0:04.47 /usr/local/sbin/httpd
>> www     20582  9.0  0.2 13192 6752  ??  SJ    2:00PM 195:51.63 /usr/local/sbin/httpd
>> 60      41798  6.6  0.1 21012 2296  ??  SJ   11:00AM   0:00.38 lmtpd
>> scrappy 41795  6.4  0.0  2220 1408  ??  SJ   11:00AM   0:00.47 cleanup -z -t unix -u
>>
>> and, process map'ng to the VMs:
>>
>> # cat /proc/{41463,20583,41053,56570,56572,41792,41778,20582,41798,41795}/status
>> cat: /proc/41463/status: No such file or directory
>> httpd 11651,690738 nochan 80 80 80,80,80 svr5.postgresql.org
>> httpd 41,778660 select 80 80 80,80,80 svr5.postgresql.org
>> httpd 15453,319902 nochan 80 80 80,80,80 pgfoundry.org
>> httpd 15130,154111 nochan 80 80 80,80,80 pgfoundry.org
>> cat: /proc/41792/status: No such file or directory
>> cat: /proc/41778/status: No such file or directory
>> httpd 11656,301638 nochan 80 80 80,80,80 svr5.postgresql.org
>> lmtpd 0,365833 select 60 60 60,60,60 up4.com
>> cleanup 0,462244 select 1001 1001 1001,1001,1001,6 up4.com
>>
>> and all the high %CPU ones are postgresql.org related ...
>>
>> also, postgresql.org generates ~298GB of traffic per month, which accounts
>> for ~56% of all traffic out of the servers (and that isn't included what I
>> put onto our offsite server for ftp/bittorrent, it used to be ~75% of the
>> traffic), while hub.org (non client) generates ~175GB (or ~33%) ...
>>
>> Most (if not 90%) of the paying clients are static pages that are lucky to
>> generate 100MB of traffic (and note that traffic is all traffic,
>> mail/web/ssh/ftp/etc) ...
>>
>> The point: before accusing "hub clients" of loading the servers, realize
>> that ~50% of the resources are used by postgresql.org, not by clients ...
>>
>> ----
>> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
>> Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 5: Have you checked our extensive FAQ?
>>
>>                http://www.postgresql.org/docs/faqs/FAQ.html
>>
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664