Thread: processor running queue - general rule of thumb?

processor running queue - general rule of thumb?

From
Alan McKay
Date:
Hey folks,

I'm new to all this stuff, and am sitting here with kSar looking at
some graphed results of some load tests we did, trying to figure
things out :-)

We got some unsatisfactory results in stressing our system, and now I
have to divine where the bottleneck is.

We did 4 tests, upping the load each time.   The 3rd and 4th ones have
all 8 cores pegged at about 95%.  Yikes!

In the first test the processor running queue spikes at 7 and maybe
averages 4 or 5

In the last test it spikes at 33 with an average maybe 25.

Looks to me like it could be a CPU bottleneck.  But I'm new at this :-)

Is there a general rule of thumb "if queue is longer than X, it is
likely a bottleneck?"

In reading an IBM Redbook on Linux performance, I also see this :
"High numbers of context switches in connection with a large number of
interrupts can signal driver or application issues."

On my first test where the CPU is not pegged, context switching goes
from about 3700 to about 4900, maybe averaging 4100

On the pegged test, the values are maybe 10% higher than that, maybe 15%.

It is an IBM 3550 with 8 cores, 2660.134 MHz (from dmesg), 32Gigs RAM

thanks,
-Alan

--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: processor running queue - general rule of thumb?

From
justin
Date:
Alan McKay wrote:
> Hey folks,
> We did 4 tests, upping the load each time.   The 3rd and 4th ones have
> all 8 cores pegged at about 95%.  Yikes!
>
> In the first test the processor running queue spikes at 7 and maybe
> averages 4 or 5
>
> In the last test it spikes at 33 with an average maybe 25.
>
> Looks to me like it could be a CPU bottleneck.  But I'm new at this :-)
>
> Is there a general rule of thumb "if queue is longer than X, it is
> likely a bottleneck?"
>
> In reading an IBM Redbook on Linux performance, I also see this :
> "High numbers of context switches in connection with a large number of
> interrupts can signal driver or application issues."
>
> On my first test where the CPU is not pegged, context switching goes
> from about 3700 to about 4900, maybe averaging 4100
>
>
>

Well the people here will need allot more information to figure out what
is going on.

What kind of Stress did you do???? is it a specific  query causing the
problem in the test
What kind of load?
How many simulated clients
How big is the database?

Need to see the postgresql.config

What kind of IO Subsystem do you have ???
what does vmstat show

have you look at wiki yet
http://wiki.postgresql.org/wiki/Performance_Optimization



Re: processor running queue - general rule of thumb?

From
Scott Marlowe
Date:
On Fri, Jun 19, 2009 at 9:59 AM, Alan McKay<alan.mckay@gmail.com> wrote:
> Hey folks,
>
> I'm new to all this stuff, and am sitting here with kSar looking at
> some graphed results of some load tests we did, trying to figure
> things out :-)
>
> We got some unsatisfactory results in stressing our system, and now I
> have to divine where the bottleneck is.
>
> We did 4 tests, upping the load each time.   The 3rd and 4th ones have
> all 8 cores pegged at about 95%.  Yikes!
>
> In the first test the processor running queue spikes at 7 and maybe
> averages 4 or 5
>
> In the last test it spikes at 33 with an average maybe 25.
>
> Looks to me like it could be a CPU bottleneck.  But I'm new at this :-)
>
> Is there a general rule of thumb "if queue is longer than X, it is
> likely a bottleneck?"
>
> In reading an IBM Redbook on Linux performance, I also see this :
> "High numbers of context switches in connection with a large number of
> interrupts can signal driver or application issues."
>
> On my first test where the CPU is not pegged, context switching goes
> from about 3700 to about 4900, maybe averaging 4100

That's not too bad.  If you see them in the 30k to 150k range, then
worry about it.

> On the pegged test, the values are maybe 10% higher than that, maybe 15%.

That's especially good news.  Normally when you've got a problem, it
will increase in a geometric (or worse) way.

> It is an IBM 3550 with 8 cores, 2660.134 MHz (from dmesg), 32Gigs RAM

Like the other poster said, we likely don't have enough to tell you
what's going on, but from what you've said here it sounds like you're
mostly just CPU bound.  Assuming you're reading the output of vmstat
and top and other tools like that.

Re: processor running queue - general rule of thumb?

From
Alan McKay
Date:
> Like the other poster said, we likely don't have enough to tell you
> what's going on, but from what you've said here it sounds like you're
> mostly just CPU bound.  Assuming you're reading the output of vmstat
> and top and other tools like that.

Thanks.  I used 'sadc' from the sysstat RPM (part of the sar suite) to
collect data, and it does collect Vm and other data like that from top
and vmstat.

I did not see any irregular activity in those areas.

I realise I did not give you all enough details, which is why I worded
my question they way I did : "is there a general rule of thumb for
running queue"




--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: processor running queue - general rule of thumb?

From
Alan McKay
Date:
BTW, our designer got the nytprofile or whatever it is called for Perl
and found out that it was a problem with the POE library that was
being used as a state-machine to drive the whole load suite.   It was
taking something like 95% of the CPU time!

On Fri, Jun 19, 2009 at 11:59 AM, Alan McKay<alan.mckay@gmail.com> wrote:
> Hey folks,
>
> I'm new to all this stuff, and am sitting here with kSar looking at
> some graphed results of some load tests we did, trying to figure
> things out :-)
>
> We got some unsatisfactory results in stressing our system, and now I
> have to divine where the bottleneck is.
>
> We did 4 tests, upping the load each time.   The 3rd and 4th ones have
> all 8 cores pegged at about 95%.  Yikes!
>
> In the first test the processor running queue spikes at 7 and maybe
> averages 4 or 5
>
> In the last test it spikes at 33 with an average maybe 25.
>
> Looks to me like it could be a CPU bottleneck.  But I'm new at this :-)
>
> Is there a general rule of thumb "if queue is longer than X, it is
> likely a bottleneck?"
>
> In reading an IBM Redbook on Linux performance, I also see this :
> "High numbers of context switches in connection with a large number of
> interrupts can signal driver or application issues."
>
> On my first test where the CPU is not pegged, context switching goes
> from about 3700 to about 4900, maybe averaging 4100
>
> On the pegged test, the values are maybe 10% higher than that, maybe 15%.
>
> It is an IBM 3550 with 8 cores, 2660.134 MHz (from dmesg), 32Gigs RAM
>
> thanks,
> -Alan
>
> --
> “Don't eat anything you've ever seen advertised on TV”
>         - Michael Pollan, author of "In Defense of Food"
>



--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"