Thread: Linux TOP

Linux TOP

From
Waldomiro
Date:
Hi,

I have one of my database server that I run the "top" command:

top - 16:16:30 up 42 days,
13:23,  4 users,  load average: 3.13, 3.52, 3.36
Tasks: 624 total,   1 running, 623 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.4%us,  1.1%sy,  0.0%ni, 84.4%id, 12.9%wa,  0.0%hi,  0.2%si, 
0.0%st
Mem:  16432240k total, 16344596k used,    87644k free,    27548k buffers
Swap: 10241428k total,  3680860k used,  6560568k free,  6230376k cached

I´m afraid of two things, one is the "load average", I think 3
is too much, another is the "swap", almost 4GB of swap, I think that is
too much swap.

Am I right?

Can I use those indicators to know if my database is ok?

Thanks

Waldomiro

Re: Linux TOP

From
Rich Shepard
Date:
On Wed, 21 Oct 2009, Waldomiro wrote:

> I'm afraid of two things, one is the "load average", I think 3 is too
> much, another is the "swap", almost 4GB of swap, I think that is too much
> swap.
>
> Am I right?

   Not necessarily.

> Can I use those indicators to know if my database is ok?

   Perhaps.

   Google is your friend. For example, enter the search term "linux load
averages" and one of the first hits is:

<http://www.lifeaftercoffee.com/2006/03/13/unix-load-averages-explained/>

   Do the same thing to understand swap.

Rich

Re: Linux TOP

From
Greg Smith
Date:
On Wed, 21 Oct 2009, Waldomiro wrote:

> top - 16:16:30 up 42 days, 13:23,  4 users,  load average: 3.13, 3.52, 3.36
> Cpu(s):  1.4%us,  1.1%sy,  0.0%ni, 84.4%id, 12.9%wa,  0.0%hi,  0.2%si,  0.0%st
> Mem:  16432240k total, 16344596k used,    87644k free,    27548k buffers
> Swap: 10241428k total,  3680860k used,  6560568k free,  6230376k cached
>
> I'm afraid of two things, one is the "load average", I think 3 is too much

You're at 12.9% waiting for I/O and 84.4% idle.  That means your average
load consists of three processes who are stuck waiting for I/O at any
given time.  The I/O is what you should be worried about, not the load
average.

> another is the "swap", almost 4GB of swap, I think that is too much
> swap.

It does look like your server is using much more RAM than it actually has,
which is the likely reason for all the disk I/O.  If you sort the top
output by memory, you might see why that is.

The information provided by top on Linux isn't very good though; take a
look at /proc/meminfo for more details.  Rather than rely on top's math,
instead I usually capture the output from:

ps -e -o pid,rss,vsz,size,user,cmd

And add things up myself instead, taking into account the shared bits each
of the PostgreSQL processes includes.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux TOP

From
Scott Marlowe
Date:
On Wed, Oct 21, 2009 at 4:01 PM, Greg Smith <gsmith@gregsmith.com> wrote:
> On Wed, 21 Oct 2009, Waldomiro wrote:
>
>> top - 16:16:30 up 42 days, 13:23,  4 users,  load average: 3.13, 3.52,
>> 3.36
>> Cpu(s):  1.4%us,  1.1%sy,  0.0%ni, 84.4%id, 12.9%wa,  0.0%hi,  0.2%si,
>> 0.0%st
>> Mem:  16432240k total, 16344596k used,    87644k free,    27548k buffers
>> Swap: 10241428k total,  3680860k used,  6560568k free,  6230376k cached
>>
>> I'm afraid of two things, one is the "load average", I think 3 is too much
>
> You're at 12.9% waiting for I/O and 84.4% idle.  That means your average
> load consists of three processes who are stuck waiting for I/O at any given
> time.  The I/O is what you should be worried about, not the load average.
>
>> another is the "swap", almost 4GB of swap, I think that is too much swap.
>
> It does look like your server is using much more RAM than it actually has,
> which is the likely reason for all the disk I/O.  If you sort the top output
> by memory, you might see why that is.

This is a common misunderstanding.

In this:

 Mem:  16432240k total, 16344596k used,    87644k free,    27548k buffers
 Swap: 10241428k total,  3680860k used,  6560568k free,  6230376k cached

The 6.2G cached is considered part of the 16G used

So it's not using more memory than it has.  It's just the accounting
is inobvious.

Re: Linux TOP

From
Tom Lane
Date:
Scott Marlowe <scott.marlowe@gmail.com> writes:
> In this:

>  Mem:  16432240k total, 16344596k used,    87644k free,    27548k buffers
>  Swap: 10241428k total,  3680860k used,  6560568k free,  6230376k cached

> The 6.2G cached is considered part of the 16G used

> So it's not using more memory than it has.  It's just the accounting
> is inobvious.

Right, but it still appears that there's something close to 14G of
actual memory use (exclusive of kernel disk buffers).  If that's
the true requirement of the set of processes being run, 16G of RAM
is pretty darn marginal, and he should go buy more.  But first it
would be prudent to find out where the memory is going.  Also, one
thing I'd do immediately is to watch "vmstat 1" for awhile to see if
there's a lot of swap activity.  If that's where the I/O is going,
it'd be another signal that memory pressure is the real issue.

            regards, tom lane

Re: Linux TOP

From
Greg Smith
Date:
On Wed, 21 Oct 2009, Scott Marlowe wrote:

> In this:
>
> Mem:  16432240k total, 16344596k used,    87644k free,    27548k buffers
> Swap: 10241428k total,  3680860k used,  6560568k free,  6230376k cached
>
> The 6.2G cached is considered part of the 16G used
>
> So it's not using more memory than it has.  It's just the accounting
> is inobvious.

This is a snapshot.  The fact that 3.7GB of swap is used here suggests
there may have been more memory used at some point in the past then we're
seeing now; that's more what I was commenting on.  A look at the si/so
figures in vmstat should nail down whether that's still going on or not
now, as Tom already suggested.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux TOP

From
Scott Marlowe
Date:
On Wed, Oct 21, 2009 at 4:25 PM, Greg Smith <gsmith@gregsmith.com> wrote:
> On Wed, 21 Oct 2009, Scott Marlowe wrote:
>
>> In this:
>>
>> Mem:  16432240k total, 16344596k used,    87644k free,    27548k buffers
>> Swap: 10241428k total,  3680860k used,  6560568k free,  6230376k cached
>>
>> The 6.2G cached is considered part of the 16G used
>>
>> So it's not using more memory than it has.  It's just the accounting
>> is inobvious.
>
> This is a snapshot.  The fact that 3.7GB of swap is used here suggests there
> may have been more memory used at some point in the past then we're seeing
> now; that's more what I was commenting on.  A look at the si/so figures in
> vmstat should nail down whether that's still going on or not now, as Tom
> already suggested.

Definitely.  not arguing the guy doesn't have problems, just that the
way top accounts for memory is rather misleading for most folks.