Thread: Linux mis-reporting memory

Linux mis-reporting memory

From
Decibel!
Date:
Sorry, I know this is probably more a linux question, but I'm guessing
that others have run into this...

I'm finding this rather interesting report from top on a Debian box...

Mem:  32945280k total, 32871832k used,    73448k free,   247432k buffers
Swap:  1951888k total,    42308k used,  1909580k free, 30294300k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12492 postgres  15   0 8469m 8.0g 8.0g S    0 25.6   3:52.03 postmaster
 7820 postgres  16   0 8474m 4.7g 4.7g S    0 15.1   1:23.72 postmaster
21863 postgres  15   0 8472m 3.9g 3.9g S    0 12.4   0:30.61 postmaster
19893 postgres  15   0 8471m 2.4g 2.4g S    0  7.6   0:07.54 postmaster
20423 postgres  17   0 8472m 1.4g 1.4g S    0  4.4   0:04.61 postmaster
26395 postgres  15   0 8474m 1.1g 1.0g S    1  3.4   0:02.12 postmaster
12985 postgres  15   0 8472m 937m 930m S    0  2.9   0:05.50 postmaster
26806 postgres  15   0 8474m 787m 779m D    4  2.4   0:01.56 postmaster

This is a machine that's been up some time and the database is 400G, so
I'm pretty confident that shared_buffers (set to 8G) should be
completely full, and that's what that top process is indicating.

So how is it that linux thinks that 30G is cached?
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Attachment

Re: Linux mis-reporting memory

From
Tom Lane
Date:
Decibel! <decibel@decibel.org> writes:
> I'm finding this rather interesting report from top on a Debian box...

> Mem:  32945280k total, 32871832k used,    73448k free,   247432k buffers
> Swap:  1951888k total,    42308k used,  1909580k free, 30294300k cached

> So how is it that linux thinks that 30G is cached?

Why would you think that a number reported by the operating system has
something to do with Postgres' shared memory?

I might be mistaken, but I think that in this report "cached" indicates
the amount of memory in use for kernel disk cache.  (No idea what the
separate "buffers" entry means, but it's obviously not all of the disk
buffers the kernel has got.)  It appears that the kernel is doing
exactly what it's supposed to do and using any not-currently-called-for
memory for disk cache ...

            regards, tom lane

Re: Linux mis-reporting memory

From
Adam Tauno Williams
Date:
> Sorry, I know this is probably more a linux question, but I'm guessing
> that others have run into this...
> I'm finding this rather interesting report from top on a Debian box...
> Mem:  32945280k total, 32871832k used,    73448k free,   247432k buffers
> Swap:  1951888k total,    42308k used,  1909580k free, 30294300k cached
> This is a machine that's been up some time and the database is 400G, so
> I'm pretty confident that shared_buffers (set to 8G) should be
> completely full, and that's what that top process is indicating.

Nope,  use "ipcs" to show allocated shared memory segments.

One of the better articles on LINUX & memory management -
http://virtualthreads.blogspot.com/2006/02/understanding-memory-usage-on-linux.html

--
Adam Tauno Williams, Network & Systems Administrator
Consultant - http://www.whitemiceconsulting.com
Developer - http://www.opengroupware.org


Re: Linux mis-reporting memory

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Decibel! <decibel@decibel.org> writes:
>> I'm finding this rather interesting report from top on a Debian box...
>
>> Mem:  32945280k total, 32871832k used,    73448k free,   247432k buffers
>> Swap:  1951888k total,    42308k used,  1909580k free, 30294300k cached
>
>> So how is it that linux thinks that 30G is cached?
>
> Why would you think that a number reported by the operating system has
> something to do with Postgres' shared memory?

I think his question is how can the kernel be using 30G for kernel buffers if
it only has 32G total and 8G of that is taken up by Postgres's shared buffers.

It seems to imply Linux is paging out sysV shared memory. In fact some of
Heikki's tests here showed that Linux would do precisely that.

If your working set really is smaller than shared buffers then that's not so
bad. Those buffers really would be completely idle anyways.

But if your working set is larger than shared buffers and you're just not
thrashing it hard enough to keep it in RAM then it's really bad. The buffer
Linux will choose to page out are precisely those that Postgres will likely
choose shortly as victim buffers, forcing Linux to page them back in just so
Postgres can overwrite them.


--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Linux mis-reporting memory

From
Csaba Nagy
Date:
On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote:
> >> Mem:  32945280k total, 32871832k used,    73448k free,   247432k buffers
> >> Swap:  1951888k total,    42308k used,  1909580k free, 30294300k cached
> >
> It seems to imply Linux is paging out sysV shared memory. In fact some of
> Heikki's tests here showed that Linux would do precisely that.

But then why is it not reporting that in the "Swap: used" section ? It
only reports 42308k used swap.

I have a box where I just executed 3x a select count(*) from a table
which has ~5.5 GB size on disk, and the count executed in <4 seconds,
which I take as it is all cached (shared memory is set to 12GB - I use
the box for testing for now, otherwise I would set it far lower because
I have bad experience with setting it more than 1/3 of the available
memory). Top reported at the end of the process:

Mem:  16510724k total, 16425252k used,    85472k free,    10144k buffers
Swap:  7815580k total,   157804k used,  7657776k free, 15980664k cached

I also watched it during the selects, but it was not significantly
different. So my only conclusion is that the reported "cached" value is
either including the shared memory or is simply wrong... or I just don't
get how linux handles memory.

Cheers,
Csaba.



Re: Linux mis-reporting memory

From
Dimitri Fontaine
Date:
Hi,

Le Friday 21 September 2007 01:04:01 Decibel!, vous avez écrit :
> I'm finding this rather interesting report from top on a Debian box...

I've read from people in other free software development groups that top/ps
memory usage outputs are not useful not trustable after all. A more usable
(or precise or trustworthy) tool seems to be exmap:
  http://www.berthels.co.uk/exmap/

Hope this helps,
--
dim

Re: Linux mis-reporting memory

From
Gregory Stark
Date:
"Csaba Nagy" <nagy@ecircle-ag.com> writes:

> On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote:
>> >> Mem:  32945280k total, 32871832k used,    73448k free,   247432k buffers
>> >> Swap:  1951888k total,    42308k used,  1909580k free, 30294300k cached
>> >
>> It seems to imply Linux is paging out sysV shared memory. In fact some of
>> Heikki's tests here showed that Linux would do precisely that.
>
> But then why is it not reporting that in the "Swap: used" section ? It
> only reports 42308k used swap.

Hm, good point.

The other possibility is that Postgres just hasn't even touched a large part
of its shared buffers.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Linux mis-reporting memory

From
Csaba Nagy
Date:
On Fri, 2007-09-21 at 10:43 +0100, Gregory Stark wrote:
> The other possibility is that Postgres just hasn't even touched a large part
> of its shared buffers.
>

But then how do you explain the example I gave, with a 5.5GB table
seq-scanned 3 times, shared buffers set to 12 GB, and top still showing
almost 100% memory as cached and no SWAP "used" ? In this case you can't
say postgres didn't touch it's shared buffers - or a sequential scan
won't use the shared buffers ?

Cheers,
Csaba.



Re: Linux mis-reporting memory

From
"Heikki Linnakangas"
Date:
Csaba Nagy wrote:
> On Fri, 2007-09-21 at 10:43 +0100, Gregory Stark wrote:
>> The other possibility is that Postgres just hasn't even touched a large part
>> of its shared buffers.
>
> But then how do you explain the example I gave, with a 5.5GB table
> seq-scanned 3 times, shared buffers set to 12 GB, and top still showing
> almost 100% memory as cached and no SWAP "used" ? In this case you can't
> say postgres didn't touch it's shared buffers - or a sequential scan
> won't use the shared buffers ?

Which version of Postgres is this? In 8.3, a scan like that really won't
suck it all into the shared buffer cache. For seq scans on tables larger
than shared_buffers/4, it switches to the bulk read strategy, using only
 a few buffers, and choosing the starting point with the scan
synchronization facility.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: Linux mis-reporting memory

From
Simon Riggs
Date:
On Fri, 2007-09-21 at 12:08 +0200, Csaba Nagy wrote:
> On Fri, 2007-09-21 at 10:43 +0100, Gregory Stark wrote:
> > The other possibility is that Postgres just hasn't even touched a large part
> > of its shared buffers.
> >
>
> But then how do you explain the example I gave, with a 5.5GB table
> seq-scanned 3 times, shared buffers set to 12 GB, and top still showing
> almost 100% memory as cached and no SWAP "used" ? In this case you can't
> say postgres didn't touch it's shared buffers - or a sequential scan
> won't use the shared buffers ?

Well, 6.5GB of shared_buffers could be swapped out and need not be
swapped back in to perform those 3 queries.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: Linux mis-reporting memory

From
Csaba Nagy
Date:
On Fri, 2007-09-21 at 11:34 +0100, Heikki Linnakangas wrote:
> Which version of Postgres is this? In 8.3, a scan like that really won't
> suck it all into the shared buffer cache. For seq scans on tables larger
> than shared_buffers/4, it switches to the bulk read strategy, using only
>  a few buffers, and choosing the starting point with the scan
> synchronization facility.
>
This was on 8.1.9 installed via apt-get on Debian 4.1.1-21. In any case
I'm pretty sure linux swaps shared buffers, as I always got worse
performance for shared buffers more than about 1/3 of the memory. But in
that case the output of top is misleading.

Cheers,
Csaba.



Re: Linux mis-reporting memory

From
Greg Smith
Date:
On Thu, 20 Sep 2007, Decibel! wrote:

> I'm finding this rather interesting report from top on a Debian box...
> how is it that linux thinks that 30G is cached?

top on Linux gives weird results when faced with situations where there's
shared memory involved.  I look at /proc/meminfo and run ipcs when I want
a better idea what's going on.  As good of an article on this topic as
I've found is http://gentoo-wiki.com/FAQ_Linux_Memory_Management which
recommends using free to clarify how big the disk cache really is.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux mis-reporting memory

From
Decibel!
Date:
On Sep 21, 2007, at 4:43 AM, Gregory Stark wrote:
> "Csaba Nagy" <nagy@ecircle-ag.com> writes:
>
>> On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote:
>>>>> Mem:  32945280k total, 32871832k used,    73448k free,
>>>>> 247432k buffers
>>>>> Swap:  1951888k total,    42308k used,  1909580k free,
>>>>> 30294300k cached
>>>>
>>> It seems to imply Linux is paging out sysV shared memory. In fact
>>> some of
>>> Heikki's tests here showed that Linux would do precisely that.
>>
>> But then why is it not reporting that in the "Swap: used"
>> section ? It
>> only reports 42308k used swap.
>
> Hm, good point.
>
> The other possibility is that Postgres just hasn't even touched a
> large part
> of its shared buffers.

Sorry for the late reply...

No, this is on a very active database server; the working set is
almost certainly larger than memory (probably by a fair margin :( ),
and all of the shared buffers should be in use.

I'm leaning towards "top on linux == dumb".
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Re: Linux mis-reporting memory

From
"Scott Marlowe"
Date:
On 10/2/07, Decibel! <decibel@decibel.org> wrote:
> On Sep 21, 2007, at 4:43 AM, Gregory Stark wrote:
> > "Csaba Nagy" <nagy@ecircle-ag.com> writes:
> >
> >> On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote:
> >>>>> Mem:  32945280k total, 32871832k used,    73448k free,
> >>>>> 247432k buffers
> >>>>> Swap:  1951888k total,    42308k used,  1909580k free,
> >>>>> 30294300k cached
> >>>>
> >>> It seems to imply Linux is paging out sysV shared memory. In fact
> >>> some of
> >>> Heikki's tests here showed that Linux would do precisely that.
> >>
> >> But then why is it not reporting that in the "Swap: used"
> >> section ? It
> >> only reports 42308k used swap.
> >
> > Hm, good point.
> >
> > The other possibility is that Postgres just hasn't even touched a
> > large part
> > of its shared buffers.
>
> Sorry for the late reply...
>
> No, this is on a very active database server; the working set is
> almost certainly larger than memory (probably by a fair margin :( ),
> and all of the shared buffers should be in use.
>
> I'm leaning towards "top on linux == dumb".

Yeah, that pretty much describes it.  It's gotten better than it once
was.  But it still doesn't seem to be able to tell shared memory from
cache/buffer.

Re: Linux mis-reporting memory

From
Adam Tauno Williams
Date:
> >> But then why is it not reporting that in the "Swap: used"
> >> section ? It
> >> only reports 42308k used swap.
> > Hm, good point.
> > The other possibility is that Postgres just hasn't even touched a
> > large part
> > of its shared buffers.
> Sorry for the late reply...
> No, this is on a very active database server; the working set is
> almost certainly larger than memory (probably by a fair margin :( ),

"almost certainly"

> and all of the shared buffers should be in use.

"should be"

It would be better to just check! :)  The catalogs and informational
views will give you definitive answers to these quests.

> I'm leaning towards "top on linux == dumb".

I disagree, it just isn't the appropriate tool for the job.  What top
tells you is lots of correct information,  it just isn't the right
information.

For starters try -
SELECT
 'HEAP:' || relname AS table_name,
 (heap_blks_read + heap_blks_hit) AS heap_hits,
 ROUND(((heap_blks_hit)::NUMERIC / (heap_blks_read + heap_blks_hit) *
100), 2) AS heap_buffer_percentage
FROM pg_statio_user_tables
WHERE (heap_blks_read + heap_blks_hit) > 0
UNION
SELECT
 'TOAST:' || relname,
 (toast_blks_read + toast_blks_hit),
 ROUND(((toast_blks_hit)::NUMERIC / (toast_blks_read + toast_blks_hit) *
100), 2)
FROM pg_statio_user_tables
WHERE (toast_blks_read + toast_blks_hit) > 0
UNION
SELECT
 'INDEX:' || relname,
 (idx_blks_read + idx_blks_hit) AS heap_hits,
 ROUND(((idx_blks_hit)::NUMERIC / (idx_blks_read + idx_blks_hit) * 100),
2)
FROM pg_statio_user_tables
WHERE (idx_blks_read + idx_blks_hit) > 0

--
Adam Tauno Williams, Network & Systems Administrator
Consultant - http://www.whitemiceconsulting.com
Developer - http://www.opengroupware.org


Re: Linux mis-reporting memory

From
Decibel!
Date:
On Oct 2, 2007, at 1:37 PM, Adam Tauno Williams wrote:
>> I'm leaning towards "top on linux == dumb".
>
> I disagree, it just isn't the appropriate tool for the job.  What top
> tells you is lots of correct information,  it just isn't the right
> information.

If it is in fact including shared memory as 'cached', then no, the
information it's providing is not correct.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828