Thread: Linux mis-reporting memory
Sorry, I know this is probably more a linux question, but I'm guessing that others have run into this... I'm finding this rather interesting report from top on a Debian box... Mem: 32945280k total, 32871832k used, 73448k free, 247432k buffers Swap: 1951888k total, 42308k used, 1909580k free, 30294300k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12492 postgres 15 0 8469m 8.0g 8.0g S 0 25.6 3:52.03 postmaster 7820 postgres 16 0 8474m 4.7g 4.7g S 0 15.1 1:23.72 postmaster 21863 postgres 15 0 8472m 3.9g 3.9g S 0 12.4 0:30.61 postmaster 19893 postgres 15 0 8471m 2.4g 2.4g S 0 7.6 0:07.54 postmaster 20423 postgres 17 0 8472m 1.4g 1.4g S 0 4.4 0:04.61 postmaster 26395 postgres 15 0 8474m 1.1g 1.0g S 1 3.4 0:02.12 postmaster 12985 postgres 15 0 8472m 937m 930m S 0 2.9 0:05.50 postmaster 26806 postgres 15 0 8474m 787m 779m D 4 2.4 0:01.56 postmaster This is a machine that's been up some time and the database is 400G, so I'm pretty confident that shared_buffers (set to 8G) should be completely full, and that's what that top process is indicating. So how is it that linux thinks that 30G is cached? -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
Attachment
Decibel! <decibel@decibel.org> writes: > I'm finding this rather interesting report from top on a Debian box... > Mem: 32945280k total, 32871832k used, 73448k free, 247432k buffers > Swap: 1951888k total, 42308k used, 1909580k free, 30294300k cached > So how is it that linux thinks that 30G is cached? Why would you think that a number reported by the operating system has something to do with Postgres' shared memory? I might be mistaken, but I think that in this report "cached" indicates the amount of memory in use for kernel disk cache. (No idea what the separate "buffers" entry means, but it's obviously not all of the disk buffers the kernel has got.) It appears that the kernel is doing exactly what it's supposed to do and using any not-currently-called-for memory for disk cache ... regards, tom lane
> Sorry, I know this is probably more a linux question, but I'm guessing > that others have run into this... > I'm finding this rather interesting report from top on a Debian box... > Mem: 32945280k total, 32871832k used, 73448k free, 247432k buffers > Swap: 1951888k total, 42308k used, 1909580k free, 30294300k cached > This is a machine that's been up some time and the database is 400G, so > I'm pretty confident that shared_buffers (set to 8G) should be > completely full, and that's what that top process is indicating. Nope, use "ipcs" to show allocated shared memory segments. One of the better articles on LINUX & memory management - http://virtualthreads.blogspot.com/2006/02/understanding-memory-usage-on-linux.html -- Adam Tauno Williams, Network & Systems Administrator Consultant - http://www.whitemiceconsulting.com Developer - http://www.opengroupware.org
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Decibel! <decibel@decibel.org> writes: >> I'm finding this rather interesting report from top on a Debian box... > >> Mem: 32945280k total, 32871832k used, 73448k free, 247432k buffers >> Swap: 1951888k total, 42308k used, 1909580k free, 30294300k cached > >> So how is it that linux thinks that 30G is cached? > > Why would you think that a number reported by the operating system has > something to do with Postgres' shared memory? I think his question is how can the kernel be using 30G for kernel buffers if it only has 32G total and 8G of that is taken up by Postgres's shared buffers. It seems to imply Linux is paging out sysV shared memory. In fact some of Heikki's tests here showed that Linux would do precisely that. If your working set really is smaller than shared buffers then that's not so bad. Those buffers really would be completely idle anyways. But if your working set is larger than shared buffers and you're just not thrashing it hard enough to keep it in RAM then it's really bad. The buffer Linux will choose to page out are precisely those that Postgres will likely choose shortly as victim buffers, forcing Linux to page them back in just so Postgres can overwrite them. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote: > >> Mem: 32945280k total, 32871832k used, 73448k free, 247432k buffers > >> Swap: 1951888k total, 42308k used, 1909580k free, 30294300k cached > > > It seems to imply Linux is paging out sysV shared memory. In fact some of > Heikki's tests here showed that Linux would do precisely that. But then why is it not reporting that in the "Swap: used" section ? It only reports 42308k used swap. I have a box where I just executed 3x a select count(*) from a table which has ~5.5 GB size on disk, and the count executed in <4 seconds, which I take as it is all cached (shared memory is set to 12GB - I use the box for testing for now, otherwise I would set it far lower because I have bad experience with setting it more than 1/3 of the available memory). Top reported at the end of the process: Mem: 16510724k total, 16425252k used, 85472k free, 10144k buffers Swap: 7815580k total, 157804k used, 7657776k free, 15980664k cached I also watched it during the selects, but it was not significantly different. So my only conclusion is that the reported "cached" value is either including the shared memory or is simply wrong... or I just don't get how linux handles memory. Cheers, Csaba.
Hi, Le Friday 21 September 2007 01:04:01 Decibel!, vous avez écrit : > I'm finding this rather interesting report from top on a Debian box... I've read from people in other free software development groups that top/ps memory usage outputs are not useful not trustable after all. A more usable (or precise or trustworthy) tool seems to be exmap: http://www.berthels.co.uk/exmap/ Hope this helps, -- dim
"Csaba Nagy" <nagy@ecircle-ag.com> writes: > On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote: >> >> Mem: 32945280k total, 32871832k used, 73448k free, 247432k buffers >> >> Swap: 1951888k total, 42308k used, 1909580k free, 30294300k cached >> > >> It seems to imply Linux is paging out sysV shared memory. In fact some of >> Heikki's tests here showed that Linux would do precisely that. > > But then why is it not reporting that in the "Swap: used" section ? It > only reports 42308k used swap. Hm, good point. The other possibility is that Postgres just hasn't even touched a large part of its shared buffers. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Fri, 2007-09-21 at 10:43 +0100, Gregory Stark wrote: > The other possibility is that Postgres just hasn't even touched a large part > of its shared buffers. > But then how do you explain the example I gave, with a 5.5GB table seq-scanned 3 times, shared buffers set to 12 GB, and top still showing almost 100% memory as cached and no SWAP "used" ? In this case you can't say postgres didn't touch it's shared buffers - or a sequential scan won't use the shared buffers ? Cheers, Csaba.
Csaba Nagy wrote: > On Fri, 2007-09-21 at 10:43 +0100, Gregory Stark wrote: >> The other possibility is that Postgres just hasn't even touched a large part >> of its shared buffers. > > But then how do you explain the example I gave, with a 5.5GB table > seq-scanned 3 times, shared buffers set to 12 GB, and top still showing > almost 100% memory as cached and no SWAP "used" ? In this case you can't > say postgres didn't touch it's shared buffers - or a sequential scan > won't use the shared buffers ? Which version of Postgres is this? In 8.3, a scan like that really won't suck it all into the shared buffer cache. For seq scans on tables larger than shared_buffers/4, it switches to the bulk read strategy, using only a few buffers, and choosing the starting point with the scan synchronization facility. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Fri, 2007-09-21 at 12:08 +0200, Csaba Nagy wrote: > On Fri, 2007-09-21 at 10:43 +0100, Gregory Stark wrote: > > The other possibility is that Postgres just hasn't even touched a large part > > of its shared buffers. > > > > But then how do you explain the example I gave, with a 5.5GB table > seq-scanned 3 times, shared buffers set to 12 GB, and top still showing > almost 100% memory as cached and no SWAP "used" ? In this case you can't > say postgres didn't touch it's shared buffers - or a sequential scan > won't use the shared buffers ? Well, 6.5GB of shared_buffers could be swapped out and need not be swapped back in to perform those 3 queries. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Fri, 2007-09-21 at 11:34 +0100, Heikki Linnakangas wrote: > Which version of Postgres is this? In 8.3, a scan like that really won't > suck it all into the shared buffer cache. For seq scans on tables larger > than shared_buffers/4, it switches to the bulk read strategy, using only > a few buffers, and choosing the starting point with the scan > synchronization facility. > This was on 8.1.9 installed via apt-get on Debian 4.1.1-21. In any case I'm pretty sure linux swaps shared buffers, as I always got worse performance for shared buffers more than about 1/3 of the memory. But in that case the output of top is misleading. Cheers, Csaba.
On Thu, 20 Sep 2007, Decibel! wrote: > I'm finding this rather interesting report from top on a Debian box... > how is it that linux thinks that 30G is cached? top on Linux gives weird results when faced with situations where there's shared memory involved. I look at /proc/meminfo and run ipcs when I want a better idea what's going on. As good of an article on this topic as I've found is http://gentoo-wiki.com/FAQ_Linux_Memory_Management which recommends using free to clarify how big the disk cache really is. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Sep 21, 2007, at 4:43 AM, Gregory Stark wrote: > "Csaba Nagy" <nagy@ecircle-ag.com> writes: > >> On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote: >>>>> Mem: 32945280k total, 32871832k used, 73448k free, >>>>> 247432k buffers >>>>> Swap: 1951888k total, 42308k used, 1909580k free, >>>>> 30294300k cached >>>> >>> It seems to imply Linux is paging out sysV shared memory. In fact >>> some of >>> Heikki's tests here showed that Linux would do precisely that. >> >> But then why is it not reporting that in the "Swap: used" >> section ? It >> only reports 42308k used swap. > > Hm, good point. > > The other possibility is that Postgres just hasn't even touched a > large part > of its shared buffers. Sorry for the late reply... No, this is on a very active database server; the working set is almost certainly larger than memory (probably by a fair margin :( ), and all of the shared buffers should be in use. I'm leaning towards "top on linux == dumb". -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
On 10/2/07, Decibel! <decibel@decibel.org> wrote: > On Sep 21, 2007, at 4:43 AM, Gregory Stark wrote: > > "Csaba Nagy" <nagy@ecircle-ag.com> writes: > > > >> On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote: > >>>>> Mem: 32945280k total, 32871832k used, 73448k free, > >>>>> 247432k buffers > >>>>> Swap: 1951888k total, 42308k used, 1909580k free, > >>>>> 30294300k cached > >>>> > >>> It seems to imply Linux is paging out sysV shared memory. In fact > >>> some of > >>> Heikki's tests here showed that Linux would do precisely that. > >> > >> But then why is it not reporting that in the "Swap: used" > >> section ? It > >> only reports 42308k used swap. > > > > Hm, good point. > > > > The other possibility is that Postgres just hasn't even touched a > > large part > > of its shared buffers. > > Sorry for the late reply... > > No, this is on a very active database server; the working set is > almost certainly larger than memory (probably by a fair margin :( ), > and all of the shared buffers should be in use. > > I'm leaning towards "top on linux == dumb". Yeah, that pretty much describes it. It's gotten better than it once was. But it still doesn't seem to be able to tell shared memory from cache/buffer.
> >> But then why is it not reporting that in the "Swap: used" > >> section ? It > >> only reports 42308k used swap. > > Hm, good point. > > The other possibility is that Postgres just hasn't even touched a > > large part > > of its shared buffers. > Sorry for the late reply... > No, this is on a very active database server; the working set is > almost certainly larger than memory (probably by a fair margin :( ), "almost certainly" > and all of the shared buffers should be in use. "should be" It would be better to just check! :) The catalogs and informational views will give you definitive answers to these quests. > I'm leaning towards "top on linux == dumb". I disagree, it just isn't the appropriate tool for the job. What top tells you is lots of correct information, it just isn't the right information. For starters try - SELECT 'HEAP:' || relname AS table_name, (heap_blks_read + heap_blks_hit) AS heap_hits, ROUND(((heap_blks_hit)::NUMERIC / (heap_blks_read + heap_blks_hit) * 100), 2) AS heap_buffer_percentage FROM pg_statio_user_tables WHERE (heap_blks_read + heap_blks_hit) > 0 UNION SELECT 'TOAST:' || relname, (toast_blks_read + toast_blks_hit), ROUND(((toast_blks_hit)::NUMERIC / (toast_blks_read + toast_blks_hit) * 100), 2) FROM pg_statio_user_tables WHERE (toast_blks_read + toast_blks_hit) > 0 UNION SELECT 'INDEX:' || relname, (idx_blks_read + idx_blks_hit) AS heap_hits, ROUND(((idx_blks_hit)::NUMERIC / (idx_blks_read + idx_blks_hit) * 100), 2) FROM pg_statio_user_tables WHERE (idx_blks_read + idx_blks_hit) > 0 -- Adam Tauno Williams, Network & Systems Administrator Consultant - http://www.whitemiceconsulting.com Developer - http://www.opengroupware.org
On Oct 2, 2007, at 1:37 PM, Adam Tauno Williams wrote: >> I'm leaning towards "top on linux == dumb". > > I disagree, it just isn't the appropriate tool for the job. What top > tells you is lots of correct information, it just isn't the right > information. If it is in fact including shared memory as 'cached', then no, the information it's providing is not correct. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828