Thread: postgres memory management issues?
Hi All, I've recently run into problems with my kernel complaining that I ran out of memory, thus killing off postgres and bringing my app to a grinding halt. I'm on a 32-bit architecture with 16GB of RAM, under Gentoo Linux. Naturally, I have to set my shmmax to 2GB because the kernel can't support more (well, I could set it to 3GB, but I use 2GB for safety). Shared_buffers is 200000 and max_connections is 600. Here is a snippet of my log output (I can give more if necessary): Sep 5 18:38:57 tii-db2.oaktown.iparadigms.com Out of Memory: Kill process 11696 (postgres) score 1181671 and children. Sep 5 18:38:57 tii-db2.oaktown.iparadigms.com Out of Memory: Kill process 11696 (postgres) score 1181671 and children. Sep 5 18:38:57 tii-db2.oaktown.iparadigms.com Out of memory: Killed process 11704 (postgres). Sep 5 18:38:57 tii-db2.oaktown.iparadigms.com Out of memory: Killed process 11704 (postgres). [...] Sep 5 18:38:57 tii-db2.oaktown.iparadigms.com postgres[11696]: [6-1] 2007-09-05 18:38:57.626 PDT [user=,db= PID:11696 XID:]LOG: background writer process (PID 11704) was terminated by signal 9 Sep 5 18:38:57 tii-db2.oaktown.iparadigms.com postgres[11696]: [7-1] 2007-09-05 18:38:57.626 PDT [user=,db= PID:11696 XID:]LOG: terminating any other active server processes My understanding is that if any one postgres process's memory usage, plus the shared memory, exceeds the kernel limit of 4GB, then the kernel will kill the process off. Is this true? If so, would postgres have some prevention mechanism that would keep a particular process from getting too big? (Maybe I'm being too idealistic, or I just simply don't understand how postgres works under the hood) --Richard
Richard Yen wrote: > Hi All, > > I've recently run into problems with my kernel complaining that I ran > out of memory, thus killing off postgres and bringing my app to a > grinding halt. > > I'm on a 32-bit architecture with 16GB of RAM, under Gentoo Linux. > Naturally, I have to set my shmmax to 2GB because the kernel can't > support more (well, I could set it to 3GB, but I use 2GB for safety). > > Shared_buffers is 200000 and max_connections is 600. OK, that's ~ 1.6GB shared-memory > Here is a snippet of my log output (I can give more if necessary): > Sep 5 18:38:57 tii-db2.oaktown.iparadigms.com Out of Memory: Kill > process 11696 (postgres) score 1181671 and children. OK, you've run out of memory at some point. > My understanding is that if any one postgres process's memory usage, > plus the shared memory, exceeds the kernel limit of 4GB, then the kernel > will kill the process off. Is this true? If so, would postgres have > some prevention mechanism that would keep a particular process from > getting too big? (Maybe I'm being too idealistic, or I just simply > don't understand how postgres works under the hood) You've got max_connections of 600 and you think individual backends are using more than 2.4GB RAM each? Long before that you'll run out of actual RAM+Swap. If you actually had 600 backends you'd be able to allocate ~24MB to each. You'd actually want much less, to allow for disk-cache in the OS. The important information missing is: 1. How much memory is in use, and by what (vmstat/top output) 2. What memory settings do you have in your postgresql.conf (work_mem, maintenance_work_mem) 3. What was happening at the time (how many connections etc) -- Richard Huxton Archonet Ltd
> I've recently run into problems with my kernel complaining that I ran > out of memory, thus killing off postgres and bringing my app to a > grinding halt. > > I'm on a 32-bit architecture with 16GB of RAM, under Gentoo Linux. > Naturally, I have to set my shmmax to 2GB because the kernel can't > support more (well, I could set it to 3GB, but I use 2GB for safety). Wouldn't it make sense to install an amd64 version with so much RAM? -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare
"Richard Yen" <dba@richyen.com> writes: > My understanding is that if any one postgres process's memory usage, plus the > shared memory, exceeds the kernel limit of 4GB, then the kernel will kill the > process off. Is this true? If so, would postgres have some prevention > mechanism that would keep a particular process from getting too big? (Maybe > I'm being too idealistic, or I just simply don't understand how postgres works > under the hood) I don't think you have an individual process going over 4G. I think what you have is 600 processes which in aggregate are using more memory than you have available. Do you really need 600 processes by the way? You could try lowering work_mem but actually your value seems fairly reasonable. Perhaps your kernel isn't actually able to use 16GB? What does cat /proc/meminfo say? What does it say when this is happening? You might also tweak /proc/sys/vm/overcommit_memory but I don't remember what the values are, you can search to find them. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
* Gregory Stark: > You might also tweak /proc/sys/vm/overcommit_memory but I don't remember what > the values are, you can search to find them. "2" is the interesting value, it turns off overcommit. However, if you're tight on memory, this will only increase your problems because the system fails sooner. The main difference is that it's much more deterministic: malloc fails in a predictable manner and this situation can be handled gracefully (at least by some processes); no processes are killed. We use this setting on all of our database server, just in case someone performs a huge SELECT locally, resulting in a a client process sucking up all available memory. With vm.overcommit_memory=2, memory allocation in the client process will fail. Without it, typically the postgres process feeding it is killed by the kernel. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On Thu, Sep 06, 2007 at 09:06:53AM -0700, Richard Yen wrote: >My understanding is that if any one postgres process's memory usage, >plus the shared memory, exceeds the kernel limit of 4GB, On a 32 bit system the per-process memory limit is a lot lower than 4G. If you want to use 16G effectively it's going to be a lot easier to simply use a 64bit system. That said, it's more likely that you've got a number of processes using an aggregate of more than 16G than that you're exceeding the limit per process. (Hitting the per-process limit should result in a memory allocation failure rather than an out of memory condition.) Mike Stone
Richard Yen <dba@richyen.com> writes: > Here is a snippet of my log output (I can give more if necessary): > Sep 5 18:38:57 tii-db2.oaktown.iparadigms.com Out of Memory: Kill > process 11696 (postgres) score 1181671 and children. > My understanding is that if any one postgres process's memory usage, > plus the shared memory, exceeds the kernel limit of 4GB, then the > kernel will kill the process off. Is this true? No. The OOM killer is not about individual process size. It's about getting the kernel out of the corner it's backed itself into when it's promised more memory for the total collection of processes than it can actually deliver. As already noted, fooling with the overcommit parameter might help, and migrating to a 64-bit kernel might help. (32-bit kernels can run out of space for "lowmem" long before all of your 16G is used up.) ObDigression: The reason the kernel would do such a silly-sounding thing as promise more memory than it has is that in a lot of cases pages are shared by more than one process --- in fact, immediately after a fork() the child process shares *all* pages of its parent --- and it would be really restrictive to insist on having sufficient RAM+swap for each process to have an independent copy of shared pages. The problem is that it's hard to guess whether currently-shared pages will need multiple copies in future. After a fork() the child's pages are supposed to be independent of the parent, so if either one scribbles on a shared page then the kernel has to instantiate separate copies at that moment (google for "copy on write" for more about this). The problem is that if there is not enough memory for another copy, there is no clean API for the kernel to return "out of memory". It cannot just fail the write instruction, so the only recourse is to nuke some process or other to release memory. The whole thing makes considerable sense until you are trying to run critical applications, and then you just wanna turn it off. regards, tom lane