Thread: Dangerous hint in the PostgreSQL manual
Hello I have been trapped by the advice from the manual to use "sysctl -w vm.overcommit_memory=2" when using Linux (see 16.4.3. Linux Memory Overcommit). This value should only be used when PostgreSQL is the only Application running on the machine in question. It should be checked against the values "CommitLimit" and "Committed_AS" in /proc/meminfo on a longer running system. If "Committed_AS" reaches or come close to "CommitLimit" one should not set overcommit_memory=2 (see http://www.redhat.com/archives/rhl-devel-list/2005-February/msg00738.html). I think it should be included in the manual as a warning because with this setting the machine in question may get trouble with "fork failed" even if the standard system tools report a lot of free memory causing confusion to the admins. Regards Andreas
On Mon, Dec 10, 2007 at 04:26:12PM +0100, Listaccount wrote: > Hello > > I have been trapped by the advice from the manual to use "sysctl -w > vm.overcommit_memory=2" when using Linux (see 16.4.3. Linux Memory > Overcommit). This value should only be used when PostgreSQL is the I think you need to read the documentation more carefully, because it clearly suggests you (1) look at the kernel source and (2) consult a kernel expert as part of your evaluation. In any case, > /proc/meminfo on a longer running system. If "Committed_AS" reaches or > come close to "CommitLimit" one should not set overcommit_memory=2 (see > http://www.redhat.com/archives/rhl-devel-list/2005-February/msg00738.html). my own reading of that message leads me to the opposite conclusion as yours. You should _for sure_ set overcommit_memory=2 in that case. And this is why: > this setting the machine in question may get trouble with "fork > failed" even if the standard system tools report a lot of free memory > causing confusion to the admins. You _want_ the fork to fail when the kernel can't (over)commit the memory, because otherwise the stupid genius kernel will come along and maybe blip your postmaster on the head, causing it to die by surprise. Don't like that? Use more memory. Or get an operating system that doesn't do stupid things like promise more memory than it has. Except, of course, those are getting rarer and rarer all the time. Please note that memory overcommit is sort of like a high-risk mortgage: the chances that the OS will recover enough memory in any given round start out as high. Eventually, however, the [technical|financial] economy is such that only high-risk commitments are available, and at that point, _someone_ isn't going to pay back enough [memory|money] to the thing demanding it. At that point, it's anyone's guess what will happen next. A
Zitat von Andrew Sullivan <ajs@crankycanuck.ca>: > On Mon, Dec 10, 2007 at 04:26:12PM +0100, Listaccount wrote: >> Hello >> >> I have been trapped by the advice from the manual to use "sysctl -w >> vm.overcommit_memory=2" when using Linux (see 16.4.3. Linux Memory >> Overcommit). This value should only be used when PostgreSQL is the > > I think you need to read the documentation more carefully, because it > clearly suggests you (1) look at the kernel source and (2) consult a kernel > expert as part of your evaluation. > > In any case, Consult the kernel source is a little bit overkill for setup a database. >> /proc/meminfo on a longer running system. If "Committed_AS" reaches or >> come close to "CommitLimit" one should not set overcommit_memory=2 (see >> http://www.redhat.com/archives/rhl-devel-list/2005-February/msg00738.html). > > my own reading of that message leads me to the opposite conclusion as yours. > You should _for sure_ set overcommit_memory=2 in that case. And this is > why: I don't want to start the discussion what is the rigth thing todo, both settings the default "0" and "2" have advantages and drawbacks. What i would like to see in the documentation is the easy hint to check if you get i trouble with this setting so one can prepare. A simple "see if your "CommitLimit - Commited_AS" from /proc/meminfo come close to 0 after some uptime and if so don't use it. >> this setting the machine in question may get trouble with "fork >> failed" even if the standard system tools report a lot of free memory >> causing confusion to the admins. > > You _want_ the fork to fail when the kernel can't (over)commit the memory, > because otherwise the stupid genius kernel will come along and maybe blip > your postmaster on the head, causing it to die by surprise. Don't like > that? Use more memory. Or get an operating system that doesn't do stupid > things like promise more memory than it has. > > Except, of course, those are getting rarer and rarer all the time. > > Please note that memory overcommit is sort of like a high-risk mortgage: the > chances that the OS will recover enough memory in any given round start out > as high. Eventually, however, the [technical|financial] economy is such > that only high-risk commitments are available, and at that point, _someone_ > isn't going to pay back enough [memory|money] to the thing demanding it. At > that point, it's anyone's guess what will happen next. As said the discussion about pro- and -con have happend many times (for example http://developers.sun.com/solaris/articles/subprocess/subprocess.html). I only like to see a hint how to check *before* you get in trouble. Regards Andreas
* Andrew Sullivan: > You _want_ the fork to fail when the kernel can't (over)commit the > memory, because otherwise the stupid genius kernel will come along > and maybe blip your postmaster on the head, causing it to die by > surprise. The other side of the story is that with overcommit, the machine continues to work flawlessly in some loads, when it would fail without overcommit. It's also not clear that trading a segfault for malloc returning a null pointer leads to more deterministic failures (because the malloc failure does not necessarily occur in the memory hog). My personal experience is that vm.overcommit_memory=2 (together with tons of swap space) leads to more deterministic failure behavior, but we don't use much software that aggressively allocates address space without actually using it (Sun's JVM does in some cases, and SBCL is particularly obnoxious in this regard). -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On Tue, Dec 11, 2007 at 09:23:38AM +0100, Listaccount wrote: > I don't want to start the discussion what is the rigth thing todo, Then you shouldn't ask here. The manual was changed to say what it does after considerable community discussion. In my view, the Linux kernel's behaviour is completely unacceptable, and exactly the sort of amateur design foolishness that people are complaining about when they say Linux is a toy. > What i would like to see in the documentation is the easy hint to > check if you get i trouble with this setting so one can prepare. From the point of view of Postgres, "getting in trouble" means "postmaster shot in head by surprise." If you feel otherwise, then you have to learn how to tune your operating system correctly. The PostgreSQL manual is not a place for general wisdom about how to tune various kernels. I think the advice is correctly worded as it is. > A simple "see if your "CommitLimit - Commited_AS" from /proc/meminfo > come close to 0 after some uptime and if so don't use it. That's not good enough, because the case where you really get into trouble might be an unusual case. It's in fact exactly the condition where your machine is facing surprising loads where memory overcommit will bite you. So following your advice will still lead people to be surprised when their postmaster goes away because they were Slashdotted or something. > only like to see a hint how to check *before* you get in trouble. "Am I using Linux with overcommit?" would be one such check. The only reliable one. (Also, "Am I using AIX?" just in case anyone thinks this is some sort of anti-Linux bias I have. Malloc lying ranks with system sins right up there with fsync returning before the bits are on the platter.) A
Zitat von Andrew Sullivan <ajs@crankycanuck.ca>: > On Tue, Dec 11, 2007 at 09:23:38AM +0100, Listaccount wrote: >> I don't want to start the discussion what is the rigth thing todo, > > Then you shouldn't ask here. The manual was changed to say what it does > after considerable community discussion. In my view, the Linux kernel's > behaviour is completely unacceptable, and exactly the sort of amateur design > foolishness that people are complaining about when they say Linux is a toy. It was not a question but a hint how to improve the documentation. It is nice you have a strong opinion on the technical details behind it but one should not include advices without pointing out possible downsides. If this is not possible then the advice should be removed completely because it has nothing to do with PostgreSQL but with shortcomings of the Linux (or some other) Kernel. I would have not been surprised if the OOM-Killer would go around in case of short memory but i was surprised to see fork failed with a system having 1GB Memory available. That's all my own opinion and nothing need to be said anymore. If the maintainer of the docu agree they could take it out or improve if not i don't care because i have learned my lesson but others may suffer from the same hidden problem. Regards Andreas
Listaccount wrote: > I would have not been surprised if the OOM-Killer would go around in case > of short memory but i was surprised to see fork failed with a system having > 1GB Memory available. We've had *a lot* of problem reports due to the OOM killer. -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ "Aprender sin pensar es inútil; pensar sin aprender, peligroso" (Confucio)
On Tue, Dec 11, 2007 at 03:08:36PM +0100, Listaccount wrote: > I would have not been surprised if the OOM-Killer would go around in > case of short memory but i was surprised to see fork failed with a > system having 1GB Memory available. You don't understand: the system _did not_ have 1G of memory available. It was all committed to applications that had asked for it. Just because they asked for it even though they were never going to use it doesn't mean that it isn't gone. It's used, as far as the kernel is concerned. The overcommit trick some OSes have implemented is a filthy hack to get around poor memory allocation discipline in applications. The point of the PostgreSQL documentation is to tell you how best to run Postgres, safely and reliably. The only safe and reliable way to run on Linux is not to use overcommit. Turning it off ensures that the system can't run out of memory in this way. What I _would_ support in the docs is the following addition in 17.4.3, where this is discussed: . . .it will lower the chances significantly and will therefore lead to more robust system behavior. It may also cause fork() to fail when the machine appears to have available memory. This is done by selecting. . . Or something like that. This would warn potential users that they really do need to read their kernel docs. A
Zitat von Andrew Sullivan <ajs@crankycanuck.ca>: > On Tue, Dec 11, 2007 at 03:08:36PM +0100, Listaccount wrote: >> I would have not been surprised if the OOM-Killer would go around in >> case of short memory but i was surprised to see fork failed with a >> system having 1GB Memory available. > > You don't understand: the system _did not_ have 1G of memory available. It > was all committed to applications that had asked for it. Just because they > asked for it even though they were never going to use it doesn't mean that > it isn't gone. It's used, as far as the kernel is concerned. The > overcommit trick some OSes have implemented is a filthy hack to get around > poor memory allocation discipline in applications. For sure i understand the problem. The key is how you define "available". But i agree with you that overcommit obfuscate careless application design. > The point of the PostgreSQL documentation is to tell you how best to run > Postgres, safely and reliably. The only safe and reliable way to run on > Linux is not to use overcommit. Turning it off ensures that the system > can't run out of memory in this way. Yes, but the documentation should at least warn if some setting *could* lead to trouble you would not have otherwise. > What I _would_ support in the docs is the following addition in 17.4.3, > where this is discussed: > > . . .it will lower the chances significantly and will therefore > lead to more robust system behavior. It may also cause fork() to fail > when the machine appears to have available memory. This is done by > selecting. . . > > Or something like that. This would warn potential users that they really do > need to read their kernel docs. On this one we can agree. Maybe we should mention the root-cause. "It may also cause fork() to fail when the machine appears to have available memory because of other applications doing careless memory allocation" Would be nice to save others from learning about this the hard way. Regards Andreas
Dear docs mavens: Please see below for a possible adjustment to the docs. Is it agreeable? If so, I'll see about putting together a patch. On Wed, Dec 12, 2007 at 05:19:24PM +0100, Listaccount wrote: > >What I _would_ support in the docs is the following addition in 17.4.3, > >where this is discussed: > > > > . . .it will lower the chances significantly and will therefore > > lead to more robust system behavior. It may also cause fork() to fail > > when the machine appears to have available memory. This is done by > > selecting. . . > > > >Or something like that. This would warn potential users that they really > >do > >need to read their kernel docs. > > On this one we can agree. Maybe we should mention the root-cause. > > "It may also cause fork() to fail when the machine appears to have > available memory because of other applications doing careless memory > allocation" > > Would be nice to save others from learning about this the hard way. > > Regards > > Andreas
Zitat von Andrew Sullivan <ajs@crankycanuck.ca>: > Dear docs mavens: > > Please see below for a possible adjustment to the docs. Is it agreeable? > If so, I'll see about putting together a patch. Thanxs ! BTW : How can one find out the application doing unused allocations? What value of "ps" output to watch for? Regards Andreas
On Wed, Dec 12, 2007 at 06:01:52PM +0100, Listaccount wrote: > BTW : How can one find out the application doing unused allocations? > What value of "ps" output to watch for? As far as I know, the only way to learn that is to use a debugger. If the OS knew this, it'd be able to shoot the misbehaving process instead of whatever it guesses on. A
Listaccount wrote: > Yes, but the documentation should at least warn if some setting > *could* lead to trouble you would not have otherwise. > > > What I _would_ support in the docs is the following addition in 17.4.3, > > where this is discussed: > > > > . . .it will lower the chances significantly and will therefore > > lead to more robust system behavior. It may also cause fork() to fail > > when the machine appears to have available memory. This is done by > > selecting. . . > > > > Or something like that. This would warn potential users that they really do > > need to read their kernel docs. > > On this one we can agree. Maybe we should mention the root-cause. > > "It may also cause fork() to fail when the machine appears to have > available memory because of other applications doing careless memory > allocation" > > Would be nice to save others from learning about this the hard way. Good, text added in parentheses: On Linux 2.6 and later, an additional measure is to modify the kernel's behavior so that it will not <quote>overcommit</> memory. Although this setting will not prevent the OOM killer from invoking altogether, it will lower the chances significantly and will therefore lead to more robust system behavior. (It might also cause fork() to fail when the machine appears to have available memory because of other applications with careless memory allocation.) This is done by selecting strict overcommit mode via -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +