Thread: Pre-allocation of shared memory ...
There is a problem which occurs from time to time and which is a bit nasty in business environments. When the shared memory is eaten up by some application such as Apache PostgreSQL will refuse to do what it should do because there is no memory around. To many people this looks like a problem relatd to stability. Also, it influences availability of the database itself. I was thinking of a solution which might help to get around this problem: If we had a flag to tell PostgreSQL that XXX Megs of shared memory should be preallocated by PostgreSQL. The database would the sure that there is always enough memory around. The problem is that PostgreSQL had to care more about memory consumption. Of course, the best solution is to put PostgreSQL on a separate machine but many people don't do it so we have to live with memory leaks caused by other software (we have just seen a nasty one in mod_perl). Does it make sense? Regards, Hans -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706; +43/664/233 90 75 www.cybertec.at, www.postgresql.at, kernel.cybertec.at
We already pre-allocate all shared memory and resources on postmaster start. --------------------------------------------------------------------------- Hans-J�rgen Sch�nig wrote: > There is a problem which occurs from time to time and which is a bit > nasty in business environments. > When the shared memory is eaten up by some application such as Apache > PostgreSQL will refuse to do what it should do because there is no > memory around. To many people this looks like a problem relatd to > stability. Also, it influences availability of the database itself. > > I was thinking of a solution which might help to get around this problem: > If we had a flag to tell PostgreSQL that XXX Megs of shared memory > should be preallocated by PostgreSQL. The database would the sure that > there is always enough memory around. The problem is that PostgreSQL had > to care more about memory consumption. > > Of course, the best solution is to put PostgreSQL on a separate machine > but many people don't do it so we have to live with memory leaks caused > by other software (we have just seen a nasty one in mod_perl). > > Does it make sense? > > Regards, > > Hans > > > -- > Cybertec Geschwinde u Schoenig > Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria > Tel: +43/2952/30706; +43/664/233 90 75 > www.cybertec.at, www.postgresql.at, kernel.cybertec.at > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > We already pre-allocate all shared memory and resources on postmaster > start. I guess we allocate memory when a backend starts, don't we? Or do we allocate when the instance starts? I have two explanations for the following behaviour: a. a bug b. not enough shared memory WARNING: Message from PostgreSQL backend:The Postmaster has informed me that some other backenddied abnormally and possiblycorrupted shared memory.I have rolled back the current transaction and amgoing to terminate your database systemconnection and exit.Please reconnect to the database system and repeat your query. server closed the connection unexpectedlyThis probably means the server terminated abnormallybefore or while processing therequest. connection to server was lost The problem is that this only happens with mod_perl and Apache on the same machine so I thought it has to do with a known memory leak in mod_perl/Apache. I happens after about two weeks (it seems to occur regularily). > Are you suggesting pre-acquiring resources like oracle does? Like you start a > database instance, 350MB memory is gone types? > > One thing I love about postgresql is that it does not do any such silly thing. > I agree in the case you suggest, it makes sense. > > If at all postgresql goes that way, I would like to see it configurable. I > would rather remove an app. from a machine rather than letting it stamp on > other apps feet. Shridhar. Yes, when preallocating some memory it has to be configurable (default = off). -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706; +43/664/233 90 75 www.cybertec.at, www.postgresql.at, kernel.cybertec.at
Hans-Jürgen Schönig <hs@cybertec.at> writes: > I have two explanations for the following behaviour: > a. a bug > b. not enough shared memory > WARNING: Message from PostgreSQL backend: > The Postmaster has informed me that some other backend > died abnormally and possibly corrupted shared memory. Is this a Linux machine? If so, the true explanation is probably (c): the kernel is kill 9'ing randomly-chosen database processes whenever it starts to feel low on memory. I would suggest checking the postmaster log to determine the signal number the failed backends are dying with. The client-side message does not give nearly enough info to debug such problems. There is also possibility (d): you have some bad RAM that is located in an address range that doesn't get used until the machine is under full load. But if the backends are dying with signal 9 then I'll take the kernel-kill theory. AFAIK the only good way around this problem is to use another OS with a more rational design for handling low-memory situations. No other Unix does anything remotely as brain-dead as what Linux does. Or bug your favorite Linux kernel hacker to fix the kernel. regards, tom lane
Tom Lane wrote: > Hans-Jürgen Schönig <hs@cybertec.at> writes: > > I have two explanations for the following behaviour: > > a. a bug > > b. not enough shared memory > > > WARNING: Message from PostgreSQL backend: > > The Postmaster has informed me that some other backend > > died abnormally and possibly corrupted shared memory. > > Is this a Linux machine? If so, the true explanation is probably (c): > the kernel is kill 9'ing randomly-chosen database processes whenever > it starts to feel low on memory. I would suggest checking the > postmaster log to determine the signal number the failed backends are > dying with. The client-side message does not give nearly enough info > to debug such problems. > > There is also possibility (d): you have some bad RAM that is located in > an address range that doesn't get used until the machine is under full > load. But if the backends are dying with signal 9 then I'll take the > kernel-kill theory. > > AFAIK the only good way around this problem is to use another OS with a > more rational design for handling low-memory situations. No other Unix > does anything remotely as brain-dead as what Linux does. Or bug your > favorite Linux kernel hacker to fix the kernel. Is there no sysctl way to disable such kills? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: > > AFAIK the only good way around this problem is to use another OS with a > > more rational design for handling low-memory situations. No other Unix > > does anything remotely as brain-dead as what Linux does. Or bug your > > favorite Linux kernel hacker to fix the kernel. > > Is there no sysctl way to disable such kills? The -ac kernel patches from Alan Cox have a sysctl to control memory overcommit--you can set it to track memory usage and fail allocations when memory runs out, rather than the random kill behavior. I'm not sure whether those have made it into the stock kernel yet, but the vendor kernels (such as Red Hat's) might have it too. -Doug
On Wed, Jun 11, 2003 at 07:35:20PM -0400, Doug McNaught wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > Is there no sysctl way to disable such kills? > > The -ac kernel patches from Alan Cox have a sysctl to control memory > overcommit--you can set it to track memory usage and fail allocations > when memory runs out, rather than the random kill behavior. I'm not > sure whether those have made it into the stock kernel yet, but the > vendor kernels (such as Red Hat's) might have it too. Yeah, I see it in the Mandrake kernel. But it's not in stock 2.4.19, so you can't assume everybody has it. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "¿Qué importan los años? Lo que realmente importa es comprobar que a fin de cuentas la mejor edad de la vida es estar vivo" (Mafalda)
> Yeah, I see it in the Mandrake kernel. But it's not in stock 2.4.19, so> you can't assume everybody has it.> We had this problem on a recent version of good old Slackware. I think we also had it on RedHat 8 or so. Doing this kind of killing is definitely a bad habit. I thought it had it had to do with something else so my proposal for pre-allocation seems to be pretty obsolete ;). Thanks a lot. Hans -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706; +43/664/233 90 75 www.cybertec.at, www.postgresql.at, kernel.cybertec.at
On this machine (RH9, kernel 2.4.20-18.9) the docs say (in /usr/src/linux-2.4/Documentation/vm/overcommit-accounting ): ----------------- The Linux kernel supports four overcommit handling modes 0 - Heuristic overcommit handling. Obvious overcommits of address space are refused. Used for atypical system. It ensures a seriously wild allocation fails while allowing overcommit to reduceswap usage 1 - No overcommit handling. Appropriate for some scientific applications 2 - (NEW) strict overcommit. The total address space commit for the system is not permitted toexceed swap + half ram. In almost all situations this means a process will not be killed whileaccessing pages but only by malloc failures that are reported back by the kernel mmap/brk code. 3 - (NEW) paranoid overcommit The total address space commit for the system is not permitted toexceed swap. The machine will never kill a process accessing pages it has mapped except dueto a bug (ie report it!) ---------------------- So maybe sysctl -w vm.overcommit_memory=3 is what's needed? I guess you might pay a performance hit for doing that, though. andrew > > Yeah, I see it in the Mandrake kernel. But it's not in stock 2.4.19, > > so you can't assume everybody has it. > > > > We had this problem on a recent version of good old Slackware. > I think we also had it on RedHat 8 or so. > > Doing this kind of killing is definitely a bad habit. I thought it had > it had to do with something else so my proposal for pre-allocation > seems to be pretty obsolete ;). > > Thanks a lot. > > Hans
Tom Lane wrote: > Is this a Linux machine? If so, the true explanation is probably (c): > the kernel is kill 9'ing randomly-chosen database processes whenever > it starts to feel low on memory. I would suggest checking the > postmaster log to determine the signal number the failed backends are > dying with. The client-side message does not give nearly enough info > to debug such problems. > > AFAIK the only good way around this problem is to use another OS with a > more rational design for handling low-memory situations. No other Unix > does anything remotely as brain-dead as what Linux does. Or bug your > favorite Linux kernel hacker to fix the kernel. Tom- Just curious. What would a rationally designed OS do in an out of memory situation? It seems like from the discussions I've read about the subject there really is no rational solution to this irrational problem. Some solutions such as "suspend process, write image to file" and "increase swap space" assume available disk space, which is obviously not guaranteed to be avaliable. -- -**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*--- Jon Lapham <lapham@extracta.com.br> Riode Janeiro, Brasil Work: Extracta Moléculas Naturais SA http://www.extracta.com.br/ Web: http://www.jandr.org/ ***-*--*----*-------*------------*--------------------*---------------
Jon Lapham <lapham@extracta.com.br> writes: > Just curious. What would a rationally designed OS do in an out of > memory situation? Fail malloc() requests. The sysctl docs that Andrew Dunstan just provided give some insight into the problem: the default behavior of Linux is to promise more virtual memory than it can actually deliver. That is, it allows malloc to succeed even when it's not going to be able to actually provide the address space when push comes to shove. When called to stand and deliver, the kernel has no way to report failure (other than perhaps a software-induced SIGSEGV, which would hardly be an improvement). So it kills the process instead. Unfortunately, the process that happens to be in the line of fire at this point could be any process, not only the one that made unreasonable memory demands. This is perhaps an okay behavior for desktop systems being run by people who are accustomed to Microsoft-like reliability. But to make it the default is brain-dead, and to make it the only available behavior (as seems to have been true until very recently) defies belief. The setting now called "paranoid overcommit" is IMHO the *only* acceptable one for any sort of server system. With anything else, you risk having critical userspace daemons killed through no fault of their own. regards, tom lane
Tom Lane wrote:> [snip] > The > setting now called "paranoid overcommit" is IMHO the *only* acceptable > one for any sort of server system. With anything else, you risk having > critical userspace daemons killed through no fault of their own. Wow. Thanks for the info. I found the documentation you are referring to in Documentation/vm/overcommit-accounting (on a stock RH9 machine). It seems that the overcommit policy is set via the sysctl `vm.overcommit_memory'. So... [root@bilbo src]# sysctl -a | grep -i overcommit vm.overcommit_memory = 0 ...the default seems to be "Heuristic overcommit handling". It seems that what we want is "vm.overcommit_memory = 3" for paranoid overcommit. Thanks for getting to the bottom of this Tom. It *is* insane that the default isn't "paranoid overcommit". -- -**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*--- Jon Lapham <lapham@extracta.com.br> Riode Janeiro, Brasil Work: Extracta Moléculas Naturais SA http://www.extracta.com.br/ Web: http://www.jandr.org/ ***-*--*----*-------*------------*--------------------*---------------
What really kills [:-)] me is that they allocate memory assuming I will not be using it all, then terminate the executable in an unrecoverable way when I go to use the memory. And, they make a judgement on users who don't want this by calling them "paranoid". I will add something to the docs about this. --------------------------------------------------------------------------- Tom Lane wrote: > Jon Lapham <lapham@extracta.com.br> writes: > > Just curious. What would a rationally designed OS do in an out of > > memory situation? > > Fail malloc() requests. > > The sysctl docs that Andrew Dunstan just provided give some insight into > the problem: the default behavior of Linux is to promise more virtual > memory than it can actually deliver. That is, it allows malloc to > succeed even when it's not going to be able to actually provide the > address space when push comes to shove. When called to stand and > deliver, the kernel has no way to report failure (other than perhaps a > software-induced SIGSEGV, which would hardly be an improvement). So it > kills the process instead. Unfortunately, the process that happens to > be in the line of fire at this point could be any process, not only the > one that made unreasonable memory demands. > > This is perhaps an okay behavior for desktop systems being run by > people who are accustomed to Microsoft-like reliability. But to make it > the default is brain-dead, and to make it the only available behavior > (as seems to have been true until very recently) defies belief. The > setting now called "paranoid overcommit" is IMHO the *only* acceptable > one for any sort of server system. With anything else, you risk having > critical userspace daemons killed through no fault of their own. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > What really kills [:-)] me is that they allocate memory assuming I will > not be using it all, then terminate the executable in an unrecoverable > way when I go to use the memory. To be fair, I'm probably misstating things by referring to malloc(). The big problem probably comes from fork() with copy-on-write --- the kernel has no good way to estimate how much of the shared address space will eventually become private modified copies, but it can be forgiven for wanting to make less than the worst-case assumption. Still, if you are wanting to run a reliable server, I think worst-case assumption is exactly what you want. Swap space is cheap, and there's no reason you shouldn't have enough swap to support the worst-case situation. If the swap area goes largely unused, that's fine. The policy they're calling "paranoid overcommit" (don't allocate more virtual memory than you have swap) is as far as I know the standard on all Unixen other than Linux; certainly it's the traditional behavior. regards, tom lane
OK, doc patch attached and applied. Improvements? --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > What really kills [:-)] me is that they allocate memory assuming I will > > not be using it all, then terminate the executable in an unrecoverable > > way when I go to use the memory. > > To be fair, I'm probably misstating things by referring to malloc(). > The big problem probably comes from fork() with copy-on-write --- the > kernel has no good way to estimate how much of the shared address space > will eventually become private modified copies, but it can be forgiven > for wanting to make less than the worst-case assumption. > > Still, if you are wanting to run a reliable server, I think worst-case > assumption is exactly what you want. Swap space is cheap, and there's > no reason you shouldn't have enough swap to support the worst-case > situation. If the swap area goes largely unused, that's fine. > > The policy they're calling "paranoid overcommit" (don't allocate more > virtual memory than you have swap) is as far as I know the standard on > all Unixen other than Linux; certainly it's the traditional behavior. > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 Index: doc/src/sgml/runtime.sgml =================================================================== RCS file: /cvsroot/pgsql-server/doc/src/sgml/runtime.sgml,v retrieving revision 1.184 diff -c -c -r1.184 runtime.sgml *** doc/src/sgml/runtime.sgml 11 Jun 2003 22:13:21 -0000 1.184 --- doc/src/sgml/runtime.sgml 12 Jun 2003 15:29:45 -0000 *************** *** 2780,2785 **** --- 2780,2795 ---- <filename>/usr/src/linux/include/asm-<replaceable>xxx</>/shmpara m.h</> and <filename>/usr/src/linux/include/linux/sem.h</>. </para> + + <para> + Linux has poor default memory overcommit behavior. Rather than + failing if it can not reserve enough memory, it returns success, + but later fails when the memory can't be mapped and terminates + the application. To prevent unpredictable process termination, use: + <programlisting> + sysctl -w vm.overcommit_memory=3 + </programlisting> + </para> </listitem> </varlistentry>
Bruce Momjian <pgman@candle.pha.pa.us> writes: > OK, doc patch attached and applied. Improvements? I think it would be worth spending another sentence to tell people exactly what the symptom looks like, ie, backends dying with signal 9. regards, tom lane
I have added the following sentence to the docs too: Note, you will need enough swap space to cover all your memoryneeds. I still wish Linux would just fail the fork/malloc when memory is low, rather than requiring swap for everything _or_ overcommitting. I wonder if making a unified buffer cache just made that too hard to do. --------------------------------------------------------------------------- Andrew Dunstan wrote: > > On this machine (RH9, kernel 2.4.20-18.9) the docs say (in > /usr/src/linux-2.4/Documentation/vm/overcommit-accounting ): > > ----------------- > The Linux kernel supports four overcommit handling modes > > 0 - Heuristic overcommit handling. Obvious overcommits of > address space are refused. Used for a typical system. It > ensures a seriously wild allocation fails while allowing > overcommit to reduce swap usage > > 1 - No overcommit handling. Appropriate for some scientific > applications > > 2 - (NEW) strict overcommit. The total address space commit > for the system is not permitted to exceed swap + half ram. > In almost all situations this means a process will not be > killed while accessing pages but only by malloc failures > that are reported back by the kernel mmap/brk code. > > 3 - (NEW) paranoid overcommit The total address space commit > for the system is not permitted to exceed swap. The machine > will never kill a process accessing pages it has mapped > except due to a bug (ie report it!) > ---------------------- > > So maybe > > sysctl -w vm.overcommit_memory=3 > > is what's needed? I guess you might pay a performance hit for doing that, > though. > > andrew > > > > Yeah, I see it in the Mandrake kernel. But it's not in stock 2.4.19, > > > so you can't assume everybody has it. > > > > > > > We had this problem on a recent version of good old Slackware. > > I think we also had it on RedHat 8 or so. > > > > Doing this kind of killing is definitely a bad habit. I thought it had > > it had to do with something else so my proposal for pre-allocation > > seems to be pretty obsolete ;). > > > > Thanks a lot. > > > > Hans > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
OK, new text is: <para> Linux has poor default memory overcommit behavior. Rather than failing if it can not reserve enoughmemory, it returns success, but later fails when the memory can't be mapped and terminates the applicationwith <literal>kill -9</>. To prevent unpredictable process termination, use: <programlisting> sysctl -w vm.overcommit_memory=3 </programlisting> Note, you will need enough swap space to cover all your memory needs. </para> </listitem> </varlistentry> --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > OK, doc patch attached and applied. Improvements? > > I think it would be worth spending another sentence to tell people > exactly what the symptom looks like, ie, backends dying with signal 9. > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
A couple of points: . It is probably a good idea to put do this via /etc/sysctl.conf, which will be called earlyish by init scripts (on RH9 it is in the network startup file, for some reason). . The setting is not available on all kernel versions AFAIK. The admin needs to check the docs. I have no idea when this went into the kernel, and no time to spend finding out. Even if we knew, it might have gone into vendor kernels at other odd times - there are often times when the vendors are in advance of the officially released kernels. Andrew Bruce wrote: > > OK, new text is: > > <para> > Linux has poor default memory overcommit behavior. Rather than > failing if it can not reserve enough memory, it returns success, > but later fails when the memory can't be mapped and terminates > the application with <literal>kill -9</>. To prevent > unpredictable process termination, use: > <programlisting> > sysctl -w vm.overcommit_memory=3 > </programlisting> > Note, you will need enough swap space to cover all your memory > needs. > </para> > </listitem> > </varlistentry> > > --------------------------------------------------------------------------- > > Tom Lane wrote: >> Bruce Momjian <pgman@candle.pha.pa.us> writes: >> > OK, doc patch attached and applied. Improvements? >> >> I think it would be worth spending another sentence to tell people >> exactly what the symptom looks like, ie, backends dying with signal 9. >> >> regards, tom lane >> > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania > 19073 > > ---------------------------(end of > broadcast)--------------------------- TIP 2: you can get off all lists > at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
Well, let's see what feedback we get. --------------------------------------------------------------------------- Andrew Dunstan wrote: > > A couple of points: > > . It is probably a good idea to put do this via /etc/sysctl.conf, which will > be called earlyish by init scripts (on RH9 it is in the network startup > file, for some reason). > > . The setting is not available on all kernel versions AFAIK. The admin needs > to check the docs. I have no idea when this went into the kernel, and no > time to spend finding out. Even if we knew, it might have gone into vendor > kernels at other odd times - there are often times when the vendors are in > advance of the officially released kernels. > > Andrew > > > Bruce wrote: > > > > OK, new text is: > > > > <para> > > Linux has poor default memory overcommit behavior. Rather than > > failing if it can not reserve enough memory, it returns success, > > but later fails when the memory can't be mapped and terminates > > the application with <literal>kill -9</>. To prevent > > unpredictable process termination, use: > > <programlisting> > > sysctl -w vm.overcommit_memory=3 > > </programlisting> > > Note, you will need enough swap space to cover all your memory > > needs. > > </para> > > </listitem> > > </varlistentry> > > > > --------------------------------------------------------------------------- > > > > Tom Lane wrote: > >> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> > OK, doc patch attached and applied. Improvements? > >> > >> I think it would be worth spending another sentence to tell people > >> exactly what the symptom looks like, ie, backends dying with signal 9. > >> > >> regards, tom lane > >> > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 359-1001 > > + If your life is a hard drive, | 13 Roberts Road > > + Christ can be your backup. | Newtown Square, Pennsylvania > > 19073 > > > > ---------------------------(end of > > broadcast)--------------------------- TIP 2: you can get off all lists > > at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane <tgl@sss.pgh.pa.us> writes: > The policy they're calling "paranoid overcommit" (don't allocate more > virtual memory than you have swap) is as far as I know the standard on > all Unixen other than Linux; certainly it's the traditional behavior. Uhm, it's traditional for Unixen without extensive shared memory usage like SunOS 4. But it's not nearly as standard as you say. In fact Linux wasn't the first major Unix to behave this way at all. As far as I know, that honour belongs to AIX. Not coincidentally, one of the first Unixen to have shared libraries. Hence the AIX invention of SIGDANGER which told a process its death was imminent. On AIX the heuristic was to kill the largest process in order to clear up the most memory -- which had a nasty habit of picking the X server to kill, which of course, well, it cleared up lots of memory... I think they "fixed" that by changing the heuristic to kill the *second* biggest process. I think you'll find this overcommit issue affects many if not most Unixen. There's a bit of a vicious circle here, a lot of software now have the habit of starting off by mallocing huge chunks of memory that they never need because "well the machine has virtual memory so it doesn't cost anything". -- greg
Greg Stark <gsstark@mit.edu> writes: > I think you'll find this overcommit issue affects many if not most Unixen. I'm unconvinced, because I've only ever heard of the problem affecting Postgres on Linux. regards, tom lane
Tom Lane wrote: > Greg Stark <gsstark@mit.edu> writes: > > I think you'll find this overcommit issue affects many if not most Unixen. > > I'm unconvinced, because I've only ever heard of the problem affecting > Postgres on Linux. What I don't understand is why they just don't start failing on fork/malloc rather than killing things. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Thu, Jun 12, 2003 at 08:08:28PM -0400, Bruce Momjian wrote: > > > > I'm unconvinced, because I've only ever heard of the problem affecting > > Postgres on Linux. > > What I don't understand is why they just don't start failing on > fork/malloc rather than killing things. I may be way off the mark here, falling into the middle of this as I am, but it may be because the kernel overcommits the memory (which is sort of logical in a way given the way fork() works). That may mean that malloc() thinks it gets more memory and returns a pointer, but the kernel hasn't actually committed that address space yet and waits to see if it's ever going to be needed. Given the right allocation proportions, this may mean that in the end the kernel has no way to handle a shortage gracefully by causing fork() or allocations to fail. I would assume it then goes through its alternatives like scaling back its file cache--which it'd probably start to do before a lot of swapping was needed, so not much to scrape out of that barrel. After that, where do you go? Try to find a reasonable process to shoot in the head. From what I heard, although I haven't kept current, a lot of work went into selecting a "reasonable" process, so there will be some determinism. And if you have occasion to find out in the first place, "some determinism" usually means "suspiciously bad luck." Jeroen
I'm not saying you're wrong, but I also think it's true that typical Linux usage patterns are rather different from those of other *nixen. Linux started out being able to do a lot with a little, and is still often used that way - with more functions crammed into boxes with less resources. When I last worked in a data centre (a few years ago now, for one of the world's largest companies) they had hundreds of AIX and HP-UX boxes, each well resourced and each dedicated to exactly one function. I rarely see Linux being used that way, and I often see it configured with lowish memory and not nearly enough swap. In any case, it seems to me we need to have someone check that setting the vm.overcommit_memory to paranoid will actually stop the postmaster being killed. I'd love to help but I'm up to my ears in stuff right now. If we know that we can save the philosophical stuff for another day :-) cheers andrew ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Greg Stark" <gsstark@mit.edu> Cc: <pgsql-hackers@postgresql.org> Sent: Thursday, June 12, 2003 6:19 PM Subject: Re: [HACKERS] Pre-allocation of shared memory ... > Greg Stark <gsstark@mit.edu> writes: > > I think you'll find this overcommit issue affects many if not most Unixen. > > I'm unconvinced, because I've only ever heard of the problem affecting > Postgres on Linux. > > regards, tom lane
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > Given the right allocation proportions, this may mean that in the end the > kernel has no way to handle a shortage gracefully by causing fork() or > allocations to fail. Sure it does. All you need is a conservative allocation policy: fork() fails if it cannot reserve enough swap space to guarantee that the new process could write over its entire address space. Copy-on-write is an optimization that reduces physical RAM usage, not virtual address space or swap-space requirements. Given that swap space is cheap, and that killing random processes is obviously bad, it's not apparent to me why people think this is not a good approach --- at least for high-reliability servers. And Linux would definitely like to think of itself as a server-grade OS. > After that, where do you go? Try to find a reasonable process to shoot > in the head. From what I heard, although I haven't kept current, a lot > of work went into selecting a "reasonable" process, so there will be some > determinism. Considering the frequency with which we hear of database backends getting shot in the head, I'd say those heuristics need lots of work yet. I'll take a non-heuristic solution for any system I have to administer, thanks. regards, tom lane
On Thu, Jun 12, 2003 at 09:18:33PM -0400, Tom Lane wrote: > Given that swap space is cheap, and that killing random processes is > obviously bad, it's not apparent to me why people think this is not > a good approach --- at least for high-reliability servers. And Linux > would definitely like to think of itself as a server-grade OS. Well, it was a toy OS when conceived, that's for sure. But it's getting better. > Considering the frequency with which we hear of database backends > getting shot in the head, I'd say those heuristics need lots of work > yet. Previous versions were said to attempt to kill init. You have to admit there has been some progress. But then there's the problem of people running database servers on misconfigured machines. They should know better than not setting enough swap space, IMHO anyway. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) Y una voz del caos me hablo y me dijo "Sonrie y se feliz, podria ser peor". Y sonrei. Y fui feliz. Y fue peor.
Tom Lane wrote: > "Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > > Given the right allocation proportions, this may mean that in the end the > > kernel has no way to handle a shortage gracefully by causing fork() or > > allocations to fail. > > Sure it does. All you need is a conservative allocation policy: fork() > fails if it cannot reserve enough swap space to guarantee that the new > process could write over its entire address space. Copy-on-write is > an optimization that reduces physical RAM usage, not virtual address > space or swap-space requirements. > > Given that swap space is cheap, and that killing random processes is > obviously bad, it's not apparent to me why people think this is not > a good approach --- at least for high-reliability servers. And Linux > would definitely like to think of itself as a server-grade OS. BSD used to require full swap behind all RAM. I am not sure if that was changed in BSD 4.4 or in later BSD/OS releases, but it is no longer true. I think now it can use RAM or swap as reserved backing store for fork page modifications. However, when the system runs of of swap, it hangs! > > After that, where do you go? Try to find a reasonable process to shoot > > in the head. From what I heard, although I haven't kept current, a lot > > of work went into selecting a "reasonable" process, so there will be some > > determinism. > > Considering the frequency with which we hear of database backends > getting shot in the head, I'd say those heuristics need lots of work > yet. I'll take a non-heuristic solution for any system I have to > administer, thanks. You have to love that swap + 1/2 ram option --- when you need four possible options, there is something wrong with your approach. :-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > You have to love that swap + 1/2 ram option --- when you need four > possible options, there is something wrong with your approach. :-) I'm still wondering what the "no overcommit handling" option does, exactly. regards, tom lane
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > On Thu, Jun 12, 2003 at 09:18:33PM -0400, Tom Lane wrote: > > > Given that swap space is cheap, and that killing random processes is > > obviously bad, it's not apparent to me why people think this is not > > a good approach --- at least for high-reliability servers. And Linux > > would definitely like to think of itself as a server-grade OS. Consider the case of huge processes trying to fork/exec to run ls. It might seem kind of strange to be getting "Out of memory" errors from your java or database engine when there are hundreds of megs free on the machine... I suspect this was less of an issue in the days before copy on write because vfork was more widely used/implemented. I'm not sure linux even implements vfork other than just as a wrapper around fork. Even BSD ditched it a while back though I think I saw that NetBSD reimplemented it since then. > But then there's the problem of people running database servers on > misconfigured machines. They should know better than not setting enough > swap space, IMHO anyway. Well, I've seen DBAs say "Since I don't want the database swapping anyways, I'll make really sure it doesn't swap by just not giving it any swap space -- that's why we bought so much RAM in the first place". It's not obvious that you need swap to back memory the machine doesn't even report as being in use... -- greg
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > You have to love that swap + 1/2 ram option --- when you need four > > possible options, there is something wrong with your approach. :-) > > I'm still wondering what the "no overcommit handling" option does, > exactly. I assume it does no kills, and allows you to commit until you run of of swap and hang. This might be the BSD 4.4 behavior, actually. It is bad to hang the system, but if it reports swap failure, at least the admin knows why it failed, rather than killing random processes. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Greg Stark wrote: > I suspect this was less of an issue in the days before copy on write because > vfork was more widely used/implemented. I'm not sure linux even implements > vfork other than just as a wrapper around fork. Even BSD ditched it a while > back though I think I saw that NetBSD reimplemented it since then. > > > But then there's the problem of people running database servers on > > misconfigured machines. They should know better than not setting enough > > swap space, IMHO anyway. > > Well, I've seen DBAs say "Since I don't want the database swapping anyways, > I'll make really sure it doesn't swap by just not giving it any swap space -- > that's why we bought so much RAM in the first place". It's not obvious that > you need swap to back memory the machine doesn't even report as being in > use... I see no reason RAM can't be used as backing store for possible copy-on-write use. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I see no reason RAM can't be used as backing store for possible > copy-on-write use. Depends on the scenario. For a database like postgres it would work fairly well since that RAM is still available for filesystem buffers. For Oracle it would suck because it's not available for Oracle to allocate to use for its own buffers. And for a web server with an architecture like Apache it would suck because it would mean being restricted to a much lower number of processes than the machine could really handle. > > I'm still wondering what the "no overcommit handling" option does, > > exactly. > > I assume it does no kills, and allows you to commit until you run of of > swap and hang. This might be the BSD 4.4 behavior, actually. I think it just makes fork/mmap/sbrk return an error if you run out of swap. That makes the error appear most likely as malloc() returning null which most applications don't handle anyways and the user sees the same behaviour: programs crashing randomly. Of course that's not what high availability server software does but since most users' big memory consumers these days seem to be their window manager and its 3d animated window decorations... -- greg
Jeroen T. Vermeulen wrote: > >After that, where do you go? Try to find a reasonable process to shoot >in the head. From what I heard, although I haven't kept current, a lot >of work went into selecting a "reasonable" process, so there will be some >determinism. FWIW, you can browse the logic linux uses to choose which process to kill here: http://lxr.linux.no/source/mm/oom_kill.c If I read that right, this calculates "points" for each process, where: points = vm_size_of_process / sqrt(cpu_time_it_ran) / sqrt(sqrt(clock_time_it_had) * 2 if the process was niced / 4 if theprocess ran a root / 4 if the process had hardware access. and whichever process has the most points dies. I'm guessing any database backend (postgres, oracle) that wasn't part of a long-lived connection seems like an especially attractive target to this algorithm. (Though hopefully it's all moot now that Andrew / Tomfound/recommended the paranoid overcommit option, whichsure seems likethe most sane thing for a server to me) Ron PS: Oracle DBAs suffer from the same pain. http://www.cs.helsinki.fi/linux/linux-kernel/2001-12/0098.html http://www.ussg.iu.edu/hypermail/linux/kernel/0103.3/0094.html
On Thu, Jun 12, 2003 at 07:22:14PM -0700, Ron Mayer wrote: > FWIW, you can browse the logic linux uses to choose > which process to kill here: > http://lxr.linux.no/source/mm/oom_kill.c Hey, this LXR thing is cool. It'd be nice to have one of those for Postgres. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "La naturaleza, tan fragil, tan expuesta a la muerte... y tan viva"
On 12 Jun 2003 at 11:31, Bruce Momjian wrote: > > OK, doc patch attached and applied. Improvements? Can we point people to /usr/src/linux/doc...place where they can find more documentation and if their kernel supports it or not. ByeShridhar -- Zall's Laws: (1) Any time you get a mouthful of hot soup, the next thing you do will be wrong. (2) How long aminute is, depends on which side of the bathroom door you're on.
On Thu, Jun 12, 2003 at 07:22:14PM -0700, Ron Mayer wrote: > I'm guessing any database backend (postgres, oracle) > that wasn't part of a long-lived connection seems like > an especially attractive target to this algorithm. Yeah, IIRC it tries to pick daemons that can be restarted, or will be restarted automatically, but may need a lot less memory after that. Jeroen
Shridhar Daithankar wrote: > On 12 Jun 2003 at 11:31, Bruce Momjian wrote: > > > > > OK, doc patch attached and applied. Improvements? > > Can we point people to /usr/src/linux/doc...place where they can find more > documentation and if their kernel supports it or not. Yes, we could, but the name of the parameter seems enough. They certainly can look that up. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Thu, Jun 12, 2003 at 10:10:02PM -0400, Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > You have to love that swap + 1/2 ram option --- when you need four > > > possible options, there is something wrong with your approach. :-) > > > > I'm still wondering what the "no overcommit handling" option does, > > exactly. > > I assume it does no kills, and allows you to commit until you run of of > swap and hang. This might be the BSD 4.4 behavior, actually. ? I thought the idea of no overcommit was that your malloc fails ENOMEM if there isn't enough memory free for your whole request, rather than gambling that other processes aren't actually using all of theirs right now and have pages swapped out. I don't see where the hang comes in.. > It is bad to hang the system, but if it reports swap failure, at least > the admin knows why it failed, rather than killing random processes. Yes! Patrick
Patrick Welche wrote: > On Thu, Jun 12, 2003 at 10:10:02PM -0400, Bruce Momjian wrote: > > Tom Lane wrote: > > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > > You have to love that swap + 1/2 ram option --- when you need four > > > > possible options, there is something wrong with your approach. :-) > > > > > > I'm still wondering what the "no overcommit handling" option does, > > > exactly. > > > > I assume it does no kills, and allows you to commit until you run of of > > swap and hang. This might be the BSD 4.4 behavior, actually. > > ? I thought the idea of no overcommit was that your malloc fails ENOMEM > if there isn't enough memory free for your whole request, rather than > gambling that other processes aren't actually using all of theirs right now > and have pages swapped out. I don't see where the hang comes in.. I think there are two important memory cases: malloc() - should fail right away if it can't reserve the requested memory; assuming application request memory they don't use just seems dumb --- fix the bad apps. fork() - this is the tricky one because you don't know at fork time who is going to be sharing the data pages as read-only or doing an exec to overlay a new process, and who is going to be modifying them and need a private copy. I think only the fork case is tricky. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Fri, Jun 13, 2003 at 09:25:49AM -0400, Bruce Momjian wrote: > > malloc() - should fail right away if it can't reserve the requested > memory; assuming application request memory they don't use just seems > dumb --- fix the bad apps. > > fork() - this is the tricky one because you don't know at fork time who > is going to be sharing the data pages as read-only or doing an exec to > overlay a new process, and who is going to be modifying them and need a > private copy. > > I think only the fork case is tricky. But how do you tell that a malloc() can't get enough memory, once you've had to overcommit on fork()s? If a really large program did a regular fork()/exec() and there wasn't enough free virtual memory to support the full fork() "just in case the program isn't going to exec()," then *any* malloc() occurring between the two calls would have to fail. That may be better than random killing in theory, but the practical effect would be close to that. There's other complications as well, I'm sure. If this were easy, we probably wouldn't be discussing this problem now. Jeroen
Tom, et al, > > Given that swap space is cheap, and that killing random processes is > > obviously bad, it's not apparent to me why people think this is not > > a good approach --- at least for high-reliability servers. And Linux > > would definitely like to think of itself as a server-grade OS. Regrettably, few of the GUI installers for Linux (SuSE or Red Hat, for example), include adequate swap space in their "suggested" disk formatting. Some versions of some distributions do not create a swap partition at all; others allocate only 130mb to this partition regardless of actual RAM. So regardless of what they *should* be doing, there's thousands of Linux users out there with too little or no swap on disk ... -- Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus wrote: > Tom, et al, > > > > Given that swap space is cheap, and that killing random processes is > > > obviously bad, it's not apparent to me why people think this is not > > > a good approach --- at least for high-reliability servers. And Linux > > > would definitely like to think of itself as a server-grade OS. > > Regrettably, few of the GUI installers for Linux (SuSE or Red Hat, for > example), include adequate swap space in their "suggested" disk formatting. > Some versions of some distributions do not create a swap partition at all; > others allocate only 130mb to this partition regardless of actual RAM. > > So regardless of what they *should* be doing, there's thousands of Linux users > out there with too little or no swap on disk ... Yes, I have seen that on BSD's too. I am unsure if we need actual swap backing store, or just sufficient RAM to allow fork expansion for dirty pages. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Friday 13 June 2003 11:55, Josh Berkus wrote: > Regrettably, few of the GUI installers for Linux (SuSE or Red Hat, for > example), include adequate swap space in their "suggested" disk formatting. > Some versions of some distributions do not create a swap partition at all; > others allocate only 130mb to this partition regardless of actual RAM. Incidentally, Red Hat as of about 7.0 began insisting on swap space at least as large as twice RAM size. In my case on my 512MB RAM notebook, that meant it wanted 1GB swap. If you upgrade your RAM you could get into trouble. In that case, you create a swap file on one of your other partitions that the kernel can use. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen wrote: > On Friday 13 June 2003 11:55, Josh Berkus wrote: > > Regrettably, few of the GUI installers for Linux (SuSE or Red Hat, for > > example), include adequate swap space in their "suggested" disk formatting. > > Some versions of some distributions do not create a swap partition at all; > > others allocate only 130mb to this partition regardless of actual RAM. > > Incidentally, Red Hat as of about 7.0 began insisting on swap space at least > as large as twice RAM size. In my case on my 512MB RAM notebook, that meant > it wanted 1GB swap. If you upgrade your RAM you could get into trouble. In > that case, you create a swap file on one of your other partitions that the > kernel can use. Oh, that's interesting. I know the newer BSD releases got rid of the large swap requirement, on the understanding that you usually aren't going to be using it anyway. What old BSD releases used to do was to allocate swap space as backing _all_ RAM, even when it wasn't going to need it, while later releases allocated swap only when it was needed, so it was only for cases _exceeding_ RAM, so your virtual memory was now RAM _plus_ swap. Of course, if you exceed swap, your system hangs. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Fri, 13 Jun 2003, Lamar Owen wrote: > On Friday 13 June 2003 11:55, Josh Berkus wrote: > > Regrettably, few of the GUI installers for Linux (SuSE or Red Hat, for > > example), include adequate swap space in their "suggested" disk formatting. > > Some versions of some distributions do not create a swap partition at all; > > others allocate only 130mb to this partition regardless of actual RAM. > > Incidentally, Red Hat as of about 7.0 began insisting on swap space at least > as large as twice RAM size. In my case on my 512MB RAM notebook, that meant > it wanted 1GB swap. If you upgrade your RAM you could get into trouble. In > that case, you create a swap file on one of your other partitions that the > kernel can use. I'm not sure I agree with this. To a large extent these days of cheap memory swap space is there to give you time to notice the excessive use of it and repair the system, since you'd normally be running everything in RAM. Using the old measure of twice physical memory for swap is excessive on a decent system imo. I certainly would not allocate 1GB of swap! Well, okay, I might if I've got a 16GB machine with the potential for an excessive but transitory workload, or say 4-8GB machine with a few very large memory usage processes that can be started as part of the normal work load. In short, imo these days swap is there to prevent valid processes dying for lack of system memory and not to provide normal workspace for them. Having said all that, I haven't read the start of this thread so I've probably missed the reason for the complaint about lack of swap space, like a problem on a small memory system. -- Nigel J. Andrews
I will say I do use swap sometimes when I am editing a huge image or something --- there are peak times when it is required. --------------------------------------------------------------------------- Nigel J. Andrews wrote: > On Fri, 13 Jun 2003, Lamar Owen wrote: > > > On Friday 13 June 2003 11:55, Josh Berkus wrote: > > > Regrettably, few of the GUI installers for Linux (SuSE or Red Hat, for > > > example), include adequate swap space in their "suggested" disk formatting. > > > Some versions of some distributions do not create a swap partition at all; > > > others allocate only 130mb to this partition regardless of actual RAM. > > > > Incidentally, Red Hat as of about 7.0 began insisting on swap space at least > > as large as twice RAM size. In my case on my 512MB RAM notebook, that meant > > it wanted 1GB swap. If you upgrade your RAM you could get into trouble. In > > that case, you create a swap file on one of your other partitions that the > > kernel can use. > > I'm not sure I agree with this. To a large extent these days of cheap memory > swap space is there to give you time to notice the excessive use of it and > repair the system, since you'd normally be running everything in RAM. > > Using the old measure of twice physical memory for swap is excessive on a > decent system imo. I certainly would not allocate 1GB of swap! Well, okay, I > might if I've got a 16GB machine with the potential for an excessive > but transitory workload, or say 4-8GB machine with a few very large memory > usage processes that can be started as part of the normal work load. > > In short, imo these days swap is there to prevent valid processes dying for > lack of system memory and not to provide normal workspace for them. > > Having said all that, I haven't read the start of this thread so I've probably > missed the reason for the complaint about lack of swap space, like a problem on > a small memory system. > > > -- > Nigel J. Andrews > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: most folks find a random_page_cost between 1 or 2 is ideal > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Fri, Jun 13, 2003 at 12:32:24PM -0400, Lamar Owen wrote: > > Incidentally, Red Hat as of about 7.0 began insisting on swap space at least > as large as twice RAM size. In my case on my 512MB RAM notebook, that meant > it wanted 1GB swap. If you upgrade your RAM you could get into trouble. In > that case, you create a swap file on one of your other partitions that the > kernel can use. RedHat's position may be influenced by the fact that, AFAIR, they use the Rik van Riel virtual memory system which is inclusive--i.e., you need at least as much swap as you have physical memory before you really have any virtual memory at all. This was fixed by the competing Andrea Arcangeli system, which became standard for the Linux kernel around 2.4.10 or so. Jeroen
On Friday 13 June 2003 12:46, Nigel J. Andrews wrote: > On Fri, 13 Jun 2003, Lamar Owen wrote: > > Incidentally, Red Hat as of about 7.0 began insisting on swap space at > > least as large as twice RAM size. In my case on my 512MB RAM notebook, > > that meant it wanted 1GB swap. If you upgrade your RAM you could get > > into trouble. In that case, you create a swap file on one of your other > > partitions that the kernel can use. > I'm not sure I agree with this. To a large extent these days of cheap > memory swap space is there to give you time to notice the excessive use of > it and repair the system, since you'd normally be running everything in > RAM. It is or was a Linux kernel problem. The 2.2 kernel required double swap space, even though it wasn't well documented. Early 2.4 kernels also required double swap space, and it was better documented. Current Red Hat 2.4 kernels, I'm not sure which VM system is in use. The old VM certainly DID require double physical memory swap space. From a message I wrote in January of 2002: "On Tuesday 22 January 2002 03:48 pm, Jim Wilcoxson wrote: > I should have said, we're running this way on 2.2.19, not 2.4 -J > > Is this Linux requirement documented anywhere? We're running 256MB > > of swap on 1GB machines and have not had any problems. But we don't > > swap much either. 2.2 actually needs 2x swap, but the problems are worse with 2.4. 2.2 won't die a horrible screaming death -- but 2.4 WILL DIE if you run out of swap in the wrong way. As to documentation, I can't tell you how I found out about it, as I'm under NDA from that source. However, it is public information: see http://lwn.net/2001/0607/kernel.php3 for some pointers. Also see http://www.geocrawler.com/archives/3/84/2001/5/0/5867356/ http://www.tuxedo.org/~esr/writings/ultimate-linux-box/configuration.html and http://www.ultraviolet.org/mail-archives/linux-kernel.2001/28831.html And note that Red Hat Linux 7.1 and 7.2 will complain vociferously if you create a swap partition smaller than 2x RAM during installation (anaconda). What it doesn't do is complain when you upgrade RAM but don't upgrade your swap." Now, as to whether this is _still_ a requirement or not, I don't know. Search the lkml (Linux Kernel Mailing List) for it. However, understand that the Red Hat kernel is closer to an Alan Cox kernel than to a Linus kernel. At least that was true up to 2.4.18; the Red Hat 2.4.20 is very different, with NPTL and its ilk thrown in. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Friday 13 June 2003 15:29, Lamar Owen wrote: > It is or was a Linux kernel problem. The 2.2 kernel required double swap > space, even though it wasn't well documented. Early 2.4 kernels also > required double swap space, and it was better documented. Current Red Hat > 2.4 kernels, I'm not sure which VM system is in use. The old VM certainly > DID require double physical memory swap space. After consulting with some kernel gurus, you can upgrade to a straight Alan Cox (-ac) kernel and turn off overcommits to cause it to fail the allocation instead of blowing processes out at random when the overcommit bites. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
The trouble with this advice is that if I am an SA wanting to run a DBMS server, I will want to run a kernel supplied by a vendor, not an arbitrary kernel released by a developer, even one as respected as Alan Cox. andrew ----- Original Message ----- From: "Lamar Owen" <lamar.owen@wgcr.org> To: "Nigel J. Andrews" <nandrews@investsystems.co.uk> Cc: "Josh Berkus" <josh@agliodbs.com>; <pgsql-hackers@postgresql.org> Sent: Saturday, June 14, 2003 11:52 AM Subject: Re: [HACKERS] Pre-allocation of shared memory ... > On Friday 13 June 2003 15:29, Lamar Owen wrote: > > It is or was a Linux kernel problem. The 2.2 kernel required double swap > > space, even though it wasn't well documented. Early 2.4 kernels also > > required double swap space, and it was better documented. Current Red Hat > > 2.4 kernels, I'm not sure which VM system is in use. The old VM certainly > > DID require double physical memory swap space. > > After consulting with some kernel gurus, you can upgrade to a straight Alan > Cox (-ac) kernel and turn off overcommits to cause it to fail the allocation > instead of blowing processes out at random when the overcommit bites.
http://lwn.net/Articles/4628/ has this possibly useful info: ---------------So what is strict VM overcommit? We introduce new overcommit policies that attempt to never succeed an allocation that can not be fulfilled by the backing store and consequently never OOM. This is achieved through strict accounting of the committed address space and a policy to allow/refuse allocations based on that accounting. In the strictest of modes, it should be impossible to allocate more memory than available and impossible to OOM. All memory failures should be pushed down to the allocation routines -- malloc, mmap, etc. -------------- But see also the discussion from July last year:http://www.ussg.iu.edu/hypermail/linux/kernel/0207.2/index.htmlA quick investigation of 2.4 releases on kernel.org appears to show this still hasn't made it into mainline kernels. Apparently Alan did this work originally because RH had customers using Oracle who were running into OOM ... Surprise!I don't keep copies of old kernel sources around on my Linux machine, so I don't know when it went into the RH kernel series - that at least would be nice to know.andrew ----- Original Message ----- From: "Andrew Dunstan" <andrew@dunslane.net> To: <pgsql-hackers@postgresql.org> Sent: Saturday, June 14, 2003 12:30 PM Subject: Re: [HACKERS] Pre-allocation of shared memory ... > The trouble with this advice is that if I am an SA wanting to run a DBMS > server, I will want to run a kernel supplied by a vendor, not an arbitrary > kernel released by a developer, even one as respected as Alan Cox. > > andrew > > ----- Original Message ----- > From: "Lamar Owen" <lamar.owen@wgcr.org> > To: "Nigel J. Andrews" <nandrews@investsystems.co.uk> > Cc: "Josh Berkus" <josh@agliodbs.com>; <pgsql-hackers@postgresql.org> > Sent: Saturday, June 14, 2003 11:52 AM > Subject: Re: [HACKERS] Pre-allocation of shared memory ... > > > > On Friday 13 June 2003 15:29, Lamar Owen wrote: > > > It is or was a Linux kernel problem. The 2.2 kernel required double > swap > > > space, even though it wasn't well documented. Early 2.4 kernels also > > > required double swap space, and it was better documented. Current Red > Hat > > > 2.4 kernels, I'm not sure which VM system is in use. The old VM > certainly > > > DID require double physical memory swap space. > > > > After consulting with some kernel gurus, you can upgrade to a straight > Alan > > Cox (-ac) kernel and turn off overcommits to cause it to fail the > allocation > > instead of blowing processes out at random when the overcommit bites. > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
On Sat, 14 Jun 2003, Andrew Dunstan wrote: > The trouble with this advice is that if I am an SA wanting to run a > DBMS server, I will want to run a kernel supplied by a vendor, not an > arbitrary kernel released by a developer, even one as respected as > Alan Cox. Like, say, Red Hat: $ ls -l /proc/sys/vm/overcommit_memory -rw-r--r-- 1 root root 0 Jun 14 18:58 /proc/sys/vm/overcommit_memory $ uname -a Linux stinky.hoopy.net 2.4.20-20.1.1995.2.2.nptl #1 Fri May 23 12:18:31 EDT 2003 i686 i686 i386 GNU/Linux (This is a Rawhide kernel, but I think that control has been in stock RH kernels for some time now.) Matthew.
On Sat, Jun 14, 2003 at 08:32:40PM +0100, Matthew Kirkwood wrote: > On Sat, 14 Jun 2003, Andrew Dunstan wrote: > > > The trouble with this advice is that if I am an SA wanting to run a > > DBMS server, I will want to run a kernel supplied by a vendor, not an > > arbitrary kernel released by a developer, even one as respected as > > Alan Cox. > > Like, say, Red Hat: > > $ ls -l /proc/sys/vm/overcommit_memory > -rw-r--r-- 1 root root 0 Jun 14 18:58 /proc/sys/vm/overcommit_memory > $ uname -a > Linux stinky.hoopy.net 2.4.20-20.1.1995.2.2.nptl #1 Fri May 23 12:18:31 EDT 2003 i686 i686 i386 GNU/Linux I also got that /proc/sys/vm/overcommit_memory on a plain 2.4.21. Kurt
On Sat, 14 Jun 2003, Kurt Roeckx wrote: > > $ ls -l /proc/sys/vm/overcommit_memory > > -rw-r--r-- 1 root root 0 Jun 14 18:58 /proc/sys/vm/overcommit_memory > > $ uname -a > > Linux stinky.hoopy.net 2.4.20-20.1.1995.2.2.nptl #1 Fri May 23 12:18:31 EDT 2003 i686 i686 i386 GNU/Linux > > I also got that /proc/sys/vm/overcommit_memory on a plain 2.4.21. This might also be interesting: http://www.cs.helsinki.fi/linux/linux-kernel/2002-33/0826.html I couldn't say how much of it is in the stock RH kernels, or how successful the heuristic is. Matthew.
Yes, but it's only a binary flag. Non-zero says "cheerfully overcommit" and 0 says "try not to overcommit" but there isn't a value that says "make sure not to overcommit". Have a look in mm/mmap.c in the plain 2.4.21 sources for evidence. There's nothing like the Alan Cox patch. IOW, simply the presence of /proc/sys/vm/overcommit_memory with a value set to 0 doesn't guarantee you won't get an OOM kill, AFAICS. I *know* the latest RH kernel docs *say* they have paranoid mode that supposedly guarantees against OOM - it was me that pointed that out originally :-). I just checked on the latest sources (today it's RH8, kernel 2.4.20-18.8) to be doubly sure, and can't see the patches. (That would be really bad of RH, btw, if I'm correct - saying in your docs you support something that you don't) The proof, if any is needed, that the mainline kernel still does not have this, is that it is still in Alan's patch set against 2.4.21, at http://www.kernel.org/pub/linux/kernel/people/alan/linux-2.4/2.4.21/patch-2.4.21-ac1.gz Summary: don't take shortcuts looking for this - Read the Source, Luke. It's important not to give people false expectations. For now, I'm leaning in Tom's direction of advising people to avoid Linux for mission-critical situations that could run into an OOM. cheers andrew ----- Original Message ----- From: "Kurt Roeckx" <Q@ping.be> To: "Matthew Kirkwood" <matthew@hairy.beasts.org> Cc: "Andrew Dunstan" <andrew@dunslane.net>; <pgsql-hackers@postgresql.org> Sent: Saturday, June 14, 2003 3:44 PM Subject: Re: [HACKERS] Pre-allocation of shared memory ... > On Sat, Jun 14, 2003 at 08:32:40PM +0100, Matthew Kirkwood wrote: > > On Sat, 14 Jun 2003, Andrew Dunstan wrote: > > > > > The trouble with this advice is that if I am an SA wanting to run a > > > DBMS server, I will want to run a kernel supplied by a vendor, not an > > > arbitrary kernel released by a developer, even one as respected as > > > Alan Cox. > > > > Like, say, Red Hat: > > > > $ ls -l /proc/sys/vm/overcommit_memory > > -rw-r--r-- 1 root root 0 Jun 14 18:58 /proc/sys/vm/overcommit_memory > > $ uname -a > > Linux stinky.hoopy.net 2.4.20-20.1.1995.2.2.nptl #1 Fri May 23 12:18:31 EDT 2003 i686 i686 i386 GNU/Linux > > > I also got that /proc/sys/vm/overcommit_memory on a plain 2.4.21. > > > Kurt > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html
"Andrew Dunstan" <andrew@dunslane.net> writes: > I *know* the latest RH kernel docs *say* they have paranoid mode that > supposedly guarantees against OOM - it was me that pointed that out > originally :-). I just checked on the latest sources (today it's RH8, kernel > 2.4.20-18.8) to be doubly sure, and can't see the patches. I think you must be looking in the wrong place. Red Hat's kernels have included the mode 2/3 overcommit logic since RHL 7.3, according to what I can find. (Don't forget Alan Cox works for Red Hat ;-).) But it is true that it's not in Linus' tree yet. This may be because there are still some loose ends. The copy of the overcommit document in my RHL 8.0 system lists some ToDo items down at the bottom: To Do ----- o Account ptrace pages (this is hard) o Disable MAP_NORESERVE in mode 2/3 o Account for shared anonymous mappings properly - right now we account them per instance I have not installed RHL 9 yet --- is the ToDo list any shorter there? regards, tom lane
"Andrew Dunstan" <andrew@dunslane.net> writes: > I *know* the latest RH kernel docs *say* they have paranoid mode that > supposedly guarantees against OOM - it was me that pointed that out > originally :-). I just checked on the latest sources (today it's RH8, kernel > 2.4.20-18.8) to be doubly sure, and can't see the patches. (That would be > really bad of RH, btw, if I'm correct - saying in your docs you support > something that you don't) I tried a direct test on my RHL 8.0 box, and was able to prove that indeed the overcommit 2/3 modes do something, though whether they work exactly as documented is another question. I wrote this silly little test program to get an approximate answer about the largest amount a program could malloc: #include <stdio.h> #include <stdlib.h> int main (int argc, char **argv) { size_t min = 1024; /* assume this'd work */ size_t max = -1; /* = max unsigned */ size_t sz; void *ptr; while ((max - min) >= 1024ul) { sz = (((unsigned long long) max) + ((unsigned long long) min)) / 2; ptr = malloc(sz); if (ptr) { free(ptr); // printf("malloc(%lu) succeeded\n", sz); min = sz; } else { // printf("malloc(%lu) failed\n", sz); max = sz; } } printf("Max malloc is %lu Kb\n", min / 1024); return 0; } and got these results: [root@rh1 tmp]# echo 0 > /proc/sys/vm/overcommit_memory [root@rh1 tmp]# ./alloc Max malloc is 1489075 Kb [root@rh1 tmp]# echo 1 > /proc/sys/vm/overcommit_memory [root@rh1 tmp]# ./alloc Max malloc is 2063159 Kb [root@rh1 tmp]# echo 2 > /proc/sys/vm/overcommit_memory [root@rh1 tmp]# ./alloc Max malloc is 1101639 Kb [root@rh1 tmp]# echo 3 > /proc/sys/vm/overcommit_memory [root@rh1 tmp]# ./alloc Max malloc is 974179 Kb So it's definitely doing something. /proc/meminfo shows total: used: free: shared: buffers: cached: Mem: 261042176 160456704 100585472 0 72015872 63344640 Swap: 1077501952 44974080 1032527872 MemTotal: 254924 kB MemFree: 98228 kB MemShared: 0 kB Buffers: 70328 kB Cached: 59244 kB SwapCached: 2616 kB Active: 102532 kB Inact_dirty: 11644 kB Inact_clean: 21840 kB Inact_target: 27200 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 254924 kB LowFree: 98228 kB SwapTotal: 1052248 kB SwapFree: 1008328 kB Committed_AS: 77164 kB It does appear that the limit in mode 3 is not too far from where you'd expect (SwapTotal - Committed_AS), and mode 2 allows about 128M more, which is correct since there's 256 M of RAM. regards, tom lane
I know he does - *but* I think it has probably been wiped out by accident somewhere along the line (like when they went to 2.4.20?) Here's what's in RH sources - tell me after you look that I am looking in the wrong place. (Or did RH get cute and decide to do this only for the AS product?) first, RH7.3/kernel 2.4.18-3 (patch present): ---------------- int vm_enough_memory(long pages, int charge) { /* Stupid algorithm to decide if we have enough memory: while * simple, it hopefully works in most obviouscases.. Easy to * fool it, but this should catch most mistakes. * * 23/11/98 NJC: Somewhat lessstupid version of algorithm, * which tries to do "TheRightThing". Instead of using half of * (buffers+cache),use the minimum values. Allow an extra 2% * of num_physpages for safety margin. * *2002/02/26 Alan Cox: Added two new modes that do real accounting */ unsigned long free, allowed; structsysinfo i; if(charge) atomic_add(pages, &vm_committed_space); /* Sometimes we want to use more memory than we have. */ if (sysctl_overcommit_memory == 1) return1; if (sysctl_overcommit_memory == 0) { /* The page cache contains buffer pages these days..*/ free = atomic_read(&page_cache_size); free += nr_free_pages(); free +=nr_swap_pages; /* * This double-counts: the nrpages are both in the page-cache * and in the swapper space. At the same time, this compensates * for the swap-space over-allocation (ie "nr_swap_pages" being * too small. */ free += swapper_space.nrpages; /* * The code below doesn't account for free space in the inode * and dentry slab cache, slab cache fragmentation, inodes and * dentries which will become freeable under VM load, etc. * Lets just hope all these (complex)factors balance out... */ free += (dentry_stat.nr_unused * sizeof(struct dentry)) >> PAGE_SHIFT; free += (inodes_stat.nr_unused * sizeof(struct inode)) >> PAGE_SHIFT; if(free > pages) return 1; atomic_sub(pages, &vm_committed_space); return 0; } allowed = total_swap_pages; if(sysctl_overcommit_memory == 2) { /* FIXME - need to add arch hooks to get the bits we need without the higher overhead crap */ si_meminfo(&i); allowed += i.totalram>> 1; } if(atomic_read(&vm_committed_space) < allowed) return 1; if(charge) atomic_sub(pages, &vm_committed_space); return 0; } --------- and here's what's in RH9/2.4.20-18 (patch absent): -------------- int vm_enough_memory(long pages) { /* Stupid algorithm to decide if we have enough memory: while * simple, it hopefully works in most obviouscases.. Easy to * fool it, but this should catch most mistakes. */ /* 23/11/98 NJC: Somewhat lessstupid version of algorithm, * which tries to do "TheRightThing". Instead of using half of * (buffers+cache),use the minimum values. Allow an extra 2% * of num_physpages for safety margin. */ unsigned long free; /* Sometimes we want to use more memory than we have. */ if (sysctl_overcommit_memory) return 1; /* The page cache contains buffer pages these days.. */ free = atomic_read(&page_cache_size); free +=nr_free_pages(); free += nr_swap_pages; /* * This double-counts: the nrpages are both in the page-cache * and in the swapper space. At the sametime, this compensates * for the swap-space over-allocation (ie "nr_swap_pages" being * too small. */ free += swapper_space.nrpages; /* * The code below doesn't account for free space in the inode * and dentry slab cache, slab cachefragmentation, inodes and * dentries which will become freeable under VM load, etc. * Lets just hope allthese (complex) factors balance out... */ free += (dentry_stat.nr_unused * sizeof(struct dentry)) >> PAGE_SHIFT; free += (inodes_stat.nr_unused * sizeof(struct inode)) >> PAGE_SHIFT; return free > pages; } ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Andrew Dunstan" <andrew@dunslane.net> Cc: "Kurt Roeckx" <Q@ping.be>; "Matthew Kirkwood" <matthew@hairy.beasts.org>; <pgsql-hackers@postgresql.org> Sent: Saturday, June 14, 2003 5:16 PM Subject: Re: [HACKERS] Pre-allocation of shared memory ... > "Andrew Dunstan" <andrew@dunslane.net> writes: > > I *know* the latest RH kernel docs *say* they have paranoid mode that > > supposedly guarantees against OOM - it was me that pointed that out > > originally :-). I just checked on the latest sources (today it's RH8, kernel > > 2.4.20-18.8) to be doubly sure, and can't see the patches. > > I think you must be looking in the wrong place. Red Hat's kernels have > included the mode 2/3 overcommit logic since RHL 7.3, according to > what I can find. (Don't forget Alan Cox works for Red Hat ;-).) > > But it is true that it's not in Linus' tree yet. This may be because > there are still some loose ends. The copy of the overcommit document > in my RHL 8.0 system lists some ToDo items down at the bottom: > > To Do > ----- > o Account ptrace pages (this is hard) > o Disable MAP_NORESERVE in mode 2/3 > o Account for shared anonymous mappings properly > - right now we account them per instance > > I have not installed RHL 9 yet --- is the ToDo list any shorter there? > > regards, tom lane
On Saturday 14 June 2003 16:38, Andrew Dunstan wrote: > IOW, simply the presence of /proc/sys/vm/overcommit_memory with a value set > to 0 doesn't guarantee you won't get an OOM kill, AFAICS. Right. You need the value to be 2 or 3. Which means you need Alan's patch to do that. > I *know* the latest RH kernel docs *say* they have paranoid mode that > supposedly guarantees against OOM - it was me that pointed that out > originally :-). I just checked on the latest sources (today it's RH8, > kernel 2.4.20-18.8) to be doubly sure, and can't see the patches. (That > would be really bad of RH, btw, if I'm correct - saying in your docs you > support something that you don't) But note these two lines in the docs with 2.4.20-13.9 (RHL9 errata): * This describes the overcommit management facility in the latest kernel tree (FIXME: actually it also describes the stuffthat isnt yet done) Pay double attention to the line that says FIXME. IOW, they've documented stuff that might not be done! You can try Red Hat's enterprise kernel, but you'll have to build it from source. RHEL AS is available online as source RPMs. Also understand that the official Red Hat kernel is very close to an Alan Cox kernel. Also, if you really want to get down and dirty testing the kernel, a test suite is available to help with that, known as Cerberus. Configs are available specifically tuned to stress-test kernels. I think Cerberus is on Source Forge. So, make sure you have a kernel that allows overcommit-accounting mode 2 to prevent kills on OOM. Theoretically mode 2 will prevent the possiblity of OOM completely. If I read things right, if you have double swap space mode 0 will not OOM nearly as quickly. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On 14 Jun 2003 at 16:38, Andrew Dunstan wrote: > Summary: don't take shortcuts looking for this - Read the Source, Luke. It's > important not to give people false expectations. For now, I'm leaning in > Tom's direction of advising people to avoid Linux for mission-critical > situations that could run into an OOM. While I agree that vanilla linux does not handle the situation gracefully enough, anybody running a mission critical application should spec. the machine and the demads on the same carefully enough. For certain linux won't start doing OOM kill because it started going low on buffer memory. ( At least I hope so.) If on expects to throw uncalculated amount of load on a mission critical box, till it reaches swap for every malloc in a strcpy, there are things need to be checked before which kernel/OS you are running. And BTW whas that original comment for vanilla liux or linux in general..:-) ByeShridhar -- Adore, v.: To venerate expectantly. -- Ambrose Bierce, "The Devil's Dictionary"
Alan Cox has written to me thus: > It got dropped for RH9 and some errata kernels because of clashes between > the old stuff and the rmap vm and other weird RH patches andrew ----- Original Message ----- From: "Andrew Dunstan" <andrew@dunslane.net> To: "Tom Lane" <tgl@sss.pgh.pa.us> Cc: "Kurt Roeckx" <Q@ping.be>; "Matthew Kirkwood" <matthew@hairy.beasts.org>; <pgsql-hackers@postgresql.org> Sent: Saturday, June 14, 2003 5:39 PM Subject: Re: [HACKERS] Pre-allocation of shared memory ... > I know he does - *but* I think it has probably been wiped out by accident > somewhere along the line (like when they went to 2.4.20?) > > Here's what's in RH sources - tell me after you look that I am looking in > the wrong place. (Or did RH get cute and decide to do this only for the AS > product?) >
On Thu, Jun 12, 2003 at 10:10:02PM -0400, Bruce Momjian wrote: > Tom Lane wrote: > It is bad to hang the system, but if it reports swap failure, at least > the admin knows why it failed, rather than killing random processes. I wonder if it might be better to suspend whatever process is trying to allocate/write to too much memory. At least then you have some chance of keeping the system up (obviously you'd need to leave some amount free so you could login to the box to fix things). -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On Fri, Jun 13, 2003 at 12:41:28PM -0400, Bruce Momjian wrote: > Of course, if you exceed swap, your system hangs. Are you sure? I ran out of swap once or came damn close, due to a cron job gone amuck. My clue was starting to see lots of memory allocation errors. After I fixed what was blocking all the backed-up cron jobs, the machine ground to a crawl (mmm... system load of 400+ on a dual PII-375), and X did crash (though I think that's because I tried switching to a different virtual console), but the machine stayed up and eventually worked through everything. -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"