Re: Pre-allocation of shared memory ... - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: Pre-allocation of shared memory ... |
Date | |
Msg-id | 001201c332bd$7fee5780$6401a8c0@DUNSLANE Whole thread Raw |
In response to | Re: Pre-allocation of shared memory ... ("Andrew Dunstan" <andrew@dunslane.net>) |
List | pgsql-hackers |
I know he does - *but* I think it has probably been wiped out by accident somewhere along the line (like when they went to 2.4.20?) Here's what's in RH sources - tell me after you look that I am looking in the wrong place. (Or did RH get cute and decide to do this only for the AS product?) first, RH7.3/kernel 2.4.18-3 (patch present): ---------------- int vm_enough_memory(long pages, int charge) { /* Stupid algorithm to decide if we have enough memory: while * simple, it hopefully works in most obviouscases.. Easy to * fool it, but this should catch most mistakes. * * 23/11/98 NJC: Somewhat lessstupid version of algorithm, * which tries to do "TheRightThing". Instead of using half of * (buffers+cache),use the minimum values. Allow an extra 2% * of num_physpages for safety margin. * *2002/02/26 Alan Cox: Added two new modes that do real accounting */ unsigned long free, allowed; structsysinfo i; if(charge) atomic_add(pages, &vm_committed_space); /* Sometimes we want to use more memory than we have. */ if (sysctl_overcommit_memory == 1) return1; if (sysctl_overcommit_memory == 0) { /* The page cache contains buffer pages these days..*/ free = atomic_read(&page_cache_size); free += nr_free_pages(); free +=nr_swap_pages; /* * This double-counts: the nrpages are both in the page-cache * and in the swapper space. At the same time, this compensates * for the swap-space over-allocation (ie "nr_swap_pages" being * too small. */ free += swapper_space.nrpages; /* * The code below doesn't account for free space in the inode * and dentry slab cache, slab cache fragmentation, inodes and * dentries which will become freeable under VM load, etc. * Lets just hope all these (complex)factors balance out... */ free += (dentry_stat.nr_unused * sizeof(struct dentry)) >> PAGE_SHIFT; free += (inodes_stat.nr_unused * sizeof(struct inode)) >> PAGE_SHIFT; if(free > pages) return 1; atomic_sub(pages, &vm_committed_space); return 0; } allowed = total_swap_pages; if(sysctl_overcommit_memory == 2) { /* FIXME - need to add arch hooks to get the bits we need without the higher overhead crap */ si_meminfo(&i); allowed += i.totalram>> 1; } if(atomic_read(&vm_committed_space) < allowed) return 1; if(charge) atomic_sub(pages, &vm_committed_space); return 0; } --------- and here's what's in RH9/2.4.20-18 (patch absent): -------------- int vm_enough_memory(long pages) { /* Stupid algorithm to decide if we have enough memory: while * simple, it hopefully works in most obviouscases.. Easy to * fool it, but this should catch most mistakes. */ /* 23/11/98 NJC: Somewhat lessstupid version of algorithm, * which tries to do "TheRightThing". Instead of using half of * (buffers+cache),use the minimum values. Allow an extra 2% * of num_physpages for safety margin. */ unsigned long free; /* Sometimes we want to use more memory than we have. */ if (sysctl_overcommit_memory) return 1; /* The page cache contains buffer pages these days.. */ free = atomic_read(&page_cache_size); free +=nr_free_pages(); free += nr_swap_pages; /* * This double-counts: the nrpages are both in the page-cache * and in the swapper space. At the sametime, this compensates * for the swap-space over-allocation (ie "nr_swap_pages" being * too small. */ free += swapper_space.nrpages; /* * The code below doesn't account for free space in the inode * and dentry slab cache, slab cachefragmentation, inodes and * dentries which will become freeable under VM load, etc. * Lets just hope allthese (complex) factors balance out... */ free += (dentry_stat.nr_unused * sizeof(struct dentry)) >> PAGE_SHIFT; free += (inodes_stat.nr_unused * sizeof(struct inode)) >> PAGE_SHIFT; return free > pages; } ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Andrew Dunstan" <andrew@dunslane.net> Cc: "Kurt Roeckx" <Q@ping.be>; "Matthew Kirkwood" <matthew@hairy.beasts.org>; <pgsql-hackers@postgresql.org> Sent: Saturday, June 14, 2003 5:16 PM Subject: Re: [HACKERS] Pre-allocation of shared memory ... > "Andrew Dunstan" <andrew@dunslane.net> writes: > > I *know* the latest RH kernel docs *say* they have paranoid mode that > > supposedly guarantees against OOM - it was me that pointed that out > > originally :-). I just checked on the latest sources (today it's RH8, kernel > > 2.4.20-18.8) to be doubly sure, and can't see the patches. > > I think you must be looking in the wrong place. Red Hat's kernels have > included the mode 2/3 overcommit logic since RHL 7.3, according to > what I can find. (Don't forget Alan Cox works for Red Hat ;-).) > > But it is true that it's not in Linus' tree yet. This may be because > there are still some loose ends. The copy of the overcommit document > in my RHL 8.0 system lists some ToDo items down at the bottom: > > To Do > ----- > o Account ptrace pages (this is hard) > o Disable MAP_NORESERVE in mode 2/3 > o Account for shared anonymous mappings properly > - right now we account them per instance > > I have not installed RHL 9 yet --- is the ToDo list any shorter there? > > regards, tom lane
pgsql-hackers by date: