On Thu, Jun 12, 2003 at 08:08:28PM -0400, Bruce Momjian wrote:
> >
> > I'm unconvinced, because I've only ever heard of the problem affecting
> > Postgres on Linux.
>
> What I don't understand is why they just don't start failing on
> fork/malloc rather than killing things.
I may be way off the mark here, falling into the middle of this as I am,
but it may be because the kernel overcommits the memory (which is sort of
logical in a way given the way fork() works). That may mean that malloc()
thinks it gets more memory and returns a pointer, but the kernel hasn't
actually committed that address space yet and waits to see if it's ever
going to be needed.
Given the right allocation proportions, this may mean that in the end the
kernel has no way to handle a shortage gracefully by causing fork() or
allocations to fail. I would assume it then goes through its alternatives
like scaling back its file cache--which it'd probably start to do before
a lot of swapping was needed, so not much to scrape out of that barrel.
After that, where do you go? Try to find a reasonable process to shoot
in the head. From what I heard, although I haven't kept current, a lot
of work went into selecting a "reasonable" process, so there will be some
determinism. And if you have occasion to find out in the first place,
"some determinism" usually means "suspiciously bad luck."
Jeroen