Re: Threads - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Threads |
Date | |
Msg-id | 19749.1041966113@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Threads (Greg Stark <gsstark@mit.edu>) |
List | pgsql-hackers |
Greg Stark <gsstark@mit.edu> writes: > You missed the point of his post. If one process in your database does > something nasty you damn well should worry about the state of and validity of > the entire database, not just that one backend. Right. And in fact we do blow away all the processes when any one of them crashes or panics. Nonetheless, memory isolation between processes is a Good Thing, because it reduces the chances that a process gone wrong will cause damage via other processes before they can be shut down. Here is a simple example of a scenario where that isolation buys us something: suppose that we have a bug that tromps on memory starting at some point X until it falls off the sbrk boundary and dumps core. (There are plenty of ways to make that happen, such as miscalculating the length of a memcpy or memset operation as -1.) Such a bug causes no serious damage in isolation, because the process suffering the failure will be in a tight data-copying or data-zeroing loop until it gets the SIGSEGV exception. It won't do anything bad based on all the data structures it has clobbered during its march to the end of memory. However, put that same bug in a multithreading context, and it becomes entirely possible that some other thread will be dispatched and will try to make use of already-clobbered data structures before the ultimate SIGSEGV exception happens. Now you have the potential for unlimited trouble. In general, isolation buys you some safety anytime there is a delay between the occurrence of a failure and its detection. > Processes by default have complete memory isolation. However postgres > actually weakens that by doing a lot of work in a shared memory > pool. That memory gets exactly the same protection as it would get in > a threaded model, which is to say none. Yes. We try to minimize the risk by keeping the shared memory pool relatively small and not doing more than we have to in it. (For example, this was one of the arguments against creating a shared plan cache.) It's also very helpful that in most platforms, shared memory is not address-wise contiguous to normal memory; thus for example a process caught in a memset death march will hit a SIGSEGV before it gets to the shared memory block. It's interesting to note that this can be made into an argument for not making shared_buffers very large: the larger the fraction of your address space that the shared buffers occupy, the larger the chance that a wild store will overwrite something you'd wish it didn't. I can't recall anyone having made that point during our many discussions of appropriate shared_buffer sizing. > So the reality is that if you have a bug most likely you've only corrupted the > local data which can be easily cleaned up either way. In the thread model > there's also the unlikely but scary risk that you've damaged other threads' > memory. And in either case there's the possibility that you've damaged the > shared pool which is unrecoverable. In a thread model, *most* of the accessible memory space would be shared with other threads, at least potentially. So I think you're wrong to categorize the second case as unlikely. regards, tom lane
pgsql-hackers by date: