Thread: unchecked out of memory in postmaster.c
Hi, Some time ago I noticed that in postmaster.c there's a corner case which probably causes postmaster to exit in out-of-memory condition. See BackendStartup, near the bottom, there's a call to DLNewElem(). The problem is that this function calls palloc() and thus can elog(ERROR) on OOM, but postmaster has no way to defend itself from this and would die. I haven't ever seen postmaster die from this, but I don't think it's a good idea to let it be like this, given the strict promises we make about its reliability. Probably a simple PG_TRY block around the DLNewElem call suffices ...? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Some time ago I noticed that in postmaster.c there's a corner case which > probably causes postmaster to exit in out-of-memory condition. See > BackendStartup, near the bottom, there's a call to DLNewElem(). The > problem is that this function calls palloc() and thus can elog(ERROR) on > OOM, but postmaster has no way to defend itself from this and would die. So? There are probably hundreds of palloc calls that are reachable from the postmaster main loop. If this were allocating more than a few bytes of memory, it might be worth worrying about. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Some time ago I noticed that in postmaster.c there's a corner case which > > probably causes postmaster to exit in out-of-memory condition. See > > BackendStartup, near the bottom, there's a call to DLNewElem(). The > > problem is that this function calls palloc() and thus can elog(ERROR) on > > OOM, but postmaster has no way to defend itself from this and would die. > > So? There are probably hundreds of palloc calls that are reachable from > the postmaster main loop. If this were allocating more than a few bytes > of memory, it might be worth worrying about. Hundreds? I think you'd be hard pressed to find as much as a dozen :-) I mean stuff that's called inside ServerLoop, of course. There are a few places calling alloc-type functions, but as far as I see they use calloc or malloc, and behave "sanely" (i.e. not elog(ERROR)) in OOM. Note that BackendStartup itself is very careful about allocating a Backend struct, even when it's just ~10 bytes on 32 bits machines. I think a patch to solve this is as simple as the attached. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > I think a patch to solve this is as simple as the attached. I guess I need to point out that those ereport calls already pose a far more substantial risk of palloc failure than the DLNewElem represents. You seem to have forgotten about releasing the DLElem if the fork fails, too. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > I think a patch to solve this is as simple as the attached. > > I guess I need to point out that those ereport calls already pose a far > more substantial risk of palloc failure than the DLNewElem represents. Hmm, do they? I mean, don't they use ErrorContext, which is supposed to be preallocated? > You seem to have forgotten about releasing the DLElem if the fork fails, > too. Doh, sorry about that. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> I guess I need to point out that those ereport calls already pose a far >> more substantial risk of palloc failure than the DLNewElem represents. > Hmm, do they? I mean, don't they use ErrorContext, which is supposed to > be preallocated? Well, we'd like to think that they pose an insignificant risk, but it's hard to credit that DLNewElem isn't insignificant as well. If you're really intent on doing something about this, my inclination would be to get rid of the dependence on DLNewElem altogether. Add a Dlelem field to the Backend struct and use DLInitElem (compare the way catcache uses that module). regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane wrote: > >> I guess I need to point out that those ereport calls already pose a far > >> more substantial risk of palloc failure than the DLNewElem represents. > > > Hmm, do they? I mean, don't they use ErrorContext, which is supposed to > > be preallocated? > > Well, we'd like to think that they pose an insignificant risk, but it's > hard to credit that DLNewElem isn't insignificant as well. Yeah, actually as I said earlier, I haven't ever seen this. > If you're really intent on doing something about this, my inclination > would be to get rid of the dependence on DLNewElem altogether. Add > a Dlelem field to the Backend struct and use DLInitElem (compare > the way catcache uses that module). Hmm, yeah, I had seen that code. So it looks like this instead. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> If you're really intent on doing something about this, my inclination >> would be to get rid of the dependence on DLNewElem altogether. Add >> a Dlelem field to the Backend struct and use DLInitElem (compare >> the way catcache uses that module). > Hmm, yeah, I had seen that code. So it looks like this instead. Looks a lot nicer to me ... regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> If you're really intent on doing something about this, my inclination >> would be to get rid of the dependence on DLNewElem altogether. Add >> a Dlelem field to the Backend struct and use DLInitElem (compare >> the way catcache uses that module). > Hmm, yeah, I had seen that code. So it looks like this instead. Huh, didn't you commit this yet? regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane wrote: > >> If you're really intent on doing something about this, my inclination > >> would be to get rid of the dependence on DLNewElem altogether. Add > >> a Dlelem field to the Backend struct and use DLInitElem (compare > >> the way catcache uses that module). > > > Hmm, yeah, I had seen that code. So it looks like this instead. > > Huh, didn't you commit this yet? Sorry, I just did. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support