Thread: BUG #18845: DEREF_OF_NULL.RET guc_malloc possibly returns NULL
The following bug has been logged on the website: Bug reference: 18845 Logged by: Nikita Email address: pm91.arapov@gmail.com PostgreSQL version: 16.6 Operating system: ubuntu 20.04 Description: guc_malloc possibly returns NULL, if no memory I suggest the following patch fixing this issue diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c --- a/src/backend/commands/user.c (revision a49ac80219c6f28c3cf3973f797de637329952da) +++ b/src/backend/commands/user.c (date 1740386879158) @@ -2553,7 +2553,7 @@ pfree(rawstring); list_free(elemlist); - result = (unsigned *) guc_malloc(LOG, sizeof(unsigned)); + result = (unsigned *) guc_malloc(FATAL, sizeof(unsigned)); *result = options; *extra = result;
> On 14 Mar 2025, at 08:58, PG Bug reporting form <noreply@postgresql.org> wrote: > > The following bug has been logged on the website: > > Bug reference: 18845 > Logged by: Nikita > Email address: pm91.arapov@gmail.com > PostgreSQL version: 16.6 > Operating system: ubuntu 20.04 > Description: > > guc_malloc possibly returns NULL, if no memory > I suggest the following patch fixing this issue > > diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c > --- a/src/backend/commands/user.c (revision > a49ac80219c6f28c3cf3973f797de637329952da) > +++ b/src/backend/commands/user.c (date 1740386879158) > @@ -2553,7 +2553,7 @@ > pfree(rawstring); > list_free(elemlist); > > - result = (unsigned *) guc_malloc(LOG, sizeof(unsigned)); > + result = (unsigned *) guc_malloc(FATAL, sizeof(unsigned)); Why would we want FATAL here? Wouldn't it be better to return false like how other check_ functions already do? --- a/src/backend/commands/user.c +++ b/src/backend/commands/user.c @@ -2566,6 +2566,8 @@ check_createrole_self_grant(char **newval, void **extra, GucSource source) list_free(elemlist); result = (unsigned *) guc_malloc(LOG, sizeof(unsigned)); + if (!result) + return false; *result = options; *extra = result; -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: > Why would we want FATAL here? Wouldn't it be better to return false like how > other check_ functions already do? Indeed. Also, a quick survey shows a lot of inconsistency in guc_malloc callers --- some are lazy and just use ERROR rather than LOG-and-return. That's probably all right for PGC_POSTMASTER variables (since there's no chance of continuing anyway) but perhaps it's worth improving elsewhere. regards, tom lane
> On 14 Mar 2025, at 15:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> Why would we want FATAL here? Wouldn't it be better to return false like how >> other check_ functions already do? > > Indeed. Also, a quick survey shows a lot of inconsistency in > guc_malloc callers --- some are lazy and just use ERROR rather > than LOG-and-return. That's probably all right for PGC_POSTMASTER > variables (since there's no chance of continuing anyway) but > perhaps it's worth improving elsewhere. I'll take a look at all the other callers when preparing a patch for this to see if there are more cases which need updating. -- Daniel Gustafsson
> On 14 Mar 2025, at 15:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> Why would we want FATAL here? Wouldn't it be better to return false like how >> other check_ functions already do? > > Indeed. Also, a quick survey shows a lot of inconsistency in > guc_malloc callers --- some are lazy and just use ERROR rather > than LOG-and-return. That's probably all right for PGC_POSTMASTER > variables (since there's no chance of continuing anyway) but > perhaps it's worth improving elsewhere. Turns out there was one more guc_malloc(LOG.. which didn't inspect the returned allocation in check_synchronized_standby_slots. On top of that there were a few non PGC_POSTMASTER check functions that could return false and let the GUC machinery handle it if we want to be consistent. The fix for check_createrole_self_grant should go down to v16 and the fix for check_synchronized_standby_slots down to 17, the other ones aren't bugs today so that would be a changed behaviour in backbranches. -- Daniel Gustafsson
Attachment
Daniel Gustafsson <daniel@yesql.se> writes: > On 14 Mar 2025, at 15:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Indeed. Also, a quick survey shows a lot of inconsistency in >> guc_malloc callers --- some are lazy and just use ERROR rather >> than LOG-and-return. That's probably all right for PGC_POSTMASTER >> variables (since there's no chance of continuing anyway) but >> perhaps it's worth improving elsewhere. > Turns out there was one more guc_malloc(LOG.. which didn't inspect the > returned allocation in check_synchronized_standby_slots. On top of that there > were a few non PGC_POSTMASTER check functions that could return false and let > the GUC machinery handle it if we want to be consistent. Patch looks good as far as it goes, but I nosed around and found a few other things: * After sleeping on it, I think we ought to change the guc_malloc(ERROR,...) calls in the PGC_POSTMASTER cases too. While this won't have any functional effect, what it does do is remove examples of bad practice that are likely to get copied-and-pasted to somewhere where it matters. The affected functions seem to be check_recovery_target_lsn check_recovery_target_timeline check_recovery_target_xid check_debug_io_direct * check_application_name and check_cluster_name are using WARNING not LOG in their guc_strdup calls. That seems to have been decided by flipping a coin; let's sync them with everyplace else. * init_custom_variable uses ERROR in a guc_malloc and a guc_strdup call. These should probably be changed to FATAL, on the same grounds that the error levels earlier in that function are FATAL: we are partway through initializing a newly-loaded extension, so it's unlikely that trying to continue the session is going to end well. I agree that in the back branches it's sufficient to fix check_createrole_self_grant and check_synchronized_standby_slots. The rest of this is mostly about setting good examples. regards, tom lane
> On 15 Mar 2025, at 18:22, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> On 14 Mar 2025, at 15:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Indeed. Also, a quick survey shows a lot of inconsistency in >>> guc_malloc callers --- some are lazy and just use ERROR rather >>> than LOG-and-return. That's probably all right for PGC_POSTMASTER >>> variables (since there's no chance of continuing anyway) but >>> perhaps it's worth improving elsewhere. > >> Turns out there was one more guc_malloc(LOG.. which didn't inspect the >> returned allocation in check_synchronized_standby_slots. On top of that there >> were a few non PGC_POSTMASTER check functions that could return false and let >> the GUC machinery handle it if we want to be consistent. > > Patch looks good as far as it goes, but I nosed around and found > a few other things: > > * After sleeping on it, I think we ought to change the > guc_malloc(ERROR,...) calls in the PGC_POSTMASTER cases too. > While this won't have any functional effect, what it does do > is remove examples of bad practice that are likely to get > copied-and-pasted to somewhere where it matters. The affected > functions seem to be > check_recovery_target_lsn > check_recovery_target_timeline > check_recovery_target_xid > check_debug_io_direct > > * check_application_name and check_cluster_name are using WARNING > not LOG in their guc_strdup calls. That seems to have been > decided by flipping a coin; let's sync them with everyplace else. > > * init_custom_variable uses ERROR in a guc_malloc and a guc_strdup > call. These should probably be changed to FATAL, on the same grounds > that the error levels earlier in that function are FATAL: we are > partway through initializing a newly-loaded extension, so it's > unlikely that trying to continue the session is going to end well. Those are all good points, I initially didn't think we should touch the PGC_POSTMASTER cases but you are correct that avoiding back copy pastes to happen is a Good Thing. The attached has all these fixes added. -- Daniel Gustafsson
Attachment
Daniel Gustafsson <daniel@yesql.se> writes: > Those are all good points, I initially didn't think we should touch the > PGC_POSTMASTER cases but you are correct that avoiding back copy pastes to > happen is a Good Thing. The attached has all these fixes added. I think your fix in check_debug_io_direct is wrong: - *extra = guc_malloc(ERROR, sizeof(int)); + *extra = guc_malloc(LOG, sizeof(int)); + if (!*extra) + { + pfree(rawstring); + list_free(elemlist); + return false; + } It looks to me like rawstring and elemlist were already freed, so "return false" ought to be sufficient. Also, in init_custom_variable maybe it'd be worth a comment, along the lines of - gen = (struct config_generic *) guc_malloc(ERROR, sz); + /* As above, OOM is fatal */ + gen = (struct config_generic *) guc_malloc(FATAL, sz); Otherwise LGTM. regards, tom lane
> On 16 Mar 2025, at 21:49, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> Those are all good points, I initially didn't think we should touch the >> PGC_POSTMASTER cases but you are correct that avoiding back copy pastes to >> happen is a Good Thing. The attached has all these fixes added. > > I think your fix in check_debug_io_direct is wrong: > > - *extra = guc_malloc(ERROR, sizeof(int)); > + *extra = guc_malloc(LOG, sizeof(int)); > + if (!*extra) > + { > + pfree(rawstring); > + list_free(elemlist); > + return false; > + } > > It looks to me like rawstring and elemlist were already freed, > so "return false" ought to be sufficient. AFAICT it depends on the value of PG_O_DIRECT, but leaving them is the safe option as they will be cleaned anyways. > Also, in init_custom_variable maybe it'd be worth a comment, > along the lines of > > - gen = (struct config_generic *) guc_malloc(ERROR, sz); > + /* As above, OOM is fatal */ > + gen = (struct config_generic *) guc_malloc(FATAL, sz); I was contemplating that but skipped it due to the really good comments in the same function, but you are right that we might as well direct the reader. -- Daniel Gustafsson
Attachment
Took another look at this and pushed the v3 version to master with backpatches to 16 and 17 for the access-after-alloc-failure fixes. -- Daniel Gustafsson