Thread: Support json_errdetail in FRONTEND builds
Hello, Both the incremental JSON [1] and OAuth [2] patchsets would be improved by json_errdetail(), which was removed from FRONTEND builds in b44669b2ca: > The routine providing a detailed error message based on the error code > is made backend-only, the existing code being unsafe to use in the > frontend as the error message may finish by being palloc'd or point to a > static string, so there is no way to know if the memory of the message > should be pfree'd or not. Attached is a patch to undo this, by attaching any necessary allocations to the JsonLexContext so the caller doesn't have to keep track of them. This is based on the first patch of the OAuth patchset, which additionally needs json_errdetail() to be safe to use from libpq itself. Alvaro pointed out offlist that we don't have to go that far to re-enable this function for the utilities, so this patch is a sort of middle ground between what we have now and what OAuth implements. (There is some additional minimization that could be done to this patch, but I'm hoping to keep the code structure consistent between the two, if the result is acceptable.) Two notes that I wanted to point out explicitly: - On the other thread, Daniel contributed a destroyStringInfo() counterpart for makeStringInfo(), which is optional but seemed useful to include here. - After this patch, if a frontend client calls json_errdetail() without calling freeJsonLexContext(), it will leak memory. Thanks, --Jacob [1] https://www.postgresql.org/message-id/682c8fff-355c-a04f-57ac-81055c4ccda8%40dunslane.net [2] https://www.postgresql.org/message-id/CAOYmi%2BkKNZCL7uz-LHyBggM%2BfEcf4285pFWwm7spkUb8irY7mQ%40mail.gmail.com
Attachment
On 2024-03-12 Tu 14:43, Jacob Champion wrote: > Hello, > > Both the incremental JSON [1] and OAuth [2] patchsets would be > improved by json_errdetail(), which was removed from FRONTEND builds > in b44669b2ca: > >> The routine providing a detailed error message based on the error code >> is made backend-only, the existing code being unsafe to use in the >> frontend as the error message may finish by being palloc'd or point to a >> static string, so there is no way to know if the memory of the message >> should be pfree'd or not. > Attached is a patch to undo this, by attaching any necessary > allocations to the JsonLexContext so the caller doesn't have to keep > track of them. > > This is based on the first patch of the OAuth patchset, which > additionally needs json_errdetail() to be safe to use from libpq > itself. Alvaro pointed out offlist that we don't have to go that far > to re-enable this function for the utilities, so this patch is a sort > of middle ground between what we have now and what OAuth implements. > (There is some additional minimization that could be done to this > patch, but I'm hoping to keep the code structure consistent between > the two, if the result is acceptable.) Seems reasonable. > > Two notes that I wanted to point out explicitly: > - On the other thread, Daniel contributed a destroyStringInfo() > counterpart for makeStringInfo(), which is optional but seemed useful > to include here. yeah, although maybe worth a different patch. > - After this patch, if a frontend client calls json_errdetail() > without calling freeJsonLexContext(), it will leak memory. Not too concerned about this: 1. we tend to be a bit more relaxed about that in frontend programs, especially those not expected to run for long times and especially on error paths, where in many cases the program will just exit anyway. 2. the fix is simple where it's needed. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Tue, Mar 12, 2024 at 08:38:43PM -0400, Andrew Dunstan wrote: > On 2024-03-12 Tu 14:43, Jacob Champion wrote: >> Two notes that I wanted to point out explicitly: >> - On the other thread, Daniel contributed a destroyStringInfo() >> counterpart for makeStringInfo(), which is optional but seemed useful >> to include here. > > yeah, although maybe worth a different patch. - { - pfree(lex->strval->data); - pfree(lex->strval); - } + destroyStringInfo(lex->strval); I've wanted that a few times, FWIW. I would do a split, mainly for clarity. >> - After this patch, if a frontend client calls json_errdetail() >> without calling freeJsonLexContext(), it will leak memory. > > Not too concerned about this: > > 1. we tend to be a bit more relaxed about that in frontend programs, > especially those not expected to run for long times and especially on error > paths, where in many cases the program will just exit anyway. > > 2. the fix is simple where it's needed. This does not stress me much, either. I can see that OAuth introduces at least two calls of json_errdetail() in libpq, and that would matter there. case JSON_EXPECTED_STRING: - return psprintf(_("Expected string, but found \"%s\"."), - extract_token(lex)); + appendStringInfo(lex->errormsg, + _("Expected string, but found \"%.*s\"."), + toklen, lex->token_start); Maybe this could be wrapped in a macro to avoid copying around token_start and toklen for all the error cases. -- Michael
Attachment
On Tue, Mar 12, 2024 at 11:38 PM Michael Paquier <michael@paquier.xyz> wrote: > On Tue, Mar 12, 2024 at 08:38:43PM -0400, Andrew Dunstan wrote: > > yeah, although maybe worth a different patch. > > I've wanted that a few times, FWIW. I would do a split, mainly for > clarity. Sounds good, split into v2-0002. (That gives me room to switch other call sites to the new API, too.) Thanks both! > This does not stress me much, either. I can see that OAuth introduces > at least two calls of json_errdetail() in libpq, and that would matter > there. Yep. > case JSON_EXPECTED_STRING: > - return psprintf(_("Expected string, but found \"%s\"."), > - extract_token(lex)); > + appendStringInfo(lex->errormsg, > + _("Expected string, but found \"%.*s\"."), > + toklen, lex->token_start); > > Maybe this could be wrapped in a macro to avoid copying around > token_start and toklen for all the error cases. I gave it a shot in 0001; see what you think. Thanks, --Jacob
Attachment
On Wed, Mar 13, 2024 at 11:20:16AM -0700, Jacob Champion wrote: > Sounds good, split into v2-0002. (That gives me room to switch other > call sites to the new API, too.) Thanks both! That looks OK to me. I can see 7~8 remaining sites where StringInfo data is freed, like in the syslogger, but we should not touch the StringInfo. >> case JSON_EXPECTED_STRING: >> - return psprintf(_("Expected string, but found \"%s\"."), >> - extract_token(lex)); >> + appendStringInfo(lex->errormsg, >> + _("Expected string, but found \"%.*s\"."), >> + toklen, lex->token_start); >> >> Maybe this could be wrapped in a macro to avoid copying around >> token_start and toklen for all the error cases. > > I gave it a shot in 0001; see what you think. That's cleaner, thanks. Hmm, I think that it is cleaner to create the new API first, and then rely on it, reversing the order of both patches (perhaps the extra destroyStringInfo in freeJsonLexContext() should have been moved in 0001). See the attached with few tweaks to 0001, previously 0002 @-@. I'd still need to do a more serious lookup of 0002, previously 0001. -- Michael
Attachment
> On 14 Mar 2024, at 09:06, Michael Paquier <michael@paquier.xyz> wrote: > I think that it is cleaner to create the new API first, and then > rely on it, reversing the order of both patches I agree with this ordering. > (perhaps the extra destroyStringInfo in freeJsonLexContext() should > have been moved in 0001). I wouldn't worry about that, seems fine as is to me. > See the attached with few tweaks to 0001, previously 0002 @-@. > I'd still need to do a more serious lookup of 0002, previously 0001. A few small comments: - * +* Whitespace + /* don't allow destroys of read-only StringInfos */ + Assert(str->maxlen != 0); Considering that StringInfo.c don't own the memory here I think it's warranted to turn this assert into an elog() to avoid the risk of use-after-free bugs. + * The returned allocation is either static or owned by the JsonLexContext and + * should not be freed. The most important part of that comment is at the very end, to help readers I would reword this to just "The returned pointer should not be freed.", or at least put that part first. +#define token_error(lex, format) \ I'm not sure this buys much more than reduced LoC, the expansion isn't unreadable to the point that the format constraint encoded in the macro is worth it IMO. -- Daniel Gustafsson
On Thu, Mar 14, 2024 at 10:56:46AM +0100, Daniel Gustafsson wrote: > + /* don't allow destroys of read-only StringInfos */ > + Assert(str->maxlen != 0); > Considering that StringInfo.c don't own the memory here I think it's warranted > to turn this assert into an elog() to avoid the risk of use-after-free bugs. Hmm. I am not sure how much protection this would offer, TBH. One thing that I find annoying with common/stringinfo.c as it is currently is that we have two exit() calls in the enlarge path, and it does not seem wise to me to spread that even more. My last argument sounds like a nit for HEAD knowing that this does not impact libpq that has its own pqexpbuffer.c to avoid issues with palloc, elog and exit, but that could be a problem if OAuth relies more on these code paths in libpq. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > Hmm. I am not sure how much protection this would offer, TBH. One > thing that I find annoying with common/stringinfo.c as it is currently > is that we have two exit() calls in the enlarge path, and it does not > seem wise to me to spread that even more. > My last argument sounds like a nit for HEAD knowing that this does not > impact libpq that has its own pqexpbuffer.c to avoid issues with > palloc, elog and exit, but that could be a problem if OAuth relies > more on these code paths in libpq. I hope nobody is expecting that such code will get accepted. We have a policy (and an enforcement mechanism) that libpq.so must not call exit(). OAuth code in libpq will need to cope with using pqexpbuffer. regards, tom lane
On 2024-Mar-14, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: > > Hmm. I am not sure how much protection this would offer, TBH. One > > thing that I find annoying with common/stringinfo.c as it is currently > > is that we have two exit() calls in the enlarge path, and it does not > > seem wise to me to spread that even more. > I hope nobody is expecting that such code will get accepted. We have > a policy (and an enforcement mechanism) that libpq.so must not call > exit(). OAuth code in libpq will need to cope with using pqexpbuffer. FWIW that's exactly what Jacob's OAUTH patch does -- it teaches the relevant JSON parser code to use pqexpbuffer when in frontend environment, and continues to use StringInfo in backend. I find that a bit cumbersome, but if the idea of changing the StringInfo behavior (avoiding exit()) is too radical, then perhaps it's better that we go with Jacob's proposal in the other thread: +/* + * In backend, we will use palloc/pfree along with StringInfo. In frontend, use + * malloc and PQExpBuffer, and return JSON_OUT_OF_MEMORY on out-of-memory. + */ +#ifdef FRONTEND + +#define STRDUP(s) strdup(s) +#define ALLOC(size) malloc(size) + +#define appendStrVal appendPQExpBuffer +#define appendBinaryStrVal appendBinaryPQExpBuffer +#define appendStrValChar appendPQExpBufferChar +#define createStrVal createPQExpBuffer +#define resetStrVal resetPQExpBuffer +#define destroyStrVal destroyPQExpBuffer + +#else /* !FRONTEND */ + +#define STRDUP(s) pstrdup(s) +#define ALLOC(size) palloc(size) + +#define appendStrVal appendStringInfo +#define appendBinaryStrVal appendBinaryStringInfo +#define appendStrValChar appendStringInfoChar +#define createStrVal makeStringInfo +#define resetStrVal resetStringInfo +#define destroyStrVal destroyStringInfo + +#endif -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Ni aún el genio muy grande llegaría muy lejos si tuviera que sacarlo todo de su propio interior" (Goethe)
> On 15 Mar 2024, at 09:38, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Mar-14, Tom Lane wrote: > >> Michael Paquier <michael@paquier.xyz> writes: >>> Hmm. I am not sure how much protection this would offer, TBH. One >>> thing that I find annoying with common/stringinfo.c as it is currently >>> is that we have two exit() calls in the enlarge path, and it does not >>> seem wise to me to spread that even more. > >> I hope nobody is expecting that such code will get accepted. We have >> a policy (and an enforcement mechanism) that libpq.so must not call >> exit(). OAuth code in libpq will need to cope with using pqexpbuffer. > > FWIW that's exactly what Jacob's OAUTH patch does -- it teaches the > relevant JSON parser code to use pqexpbuffer when in frontend > environment, and continues to use StringInfo in backend. I find that a > bit cumbersome, but if the idea of changing the StringInfo behavior > (avoiding exit()) is too radical, then perhaps it's better that we go > with Jacob's proposal in the other thread: Correct, the OAuth work does not make any claims to use StringInfo in libpq. My understanding of this thread was to make it use StringInfo for now to get this available for frontend binaries now, and reduce scope here, and later change it up if/when the OAuth patch lands. -- Daniel Gustafsson
> On 15 Mar 2024, at 01:10, Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Mar 14, 2024 at 10:56:46AM +0100, Daniel Gustafsson wrote: >> + /* don't allow destroys of read-only StringInfos */ >> + Assert(str->maxlen != 0); >> Considering that StringInfo.c don't own the memory here I think it's warranted >> to turn this assert into an elog() to avoid the risk of use-after-free bugs. > > Hmm. I am not sure how much protection this would offer, TBH. I can't see how refusing to free memory owned and controlled by someone else, and throwing an error if attempted, wouldn't be a sound defensive programming measure. -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: > I can't see how refusing to free memory owned and controlled by someone else, > and throwing an error if attempted, wouldn't be a sound defensive programming > measure. I think the argument is about what "refusal" looks like. An Assert seems OK to me, but anything based on elog is likely to be problematic, because it'll involve exit() somewhere. regards, tom lane
On Fri, Mar 15, 2024 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
> I can't see how refusing to free memory owned and controlled by someone else,
> and throwing an error if attempted, wouldn't be a sound defensive programming
> measure.
I think the argument is about what "refusal" looks like.
An Assert seems OK to me, but anything based on elog is
likely to be problematic, because it'll involve exit()
somewhere.
Yeah, I agree an Assert seems safest here.
I'd like to get this done ASAP so I can rebase my incremental parse patchset. Daniel, do you want to commit it? If not, I can.
cheers
andrew
> On 15 Mar 2024, at 21:56, Andrew Dunstan <andrew@dunslane.net> wrote: > On Fri, Mar 15, 2024 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote: > Daniel Gustafsson <daniel@yesql.se <mailto:daniel@yesql.se>> writes: > > I can't see how refusing to free memory owned and controlled by someone else, > > and throwing an error if attempted, wouldn't be a sound defensive programming > > measure. > > I think the argument is about what "refusal" looks like. > An Assert seems OK to me, but anything based on elog is > likely to be problematic, because it'll involve exit() > somewhere. > > Yeah, I agree an Assert seems safest here. > > I'd like to get this done ASAP so I can rebase my incremental parse patchset. Daniel, do you want to commit it? If not,I can. Sure, I can commit these patches. It won't be until tomorrow though since I prefer to have ample time to monitor the buildfarm, if you are in a bigger rush than that then feel free to go ahead. -- Daniel Gustafsson
On Fri, Mar 15, 2024 at 11:23:07PM +0100, Daniel Gustafsson wrote: > On 15 Mar 2024, at 21:56, Andrew Dunstan <andrew@dunslane.net> wrote: >> On Fri, Mar 15, 2024 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote: >> Yeah, I agree an Assert seems safest here. Cool. >> I'd like to get this done ASAP so I can rebase my incremental parse >> patchset. Daniel, do you want to commit it? If not, I can. > > Sure, I can commit these patches. It won't be until tomorrow though since I > prefer to have ample time to monitor the buildfarm, if you are in a bigger rush > than that then feel free to go ahead. +1. I've not looked much at 0002, but feel free to do so if you think both are good for shipping. -- Michael
Attachment
> On Mar 16, 2024, at 8:53 AM, Daniel Gustafsson <daniel@yesql.se> wrote: > > >> >> On 15 Mar 2024, at 21:56, Andrew Dunstan <andrew@dunslane.net> wrote: >> On Fri, Mar 15, 2024 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote: >> Daniel Gustafsson <daniel@yesql.se <mailto:daniel@yesql.se>> writes: >>> I can't see how refusing to free memory owned and controlled by someone else, >>> and throwing an error if attempted, wouldn't be a sound defensive programming >>> measure. >> >> I think the argument is about what "refusal" looks like. >> An Assert seems OK to me, but anything based on elog is >> likely to be problematic, because it'll involve exit() >> somewhere. >> >> Yeah, I agree an Assert seems safest here. >> >> I'd like to get this done ASAP so I can rebase my incremental parse patchset. Daniel, do you want to commit it? If not,I can. > > Sure, I can commit these patches. It won't be until tomorrow though since I > prefer to have ample time to monitor the buildfarm, if you are in a bigger rush > than that then feel free to go ahead. > tomorrow will be fine, thanks Cheers Andrew
> On 16 Mar 2024, at 00:59, Andrew Dunstan <andrew@dunslane.net> wrote: >> On Mar 16, 2024, at 8:53 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >>> On 15 Mar 2024, at 21:56, Andrew Dunstan <andrew@dunslane.net> wrote: >>> I'd like to get this done ASAP so I can rebase my incremental parse patchset. Daniel, do you want to commit it? If not,I can. >> >> Sure, I can commit these patches. It won't be until tomorrow though since I >> prefer to have ample time to monitor the buildfarm, if you are in a bigger rush >> than that then feel free to go ahead. > > tomorrow will be fine, thanks Sorry, I ran into some unforeseen scheduling issues and had less time available than planned. I have pushed the 0001 StringInfo patch to reduce the work for tomorrow when I will work on 0002 unless beaten to it. -- Daniel Gustafsson
> On 17 Mar 2024, at 00:00, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 16 Mar 2024, at 00:59, Andrew Dunstan <andrew@dunslane.net> wrote: >>> On Mar 16, 2024, at 8:53 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >>>> On 15 Mar 2024, at 21:56, Andrew Dunstan <andrew@dunslane.net> wrote: > >>>> I'd like to get this done ASAP so I can rebase my incremental parse patchset. Daniel, do you want to commit it? If not,I can. >>> >>> Sure, I can commit these patches. It won't be until tomorrow though since I >>> prefer to have ample time to monitor the buildfarm, if you are in a bigger rush >>> than that then feel free to go ahead. >> >> tomorrow will be fine, thanks > > Sorry, I ran into some unforeseen scheduling issues and had less time available > than planned. I have pushed the 0001 StringInfo patch to reduce the work for > tomorrow when I will work on 0002 unless beaten to it. I took another look at this tonight and committed it with some mostly cosmetic changes. Since then, tamandua failed the SSL test on this commit but I am unable to reproduce any error both on older OpenSSL and matching the 3.1 version on tamandua, so not sure if its related. Other animals have cleared sslcheck after this, but looking at this highlights just how rare it is for a buildfarm animal to run sslcheck =/ -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: > I took another look at this tonight and committed it with some mostly cosmetic > changes. Since then, tamandua failed the SSL test on this commit but I am > unable to reproduce any error both on older OpenSSL and matching the 3.1 > version on tamandua, so not sure if its related. That failure looks like just a random buildfarm burp: 2024-03-17 23:11:30.989 UTC [106988][postmaster][:0] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by gcc-12.3.0,64-bit 2024-03-17 23:11:30.989 UTC [106988][postmaster][:0] LOG: could not bind IPv4 address "127.0.0.1": Address already in use 2024-03-17 23:11:30.989 UTC [106988][postmaster][:0] HINT: Is another postmaster already running on port 54588? If not,wait a few seconds and retry. 2024-03-17 23:11:30.990 UTC [106988][postmaster][:0] WARNING: could not create listen socket for "127.0.0.1" 2024-03-17 23:11:30.990 UTC [106988][postmaster][:0] FATAL: could not create any TCP/IP sockets 2024-03-17 23:11:30.993 UTC [106988][postmaster][:0] LOG: database system is shut down regards, tom lane
> On 18 Mar 2024, at 01:13, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> I took another look at this tonight and committed it with some mostly cosmetic >> changes. Since then, tamandua failed the SSL test on this commit but I am >> unable to reproduce any error both on older OpenSSL and matching the 3.1 >> version on tamandua, so not sure if its related. > > That failure looks like just a random buildfarm burp: Indeed, and it corrected itself a few hours later. -- Daniel Gustafsson
On Sun, Mar 17, 2024 at 4:49 PM Daniel Gustafsson <daniel@yesql.se> wrote: > I took another look at this tonight and committed it with some mostly cosmetic > changes. Great! Thanks everyone. --Jacob
On Mon, Mar 18, 2024 at 06:17:18AM -0700, Jacob Champion wrote: > Great! Thanks everyone. Cool. Thanks for the commit. -- Michael