Re: JIT causes core dump during error recovery - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: JIT causes core dump during error recovery |
Date | |
Msg-id | 1749264.1719505111@sss.pgh.pa.us Whole thread Raw |
In response to | Re: JIT causes core dump during error recovery (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: JIT causes core dump during error recovery
|
List | pgsql-hackers |
I wrote: > So delaying removal of the jit-created code segment until transaction > cleanup wouldn't be enough to prevent this crash, if I'm reading > things right. The extra-pstrdup solution may be the only viable one. > I could use confirmation from someone who knows the JIT code about > when jit-created code is unloaded. It also remains very unclear > why there is no crash if we don't force both jit_optimize_above_cost > and jit_inline_above_cost to small values. I found where the unload happens: ResOwnerReleaseJitContext, which is executed during the resource owner BEFORE_LOCKS phase. (Which seems like a pretty dubious choice from here; why can't we leave it till the less time-critical phase after we've let go of locks?) But anyway, we definitely do drop this stuff during xact cleanup. Also, it seems that the reason that both jit_optimize_above_cost and jit_inline_above_cost must be small is that otherwise int4div is simply called from the JIT-generated code, not inlined into it. This gives me very considerable fear about how well that behavior has been tested: if throwing an error from inlined code doesn't work, and we hadn't noticed that, how much can it really have been exercised? I have also got an itchy feeling that we have code that will be broken by this behavior of "if it happens to get inlined then string constants aren't so constant". In any case, I found that adding some copying logic to CopyErrorData() is enough to solve this problem, since the SPI infrastructure applies that before executing xact cleanup. I had feared that we'd have to add copying to every single elog/ereport sequence, which would have been an annoying amount of overhead; but the attached seems acceptable. We do get through check-world with this patch and the JIT parameters all set to small values. regards, tom lane diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index d91a85cb2d..d1d1632bdd 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -1743,7 +1743,21 @@ CopyErrorData(void) newedata = (ErrorData *) palloc(sizeof(ErrorData)); memcpy(newedata, edata, sizeof(ErrorData)); - /* Make copies of separately-allocated fields */ + /* + * Make copies of separately-allocated strings. Note that we copy even + * theoretically-constant strings such as filename. This is because those + * could point into JIT-created code segments that might get unloaded at + * transaction cleanup. In some cases we need the copied ErrorData to + * survive transaction boundaries, so we'd better copy those strings too. + */ + if (newedata->filename) + newedata->filename = pstrdup(newedata->filename); + if (newedata->funcname) + newedata->funcname = pstrdup(newedata->funcname); + if (newedata->domain) + newedata->domain = pstrdup(newedata->domain); + if (newedata->context_domain) + newedata->context_domain = pstrdup(newedata->context_domain); if (newedata->message) newedata->message = pstrdup(newedata->message); if (newedata->detail) @@ -1756,6 +1770,8 @@ CopyErrorData(void) newedata->context = pstrdup(newedata->context); if (newedata->backtrace) newedata->backtrace = pstrdup(newedata->backtrace); + if (newedata->message_id) + newedata->message_id = pstrdup(newedata->message_id); if (newedata->schema_name) newedata->schema_name = pstrdup(newedata->schema_name); if (newedata->table_name)
pgsql-hackers by date: