Thread: [HACKERS] Odd behavior with PG_TRY
In the attached patch (snippet below), I'm seeing something strange with args->in.r.atts[]. Prior to entering the PG_TRY block, I can inspect things in lldb just fine: (lldb) call args->in.r.atts[0].func (PLyDatumToObFunc) $49 = 0x000000010fc4dc70 (plpython3.so`PLyString_FromDatum at plpy_typeio.c:621) (lldb) call &args->in.r.atts[0] (PLyDatumToOb *) $50 = 0x00007fd2b302f6d0 (lldb) call args->in.r.atts[0] (PLyDatumToOb) $51 = { func = 0x000000010fc4dc70 (plpython3.so`PLyString_FromDatum at plpy_typeio.c:621) typfunc = { fn_addr = 0x000000010f478b50 (postgres`textout at varlena.c:521) fn_oid = 47 ... But I'm getting a EXC_BAD_ACCESS (code=1, address=0xb302f6d0) on the last if in the snippet below. Looking at the variables again, I see: (lldb) call args->in.r.atts[i].func error: Couldn't apply expression side effects : Couldn't dematerialize a result variable: couldn't read its memory (lldb) call args->in.r.atts[i] error: Couldn't apply expression side effects : Couldn't dematerialize a result variable: couldn't read its memory I saw the comment on PG_TRY about marking things as volatile, but my understanding from the comment is I shouldn't even need to do that, since these variables aren't being modified. > static bool > PLy_CSreceive(TupleTableSlot *slot, DestReceiver *self) > { > volatile TupleDesc desc = slot->tts_tupleDescriptor; > volatile CallbackState *myState = (CallbackState *) self; > volatile PLyTypeInfo *args = myState->args; > > PLyExecutionContext *old_exec_ctx = PLy_switch_execution_context(myState->exec_ctx); > MemoryContext scratch_context = PLy_get_scratch_context(myState->exec_ctx); > MemoryContext oldcontext = CurrentMemoryContext; > > /* Verify saved state matches incoming slot */ > Assert(myState->desc == desc); > Assert(args->in.r.natts == desc->natts); > > /* Make sure the tuple is fully deconstructed */ > slot_getallattrs(slot); > > MemoryContextSwitchTo(scratch_context); > > PG_TRY(); > { > int i, rv; > > /* > * Do the work in the scratch context to avoid leaking memory from the > * datatype output function calls. > */ > for (i = 0; i < desc->natts; i++) > { > PyObject *value = NULL; > > if (desc->attrs[i]->attisdropped) > continue; > > if (myState->lists[i] == NULL) > ereport(ERROR, > (errmsg("missing list for attribute %d", i))); > /* XXX If the function can't be null, ditch that check */ > if (slot->tts_isnull[i] || args->in.r.atts[i].func == NULL) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Mon, Jan 2, 2017 at 11:14 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > In the attached patch (snippet below), I'm seeing something strange with > args->in.r.atts[]. Prior to entering the PG_TRY block, I can inspect things > in lldb just fine: > > (lldb) call args->in.r.atts[0].func > (PLyDatumToObFunc) $49 = 0x000000010fc4dc70 > (plpython3.so`PLyString_FromDatum at plpy_typeio.c:621) > (lldb) call &args->in.r.atts[0] > (PLyDatumToOb *) $50 = 0x00007fd2b302f6d0 > (lldb) call args->in.r.atts[0] > (PLyDatumToOb) $51 = { > func = 0x000000010fc4dc70 (plpython3.so`PLyString_FromDatum at > plpy_typeio.c:621) > typfunc = { > fn_addr = 0x000000010f478b50 (postgres`textout at varlena.c:521) > fn_oid = 47 > ... > > But I'm getting a EXC_BAD_ACCESS (code=1, address=0xb302f6d0) on the last if > in the snippet below. Looking at the variables again, I see: > > (lldb) call args->in.r.atts[i].func > error: Couldn't apply expression side effects : Couldn't dematerialize a > result variable: couldn't read its memory > (lldb) call args->in.r.atts[i] > error: Couldn't apply expression side effects : Couldn't dematerialize a > result variable: couldn't read its memory > Looks strange, what is the value of 'i'? Did you get the same result if you try to print args->in.r.atts[0] inside PG_TRY? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On 1/2/17 1:31 AM, Amit Kapila wrote: > On Mon, Jan 2, 2017 at 11:14 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >> In the attached patch (snippet below), I'm seeing something strange with >> args->in.r.atts[]. Prior to entering the PG_TRY block, I can inspect things >> in lldb just fine: >> >> (lldb) call args->in.r.atts[0].func >> (PLyDatumToObFunc) $49 = 0x000000010fc4dc70 >> (plpython3.so`PLyString_FromDatum at plpy_typeio.c:621) >> (lldb) call &args->in.r.atts[0] >> (PLyDatumToOb *) $50 = 0x00007fd2b302f6d0 >> (lldb) call args->in.r.atts[0] >> (PLyDatumToOb) $51 = { >> func = 0x000000010fc4dc70 (plpython3.so`PLyString_FromDatum at >> plpy_typeio.c:621) >> typfunc = { >> fn_addr = 0x000000010f478b50 (postgres`textout at varlena.c:521) >> fn_oid = 47 >> ... >> >> But I'm getting a EXC_BAD_ACCESS (code=1, address=0xb302f6d0) on the last if >> in the snippet below. Looking at the variables again, I see: >> >> (lldb) call args->in.r.atts[i].func >> error: Couldn't apply expression side effects : Couldn't dematerialize a >> result variable: couldn't read its memory >> (lldb) call args->in.r.atts[i] >> error: Couldn't apply expression side effects : Couldn't dematerialize a >> result variable: couldn't read its memory >> > > Looks strange, what is the value of 'i'? Did you get the same result > if you try to print args->in.r.atts[0] inside PG_TRY? i is 0, makes no difference: (lldb) call i (int) $56 = 0 (lldb) call args->in.r.atts[0] error: Couldn't apply expression side effects : Couldn't dematerialize a result variable: couldn't read its memory (lldb) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On Mon, Jan 2, 2017 at 10:43 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 1/2/17 1:31 AM, Amit Kapila wrote: >> >> On Mon, Jan 2, 2017 at 11:14 AM, Jim Nasby <Jim.Nasby@bluetreble.com> >> wrote: >> >> Looks strange, what is the value of 'i'? Did you get the same result >> if you try to print args->in.r.atts[0] inside PG_TRY? > > > i is 0, makes no difference: > > (lldb) call i > (int) $56 = 0 > (lldb) call args->in.r.atts[0] > error: Couldn't apply expression side effects : Couldn't dematerialize a > result variable: couldn't read its memory > (lldb) > To localize the problem you might want to try by just having the problematic statement in PG_TRY(); PG_TRY(); { if (args->in.r.atts[0].func == NULL) { } } PG_CATCH(); { PG_RE_THROW(); } PG_END_TRY(); -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > In the attached patch (snippet below), I'm seeing something strange with > args->in.r.atts[]. Did you try comparing the apparent values of "args" before and after entering PG_TRY? > I saw the comment on PG_TRY about marking things as volatile, but my > understanding from the comment is I shouldn't even need to do that, > since these variables aren't being modified. Not sure, but if you do need to mark them volatile to prevent misoptimization in the PG_TRY, this is not how to do it: > volatile TupleDesc desc = slot->tts_tupleDescriptor; > volatile CallbackState *myState = (CallbackState *) self; > volatile PLyTypeInfo *args = myState->args; Correct coding would be volatile TupleDesc desc = slot->tts_tupleDescriptor; CallbackState * volatile myState = (CallbackState *) self; PLyTypeInfo * volatile args = myState->args; because what needs to be marked volatile is the pointer variable, not what it points at. I'm a bit surprised you're not getting "cast away volatile" warnings from the code as you have it. regards, tom lane
On 1/2/17 9:47 PM, Tom Lane wrote: > Jim Nasby <Jim.Nasby@BlueTreble.com> writes: >> In the attached patch (snippet below), I'm seeing something strange with >> args->in.r.atts[]. > > Did you try comparing the apparent values of "args" before and after > entering PG_TRY? Yeah, see below. FWIW, when I did that just now I stepped through at the instruction level, and things went belly up when I stepped into the first instruction of the for (first instruction after PG_TRY was done). >> I saw the comment on PG_TRY about marking things as volatile, but my >> understanding from the comment is I shouldn't even need to do that, >> since these variables aren't being modified. > > Not sure, but if you do need to mark them volatile to prevent > misoptimization in the PG_TRY, this is not how to do it: > >> volatile TupleDesc desc = slot->tts_tupleDescriptor; >> volatile CallbackState *myState = (CallbackState *) self; >> volatile PLyTypeInfo *args = myState->args; > > Correct coding would be > > volatile TupleDesc desc = slot->tts_tupleDescriptor; > CallbackState * volatile myState = (CallbackState *) self; > PLyTypeInfo * volatile args = myState->args; > > because what needs to be marked volatile is the pointer variable, > not what it points at. I'm a bit surprised you're not getting > "cast away volatile" warnings from the code as you have it. Unfortunately, that didn't make a difference. Amit's suggestion of isolating the single statement in a PG_TRY() didn't work either, but assigning args->in.r.atts[i] to a pointer did. The volatile didn't make a difference in this case, which if the PG_TRY() comments are to be believed is what I'd expect. Though, it was also OK with value not being volatile, which AFAIK isn't safe, so... for (i = 0; i < desc->natts; i++){ PyObject *value = NULL; PLyDatumToOb * volatile atts = &args->in.r.atts[i]; ... if (slot->tts_isnull[i] || atts->func == NULL) { Py_INCREF(Py_None); value = Py_None; } else { PG_TRY(); { value = (atts->func) (atts, slot->tts_values[i]); } PG_CATCH(); { Py_XDECREF(value); MemoryContextSwitchTo(oldcontext); PLy_switch_execution_context(old_exec_ctx); PG_RE_THROW(); } PG_END_TRY(); } lldb output from inspecting variables: (lldb) call args (volatile PLyTypeInfo *) $60 = 0x00007fff5912fbe8 (lldb) call @args error: expected expression (lldb) call *args (PLyTypeInfo) $61 = { in = { d = { func = 0x00007fca218f8ed0 typfunc = { fn_addr = 0x00007fff00000001 fn_oid = 114610686 fn_nargs = 1 fn_strict = '\0' fn_retset = '\0' fn_stats= '0' fn_extra = 0x00007fca2206d060 fn_mcxt = 0x00007fca22800960 fn_expr = 0x00007fff5912fbe8 } typtransform = { fn_addr = 0x00007fca218f8e38 fn_oid = 570849696 fn_nargs= 32714 fn_strict = '\0' fn_retset = '\0' fn_stats = '8' fn_extra = 0x00007fca22066f60 fn_mcxt = 0x00007fff5912fc90 fn_expr = 0x0000000106d7df6f } typoid = 570830824 typmod = 32714 typioparam = 570830864 typbyval = '\0' typlen = 0 typalign = 'h' elm= 0x0000000000000000 } r = { atts = 0x00007fca218f8ed0 natts = 1 } } out = { d = { func =0x00007fca22066f60 typfunc = { fn_addr = 0x00007fca00000000 fn_oid = 570846544 fn_nargs = 32714 fn_strict = '\0' fn_retset = '\0' fn_stats = '`' fn_extra = 0x00007fff5912fce0 fn_mcxt= 0x0000000106d4d437 fn_expr = 0x00007fca2206ce30 } typtransform = { fn_addr = 0x00007fca22062f90 fn_oid = 0 fn_nargs = 0 fn_strict = '\0' fn_retset = '\0' fn_stats ='\0' fn_extra = 0xffffffdc22066c20 fn_mcxt = 0x0000000000000000 fn_expr = 0x00007fca22066f60 } typoid = 3756919206 typmod = -754934605 typioparam = 1494416704 typbyval = '\xff' typlen = 0 typalign = '^' elm = 0x00007fff5912fd10 } r = { atts = 0x00007fca22066f60 natts = 0 } } is_rowtype= 0 typ_relid = 0 typrel_xmin = 570847072 typrel_tid = { ip_blkid = (bi_hi = 32714, bi_lo = 0) ip_posid= 36408 } mcxt = 0x000000012206c6f0 } (lldb) gu (lldb) call args (volatile PLyTypeInfo *) $62 = 0x00007fff5912fbe8 (lldb) call *args (PLyTypeInfo) $63 = { in = { d = { func = 0x00000000218f8ed0 typfunc = { fn_addr = 0x00007fff5912ff08 fn_oid = 1494417744 fn_nargs = 32767 fn_strict = '\0' fn_retset = '\0' fn_stats = '0' fn_extra = 0x00007fca2206d060 fn_mcxt = 0x00007fca22800960 fn_expr = 0x00007fff5912fbe8 } typtransform = { fn_addr = 0x00007fca218f8e38 fn_oid = 570849696 fn_nargs= 32714 fn_strict = '\0' fn_retset = '\0' fn_stats = '8' fn_extra = 0x00007fca22066f60 fn_mcxt = 0x0000000000000000 fn_expr = 0x00007fff5912fce0 } typoid = 1494416304 typmod = 32767 typioparam = 114900880 typbyval = '\x01' typlen = 0 typalign = '\xa6' elm = 0x000000010735c8a0 } r = { atts = 0x00000000218f8ed0 natts = 1494417160 } } out = { d ={ func = 0x00007fff5912ff08 typfunc = { fn_addr = 0x000000010786ac18 (plpython3.so`PLy_CSreceive + 360 at plpy_spi.c:455) fn_oid = 570846544 fn_nargs = 32714 fn_strict = '\0' fn_retset = '\0' fn_stats = '\x80' fn_extra = 0x000000005912fce0 fn_mcxt = 0x0000000106d4d437 fn_expr = 0x00007fca2206ce30 } typtransform = { fn_addr = 0x00007fca22062f90 fn_oid = 0 fn_nargs = 0 fn_strict = '\0' fn_retset = '\0' fn_stats = '\0' fn_extra = 0xffffffdc22066c20 fn_mcxt= 0x0000000000000000 fn_expr = 0x00007fca22066f60 } typoid = 3756919206 typmod = -754934605 typioparam = 1494416704 typbyval = '\xff' typlen = 0 typalign = '^' elm = 0x00007fff5912fd10 } r = { atts = 0x00007fff5912ff08 natts = 126266392 } } is_rowtype = 0 typ_relid= 0 typrel_xmin = 570847072 typrel_tid = { ip_blkid = (bi_hi = 32714, bi_lo = 0) ip_posid = 36408 } mcxt= 0x000000012206c6f0 } (lldb) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On Wed, Jan 4, 2017 at 3:47 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 1/2/17 9:47 PM, Tom Lane wrote: >> Correct coding would be >> >> volatile TupleDesc desc = slot->tts_tupleDescriptor; >> CallbackState * volatile myState = (CallbackState *) self; >> PLyTypeInfo * volatile args = myState->args; >> >> because what needs to be marked volatile is the pointer variable, >> not what it points at. I'm a bit surprised you're not getting >> "cast away volatile" warnings from the code as you have it. > > > Unfortunately, that didn't make a difference. Amit's suggestion of isolating > the single statement in a PG_TRY() didn't work either, but assigning > args->in.r.atts[i] to a pointer did. > Good to know that it worked, but what is the theory? From your experiment, it appears that in some cases accessing local pointer variables is okay and in other cases, it is not okay. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On 1/3/17 9:20 PM, Amit Kapila wrote: > On Wed, Jan 4, 2017 at 3:47 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >> On 1/2/17 9:47 PM, Tom Lane wrote: >>> Correct coding would be >>> >>> volatile TupleDesc desc = slot->tts_tupleDescriptor; >>> CallbackState * volatile myState = (CallbackState *) self; >>> PLyTypeInfo * volatile args = myState->args; >>> >>> because what needs to be marked volatile is the pointer variable, >>> not what it points at. I'm a bit surprised you're not getting >>> "cast away volatile" warnings from the code as you have it. >> >> >> Unfortunately, that didn't make a difference. Amit's suggestion of isolating >> the single statement in a PG_TRY() didn't work either, but assigning >> args->in.r.atts[i] to a pointer did. >> > > Good to know that it worked, but what is the theory? From your > experiment, it appears that in some cases accessing local pointer > variables is okay and in other cases, it is not okay. I can run some other experiments if you have any to suggest. I do think it's interesting that the data appeared to be completely fine until I actually ran whatever the first assembly instruction of the for loop is, so presumably it was fine after the sigsetjmp() call (which I'm assuming is what causes all the fuss to begin with...) From my understanding of what volatile does, I can understand why it might be necessary for something in the CATCH block to need it, but not in the TRY. Two other things of note that might possibly make a difference here: - This is happening inside a function used as a DestReceiver receiver - The original call is a plpython function, calling a plpython function, calling a plpython function (specifically, nested_call_one() in the plpython regression test). That does mean that the call stack looks something like this: plpython SPI_execute_callback (my customer DestReceiverer Setup function (PLy_CSSetup) is called somewhere in here, which is what populates myState) plpython SPI_execute_callback (PLy_CSSetup gets called again) plpython (this just returns a value) After that plpython call, the executor is going to call PLy_CSreceive, which is the function with this problematic code. So by the time this error happens, there are two nested levels of plpython+SPI going on. I originally thought the re-entrant calls were causing the problem, but after monitoring what PLy_CSSetup was doing and what PLy_CSreceive was getting that's not the case, or at least not the only reason for this. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > I do think it's interesting that the data appeared to be completely fine > until I actually ran whatever the first assembly instruction of the for > loop is, so presumably it was fine after the sigsetjmp() call (which I'm > assuming is what causes all the fuss to begin with...) From my > understanding of what volatile does, I can understand why it might be > necessary for something in the CATCH block to need it, but not in the TRY. While I was at Salesforce we saw some *extremely* odd behaviors in PG_TRY blocks that we eventually tracked down to the compiler effectively trying to store two different local variables in the same register. I don't recall all the details anymore, but it had something to do with register spills to stack locations and subsequent reloads happening at unfortunate places with respect to the sigsetjmp call. I figured it was a bug in the specific compiler we were using, because nobody in the PG community was reporting any problems with identical code. Anyway, "volatile" was an effective way of suppressing the issue. I wonder whether you're seeing something similar. You'd have to grovel through the assembly code for the function with and without "volatile" to be sure. regards, tom lane
On Thu, Jan 5, 2017 at 4:18 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 1/3/17 9:20 PM, Amit Kapila wrote: >> >> >> Good to know that it worked, but what is the theory? From your >> experiment, it appears that in some cases accessing local pointer >> variables is okay and in other cases, it is not okay. > > > I can run some other experiments if you have any to suggest. > > I do think it's interesting that the data appeared to be completely fine > until I actually ran whatever the first assembly instruction of the for loop > is, so presumably it was fine after the sigsetjmp() call (which I'm assuming > is what causes all the fuss to begin with...) From my understanding of what > volatile does, I can understand why it might be necessary for something in > the CATCH block to need it, but not in the TRY. > Memory contexts used in catch block also doesn't seem to be marked as volatile, you might want to try by marking them as volatile. Also, it might worth trying it on some other system to see if you are by any chance hitting the problem similar to what Tom has faced. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Jan 5, 2017 at 7:10 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > Memory contexts used in catch block also doesn't seem to be marked as > volatile, you might want to try by marking them as volatile. Also, it > might worth trying it on some other system to see if you are by any > chance hitting the problem similar to what Tom has faced. If a variable is modified within PG_TRY and then referenced in PG_CATCH it needs to be marked as volatile to be strictly in conformance with POSIX. This also ensures that any compiler does not do any stupid optimizations with those variables in the way they are referenced and used. -- Michael
On Fri, Jan 6, 2017 at 5:43 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > If a variable is modified within PG_TRY and then referenced in > PG_CATCH it needs to be marked as volatile to be strictly in > conformance with POSIX. This also ensures that any compiler does not > do any stupid optimizations with those variables in the way they are > referenced and used. That sort of begs the question of why PG_exception_stack is not marked as volatile, since the macros themselves modify it within the PG_TRY block and reference it within the PG_CATCH block. Is there some reason this variable is immune to the problem? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kevin Grittner <kgrittn@gmail.com> writes: > On Fri, Jan 6, 2017 at 5:43 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> If a variable is modified within PG_TRY and then referenced in >> PG_CATCH it needs to be marked as volatile to be strictly in >> conformance with POSIX. This also ensures that any compiler does not >> do any stupid optimizations with those variables in the way they are >> referenced and used. > That sort of begs the question of why PG_exception_stack is not > marked as volatile, since the macros themselves modify it within > the PG_TRY block and reference it within the PG_CATCH block. Is > there some reason this variable is immune to the problem? It's not a local variable. regards, tom lane