Thread: Increase pltcl test coverage
This patch increases test coverage for pltcl, from 70% to 83%. Aside from that, the work on this uncovered 2 new bugs (the trigger return one I just submitted, as well as a bug in the SRF/composite patch). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
Attachment
On 10/31/16 3:24 PM, Jim Nasby wrote: > This patch increases test coverage for pltcl, from 70% to 83%. Aside > from that, the work on this uncovered 2 new bugs (the trigger return one > I just submitted, as well as a bug in the SRF/composite patch). Rebased patch attached. Test coverage is now at 85% (by line count). Original patch by Karl Lehenbauer. Work sponsored by FlightAware. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > On 10/31/16 3:24 PM, Jim Nasby wrote: >> This patch increases test coverage for pltcl, from 70% to 83%. Aside >> from that, the work on this uncovered 2 new bugs (the trigger return one >> I just submitted, as well as a bug in the SRF/composite patch). > Rebased patch attached. Test coverage is now at 85% (by line count). This is in a format that neither patch(1) nor "git apply" recognize. Please resubmit in a more usual format, diff -c or diff -u perhaps. regards, tom lane
On 1/6/17 2:17 PM, Tom Lane wrote: > Jim Nasby <Jim.Nasby@BlueTreble.com> writes: >> On 10/31/16 3:24 PM, Jim Nasby wrote: >>> This patch increases test coverage for pltcl, from 70% to 83%. Aside >>> from that, the work on this uncovered 2 new bugs (the trigger return one >>> I just submitted, as well as a bug in the SRF/composite patch). > >> Rebased patch attached. Test coverage is now at 85% (by line count). > > This is in a format that neither patch(1) nor "git apply" recognize. > Please resubmit in a more usual format, diff -c or diff -u perhaps. Odd, dunno what happened there. New patch attached. BTW, this also includes some case changes back to lowercase. My muscle memory would prefer to go the other direction but I figured consistency was worth something. Feel free not to include that bit if you want. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Jim Nasby <Jim.Nasby@bluetreble.com> writes: > On 1/6/17 2:17 PM, Tom Lane wrote: >> This is in a format that neither patch(1) nor "git apply" recognize. >> Please resubmit in a more usual format, diff -c or diff -u perhaps. > Odd, dunno what happened there. New patch attached. This applies, but I fail to get the expected regression test output: *** /home/postgres/pgsql/src/pl/tcl/expected/pltcl_queries.out Sun Jan 8 11:54:19 2017 --- /home/postgres/pgsql/src/pl/tcl/results/pltcl_queries.out Sun Jan 8 12:18:27 2017 *************** *** 515,529 **** select tcl_eval('spi_prepare a "b {"'); ERROR: unmatched open brace in list select tcl_error_handling_test($tcl${spi_prepare "moo" }$tcl$); ! tcl_error_handling_test ! ------------------------------ ! COMMAND: spi_prepare "moo" + ! POSTGRES: 'POSTGRES' + ! TCL: LOOKUP + ! funcname: 'funcname' + ! lineno: 'lineno' ! (1 row) ! -- test full error text select tcl_error_handling_test($tcl$ spi_exec "DO $$ --- 515,521 ---- select tcl_eval('spi_prepare a "b {"'); ERROR: unmatched open brace in list select tcl_error_handling_test($tcl${spi_prepare "moo" }$tcl$); ! ERROR: list must have an even number of elements -- test full error text select tcl_error_handling_test($tcl$ spi_exec"DO $$ Investigation shows that $::errorCode contains just "NONE", which is why the "array set myArray $::errorCode" is blowing up. And that's sort of what I'd expect, because $err isinvalid command name " spi_prepare "moo" " which is not a Postgres-detected error. So probably the example tcl_error_handling_test function should not dispense with theif {[lindex $::errorCode 0] == "POSTGRES"} guard that we recommend in the manual. But I don't understand how you got the sample output shown in the patch. Is this based on some unsubmitted changes in pltcl's error handling? regards, tom lane
On 1/8/17 11:25 AM, Tom Lane wrote: > But I don't understand > how you got the sample output shown in the patch. Is this based > on some unsubmitted changes in pltcl's error handling? AFAICT you've got everything. What I had on my end is: create function public.tcl_error_handling_test(text) returns text language pltcl as $function$ if {[catch $1 err]} { # Set keys that will change over time to fixed values array set myArray $::errorCode set myArray(funcname) "'funcname'" set myArray(lineno) 'lineno' set myArray(POSTGRES) 'POSTGRES' # Format into something nicer set vals [] foreach {key} [lsort [array names myArray]] { set value [string map {"\n" "\n\t"} $myArray($key)] lappend vals "$key: $value" } return [join $vals "\n"] } else { return "no error" } $function$ ; Maybe it's a version difference? echo 'puts [info patchlevel];exit 0' | tclsh 8.6.6 Anyway, attached is a complete new patch that fixes that issue (and a couple test diffs I missed :/), as well as the utf_e2u issue you discovered. I've applied this patch to master via git apply and run it through make check-world, so hopefully this puts the horse out to pasture. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Jim Nasby <Jim.Nasby@bluetreble.com> writes: > On 1/8/17 11:25 AM, Tom Lane wrote: >> But I don't understand >> how you got the sample output shown in the patch. Is this based >> on some unsubmitted changes in pltcl's error handling? > Maybe it's a version difference? > echo 'puts [info patchlevel];exit 0' | tclsh > 8.6.6 Mmm, yeah, I'm on 8.5.13. Evidently what we're looking at here is a change in what Tcl puts into $::errorCode for this error. That being the case, we can't use $::errorCode for the regression test output, or it'll fail depending on Tcl version. I changed it to just return "$err", ie the basic error message. It might turn out that that's version-dependent too, but the buildfarm should tell us. Pushed with that and some other, mostly-cosmetic changes. regards, tom lane
I wrote: > Pushed with that and some other, mostly-cosmetic changes. Hmm, looks like the new test cases have turned up a pre-existing bug. Some of the buildfarm is showing crashes :-(. It looks like all the unhappy critters are running Tcl 8.4.something, which might be a coincidence but I bet not. I get the impression it's a stack clobber or memory clobber associated with reporting Tcl errors back to PG, but haven't been able to isolate it yet. regards, tom lane
On 1/9/17 1:22 PM, Tom Lane wrote: > I wrote: >> Pushed with that and some other, mostly-cosmetic changes. > > Hmm, looks like the new test cases have turned up a pre-existing bug. > Some of the buildfarm is showing crashes :-(. It looks like all the > unhappy critters are running Tcl 8.4.something, which might be a > coincidence but I bet not. > > I get the impression it's a stack clobber or memory clobber associated > with reporting Tcl errors back to PG, but haven't been able to isolate > it yet. I notice the failures are also on "unusual" platforms, with only 1 being x86. Though, that's very possibly the only animals running 8.4... I'm compiling 8.4 now, will see if I can duplicate. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 1/9/17 3:12 PM, Jim Nasby wrote: > I'm compiling 8.4 now, will see if I can duplicate. Got a stack trace. The abort happens in TclObjLookupVar: if (nsPtr->varResProc != NULL || iPtr->resolverPtr != NULL) { nsPtr itself is NULL. > * thread #1: tid = 0x0000, 0x000000010949bca8 libtcl8.4g.dylib`TclObjLookupVar(interp=0x00007fdc7b508ba0, part1Ptr=0x00007fdc7d009090,part2=0x0000000000000000, flags=5, msg="set", createPart1=1, createPart2=1, arrayPtrPtr=0x00007fff5712e3c0)+ 440 at tclVar.c:400, stop reason = signal SIGSTOP > * frame #0: 0x000000010949bca8 libtcl8.4g.dylib`TclObjLookupVar(interp=0x00007fdc7b508ba0, part1Ptr=0x00007fdc7d009090,part2=0x0000000000000000, flags=5, msg="set", createPart1=1, createPart2=1, arrayPtrPtr=0x00007fff5712e3c0)+ 440 at tclVar.c:400 > frame #1: 0x000000010949d22e libtcl8.4g.dylib`Tcl_ObjSetVar2(interp=0x00007fdc7b508ba0, part1Ptr=0x00007fdc7d009090,part2Ptr=0x0000000000000000, newValuePtr=0x00007fdc7c00e800, flags=5) + 170 at tclVar.c:1517 > frame #2: 0x000000010941407a libtcl8.4g.dylib`Tcl_AddObjErrorInfo(interp=0x00007fdc7b508ba0, message="\n invokedfrom within\n\"__PLTcl_proc_16466 {quote foo bar}\"", length=-1) + 460 at tclBasic.c:6530 > frame #3: 0x0000000109411514 libtcl8.4g.dylib`Tcl_LogCommandInfo(interp=0x00007fdc7b508ba0, script="__PLTcl_proc_16466{quote foo bar}", command="__PLTcl_proc_16466 {quote foo bar}", length=34) + 488 at tclBasic.c:3521 > frame #4: 0x00000001094112fa libtcl8.4g.dylib`Tcl_EvalObjv(interp=0x00007fdc7b508ba0, objc=2, objv=0x00007fff5712e6d0,flags=393216) + 740 at tclBasic.c:3428 > frame #5: 0x000000010941247d libtcl8.4g.dylib`Tcl_EvalObjEx(interp=0x00007fdc7b508ba0, objPtr=0x00007fdc7c001260, flags=393216)+ 376 at tclBasic.c:5109 > frame #6: 0x00000001093efe24 pltcl.so`pltcl_func_handler + 660 > frame #7: 0x00000001093ef186 pltcl.so`pltcl_handler + 246 > frame #8: 0x0000000108c3ff91 postgres`ExecMakeFunctionResultNoSets + 209 > frame #9: 0x0000000108c3f42e postgres`ExecProject + 590 > frame #10: 0x0000000108c56153 postgres`ExecResult + 179 > frame #11: 0x0000000108c38bae postgres`ExecProcNode + 94 > frame #12: 0x0000000108c35010 postgres`standard_ExecutorRun + 288 > frame #13: 0x0000000108d5ab3f postgres`PortalRunSelect + 239 > frame #14: 0x0000000108d5a744 postgres`PortalRun + 500 > frame #15: 0x0000000108d58919 postgres`PostgresMain + 8745 > frame #16: 0x0000000108cec483 postgres`PostmasterMain + 7443 > frame #17: 0x0000000108c717ba postgres`main + 1562 > frame #18: 0x00007fff8dd765ad libdyld.dylib`start + 1 -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
Jim Nasby <Jim.Nasby@bluetreble.com> writes: > Got a stack trace. The abort happens in TclObjLookupVar: Yeah, I found the problem: pltcl_returnnext calls pltcl_build_tuple_result which may throw elog(ERROR), leaving the Tcl interpreter's state all screwed up, so that later functions go south. It seems quite accidental that this is only failing with 8.4. We need pltcl_returnnext to use a subtransction, like the other pltcl statements that can call into generic PG code. Working on a fix now. regards, tom lane
On 1/9/17 4:23 PM, Tom Lane wrote: > Jim Nasby <Jim.Nasby@bluetreble.com> writes: >> Got a stack trace. The abort happens in TclObjLookupVar: > > Yeah, I found the problem: pltcl_returnnext calls pltcl_build_tuple_result > which may throw elog(ERROR), leaving the Tcl interpreter's state all > screwed up, so that later functions go south. It seems quite accidental > that this is only failing with 8.4. We need pltcl_returnnext to use a > subtransction, like the other pltcl statements that can call into generic > PG code. Working on a fix now. Hmm... I suspect there's more places where this could be a problem. For example, pltcl_quote calls palloc, which could ereport as well. Perhaps all of the internal commands should be wrapped in the pltcl_subtrans_begin() construct. The subtrans overhead would be unfortunate though. But AFAIK the only reason we need the subtrans is if we're underneath a TCL catch command, and there's probably a way to determine if that's the case. BTW, now that I've seen this pattern in pltcl and plpythonu, I'm wondering if there might be some easier way to handle it through our error callbacks. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
Jim Nasby <Jim.Nasby@bluetreble.com> writes: > Hmm... I suspect there's more places where this could be a problem. For > example, pltcl_quote calls palloc, which could ereport as well. Yeah. I looked at that but couldn't get terribly excited about it, because AFAICS, Tcl in general is apt to fall over under sufficient memory pressure. There are too many API functions with no failure return provision, even though they clearly must do memory allocation under the hood. (The fact that they've even got ckalloc() tells you what their philosophy is here: they're content to PANIC on OOM.) I think pltcl should attempt to cover any error conditions that aren't purely out-of-memory ones, but in a quick scan I didn't see any other hazards. regards, tom lane
On 1/9/17 5:38 PM, Tom Lane wrote: > Jim Nasby <Jim.Nasby@bluetreble.com> writes: >> Hmm... I suspect there's more places where this could be a problem. For >> example, pltcl_quote calls palloc, which could ereport as well. > > Yeah. I looked at that but couldn't get terribly excited about it, > because AFAICS, Tcl in general is apt to fall over under sufficient > memory pressure. There are too many API functions with no failure > return provision, even though they clearly must do memory allocation > under the hood. (The fact that they've even got ckalloc() tells you > what their philosophy is here: they're content to PANIC on OOM.) Uhm... wow. According to [1] that's going to result in TCL calling abort(). I'm guessing there's no reasonable way for us to recognize a TCL abort as something that we didn't need to panic on... In any case, AFAICT there's still a bug here: if PG hits a memory error, we'll ERROR, which is going to leave the interpreter right back in a bad state for the rest of that session. That doesn't seem so good. It also means a pltcl proc wouldn't get the chance to trap the error. Though, since a memory error could just as likely come out of tcl, which is going to panic us anyway, I guess it doesn't matter. > I think pltcl should attempt to cover any error conditions that aren't > purely out-of-memory ones, but in a quick scan I didn't see any other > hazards. Yeah, I think we're OK on that front. I was hoping to establish a precedent for all the new TCL functions that pltcl provides so it would be extremely unlikely that the returnnext bug could be repeated. Putting them in a separate file with a nice comment block would be another alternative. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
Jim Nasby <Jim.Nasby@bluetreble.com> writes: > On 1/9/17 5:38 PM, Tom Lane wrote: >> Yeah. I looked at that but couldn't get terribly excited about it, >> because AFAICS, Tcl in general is apt to fall over under sufficient >> memory pressure. > Though, since a memory error could just as likely come out of tcl, which > is going to panic us anyway, I guess it doesn't matter. Exactly. I can't get excited about making our code slower and less readable if there's only a fifty-fifty chance that doing so avoids a crash. Tcl users just need to stay far away from OOM conditions. (If it were a more popular language, maybe there would be reason to try to push to improve this, but ...) regards, tom lane