Thread: try/catch macros for Postgres backend
In service of the refactoring of error handling that I was talking about a few days ago, I'm finding that there are several places that really ought to catch longjmps and clean up after themselves, instead of expecting that whatever mess they've made will be cleaned up for them when control gets back to PostgresMain(). If we have functions that can catch errors, control might *not* go right back to PostgresMain(), and so throwing random cleanup actions into the sigsetjmp branch there is No Good. This is no big deal since pltcl and plpython already do much the same thing, but I'm starting to think that instead of directly hacking on Warn_restart, it would be good to have some macros hiding the details. The syntax I'm toying with is PG_TRY();{ ... code that might elog ...}PG_CATCH();{ ... recovery code here ... PG_RE_THROW(); // optional}PG_END_CATCH(); The braces in this are not actually necessary, but will be good style since they help visually set off the controlled code. (You can't just indent the controlled code without adding braces, because pg_indent will helpfully undo it.) This would expand to something on the close order of do { sigjmp_buf local_save_restart; memcpy(local_save_restart, Warn_restart, sizeof()); if (sigsetjmp(warn_restart) == 0) { ... code that mightelog ... memcpy(Warn_restart, local_save_restart, sizeof()); } else { memcpy(Warn_restart, local_save_restart,sizeof()); ... recovery code here ... } } while(0) and of course PG_RE_THROW is just a siglongjmp call. Does anyone have a problem with this macro syntax? The try/catch names are stolen from Java, so I'm figuring they won't terribly surprise any modern programmer, but I'm open to different names if anyone has a better idea. Also, the memcpy technique for saving/restoring Warn_restart is what pltcl and plpython currently use, and it works, but it seems unnecessarily inefficient. A further improvement would be to replace Warn_restart by a pointer defined likeextern sigjmp_buf *exception_stack_top; and then the macro expansion would be something more like do { sigjmp_buf *save_exception_stack = exception_stack_top; sigjmp_buf local_sigjmp_buf; if (sigsetjmp(local_sigjmp_buf) == 0) { exception_stack_top = &local_sigjmp_buf; ... code that mightelog ... exception_stack_top = save_exception_stack; } else { exception_stack_top = save_exception_stack; ... recovery code here ... } } while(0) while elog.c and PG_RE_THROW would need to dosiglongjmp(*exception_stack_top, 1); I think that this should work but does anyone know of any machines where it would have portability issues? regards, tom lane
On Wed, 28 Jul 2004, Tom Lane wrote: > In service of the refactoring of error handling that I was talking about > a few days ago, I'm finding that there are several places that really > ought to catch longjmps and clean up after themselves, instead of > expecting that whatever mess they've made will be cleaned up for them > when control gets back to PostgresMain(). If we have functions that can > catch errors, control might *not* go right back to PostgresMain(), and > so throwing random cleanup actions into the sigsetjmp branch there is > No Good. > > This is no big deal since pltcl and plpython already do much the same > thing, but I'm starting to think that instead of directly hacking on > Warn_restart, it would be good to have some macros hiding the details. > The syntax I'm toying with is > > PG_TRY(); > { > ... code that might elog ... > } > PG_CATCH(); > { > ... recovery code here ... > PG_RE_THROW(); // optional > } > PG_END_CATCH(); Cool. [snip] > Also, the memcpy technique for saving/restoring Warn_restart is what > pltcl and plpython currently use, and it works, but it seems > unnecessarily inefficient. A further improvement would be to replace > Warn_restart by a pointer defined like > extern sigjmp_buf *exception_stack_top; > > and then the macro expansion would be something more like > > do { > sigjmp_buf *save_exception_stack = exception_stack_top; > sigjmp_buf local_sigjmp_buf; > > if (sigsetjmp(local_sigjmp_buf) == 0) > { > exception_stack_top = &local_sigjmp_buf; > ... code that might elog ... > exception_stack_top = save_exception_stack; > } > else > { > exception_stack_top = save_exception_stack; > ... recovery code here ... > } > > } while(0) I've been thinking about is allow users to trigger named exceptions in PL/PgSQL. This would work something like this: CREATE FUNCTION .... DECLAREinvalid EXCEPTION;count INT; BEGINSELECT INTO count COUNT(*) FROM ...IF count < 10 -- we shouldn't have been called RAISE EXCEPTION invalid; ... EXCEPTIONWHEN invalid THEN ....ELSE RAISE NOTICE 'Unknown exception raised'; END... Another thing I've been thinking about is if an error is generated in PL/PgSQL, what state is the system in when control is handed to the excecption handler? That is, should we roll back to the state at the start of the function? most recent save point? Another thing I like about exceptions in some languages is the ability for a subroutine to generate an exception, hand control back to the caller and have the caller raise the exception. I'm wondering how hard that would be in PL/PgSQL? These are specific to PL/PgSQL and we may need something specific to that code but I thought I'd raise these thoughts now as I intend to work on them for 7.6. Thanks, Gavin
On Wed, Jul 28, 2004 at 08:19:17PM -0400, Tom Lane wrote: Very cool and interesting idea. > Does anyone have a problem with this macro syntax? The try/catch names > are stolen from Java, so I'm figuring they won't terribly surprise any > modern programmer, but I'm open to different names if anyone has a > better idea. The only comment I have so far is that Java and Python appear to have settled on try/catch/finally blocks. Maybe we need three blocks too, for handling more complex scenarios. (The "finally" block, AFAIU, is executed whether an exception was raised or not, so it serves as cleanup for try and catch blocks. Somebody more knowledgeable in this OO matters may enlighten us better?) -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "No es bueno caminar con un hombre muerto"
> Does anyone have a problem with this macro syntax? The try/catch names > are stolen from Java, so I'm figuring they won't terribly surprise any > modern programmer, but I'm open to different names if anyone has a > better idea. I have done such a macro hiding of setjmp/longjmp for a math library that have been rused in the polylib library. Exceptions are important in an integer linear algebra library because on overflows the co;putations are false. You may have a look at the stuff if you want by googling with polylib. -- Fabien Coelho - coelho@cri.ensmp.fr
Tom Lane wrote: > In service of the refactoring of error handling that I was talking about > a few days ago, I'm finding that there are several places that really > ought to catch longjmps and clean up after themselves, instead of > expecting that whatever mess they've made will be cleaned up for them > when control gets back to PostgresMain(). This is especially a problem when the cleanup needs to be done inside the embedded interpreter. I found that with R, I had to throw an error in the R interpreter in order to allow the interpreter to clean up its own state. That left me with code like this: 8<-------------------- /* * trap elog/ereport so we can let R finish up gracefully * and generate the error once weexit the interpreter */ memcpy(&save_restart, &Warn_restart, sizeof(save_restart)); if (sigsetjmp(Warn_restart, 1)!= 0) { InError = false; memcpy(&Warn_restart, &save_restart, sizeof(Warn_restart)); error("%s", "error executingSQL statement"); } [...(execute query via SPI)...] 8<-------------------- The error() call throws the R intepreter error directly, so that on exit from the R function call I do this: 8<-------------------- ans = R_tryEval(call, R_GlobalEnv, &errorOccurred); if(errorOccurred) { ereport(ERROR,... 8<-------------------- > Does anyone have a problem with this macro syntax? The try/catch names > are stolen from Java, so I'm figuring they won't terribly surprise any > modern programmer, but I'm open to different names if anyone has a > better idea. Looks good to me, but I worry about being able to do what I've described above. Basically I found that if I don't allow R to clean up after itself by propagating the SPI call generated error into R, before throwing a Postgres ERROR, I wind up with core dumps. Or am I just doing something wrong here (it seems to work for all of the test cases I could think of)? Joe
On Thu, Jul 29, 2004 at 12:10:12AM -0400, Alvaro Herrera Munoz wrote: > (The "finally" block, AFAIU, is executed whether an exception was raised > or not, so it serves as cleanup for try and catch blocks. Somebody more > knowledgeable in this OO matters may enlighten us better?) ...Or I could try. Yes, the "finally" block is executed after executing the "catch" block if an exception was caught, or when leaving the "try" block if there wasn't. That includes both normal completion and uncaught exceptions. This is useful for cleanup stuff, as you say--mostly because Java doesn't have C++'s destructors to take the cleanup out of your hands. Jeroen
Joe Conway <mail@joeconway.com> writes: > This is especially a problem when the cleanup needs to be done inside > the embedded interpreter. I found that with R, I had to throw an error > in the R interpreter in order to allow the interpreter to clean up its > own state. That left me with code like this: > [ snip ] > Looks good to me, but I worry about being able to do what I've described > above. Basically I found that if I don't allow R to clean up after > itself by propagating the SPI call generated error into R, before > throwing a Postgres ERROR, I wind up with core dumps. You could still do that, and perhaps even a bit more cleanly: sqlErrorOccurred = false;PG_TRY();{ ans = R_tryEval(call, R_GlobalEnv, &errorOccurred);}PG_CATCH();{ sqlErrorOccurred= true; /* push PG error into R machinery */ error("%s", "error executing SQL statement");}PG_END_TRY(); if (sqlErrorOccurred) PG_RE_THROW();if (errorOccurred) ereport(ERROR, "report R error here"); (The ereport will trigger only for errors originating in R, not for PG errors propagated out, which exit via the RE_THROW.) However I wonder whether either of these really work. What happens inside R's "error()" routine, exactly? A longjmp? It seems like this structure is relying on the stack not to get clobbered between elog.c's longjmp and R's. Which would usually work, except when you happened to get a signal during those few instructions... It seems like what you really need is a TRY inside each of the functions you offer as callbacks from R to PG. These would catch errors, return them as failures to the R level, which would in turn fail out to the tryEval call, and from there you could RE_THROW the original error (which will still be patiently waiting in elog.c). regards, tom lane
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > ...Or I could try. Yes, the "finally" block is executed after executing > the "catch" block if an exception was caught, or when leaving the "try" > block if there wasn't. That includes both normal completion and uncaught > exceptions. Right. The last bit (FINALLY executes whether or not a CATCH block re-throws) seemed too messy to handle in my little macros, so I'm planning on leaving it out. But I'm open to the idea if anyone has a clever implementation thought. What I have turning over at the moment is /*----------* API for catching ereport(ERROR) exits. Use these macros like so:** PG_TRY();* {* ... code that might throw ereport(ERROR) ...* }* PG_CATCH();* {* ... error recovery code...* }* PG_END_TRY();** (The braces are not actually necessary, but are recommended so that* pg_indentwill indent the construct nicely.) The error recovery code* can optionally do PG_RE_THROW() to propagate the sameerror outwards.** Note: while the system will correctly propagate any new ereport(ERROR)* occurring in the recovery section,there is a small limit on the number* of levels this will work for. It's best to keep the error recovery* sectionsimple enough that it can't generate any new errors, at least* not before popping the error stack.*----------*/ #define PG_TRY() \do { \ sigjmp_buf *save_exception_stack = PG_exception_stack; \ ErrorContextCallback *save_context_stack= error_context_stack; \ sigjmp_buf local_sigjmp_buf; \ if (sigsetjmp(local_sigjmp_buf, 1) == 0)\ { \ PG_exception_stack = &local_sigjmp_buf #define PG_CATCH() \ } \ else \ { \ PG_exception_stack = save_exception_stack; \ error_context_stack= save_context_stack #define PG_END_TRY() \ } \ PG_exception_stack = save_exception_stack; \ error_context_stack = save_context_stack;\} while (0) #define PG_RE_THROW() \siglongjmp(*PG_exception_stack, 1) extern DLLIMPORT sigjmp_buf *PG_exception_stack; It's passing regression tests but I have some loose ends to fix before committing. regards, tom lane
On Thu, Jul 29, 2004 at 09:58:54AM -0400, Tom Lane wrote: > Right. The last bit (FINALLY executes whether or not a CATCH block > re-throws) seemed too messy to handle in my little macros, so I'm > planning on leaving it out. But I'm open to the idea if anyone has > a clever implementation thought. There's also the alternative of going to C++, of course, which would give you full native exception handling. Most of this "finally" stuff will go away when you have destructors, IMHO, and resource cleanups are a whole lot easier. The main drawback is that stricter rules apply to gotos and longjumps--but most of those will be "a poor man's exception handling" anyway. Jeroen
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>This is especially a problem when the cleanup needs to be done inside >>the embedded interpreter. I found that with R, I had to throw an error >>in the R interpreter in order to allow the interpreter to clean up its >>own state. That left me with code like this: >>[ snip ] >>Looks good to me, but I worry about being able to do what I've described >>above. Basically I found that if I don't allow R to clean up after >>itself by propagating the SPI call generated error into R, before >>throwing a Postgres ERROR, I wind up with core dumps. > > > You could still do that, and perhaps even a bit more cleanly: > > sqlErrorOccurred = false; > PG_TRY(); > { > ans = R_tryEval(call, R_GlobalEnv, &errorOccurred); > } > PG_CATCH(); > { > sqlErrorOccurred = true; > /* push PG error into R machinery */ > error("%s", "error executing SQL statement"); > } > PG_END_TRY(); > > if (sqlErrorOccurred) > PG_RE_THROW(); > if (errorOccurred) > ereport(ERROR, "report R error here"); > > (The ereport will trigger only for errors originating in R, not for > PG errors propagated out, which exit via the RE_THROW.) > > However I wonder whether either of these really work. What happens > inside R's "error()" routine, exactly? A longjmp? It seems like this > structure is relying on the stack not to get clobbered between elog.c's > longjmp and R's. Which would usually work, except when you happened to > get a signal during those few instructions... > > It seems like what you really need is a TRY inside each of the functions > you offer as callbacks from R to PG. These would catch errors, return > them as failures to the R level, which would in turn fail out to the > tryEval call, and from there you could RE_THROW the original error > (which will still be patiently waiting in elog.c). > For what it's worth, I think this looks really good. Especially when combined with the proposal discussed in the "Sketch of extending error handling for subtransactions in functions". PL/Java makes heavy use (almost all calls) of TRY/CATCH macros today so any performance increase, even a small one, might be significant. And the ability to catch an error and actually handle it, hear, hear! Kind regards, Thomas Hallgren
On 07/28/04:30/3, Tom Lane wrote: > In service of the refactoring of error handling that I was talking about > a few days ago, I'm finding that there are several places that really > ought to catch longjmps and clean up after themselves, instead of > expecting that whatever mess they've made will be cleaned up for them > when control gets back to PostgresMain(). If we have functions that can > catch errors, control might *not* go right back to PostgresMain(), and > so throwing random cleanup actions into the sigsetjmp branch there is > No Good. This is wonderful news. plpy for 7.5 will be very nice. :) > This is no big deal since pltcl and plpython already do much the same > thing, but I'm starting to think that instead of directly hacking on > Warn_restart, it would be good to have some macros hiding the details. > The syntax I'm toying with is > > ... > > Does anyone have a problem with this macro syntax? The try/catch names > are stolen from Java, so I'm figuring they won't terribly surprise any > modern programmer, but I'm open to different names if anyone has a > better idea. Sounds good, but perhaps it would be useful for some developers to have the macro syntax broken up into smaller pieces, plpythonu does/did this in plpython.h(gone now), and I rolled my own based on plpython's in plpy. for example: #define PG_EXC_DECLARE() sigjmp_buf local_sigjmp_buf #define PG_EXC_SAVE() \ sigjmp_buf *save_exception_stack = PG_exception_stack; \ ErrorContextCallback *save_context_stack= error_context_stack #define PG_EXC_TRAP() (sigsetjmp(local_sigjmp_buf, 1) == 0) ... You could then use those to make up PG_TRY, PG_CATCH, PG_END_TRY. Although, I'm not too concerned about this, as if someone wants the smaller pieces they could probably just write their own without much difficulty. > Also, the memcpy technique for saving/restoring Warn_restart is what > pltcl and plpython currently use, and it works, but it seems > unnecessarily inefficient. A further improvement would be to replace > Warn_restart by a pointer defined like > extern sigjmp_buf *exception_stack_top; Aye, good idea. -- Regards, James William Pye
James William Pye <flaw@rhid.com> writes: > On 07/28/04:30/3, Tom Lane wrote: >> Does anyone have a problem with this macro syntax? The try/catch names >> are stolen from Java, so I'm figuring they won't terribly surprise any >> modern programmer, but I'm open to different names if anyone has a >> better idea. > Sounds good, but perhaps it would be useful for some developers to have > the macro syntax broken up into smaller pieces, plpythonu does/did this in > plpython.h(gone now), and I rolled my own based on plpython's in plpy. Is there any actual functional usefulness to that, or is it just to avoid having to reorder existing code to fit into the try/catch paradigm? I would actually prefer to force people to use the try/catch macros, in the name of code readability and consistent coding style. I had never felt that I understood the way the plpython error-trapping code was structured, until I had to go in and examine it in detail to rewrite it into the try/catch style. I think it's now a lot more legible to the casual reader, and that's an important goal for Postgres-related code. > for example: > #define PG_EXC_DECLARE() sigjmp_buf local_sigjmp_buf > #define PG_EXC_SAVE() \ > sigjmp_buf *save_exception_stack = PG_exception_stack; \ > ErrorContextCallback *save_context_stack = error_context_stack > #define PG_EXC_TRAP() (sigsetjmp(local_sigjmp_buf, 1) == 0) If you're really intent on doing that, you probably can do it no matter what I say about it ;-). But I find it hardly any improvement over direct use of the setjmp API. regards, tom lane
On 07/31/04:30/6, Tom Lane wrote: > Is there any actual functional usefulness to that, or is it just to > avoid having to reorder existing code to fit into the try/catch paradigm? Both, I imagine. In the case of the former, it *may be* useful for someone to create their own paradigm, which seems like it would tye back into the latter. > I would actually prefer to force people to use the try/catch macros, in > the name of code readability and consistent coding style. Ah, you must be a Python programmer at heart! ;) > I had never > felt that I understood the way the plpython error-trapping code was > structured, until I had to go in and examine it in detail to rewrite it > into the try/catch style. Yeah, it wasn't pretty. When I first started on plpy, I hadn't even heard of sigjmp_buf before. Perhaps you can imagine the clumps of hair I had to pull out to finally get a grasp on it. > I think it's now a lot more legible to the > casual reader, and that's an important goal for Postgres-related code. Definitely. It is a vast improvement over plpython's more demanding style. > If you're really intent on doing that, you probably can do it no matter > what I say about it ;-). I have yet to decide to adopt the new syntax, as I just saw it yesterday, but it is likely that I will, as I do depend on PG, so if it convenient, I might as well use the tools that it gives me. > But I find it hardly any improvement over > direct use of the setjmp API. Well, I find it more aesthetically appealing, and it can be quite nice to have a macro interface to allow the underlying to change willy-nilly(not that it should, but that it can). I'll bet that's the "hardly any improvement" that you mentioned. -- Regards, James William Pye
tgl@sss.pgh.pa.us (Tom Lane) wrote: > Does anyone have a problem with this macro syntax? The try/catch > names are stolen from Java, so I'm figuring they won't terribly > surprise any modern programmer, but I'm open to different names if > anyone has a better idea. Mitch Bradley, once of Sun, once founder of "Bradley ForthWorks," and creator of the form of Forth used for the OpenBOOT standard, got it introduced to ANSI Forth. I remember this being a pretty neat addition to Forth back in the late '80s... <http://dec.bournemouth.ac.uk/forth/euro/ef98/milendorf98.pdf> That paper attributes it as having come from Lisp. Java almost certainly "stole" it from Common Lisp, which took it from MacLisp (albeit with changes in semantics). Guy Steele, now a Sun Fellow, documents this in _Common Lisp, The Language_ (2nd edition), section 7.11. Python has it; Perl has an add-in; Ruby has it; Scheme has it (possibly as an SRFI rather than in R?RS); Modula-3 has it (sans the "catch" keyword), and the canonical reference, there, is by Harbison, known to work with the very same Steele :-). Haskell hasn't a "throw," but does have a "catch." OCAML has "raise" and "try," where "catch" is implicit in the trying. ISO Prolog has catch/3 and throw/1. Ada has "raise" and exception handlers... I would therefore think that anything _other_ than this naming convention would appear remarkable and surprising :-). -- (reverse (concatenate 'string "gro.mca" "@" "enworbbc")) http://www3.sympatico.ca/cbbrowne/languages.html If you're sending someone some Styrofoam, what do you pack it in?