Thread: Error handling in plperl and pltcl
plperl's error handling is not completely broken, but it's close :-( Consider for example the following sequence on a machine with a relatively old Perl installation: regression=# create or replace function foo(int) returns int as $$ regression$# return $_[0] + 1 $$ language plperl; CREATE FUNCTION regression=# select foo(10); ERROR: trusted perl functions disabled - please upgrade perl Safe module to at least 2.09 regression=# create or replace function foo(int) returns int as $$ regression$# return $_[0] + 1 $$ language plperlu; CREATE FUNCTION regression=# select foo(10); ERROR: creation of function failed: (in cleanup) Undefined subroutine &main::mkunsafefunc called at (eval 6) line 1. What is happening here is that the elog() call that produced the "trusted perl functions disabled" message longjmp'd straight out of the Perl interpreter, without giving Perl any chance to clean up. Perl therefore still thinks it's executing inside the "Safe" module, wherein the mkunsafefunc() function can't be seen. You could probably devise much more spectacular failures than this one, given the fact that Perl's internal state will be left in a mess. We can deal with this in a localized fashion for plperl's elog() subroutine, by PG_CATCH'ing the longjmp and converting it into a Perl croak() call. However it would be unsafe to do that for the spi_exec_query() subroutine, because then the writer of the Perl function might think he could trap the error with eval(). Which he mustn't do because any breakage in Postgres' state won't get cleaned up. We have to go through a transaction or subtransaction abort to be sure we have cleaned up whatever mess the elog was complaining about. Similar problems have plagued pltcl for a long time. pltcl's solution is to save whatever Postgres error was reported from a SPI operation, and to forcibly re-throw that error after we get control back from Tcl, even if the Tcl code tried to catch the error. Needless to say, this is gross, and anybody who runs into it is going to think it's a bug. What I think we ought to do is change both PL languages so that every SPI call is executed as a subtransaction. If the call elogs, we can clean up by aborting the subtransaction, and then we can report the error message as a Perl or Tcl error condition, which the function author can trap if he chooses. If he doesn't choose to, then the language interpreter will return an error condition to plperl.c or pltcl.c, and we can re-throw the error. This will slow down the PL SPI call operations in both languages, but AFAICS it's the only way to provide error handling semantics that aren't too broken for words. The same observations apply to plpython, of course, but I'm not volunteering to fix that language because I'm not at all familiar with it. Perhaps someone who is can make the needed changes there. Comments? regards, tom lane
Tom Lane wrote: > What I think we ought to do is change both PL languages so that every > SPI call is executed as a subtransaction. If the call elogs, we can > clean up by aborting the subtransaction, and then we can report the > error message as a Perl or Tcl error condition, which the function > author can trap if he chooses. If he doesn't choose to, then the > language interpreter will return an error condition to plperl.c or > pltcl.c, and we can re-throw the error. > > This will slow down the PL SPI call operations in both languages, but > AFAICS it's the only way to provide error handling semantics that aren't > too broken for words. > > The same observations apply to plpython, of course, but I'm not > volunteering to fix that language because I'm not at all familiar with > it. Perhaps someone who is can make the needed changes there. > > Comments? > My approach with PL/Java is a bit different. While each SPI call is using a try/catch they are not using a subtransaction. The catch will however set a flag that will ensure two things: 1. No more calls can be made from PL/Java to the postgres backend. 2. Once PL/Java returns, the error will be re-thrown. This allows PL/Java to catch the error, clean up (within the Java domain), and return, nothing more. The solution is IMO safe and could be used for all PL languages. It introduces no overhead with subtransactions, and the developer writing functions are provided a clean up mechanism where resources not related to SPI can be handled (files closed, etc.). Something that would be great for the future is if the errors could divided into recoverable and unrecoverable. Regards, Thomas Hallgren
Tom Lane wrote: >plperl's error handling is not completely broken, but it's close :-( >Consider for example the following sequence on a machine with a >relatively old Perl installation: > > > > You just picked an easy way to trigger this. As you rightly observe, there are others. >We can deal with this in a localized fashion for plperl's elog() >subroutine, by PG_CATCH'ing the longjmp and converting it into a Perl >croak() call. > > [...] >What I think we ought to do is change both PL languages so that every >SPI call is executed as a subtransaction. If the call elogs, we can >clean up by aborting the subtransaction, and then we can report the >error message as a Perl or Tcl error condition, which the function >author can trap if he chooses. If he doesn't choose to, then the >language interpreter will return an error condition to plperl.c or >pltcl.c, and we can re-throw the error. > > We can do both of these, no? >This will slow down the PL SPI call operations in both languages, but >AFAICS it's the only way to provide error handling semantics that aren't >too broken for words. > > > > Can you estimate the extent of the slowdown? cheers andrew
Thomas Hallgren <thhal@mailblocks.com> writes: > My approach with PL/Java is a bit different. While each SPI call is > using a try/catch they are not using a subtransaction. The catch will > however set a flag that will ensure two things: > 1. No more calls can be made from PL/Java to the postgres backend. > 2. Once PL/Java returns, the error will be re-thrown. That's what pltcl has always done, and IMHO it pretty well sucks :-( it's neither intuitive nor useful. regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> This will slow down the PL SPI call operations in both languages, but >> AFAICS it's the only way to provide error handling semantics that aren't >> too broken for words. > Can you estimate the extent of the slowdown? Without actually doing the work, the closest comparison I can make is between plpgsql functions with and without exception blocks. I tried create or replace function foo(int) returns int as ' declare x int; begin select into x unique1 from tenk1 where unique2 = $1; return x; end' language plpgsql; create or replace function foo(int) returns int as ' declare x int; begin begin select into x unique1 from tenk1 where unique2 = $1; exception when others then null; end; return x; end' language plpgsql; and usedexplain analyze select foo(unique2) from tenk1; to execute each one 10000 times without too much overhead. I get about 6900 vs 12800 msec, so for a simple pre-planned query it's not quite a 50% overhead. This is probably about the worst case you'd see in practice --- unlike plpgsql, plperl and pltcl functions wouldn't be calling the SQL engine to do simple arithmetic, so they're not going to have SPI calls that do much less work than this example does. regards, tom lane
Tom Lane wrote: >Thomas Hallgren <thhal@mailblocks.com> writes: > > >>My approach with PL/Java is a bit different. While each SPI call is >>using a try/catch they are not using a subtransaction. The catch will >>however set a flag that will ensure two things: >> >> > > > >>1. No more calls can be made from PL/Java to the postgres backend. >>2. Once PL/Java returns, the error will be re-thrown. >> >> > >That's what pltcl has always done, and IMHO it pretty well sucks :-( >it's neither intuitive nor useful. > > Given that most SPI actions that you do doesn't elog (most of them are typically read-only), it's far more useful than imposing the overhead of a subtransaction on all calls. That IMHO, would really suck :-( Ideally, the behavior should be managed so that if a subtransaction is started intentionally, crash recovery would be possible and the function should be able to continue after it has issued a rollback of that subtransaction. I'm suprised you say that this is not useful. I've found that in most cases when you encounter an elog, this is the most intuitive behavior. Either you don't do any cleanup, i.e. just return and let the elog be re-thrown, or you close some files, free up some resources or whatever, then you return. Not many functions would continue executing after an elog, unless of course, you *intentionally* started a subtransaction. I'll investigate what's entailed in handling SPI calls performed in a subtransaction differently so that calls are blocked only until the subtransaction is rolled back. Since I have my own JDBC driver, that doesn't sound too hard. I guess PL/Perl and PL/Tcl has something similar where they could track this. Such handling, in combination with a "recoverable" status in the elog's error structure, would create a really nice (end efficient) subsystem. Regards, Thomas Hallgren
Thomas Hallgren <thhal@mailblocks.com> writes: > Tom Lane wrote: >> That's what pltcl has always done, and IMHO it pretty well sucks :-( >> it's neither intuitive nor useful. >> > Given that most SPI actions that you do doesn't elog (most of them are > typically read-only), it's far more useful than imposing the overhead of > a subtransaction on all calls. That IMHO, would really suck :-( I don't think we really have any alternative --- certainly not if you want to continue to regard plperl as a trusted language. I haven't bothered to develop a test case, but I'm sure it's possible to crash the backend by exploiting the lack of reasonable error handling in spi_exec_query. There's an ancient saying "I can make this code arbitrarily fast ... if it doesn't have to give the right answer". I think that applies here. Fast and unsafe is not how the Postgres project customarily designs things. I'd rather get the semantics right the first time and then look to optimize later. (I'm sure we can do more to speed up subtransaction entry/exit than we have so far.) regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > (I'm sure we can do more to speed up subtransaction entry/exit than we have > so far.) Is there anything that can be done to short circuit the _first_ layer of subtransaction? I'm thinking there will be many cases like this where there's one implicit subtransaction that users don't even know is there. in particular I'm thinking of psql introducing a subtransaction on every query to allow recovery from typos and other errors. Drivers may do something similar to allow the application to catch errors using language constructs like exceptions and recover. In many environments there will be one layer of subtransaction on every query. -- greg
I wrote: > I get about 6900 vs 12800 msec, so for a simple pre-planned query > it's not quite a 50% overhead. However, that was yesterday ;-). I did some profiling and found some easy-to-knock-off hotspots. Today I'm measuring about 25% overhead for a simple SELECT, which I think is entirely acceptable considering the cleanliness of definition that we're buying. I changed my test cases to be create or replace function foo(int,int) returns int as ' declare x int; begin for i in 1 .. $1 loop select into x unique1 from tenk1 where unique2 = $2; end loop; return x; end' language plpgsql; create or replace function foos(int,int) returns int as ' declare x int; begin for i in 1 .. $1 loop begin select into x unique1 from tenk1 where unique2 = $2; exception when othersthen null; end; end loop; return x; end' language plpgsql; so as to minimize the extraneous overhead --- I think this is a harder test (gives a higher number) than what I was doing yesterday. regards, tom lane
On Fri, 2004-11-19 at 16:58 -0500, Tom Lane wrote: > What I think we ought to do is change both PL languages so that every > SPI call is executed as a subtransaction. If the call elogs, we can > clean up by aborting the subtransaction, and then we can report the > error message as a Perl or Tcl error condition, which the function > author can trap if he chooses. If he doesn't choose to, then the > language interpreter will return an error condition to plperl.c or > pltcl.c, and we can re-throw the error. I do this already in my plpy, save the subtransaction handling "feature". In plpy, all Postgres ERRORs are caught and transformed into Python exceptions, then when the interpreter exits with a Python exception, it is transformed back into a Postgres ERROR and raised. I even created a class of Python exceptions for Postgres ERRORs(e.g. raise Postgres.ERROR('msg', code=someErrCode, hint='foo')). (And more specific classes as well, putting errcodes to good use.) I plan(well, already working on it) to create Python interfaces to Postgres transaction facilities so that the author can start, rollback, and commit subxacts as needed for use/cleanup. Of course, I feel that this is the best way to go AFA subxacts are concerned; leaving the details to the author. I have been playing with RollbackToSavepoint and ReleaseSavepoint, but per Neil's comments on IRC and the fact that I have to annoyingly construct a List containing the savepoint name. I get the feeling that I am not meant to use them. If they are provided for possible use, shouldn't they take a string instead of a List? (Is a List used here to discourage use?) -- Regards, James William Pye
James William Pye <flaw@rhid.com> writes: > I have been playing with RollbackToSavepoint and ReleaseSavepoint, but > per Neil's comments on IRC and the fact that I have to annoyingly > construct a List containing the savepoint name. I get the feeling that I > am not meant to use them. You're right. You can *not* expose those as user-callable operations in a PL language. Consider for example what will happen if the user tries to roll back to a savepoint that was established outside your function call, or tries to exit the function while still inside a local savepoint. You have to enforce strict nesting of functions and subtransactions; therefore it's a lot easier to present an API that looks like an exception-block construct (per plpgsql), or that just hides the whole deal in the SPI calling interface (as I'm proposing for plperl/pltcl). There's been some discussion of creating a "stored procedure" language that would execute outside the database engine, but still on the server side of the network connection. In that sort of context it would be reasonable to let the user do SAVEPOINT/ROLLBACK (or any other SQL command). But our existing PLs most definitely execute inside the engine, and therefore they can't expose facilities that imply arbitrary changes in the subtransaction state stack. regards, tom lane
Tom Lane wrote: > There's an ancient saying "I can make this code arbitrarily fast ... > if it doesn't have to give the right answer". I think that applies > here. Fast and unsafe is not how the Postgres project customarily > designs things. I'm missing something, that's clear. Because I can't see why the PL/Java way of doing it is anything but both fast and 100% safe. I agree 100% that unsafe is not an option. I'm arguing that since my design is totally safe, intuitive, and cover 90% of the use-cases, it is the best one. Regards, Thomas Hallgren PS. The current design that prevents non-volatile functions from doing things with side effects is not very safe ;-) I persist claiming that there's a better (and safe) way to handle that.
Tom Lane wrote: > James William Pye <flaw@rhid.com> writes: > >>I have been playing with RollbackToSavepoint and ReleaseSavepoint, but >>per Neil's comments on IRC and the fact that I have to annoyingly >>construct a List containing the savepoint name. I get the feeling that I >>am not meant to use them. > > > You're right. You can *not* expose those as user-callable operations in > a PL language. Consider for example what will happen if the user tries > to roll back to a savepoint that was established outside your function > call, or tries to exit the function while still inside a local > savepoint. You have to enforce strict nesting of functions and > subtransactions; therefore it's a lot easier to present an API that > looks like an exception-block construct (per plpgsql), or that just > hides the whole deal in the SPI calling interface (as I'm proposing for > plperl/pltcl). > > There's been some discussion of creating a "stored procedure" language > that would execute outside the database engine, but still on the server > side of the network connection. In that sort of context it would be > reasonable to let the user do SAVEPOINT/ROLLBACK (or any other SQL > command). But our existing PLs most definitely execute inside the > engine, and therefore they can't expose facilities that imply arbitrary > changes in the subtransaction state stack. > I'm planning to add subtransactions too, but my approach will be to use the savepoint functionality already present in the java.sql.Connection interface. Perhaps the plpy implementation could do something similar. This is what I'm planning to implement: In Java, safepoints are identified by an interface rather then just by a name. I will (invisibly) include both the name of the safepoint and the call level in my implementation of that interface. I will also have a nested "call context" where I manage safepoints created by the executing function. All of this will be completely hidden from the function developer. This will make it possible to enforce the following rules: 1. A Safepoint lifecycle must be confined to a function call. 2. Safepoints must be rolled back or released by the same function that sets them. Failure to comply with those rules will result in an exception (elog ERROR) that will be propagated all the way up. Would you consider this as safe? Regards, Thomas Hallgren
Thomas Hallgren wrote >> I'm planning to add subtransactions too, but my approach will be to >> use the savepoint functionality already present in the >> java.sql.Connection interface. Perhaps the plpy implementation could >> do something similar. This is what I'm planning to implement: > > In Java, safepoints are identified by an interface rather then just by > a name. I will (invisibly) include both the name of the safepoint and > the call level in my implementation of that interface. I will also > have a nested "call context" where I manage safepoints created by the > executing function. All of this will be completely hidden from the > function developer. This will make it possible to enforce the > following rules: > > 1. A Safepoint lifecycle must be confined to a function call. > 2. Safepoints must be rolled back or released by the same function > that sets them. > > Failure to comply with those rules will result in an exception (elog > ERROR) that will be propagated all the way up. > > Would you consider this as safe? > > Regards, > Thomas Hallgren s/safepoint/savepoint/g
On Sat, 2004-11-20 at 16:39 -0500, Tom Lane wrote: > You're right. You can *not* expose those as user-callable operations in > a PL language. Consider for example what will happen if the user tries > to roll back to a savepoint that was established outside your function > call, or tries to exit the function while still inside a local > savepoint. You have to enforce strict nesting of functions and > subtransactions; therefore it's a lot easier to present an API that > looks like an exception-block construct (per plpgsql), or that just > hides the whole deal in the SPI calling interface (as I'm proposing for > plperl/pltcl). Hrm, what about a savepoint scoping facility that would be wrapped around calls to [volatile?] functions to explicitly enforce these regulations? [...Poking around the archives a bit...] [Or do I mean savepoint levels?]: http://archives.postgresql.org/pgsql-hackers/2004-07/msg00505.php http://archives.postgresql.org/pgsql-hackers/2004-09/msg00569.php -- Regards, James William Pye
On 11/19/2004 7:54 PM, Tom Lane wrote: > Thomas Hallgren <thhal@mailblocks.com> writes: >> My approach with PL/Java is a bit different. While each SPI call is >> using a try/catch they are not using a subtransaction. The catch will >> however set a flag that will ensure two things: > >> 1. No more calls can be made from PL/Java to the postgres backend. >> 2. Once PL/Java returns, the error will be re-thrown. > > That's what pltcl has always done, and IMHO it pretty well sucks :-( > it's neither intuitive nor useful. At the time that code was written it simply acted as a stopgap to prevent subsequent SPI calls after elog while still unwinding the Tcl call stack properly to avoid resource leaking inside of Tcl. I don't agree that the right cure is to execute each and every statement itself as a subtransaction. What we ought to do is to define a wrapper for the catch Tcl command, that creates a subtransaction and executes the code within during that. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck <JanWieck@Yahoo.com> writes: > I don't agree that the right cure is to execute each and every statement > itself as a subtransaction. What we ought to do is to define a wrapper > for the catch Tcl command, that creates a subtransaction and executes > the code within during that. What I would like to do is provide a catch-like Tcl command that defines a subtransaction, and then optimize the SPI commands so that they don't create their own sub-subtransaction if they can see they are directly within the subtransaction command. But when they aren't, they need to define their own subtransactions so that the error semantics are reasonable. I think what you're saying is that a catch command should be exactly equivalent to a subtransaction, but I'm unconvinced --- a catch might be used around some Tcl operations that don't touch the database, in which case the subtransaction overhead would be a serious waste. The real point here is that omitting the per-command subtransaction ought to be a hidden optimization, not something that intrudes to the point of having unclean semantics when we can't do it. regards, tom lane
On 11/29/2004 10:43 PM, Tom Lane wrote: > Jan Wieck <JanWieck@Yahoo.com> writes: >> I don't agree that the right cure is to execute each and every statement >> itself as a subtransaction. What we ought to do is to define a wrapper >> for the catch Tcl command, that creates a subtransaction and executes >> the code within during that. > > What I would like to do is provide a catch-like Tcl command that defines > a subtransaction, and then optimize the SPI commands so that they don't > create their own sub-subtransaction if they can see they are directly > within the subtransaction command. But when they aren't, they need to > define their own subtransactions so that the error semantics are > reasonable. I think what you're saying is that a catch command should > be exactly equivalent to a subtransaction, but I'm unconvinced --- a > catch might be used around some Tcl operations that don't touch the > database, in which case the subtransaction overhead would be a serious > waste. That is right. What the catch replacement command should do is to establish some sort of "catch-level", run the script inside the catch block. The first spi operation inside of that block causes a subtransaction to be created and remembered in that catch-level. At the end - i.e. when that block of commands finishes, the subtransaction is committed or rolled back and nothing done if the command block didn't hit any spi statement. > > The real point here is that omitting the per-command subtransaction > ought to be a hidden optimization, not something that intrudes to the > point of having unclean semantics when we can't do it. We could treat the entire function call as one subtransaction in the first place. Then create more sub-subtransactions as catch blocks appear. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Tom Lane wrote: > The real point here is that omitting the per-command subtransaction > ought to be a hidden optimization, not something that intrudes to the > point of having unclean semantics when we can't do it. Sorry to be stupid here, but I didn't understand this when it was disussed originally either. Why a subtransaction per command rather than one per function? If I've got this right, this is so the PL can tidy up behind itself and report/log an appropriate error? -- Richard Huxton Archonet Ltd
Richard Huxton wrote: > Tom Lane wrote: > >> The real point here is that omitting the per-command subtransaction >> ought to be a hidden optimization, not something that intrudes to the >> point of having unclean semantics when we can't do it. > > > Sorry to be stupid here, but I didn't understand this when it was > disussed originally either. Why a subtransaction per command rather > than one per function? If I've got this right, this is so the PL can > tidy up behind itself and report/log an appropriate error? I don't understand this either. Why a subtransaction at all? Don't get me wrong. I fully understand that a subtransaction would make error recovery possible. What I try to say is that the kind of error recovery that needs a subtransaction is fairly, or perhaps even very, rare. We all agree that further calls to SPI must be prohibited if an error occurs when no subtransaction is active. Such an error can only result in one thing. The function must terminate and the error must be propagated. The way most functions that I've seen is written, this is the most common behavior anyway. It's very uncommon that you want to do further database accesses after something has gone wrong. I admit that some special cases indeed do exist but I cannot for my life understand why those cases must incur a 25% overhead on everything else. Especially if there is an alternate way of handling them without making any sacrifice whatsoever on safety. A function in PL/Java that calls to the backend and encounters an error can be 1 of 2 types: 1. If no subtransaction is active, the function will be completely and utterly blocked from doing further calls to the backend. When it returns, the error will be re-thrown. 2. When a subtransaction is active, the function will be blocked the same way as for #1 with one exception. A subtransaction rollback will go through and it will remove the block. So, in Java I have the choice of writing: try { // do something } catch(SQLException e) { // Clean up (but no backend calls) and terminate } or I can write: Savepoint sp = myConn->setSavepoint("foo"); try { // do something sp.commit(); } catch(SQLException e) { sp.rollback(); // Handle error and continue execution. } All cases are covered, there's no subtransaction overhead (unless you really want it), the semantics are clean, and it's 100% safe. What's wrong with this approach? Regards, Thomas Hallgren
Richard Huxton <dev@archonet.com> writes: > Tom Lane wrote: >> The real point here is that omitting the per-command subtransaction >> ought to be a hidden optimization, not something that intrudes to the >> point of having unclean semantics when we can't do it. > Sorry to be stupid here, but I didn't understand this when it was > disussed originally either. Why a subtransaction per command rather than > one per function? So that when the Tcl programmer writes "catch" around a SPI command, or the Perl programmer writes "eval" around a SPI command, they see sensible behavior. A reasonable person would expect to be able to use the language's standard error-trapping constructs to trap any error thrown by a SPI call and then continue processing (a la plpgsql exception blocks). Before 8.0 it was impossible to support this behavior, and what we actually did was, in effect, to seal off the Tcl or Perl function so that it couldn't touch the database state --- after the first SPI error all subsequent SPI operations would fail immediately until control exited the Tcl or Perl function, whereupon the error would be re-thrown. So you could try to trap an error but you couldn't do anything useful after having done so, and you couldn't prevent it from aborting the surrounding transaction. I feel that behavior was obviously bogus and cannot be justified simply on grounds of efficiency. A wise man once said "I can make this program arbitrarily fast ... if it doesn't have to give the right answer"; I think that applies here. The semantics I want to see are that catch/eval can trap errors and continue processing, and given the tools we have at the moment that requires a subtransaction per SPI call. We can think about ways to optimize this later, but I'm not putting up with the broken semantics any longer than I have to. In the case of Perl I suspect it is reasonably possible to determine whether there is an "eval" surrounding the call or not, although we might have to get more friendly with Perl's internal data structures than a purist would like. In the case of Tcl I'm not sure this is really going to be feasible :-(, because AFAICS the interpreter state is encoded as a series of return addresses buried on the stack; and even if you could detect the standard "catch" function you couldn't be sure what other custom-built Tcl statements might have catch-like functionality. But perhaps for Tcl we could think in terms of optimizations like continuing one subtransaction across multiple SPI commands as long as there's no failure. Jan also suggested the possibility of replacing the standard "catch" command, which might be good enough (though the prospect of nonstandard catch-like commands worries me). regards, tom lane
Thomas Hallgren <thhal@mailblocks.com> writes: > I don't understand this either. Why a subtransaction at all? > Don't get me wrong. I fully understand that a subtransaction would make > error recovery possible. What I try to say is that the kind of error > recovery that needs a subtransaction is fairly, or perhaps even very, rare. On what evidence do you base that claim? It's true there are no existing Tcl or Perl functions that do error recovery from SPI operations, because it doesn't work in existing releases. That does not mean the demand is not there. We certainly got beat up on often enough about the lack of error trapping in plpgsql. > or I can write: > Savepoint sp = myConn->setSavepoint("foo"); > try > { > // do something > sp.commit(); > } > catch(SQLException e) > { > sp.rollback(); > // Handle error and continue execution. > } [ shrug... ] If you intend to design pljava that way I can't stop you. But I think it's a bogus design, because (a) it puts extra burden on the function author who's already got enough things to worry about, and (b) since you can't support arbitrary rollback patterns, you have to contort the semantics of Savepoint objects with restrictions that are both hard to design correctly and complicated to enforce. I don't believe you should do language design on the basis of avoiding a 25% overhead, especially not when there's every reason to think that number can be reduced in future releases. I got it down from 50% to 25% in one afternoon, doing nothing that seemed too risky for late beta. I think there's plenty more that can be done there when we have more time to work on it. regards, tom lane
Tom Lane wrote: > >In the case of Perl I suspect it is reasonably possible to determine >whether there is an "eval" surrounding the call or not, although we >might have to get more friendly with Perl's internal data structures >than a purist would like. > > Not really very hard. (caller(0))[3] should have the value "(eval)" if you are in an eval. There might also be some ways of getting this via the perlguts API although I'm not aware of it. Of course, if you're in a subroutine which is in turn called from an eval things get trickier, so we might have to walk the stack frames a bit. cheers andrew
Tom Lane wrote: >On what evidence do you base that claim? It's true there are no >existing Tcl or Perl functions that do error recovery from SPI >operations, because it doesn't work in existing releases. That does >not mean the demand is not there. We certainly got beat up on often >enough about the lack of error trapping in plpgsql. > > Lack of error trapping is one thing. To state that all error trapping will do further accesses to the database is another altogether. I don't have evidence for my claim since subtransactions hasn't been available for that long but it's a pretty strong hunch. And the fact that all current PostgreSQL functions out there works this way today should count for something. Your suggestion will make the current code base significantly slower, IMO for no reason. >[ shrug... ] If you intend to design pljava that way I can't stop you. >But I think it's a bogus design, because (a) it puts extra burden on the >function author who's already got enough things to worry about > So it's an extra burden to create a savepoint, and commit/rollback depending on the outcome? I'm sorry, but I have to disagree with that. I think it's a powerful concept that developers will want to exploit. Confusing try/catch with subtransactions is bogus and not an option for me as I don't have the liberty of changing the language. A strong argument for my design is that if I where to write similar code in the client using a the JDBC driver, this is exactly what I'd have to do. Why should code look any different just because I move it to the backend? So, I can't see the extra burden at all. This approach brings clarity, no magic, and it enables ports of languages where SQL access has been standardized to actually conform to that standard. That's most certainly not bogus! >(b) since you can't support arbitrary rollback patterns, you have to >contort the semantics of Savepoint objects with restrictions that are >both hard to design correctly and complicated to enforce. > > On the contrary. It's very easy to enforce and PL/Java already does this. The design is simple and clean. Savepoints are prohibited to live beyond the invocation where they where created. If a savepoint is still active when an invocation exits, the savepoint is released or rolled back (depending on a GUC setting) and a warning is printed. Here I have a couple of questions to you:From your statement it sounds like you want to use the subtransactions solely in a hidden mechanism and completely remove the ability to use them from the function developer. Is that a correct interpretation? Another question relating to a statement you made earlier. You claim that an SPI call should check to see if it it is in a subtransaction and only enter a new one if that's not the case. How do you in that case intend to keep track of where the subtransaction started? I.e. how far up in nesting levels do you need to jump before you reach the right place? My argument is that whenever possible, you must let the creator of a subtransaction have the responsibility to commit or roll it back. >I don't believe you should do language design on the basis of avoiding >a 25% overhead > I don't do language design. I'm adhering to the JDBC standard and I have no way of enforcing magic code to be executed during try/catch. Meanwhile, I really want PL/Java developers to have the ability to make full use of savepoints. >I got it down from 50% to 25% >in one afternoon, doing nothing that seemed too risky for late beta. >I think there's plenty more that can be done there when we have more >time to work on it. > > That's great. But even if you come down to 10% overhead it doesn't really change anything. Regards, Thomas Hallgren
Thomas Hallgren <thhal@mailblocks.com> writes: > From your statement it sounds like you want to use the subtransactions > solely in a hidden mechanism and completely remove the ability to use > them from the function developer. Is that a correct interpretation? No; I would like to develop the ability to specify savepoints in pltcl and plperl, so that already-executed SPI commands can be rolled back at need. But that is a feature for later --- it's way too late to think about it for 8.0. Moreover, having that will not remove the requirement for the state after catching a SPI error to be sane. The fundamental point you are missing, IMHO, is that a savepoint is a mechanism for rolling back *already executed* SPI commands when the function author wishes that to happen. A failure in an individual command should not leave the function in a broken state. regards, tom lane
While your message was directed at Thomas, I think I share Thomas' position; well, for the most part. On Tue, 2004-11-30 at 11:21 -0500, Tom Lane wrote: > But I think it's a bogus design, because (a) it puts extra burden on the > function author who's already got enough things to worry about, and Simply put, IMO, a subtransaction is != an exception, and shouldn't be treated as one. If the author wishes to worry about transaction management that is his worry. I don't feel the extra "burden" is significant enough to justify hacking around in the Python interpreter(assuming that it's possible in the first place). Personally, I think the decision is fine for plpgsql, but not for Python, or just about any other language. plpgsql is a special case, IMO. > (b) since you can't support arbitrary rollback patterns, you have to > contort the semantics of Savepoint objects with restrictions that are > both hard to design correctly and complicated to enforce. Hrm, isn't this what savepoint levels are supposed to do? Impose those restrictions? I'm guessing Postgres doesn't have savepoint levels yet, per lack of response to my message inquiring about them(well, a "savepoint scoping facility"), and poking around xact.h not revealing anything either. I think I may hold a more of a hold nose stance here than Thomas. I am not sure if I want to implement savepoint/rollback restrictions as I can't help but feel this is something Postgres should handle; not me or any other PL or C Function author. plpy being an untrusted language, I *ultimately* do not have control over this. I can only specify things within my code. I *cannot* stop a user from making an extension module that draws interfaces to those routines that may rollback to a savepoint defined by the caller. (Not a great point, as a user could also try to dereference a NULL pointer from an extension module as well. ;) I feel if I were to implement such restrictions/regulations it would be analogous to a security guard trying to enforce the law, whereas a real police officer is needed.. ;-) -- Regards, James William Pye
James William Pye wrote: >I think I may hold a more of a hold nose stance here than Thomas. I am >not sure if I want to implement savepoint/rollback restrictions as I >can't help but feel this is something Postgres should handle; not me or >any other PL or C Function author. > > I agree with this but it was simple enough to implement. I'll of course remove my own implementation should PostgreSQL handle this in the future . Regards, Thomas Hallgren
James William Pye <flaw@rhid.com> writes: > plpy being an untrusted language, I *ultimately* do not have control > over this. I can only specify things within my code. I *cannot* stop a > user from making an extension module that draws interfaces to those > routines that may rollback to a savepoint defined by the caller. In which case, whether it works or not is his problem not yours ;-) This is a straw-man argument, as is the entire discussion IMHO. Wrapping each individual SPI command in a subtransaction IN NO WAY prevents us from adding programmer-controllable savepoint features to the PL languages later. It simply ensures that we have somewhat sane error recovery behavior in the meantime. The only valid argument against doing it is the one of added overhead, and I already gave my responses to that one. regards, tom lane
Tom Lane wrote: >The fundamental point you are missing, IMHO, is that a savepoint is a >mechanism for rolling back *already executed* SPI commands when the >function author wishes that to happen. > Of course. That's why it's imperative that it is the developer that defines the boundaries. I forsee that it will be very common that the author wishes this to happen due to a failure of some kind. But sure, there might be other reasons too. >A failure in an individual >command should not leave the function in a broken state. > > Well, if the function doesn't continue, there's not much point in doing repair work, is there? And that's the essence of the whole discussion. You say: Let's always take the overhead of adding a subtransaction so that the caller will be able to return to a known state, regardless if he wants to do so. I say: Let the caller decide when to add this overhead since he is the one who knows a) when it's indeed needed at all and b) where to best define the boundaries. Regards, Thomas Hallgren
Tom Lane wrote: > Wrapping each individual SPI command in a subtransaction IN NO WAY > prevents us from adding programmer-controllable savepoint features > to the PL languages later. Ah good - I was coming to the conclusion savepoints/exception handling were both separately necessary. > It simply ensures that we have somewhat > sane error recovery behavior in the meantime. The only valid argument > against doing it is the one of added overhead, and I already gave my > responses to that one. The bit I still don't get is how the subtrans-per-spi gets us try/catch functionality. INSERT 1 INSERT 2 try { INSERT 3 INSERT 4 } catch WHATEVER { INSERT 5 INSERT 6 } So - here we (well I) would expect to see 1,2,3,4 or 1,2,5,6. That means if #4 fails we need to rollback to a savepoint before #3. But the problem is that we don't know whether we are in the try block, otherwise we'd just start a savepoint there and sidestep the whole issue. That means the only safe action is to rollback the transaction. We can't even just write to a log table and raise our own exception, since the calling function then won't know what to do. I'm worried that non-intuitive behaviour here is strapping the gun to our foot. It's going to introduce peculiarities in code-paths that are likely to go untested until it's too late. Can I make some counter-proposals? 1. Wrap each function body/call (same thing here afaict) in a sub-transaction. An exception can be caught within that function, and all the spi in that function is then rolled back. This is rubbish, but at least it's predictable and allows you to write to a log table and throw another exception. 2. For pl/tcl introduce a pgtry { } catch { } which just starts a sub-transaction and does standard try/catch. I don't use TCL, but from the little I know this should be straightforward. 3. We can do something similar with a pgeval() in plperl. Don't know enough to say about Python. Basically, if exception handling doesn't work the way it should intuitively work (IMHO plpgsql's model) then I'd rather wait until 8.1 -- Richard Huxton Archonet Ltd
Richard Huxton wrote: > Can I make some counter-proposals? > > 1. Wrap each function body/call (same thing here afaict) in a > sub-transaction. An exception can be caught within that function, and > all the spi in that function is then rolled back. This is rubbish, but > at least it's predictable and allows you to write to a log table and > throw another exception. This will be even worse since it will impose the subtransaction overhead on everything, even functions that never do any database access. Perhaps this approach would be feasible if imposed on volatile functions only, but then again, the volatility of a function cannot be trusted since we have no way of defining a "stable but with side effects" type. > 2. For pl/tcl introduce a pgtry { } catch { } which just starts a > sub-transaction and does standard try/catch. I don't use TCL, but from > the little I know this should be straightforward. If you know how to use special constructs like this, what's wrong with actually using savepoints verbatim? I.e. INSERT 1 INSERT 2 SAVEPOINT foo try { INSERT 3 INSERT 4 RELEASE foo } catch WHATEVER { ROLLBACK TO foo INSERT 5 INSERT 6 } IMHO a very clean, sensible, and easily understood approach that doesn't clobber the language. Regards, Thomas Hallgren
Thomas Hallgren wrote: > Richard Huxton wrote: > >> Can I make some counter-proposals? >> >> 1. Wrap each function body/call (same thing here afaict) in a >> sub-transaction. An exception can be caught within that function, and >> all the spi in that function is then rolled back. This is rubbish, but >> at least it's predictable and allows you to write to a log table and >> throw another exception. > > > This will be even worse since it will impose the subtransaction overhead > on everything, even functions that never do any database access. Perhaps > this approach would be feasible if imposed on volatile functions only, > but then again, the volatility of a function cannot be trusted since we > have no way of defining a "stable but with side effects" type. Actually, I was thinking of setting a flag and then on the first SPI call start the subtrans. >> 2. For pl/tcl introduce a pgtry { } catch { } which just starts a >> sub-transaction and does standard try/catch. I don't use TCL, but from >> the little I know this should be straightforward. > > > If you know how to use special constructs like this, what's wrong with > actually using savepoints verbatim? I.e. > > INSERT 1 > INSERT 2 > SAVEPOINT foo > try { > INSERT 3 > INSERT 4 > RELEASE foo > } > catch WHATEVER { > ROLLBACK TO foo > INSERT 5 > INSERT 6 > } > > IMHO a very clean, sensible, and easily understood approach that doesn't > clobber the language. But is the problem not that forgetting to use SAVEPOINT can get us in trouble with clearing up after an exception? That's the main thrust of Tom's per-statement stuff AFAICT. And again, you're not going to see the problem until an exception is thrown. -- Richard Huxton Archonet Ltd
Richard Huxton wrote: > But is the problem not that forgetting to use SAVEPOINT can get us in > trouble with clearing up after an exception? I fail to see how that's different from forgetting to use pgtry instead of try. Regards, Thomas Hallgren
Thomas Hallgren wrote: > Richard Huxton wrote: > >> But is the problem not that forgetting to use SAVEPOINT can get us in >> trouble with clearing up after an exception? > > I fail to see how that's different from forgetting to use pgtry instead > of try. It feels more distinct to me. I'll grant you I'm only a sample size of 1 though. -- Richard Huxton Archonet Ltd
On Wed, 01 Dec 2004 10:29:17 +0100, Thomas Hallgren <thhal@mailblocks.com> wrote: > Richard Huxton wrote: > > > But is the problem not that forgetting to use SAVEPOINT can get us in > > trouble with clearing up after an exception? > > I fail to see how that's different from forgetting to use pgtry instead > of try. I see that as a non-starter. At least in the case of perl, we can actually hide pgeval behind the standard eval. If pgeval were equivelant to, say, 'savepoint("foo"); CORE::eval @_;' then the onus is still on the developer to use 'eval', but that is a familiar concept to defensive developers. -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org
On 12/1/2004 4:27 AM, Richard Huxton wrote: > Thomas Hallgren wrote: >> Richard Huxton wrote: >> >>> Can I make some counter-proposals? >>> >>> 1. Wrap each function body/call (same thing here afaict) in a >>> sub-transaction. An exception can be caught within that function, and >>> all the spi in that function is then rolled back. This is rubbish, but >>> at least it's predictable and allows you to write to a log table and >>> throw another exception. >> >> >> This will be even worse since it will impose the subtransaction overhead >> on everything, even functions that never do any database access. Perhaps >> this approach would be feasible if imposed on volatile functions only, >> but then again, the volatility of a function cannot be trusted since we >> have no way of defining a "stable but with side effects" type. > > Actually, I was thinking of setting a flag and then on the first SPI > call start the subtrans. > >>> 2. For pl/tcl introduce a pgtry { } catch { } which just starts a >>> sub-transaction and does standard try/catch. I don't use TCL, but from >>> the little I know this should be straightforward. >> >> >> If you know how to use special constructs like this, what's wrong with >> actually using savepoints verbatim? I.e. >> >> INSERT 1 >> INSERT 2 >> SAVEPOINT foo >> try { >> INSERT 3 >> INSERT 4 >> RELEASE foo >> } >> catch WHATEVER { >> ROLLBACK TO foo >> INSERT 5 >> INSERT 6 >> } >> >> IMHO a very clean, sensible, and easily understood approach that doesn't >> clobber the language. > > But is the problem not that forgetting to use SAVEPOINT can get us in > trouble with clearing up after an exception? That's the main thrust of > Tom's per-statement stuff AFAICT. And again, you're not going to see the > problem until an exception is thrown. I think the following would a) be a drop in replacement without any side effects or performance impact for PL/Tcl functions not using "catch" and b) give "catch" a sensible and correct behaviour. One can _replace_ the Tcl catch command with his own C function. This can be done during the interpreter initialization when loading the PL/Tcl module. The new catch would push a status NEED_SUBTRANS onto a stack call Tcl_Eval() for the first command argument if TCL_ERROR { popstatus from stack if popped status == HAVE_SUBTRANS { rollback subtransaction } if a secondargument exists { store interpreter result in variable } return TCL_ERROR } pop statusfrom stack if popped status == HAVE_SUBTRANS { commit subtransaction } return result code from Tcl_Eval() The spi functions check if the top stack entry (if there is one) is NEED_SUBTRANS. If so, they start a subtrans and change the status to HAVE_SUBTRANS. This all would mean that however deeply nested a function call tree, it would unwind and rollback everything up to the outermost catch. If there is no catch used, no subtransactions are created and the unwinding goes all the way up to the statement. If catch is used but no spi access performed inside, no subtransaction overhead either. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck wrote: > This all would mean that however deeply nested a function call tree, > it would unwind and rollback everything up to the outermost catch. If > there is no catch used, no subtransactions are created and the > unwinding goes all the way up to the statement. If catch is used but > no spi access performed inside, no subtransaction overhead either. Yes, this makes a lot of sense. No overhead unless you want to. Way to go. I wish I could do the same in PL/Java. Regards, Thomas Hallgren
On 12/1/2004 9:23 AM, Jan Wieck wrote: > On 12/1/2004 4:27 AM, Richard Huxton wrote: > >> Thomas Hallgren wrote: >>> Richard Huxton wrote: >>> >>>> Can I make some counter-proposals? >>>> >>>> 1. Wrap each function body/call (same thing here afaict) in a >>>> sub-transaction. An exception can be caught within that function, and >>>> all the spi in that function is then rolled back. This is rubbish, but >>>> at least it's predictable and allows you to write to a log table and >>>> throw another exception. >>> >>> >>> This will be even worse since it will impose the subtransaction overhead >>> on everything, even functions that never do any database access. Perhaps >>> this approach would be feasible if imposed on volatile functions only, >>> but then again, the volatility of a function cannot be trusted since we >>> have no way of defining a "stable but with side effects" type. >> >> Actually, I was thinking of setting a flag and then on the first SPI >> call start the subtrans. >> >>>> 2. For pl/tcl introduce a pgtry { } catch { } which just starts a >>>> sub-transaction and does standard try/catch. I don't use TCL, but from >>>> the little I know this should be straightforward. >>> >>> >>> If you know how to use special constructs like this, what's wrong with >>> actually using savepoints verbatim? I.e. >>> >>> INSERT 1 >>> INSERT 2 >>> SAVEPOINT foo >>> try { >>> INSERT 3 >>> INSERT 4 >>> RELEASE foo >>> } >>> catch WHATEVER { >>> ROLLBACK TO foo >>> INSERT 5 >>> INSERT 6 >>> } >>> >>> IMHO a very clean, sensible, and easily understood approach that doesn't >>> clobber the language. >> >> But is the problem not that forgetting to use SAVEPOINT can get us in >> trouble with clearing up after an exception? That's the main thrust of >> Tom's per-statement stuff AFAICT. And again, you're not going to see the >> problem until an exception is thrown. > > I think the following would a) be a drop in replacement without any side > effects or performance impact for PL/Tcl functions not using "catch" and > b) give "catch" a sensible and correct behaviour. > > One can _replace_ the Tcl catch command with his own C function. This > can be done during the interpreter initialization when loading the > PL/Tcl module. The new catch would > > push a status NEED_SUBTRANS onto a stack > call Tcl_Eval() for the first command argument > if TCL_ERROR { > pop status from stack > if popped status == HAVE_SUBTRANS { > rollback subtransaction > } > if a second argument exists { > store interpreter result in variable > } > return TCL_ERROR er ... no ... must return a true boolean with TCL_OK here > } > pop status from stack > if popped status == HAVE_SUBTRANS { > commit subtransaction > } > > return result code from Tcl_Eval() and here it must put a false boolean into the Tcl result ... not 100% sure about the result code. Must check if it's possible to return or break from inside a catch block ... if not, then catch allways turns the internal result code into TCL_OK. Anyhow, you get the idea. Jan > > The spi functions check if the top stack entry (if there is one) is > NEED_SUBTRANS. If so, they start a subtrans and change the status to > HAVE_SUBTRANS. > > This all would mean that however deeply nested a function call tree, it > would unwind and rollback everything up to the outermost catch. If there > is no catch used, no subtransactions are created and the unwinding goes > all the way up to the statement. If catch is used but no spi access > performed inside, no subtransaction overhead either. > > > Jan > -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Wednesday 01 December 2004 04:12, Thomas Hallgren wrote: > Richard Huxton wrote: > > Can I make some counter-proposals? > > > > 1. Wrap each function body/call (same thing here afaict) in a > > sub-transaction. An exception can be caught within that function, and > > all the spi in that function is then rolled back. This is rubbish, but > > at least it's predictable and allows you to write to a log table and > > throw another exception. > > This will be even worse since it will impose the subtransaction overhead > on everything, even functions that never do any database access. Perhaps > this approach would be feasible if imposed on volatile functions only, > but then again, the volatility of a function cannot be trusted since we > have no way of defining a "stable but with side effects" type. > Agreed. > > 2. For pl/tcl introduce a pgtry { } catch { } which just starts a > > sub-transaction and does standard try/catch. I don't use TCL, but from > > the little I know this should be straightforward. > > If you know how to use special constructs like this, what's wrong with > actually using savepoints verbatim? I.e. > > INSERT 1 > INSERT 2 > SAVEPOINT foo > try { > INSERT 3 > INSERT 4 > RELEASE foo > } > catch WHATEVER { > ROLLBACK TO foo > INSERT 5 > INSERT 6 > } > > IMHO a very clean, sensible, and easily understood approach that doesn't > clobber the language. > Agreed. The fewer special constructs the better imho. -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
--- Jan Wieck <JanWieck@Yahoo.com> wrote: > On 12/1/2004 9:23 AM, Jan Wieck wrote: > > > On 12/1/2004 4:27 AM, Richard Huxton wrote: > > > >> Thomas Hallgren wrote: > >>> Richard Huxton wrote: > >>> > >>>> Can I make some counter-proposals? > >>>> > >>>> 1. Wrap each function body/call (same thing > here afaict) in a > >>>> sub-transaction. An exception can be caught > within that function, and > >>>> all the spi in that function is then rolled > back. This is rubbish, but > >>>> at least it's predictable and allows you to > write to a log table and > >>>> throw another exception. > >>> > >>> > >>> This will be even worse since it will impose the > subtransaction overhead > >>> on everything, even functions that never do any > database access. Perhaps > >>> this approach would be feasible if imposed on > volatile functions only, > >>> but then again, the volatility of a function > cannot be trusted since we > >>> have no way of defining a "stable but with side > effects" type. > >> > >> Actually, I was thinking of setting a flag and > then on the first SPI > >> call start the subtrans. > >> > >>>> 2. For pl/tcl introduce a pgtry { } catch { } > which just starts a > >>>> sub-transaction and does standard try/catch. I > don't use TCL, but from > >>>> the little I know this should be > straightforward. > >>> > >>> > >>> If you know how to use special constructs like > this, what's wrong with > >>> actually using savepoints verbatim? I.e. > >>> > >>> INSERT 1 > >>> INSERT 2 > >>> SAVEPOINT foo > >>> try { > >>> INSERT 3 > >>> INSERT 4 > >>> RELEASE foo > >>> } > >>> catch WHATEVER { > >>> ROLLBACK TO foo > >>> INSERT 5 > >>> INSERT 6 > >>> } > >>> > >>> IMHO a very clean, sensible, and easily > understood approach that doesn't > >>> clobber the language. > >> > >> But is the problem not that forgetting to use > SAVEPOINT can get us in > >> trouble with clearing up after an exception? > That's the main thrust of > >> Tom's per-statement stuff AFAICT. And again, > you're not going to see the > >> problem until an exception is thrown. > > > > I think the following would a) be a drop in > replacement without any side > > effects or performance impact for PL/Tcl functions > not using "catch" and > > b) give "catch" a sensible and correct behaviour. > > > > One can _replace_ the Tcl catch command with his > own C function. This > > can be done during the interpreter initialization > when loading the > > PL/Tcl module. The new catch would > > > > push a status NEED_SUBTRANS onto a stack > > call Tcl_Eval() for the first command > argument > > if TCL_ERROR { > > pop status from stack > > if popped status == HAVE_SUBTRANS { > > rollback subtransaction > > } > > if a second argument exists { > > store interpreter result in variable > > } > > return TCL_ERROR > > er ... no ... must return a true boolean with TCL_OK > here > > > } > > pop status from stack > > if popped status == HAVE_SUBTRANS { > > commit subtransaction > > } > > > > return result code from Tcl_Eval() > > and here it must put a false boolean into the Tcl > result ... not 100% > sure about the result code. Must check if it's > possible to return or > break from inside a catch block ... if not, then > catch allways turns the > internal result code into TCL_OK. Anyhow, you get > the idea. > Yes, you can have break, return in a catch statement...it would return the exception code for that statement (i.e. TCL_BREAK, TCL_RETURN). I like this proposal, just as long as it behaves exactly like Tcl's catch when there is no SPI function call. --brett
Richard Huxton, > It feels more distinct to me. I'll grant you I'm only a sample size of > 1 though. Perhaps more distinct, but: - Using savepoints together with try/catch is not exactly an unknown concept. Try Google and you'll see a fair amount of examples advocating the approach that I suggest. - If I have to learn yet another new thing, I'd like to learn how to use savepoints since that knowledge can be used everywhere. - There's no such thing as a pgtry in the Tcl language (nor in any other language), thus you change the language as such. - Tcl code will look different depending on if it's client code or code residing in the backend. I.e. the construct is not portable. Then again, perhaps the Tcl bindings are very different anyway so that argument may be less important. For PL/Java it makes a lot of sense since the client and server implementation uses a common set of interfaces. Regards, Thomas Hallgren
On 12/1/2004 1:35 PM, Brett Schwarz wrote: > --- Jan Wieck <JanWieck@Yahoo.com> wrote: > >> On 12/1/2004 9:23 AM, Jan Wieck wrote: >> >> > On 12/1/2004 4:27 AM, Richard Huxton wrote: >> > >> >> Thomas Hallgren wrote: >> >>> Richard Huxton wrote: >> >>> >> >>>> Can I make some counter-proposals? >> >>>> >> >>>> 1. Wrap each function body/call (same thing >> here afaict) in a >> >>>> sub-transaction. An exception can be caught >> within that function, and >> >>>> all the spi in that function is then rolled >> back. This is rubbish, but >> >>>> at least it's predictable and allows you to >> write to a log table and >> >>>> throw another exception. >> >>> >> >>> >> >>> This will be even worse since it will impose the >> subtransaction overhead >> >>> on everything, even functions that never do any >> database access. Perhaps >> >>> this approach would be feasible if imposed on >> volatile functions only, >> >>> but then again, the volatility of a function >> cannot be trusted since we >> >>> have no way of defining a "stable but with side >> effects" type. >> >> >> >> Actually, I was thinking of setting a flag and >> then on the first SPI >> >> call start the subtrans. >> >> >> >>>> 2. For pl/tcl introduce a pgtry { } catch { } >> which just starts a >> >>>> sub-transaction and does standard try/catch. I >> don't use TCL, but from >> >>>> the little I know this should be >> straightforward. >> >>> >> >>> >> >>> If you know how to use special constructs like >> this, what's wrong with >> >>> actually using savepoints verbatim? I.e. >> >>> >> >>> INSERT 1 >> >>> INSERT 2 >> >>> SAVEPOINT foo >> >>> try { >> >>> INSERT 3 >> >>> INSERT 4 >> >>> RELEASE foo >> >>> } >> >>> catch WHATEVER { >> >>> ROLLBACK TO foo >> >>> INSERT 5 >> >>> INSERT 6 >> >>> } >> >>> >> >>> IMHO a very clean, sensible, and easily >> understood approach that doesn't >> >>> clobber the language. >> >> >> >> But is the problem not that forgetting to use >> SAVEPOINT can get us in >> >> trouble with clearing up after an exception? >> That's the main thrust of >> >> Tom's per-statement stuff AFAICT. And again, >> you're not going to see the >> >> problem until an exception is thrown. >> > >> > I think the following would a) be a drop in >> replacement without any side >> > effects or performance impact for PL/Tcl functions >> not using "catch" and >> > b) give "catch" a sensible and correct behaviour. >> > >> > One can _replace_ the Tcl catch command with his >> own C function. This >> > can be done during the interpreter initialization >> when loading the >> > PL/Tcl module. The new catch would >> > >> > push a status NEED_SUBTRANS onto a stack >> > call Tcl_Eval() for the first command >> argument >> > if TCL_ERROR { >> > pop status from stack >> > if popped status == HAVE_SUBTRANS { >> > rollback subtransaction >> > } >> > if a second argument exists { >> > store interpreter result in variable >> > } >> > return TCL_ERROR >> >> er ... no ... must return a true boolean with TCL_OK >> here >> >> > } >> > pop status from stack >> > if popped status == HAVE_SUBTRANS { >> > commit subtransaction >> > } >> > >> > return result code from Tcl_Eval() >> >> and here it must put a false boolean into the Tcl >> result ... not 100% >> sure about the result code. Must check if it's >> possible to return or >> break from inside a catch block ... if not, then >> catch allways turns the >> internal result code into TCL_OK. Anyhow, you get >> the idea. >> > > Yes, you can have break, return in a catch > statement...it would return the exception code for > that statement (i.e. TCL_BREAK, TCL_RETURN). Yeah ... little tests are nice :-) catch allways returns the numeric Tcl result status, with TCL_OK being 0, TCL_ERROR being 1 and so on. > > I like this proposal, just as long as it behaves > exactly like Tcl's catch when there is no SPI function > call. That's what I intended, plus that the catch-nesting automatically represents the subtransaction nesting. I can't really see any reason why those two should not be bound together. Does anybody? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan, > ... plus that the catch-nesting automatically represents the > subtransaction nesting. I can't really see any reason why those two > should not be bound together. Does anybody? That depends on what you mean. As a stop-gap solution, cerntanly. But in the long run, I still think that savepoints and exception handling should be kept separate. Consider the following two examples: savepoint a spi calls savepoint b spi calls savepoint c spi calls switch(some test) {case 1: rollback b; commit a; break; case 2: rollback c; commit a; break; case 3: rollback a; break; default: commit a; } or nested try/catch where the catch doesn't access the database: foo() { try { spi calls; } catch { set some status; re-throw; } } and some other place in the code: savepoint a try { spi calls; for(i = 0; i < 100; ++i) foo(); commit a; } catch { rollback a; } If "normal" savepoint hanling is disabled here in favor of your suggestion, you will get 101 subtransations although only 1 is relevant. I still think that the concept of savepoints is fairly easy to understand. Using it together with exception handling is a common and well known concept and we can make it even more so by providing good documentation and examples. Regards, Thomas Hallgren
On 12/2/2004 3:18 AM, Thomas Hallgren wrote: > Jan, > >> ... plus that the catch-nesting automatically represents the >> subtransaction nesting. I can't really see any reason why those two >> should not be bound together. Does anybody? > > That depends on what you mean. As a stop-gap solution, cerntanly. But in > the long run, I still think that savepoints and exception handling > should be kept separate. Consider the following two examples: > > savepoint a > spi calls > savepoint b > spi calls > savepoint c > spi calls > > switch(some test) > { > case 1: > rollback b; > commit a; > break; > case 2: > rollback c; > commit a; > break; > case 3: > rollback a; > break; > default: > commit a; > } I don't know, but doing a lot of work only to later decide to throw it away doesn't strike me as a good programming style. Some test should be done before performing the work. > > or nested try/catch where the catch doesn't access the database: There is no "try" in Tcl. The syntax is catch { block-of-commands } [variable-name] Catch returns a numeric result, which is 0 if there was no exception thrown inside of the block-of-commands. The interpreter result, which would be the exceptions error message in cleartext, is assigned to the optional variable specified. Thus, your code usually looks like this: if {[catch {statements-that-might-fail} err]} { on-error-action } else { on-success-action } > > foo() > { > try > { > spi calls; > } > catch > { > set some status; > re-throw; > } > } > > and some other place in the code: > > savepoint a > try > { > spi calls; > for(i = 0; i < 100; ++i) > foo(); > commit a; > } > catch > { > rollback a; > } > > If "normal" savepoint hanling is disabled here in favor of your > suggestion, you will get 101 subtransations although only 1 is relevant. Your example shows where leaving the burdon on the programmer can improve performance. But change it to this: foo {} { spi-calls; if {[catch {spi-call} err]} { return "boo: $err" } return "hooray" } This function never throws any exception. And any normal Tcl programmer would expect that the spi-calls done before the catch will either abort the function on exception, or if they succeed, they get committed. What you mean with "normal" savepoint handling in fact means that we don't change catch at all but just expose the savepoint feature on the Tcl level. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck wrote: > There is no "try" in Tcl. > > The syntax is > > catch { block-of-commands } [variable-name] > > Catch returns a numeric result, which is 0 if there was no exception > thrown inside of the block-of-commands. The interpreter result, which > would be the exceptions error message in cleartext, is assigned to the > optional variable specified. Thus, your code usually looks like this: > > if {[catch {statements-that-might-fail} err]} { > on-error-action > } else { > on-success-action > } Ok, I wasn't trying to write tcl ;-) just pseudo code proving a point. This particular point is only valid until you expose the savepoint API's (as you now suggest) though, so no disagreement there. > Your example shows where leaving the burdon on the programmer can > improve performance. But change it to this: > > foo {} { > spi-calls; > > if {[catch {spi-call} err]} { > return "boo: $err" > } > return "hooray" > } > > This function never throws any exception. And any normal Tcl > programmer would expect that the spi-calls done before the catch will > either abort the function on exception, or if they succeed, they get > committed. What you mean with "normal" savepoint handling in fact > means that we don't change catch at all but just expose the savepoint > feature on the Tcl level. Maybe Tcl programmers use catch very differently from what I'm used to with try/catch in C++, C#, and Java. There, it's very common that you use a catch to make sure that resources that you've utilized are freed up, to do error logging, and to deal with errors that are recoverable. If a catch containing an spi-function automatically implies a subtransaction, then it might affect how people design their code since the subtransaction is much more expensive then a mere catch. Ideally, in a scenario where the caller of foo also calls other functions and want to treat the whole call chain as a atomic, he would start a subtransaction and do all of those calls within one catch where an error condition would yield a rollback. Within each function he still might want to catch code that eventually contains spi-calls but not for the purpose of rolling back. The error condition is perhaps not even caused by the spi-call but by something else that happened within the same block of code. If it's unrecoverable, then he re-throws the error of course. The catch functionality is likely to be lean and mean. Implied subtransactions will make it slower and thus not as suitable for control flow as it normally would be. Where I come from, frequent use of try/catch is encouraged since it results in good program design. I'm concerned that what you are suggesting will make developers think twice before they use a catch since they know what's implied. I still believe that both catch (with try or no try) and savepoints are simple and well known concepts that will benefit from being kept separate. Regards, Thomas Hallgren
On 12/3/2004 12:23 PM, Thomas Hallgren wrote: > Jan Wieck wrote: > >> There is no "try" in Tcl. >> >> The syntax is >> >> catch { block-of-commands } [variable-name] >> >> Catch returns a numeric result, which is 0 if there was no exception >> thrown inside of the block-of-commands. The interpreter result, which >> would be the exceptions error message in cleartext, is assigned to the >> optional variable specified. Thus, your code usually looks like this: >> >> if {[catch {statements-that-might-fail} err]} { >> on-error-action >> } else { >> on-success-action >> } > > Ok, I wasn't trying to write tcl ;-) just pseudo code proving a point. > This particular point is only valid until you expose the savepoint API's > (as you now suggest) though, so no disagreement there. "as you now suggest"? I don't remember suggesting that. I concluded from your statements that _you_ are against changing Tcl's catch but instead want the savepoint functionality exposed to plain Tcl. So _you_ are against _my_ suggestion because these two are mutually exclusive. > Maybe Tcl programmers use catch very differently from what I'm used to > with try/catch in C++, C#, and Java. There, it's very common that you > use a catch to make sure that resources that you've utilized are freed > up, to do error logging, and to deal with errors that are recoverable. > > If a catch containing an spi-function automatically implies a > subtransaction, then it might affect how people design their code since > the subtransaction is much more expensive then a mere catch. > > Ideally, in a scenario where the caller of foo also calls other > functions and want to treat the whole call chain as a atomic, he would > start a subtransaction and do all of those calls within one catch where > an error condition would yield a rollback. Within each function he still > might want to catch code that eventually contains spi-calls but not for > the purpose of rolling back. The error condition is perhaps not even > caused by the spi-call but by something else that happened within the > same block of code. If it's unrecoverable, then he re-throws the error > of course. You want the capabilities of C or Assembler (including all possible failures that lead to corruptions) in a trusted procedural language. I call that far from "ideal". > > The catch functionality is likely to be lean and mean. Implied > subtransactions will make it slower and thus not as suitable for control > flow as it normally would be. Where I come from, frequent use of > try/catch is encouraged since it results in good program design. I'm > concerned that what you are suggesting will make developers think twice > before they use a catch since they know what's implied. The point we where coming from was Tom's proposal to wrap each and every single SPI call into its own subtransaction for semantic reasons. My proposal was an improvement to that with respect to performance and IMHO also better matching the semantics. Your suggestion to expose a plain savepoint interface to the programmer leads directly to the possiblity to commit a savepoint made by a sub-function in the caller and vice versa - which if I understood Tom correctly is what we need to avoid. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck <JanWieck@Yahoo.com> writes: > Your suggestion to expose a plain savepoint interface to the programmer > leads directly to the possiblity to commit a savepoint made by a > sub-function in the caller and vice versa - which if I understood Tom > correctly is what we need to avoid. If we expose a savepoint-style interface in the PLs, it'll need to be restricted to the cases we can actually support. I don't have a problem with the idea in the abstract, but there was no time to do it for 8.0. I think we can add that on in 8.1, or later, without creating any backwards-compatibility issues compared to where we are now --- at least not for pltcl and plperl. (We might regret having tied subtransactions to exceptions in plpgsql, not sure.) The real issue is whether the required restrictions would be ugly enough that savepoint syntax doesn't seem like a nice API. I thought so when I did the coding for plpgsql, but I'm less sure at the moment. You'd probably have to prototype an implementation to find out for certain. It might be that the only real restriction is to make savepoint names local to functions (a/k/a savepoint levels), which wouldn't be bad at all. regards, tom lane
On Dec 3, 2004, at 2:04 PM, Jan Wieck wrote: [snip] > > The point we where coming from was Tom's proposal to wrap each and > every single SPI call into its own subtransaction for semantic > reasons. My proposal was an improvement to that with respect to > performance and IMHO also better matching the semantics. > > Your suggestion to expose a plain savepoint interface to the > programmer leads directly to the possiblity to commit a savepoint made > by a sub-function in the caller and vice versa - which if I understood > Tom correctly is what we need to avoid. > The JDBC interface exposes the savepoint interface, via setSavepoint(), releaseSavepoint(), and rollback(Savepoint sp) methods on the Connection, and Thomas's design of PL/Java offers the SPI via mapping it onto JDBC. Would client-side JDBC also suffer from the same potential issue of 'commit a savepoint made by a sub-function'? Or is this something SPI-specific? Or, finally, is this an issue of interacting with other PL languages who won't expose savepoint-ish functionality? IMO, if it smells like JDBC, it oughta smell as close to 100% like JDBC, allowing folks to possibly relocate some of their code to run inside PG. Ugly savepoint handling and all. ---- James Robinson Socialserve.com
James Robinson <jlrobins@socialserve.com> writes: > The JDBC interface exposes the savepoint interface, via setSavepoint(), > releaseSavepoint(), and rollback(Savepoint sp) methods on the > Connection, and Thomas's design of PL/Java offers the SPI via mapping > it onto JDBC. Would client-side JDBC also suffer from the same > potential issue of 'commit a savepoint made by a sub-function'? No, it's not a problem for client-side JDBC, because that's executing in a client thread that's not going to have its state affected by telling the server to roll back some work. The fundamental problem on the server side is keeping rollback from wiping your execution stack and local variables out from under you :-(. > Or is this something SPI-specific? AFAICS the same problem would occur whether the PL used SPI or not; certainly bypassing SPI to use the database engine more directly wouldn't solve it. regards, tom lane
Jan Wieck wrote: > "as you now suggest"? I don't remember suggesting that. I concluded > from your statements that _you_ are against changing Tcl's catch but > instead want the savepoint functionality exposed to plain Tcl. So > _you_ are against _my_ suggestion because these two are mutually > exclusive. I probably misinterpreted what you wrote in your last post where you wrote "What you mean with "normal" savepoint handling in fact means that we don't change catch at all but just expose the savepoint feature on the Tcl level.". I thought you ment that you actually would expose the savepoints in Tcl. > You want the capabilities of C or Assembler (including all possible > failures that lead to corruptions) in a trusted procedural language. I > call that far from "ideal". No I don't. I'm not sure how you came to that conclusion. I'm all for a good, 100% safe design and clean interfaces. > The point we where coming from was Tom's proposal to wrap each and > every single SPI call into its own subtransaction for semantic > reasons. My proposal was an improvement to that with respect to > performance and IMHO also better matching the semantics. As I said earlier, I think you proposal is great as a stop-gap solution for 8.0. But when full savepoint support is enabled using SPI calls, the implementation should change IMHO. > > Your suggestion to expose a plain savepoint interface to the > programmer leads directly to the possiblity to commit a savepoint made > by a sub-function in the caller and vice versa - which if I understood > Tom correctly is what we need to avoid. That particluar scenario is very easy to prevent. You just maintain a savepoint structure that keeps track of function call level. The lifecycle of a savepoint cannot exceed the lifecycle of the invocation where it was created and it cannot be released or rolled back at a higher level. An attemt to do so would yield an unrecoverable error. Regards, Thomas Hallgren