Thread: transaction processing after error in statement
Dear PostgreSQL Gurus, I have been using PostgreSQL for quite a while and always relied on its handling of transaction and concurrency. But recently I discovered a behaviour that deviates quite a lot from how I understand transactions and how things are handled by other databases: HP Allbase, Oracle 8 and .mdb-files (Access). Here's the description: Whenever an error occurs within the transaction, PostgreSQL puts the whole transaction in an *ABORT* state, so that there is no difference at all between COMMITing or ROLLBACKing it. Even commands successfully carried out before the error ocurred are rolled back, even if I COMMIT the transaction, where no error message whatsoever is shown. Example: begin; insert into table1 values (1, 'hello'); --> success! select no from table1; ERROR: Attribute 'no' not found commit;--> success! Why should the insert statement fail, just because there was a typo in the following select statement? I was already carried out successfully, albeit only visible to the current transaction. I found this behaviour to be the same across all 7.x versions of PostgreSQL. Unfortunately, I haven't been able to find an explanation why PostgreSQL behaves like this and why all other RDBMS I tried behave differently. In this case the others make more sense to me. Additionally, I have discovered that phantom reads occur in PostgreSQL even if isolation mode serializable is used. Also not so nice! Sincerely, Holger -- Holger Jakobs * D-51469 Bergisch Gladbach Telefon +49-2202-59991 * Mobilfon +49-177-792-2466
On Fri, 7 Nov 2003 holger@jakobs.com wrote: > Whenever an error occurs within the transaction, PostgreSQL puts the > whole transaction in an *ABORT* state, so that there is no difference at > all between COMMITing or ROLLBACKing it. Even commands successfully > carried out before the error ocurred are rolled back, even if I COMMIT > the transaction, where no error message whatsoever is shown. In PostgreSQL all errors are currently considered unrecoverable, and all statements in a transaction must commit or rollback together as a single unit. In the future an implementation of nested transactions or savepoints would presumably relax this limitation to only the successfully committed subtransactions or statements that were not separately rolled back to a previous savepoint. > Additionally, I have discovered that phantom reads occur in PostgreSQL > even if isolation mode serializable is used. Also not so nice! You're going to have to give more information for anyone to even begin to understand the case you're seeing.
On Mon, 10 Nov 2003, Holger Jakobs wrote: > Hi Stephan, > > On 9 Nov, Stephan Szabo wrote: > > On Fri, 7 Nov 2003 holger@jakobs.com wrote: > > > >> Whenever an error occurs within the transaction, PostgreSQL puts the > >> whole transaction in an *ABORT* state, so that there is no difference > >> at all between COMMITing or ROLLBACKing it. Even commands > >> successfully carried out before the error ocurred are rolled back, > >> even if I COMMIT the transaction, where no error message whatsoever > >> is shown. > > > > In PostgreSQL all errors are currently considered unrecoverable, and > > all statements in a transaction must commit or rollback together as a > > single unit. In the future an implementation of nested transactions or > > savepoints would presumably relax this limitation to only the > > successfully committed subtransactions or statements that were not > > separately rolled back to a previous savepoint. > > What I meant was not subtransactions or savepoints, but the funny > behaviour that operations already successfully carried out never will be > committed, just because some other operation later within the same > transaction fails. This is far different from the behaviour of all other > DMBS I know. Why not: > > begin work; > insert into x values (1, 'hi'); > --> success > insert into x values (1, 'there'); > --> failure due to primary key violation > insert into x values (2, 'foo'); > --> success > commit work; > > and have two new tuples in the table? Why do _all_ of these operations > have to be rolled back? I just don't get it that this has anything to do Right now there's a technical reason (the system won't handle partially completed statements reasonably - but nested transactions will give a way for that to work presumably hence their mention) and a logical reason (many of the developers seem believe that partial commit isn't a good behavior and that transactions should be all commit or all fail). Personally, I think it'd be nice to have some way to deal with errors other than retrying the whole sequence, but at the mean time we have a system which basically meets the words if not the intent of the spec. > Hopefully this can be cleared and perhaps improved within PostgreSQL. > Otherwise, PostgreSQL always claims to be close to standards. This is actually fairly standard complient. Note my message about errors being unrecoverable. The SQL spec allows an entire transaction to be rolled back upon unrecoverable errors. Our failed state behavior may not be (if we sent an error on commit, I believe it would be), but forcing the entire transaction to roll back is. That may not be what the spec intended, but it does seem to be allowed.
Hi Stephan, On 9 Nov, Stephan Szabo wrote: > On Fri, 7 Nov 2003 holger@jakobs.com wrote: > >> Whenever an error occurs within the transaction, PostgreSQL puts the >> whole transaction in an *ABORT* state, so that there is no difference >> at all between COMMITing or ROLLBACKing it. Even commands >> successfully carried out before the error ocurred are rolled back, >> even if I COMMIT the transaction, where no error message whatsoever >> is shown. > > In PostgreSQL all errors are currently considered unrecoverable, and > all statements in a transaction must commit or rollback together as a > single unit. In the future an implementation of nested transactions or > savepoints would presumably relax this limitation to only the > successfully committed subtransactions or statements that were not > separately rolled back to a previous savepoint. What I meant was not subtransactions or savepoints, but the funny behaviour that operations already successfully carried out never will be committed, just because some other operation later within the same transaction fails. This is far different from the behaviour of all other DMBS I know. Why not: begin work; insert into x values (1, 'hi'); --> success insert into x values (1, 'there'); --> failure due to primarykey violation insert into x values (2, 'foo'); --> success commit work; and have two new tuples in the table? Why do _all_ of these operations have to be rolled back? I just don't get it that this has anything to do with savepoints or so. I don't see any problem with an error being recoverable, because the second insert failed and does not have to be recovered while the first and the third worked fine and does not have to be recovered either. When committing a transaction the effects of all operations that did not fail will be made permanent. This is how transaction processing is described in the literature. If a programmer wants the whole transaction to fail because one part failed, (s)he can always program a rollback in case of at least one error. But there should always be a difference between a rollback and a commit, after at least one statement changing data has reported a success. Hopefully this can be cleared and perhaps improved within PostgreSQL. Otherwise, PostgreSQL always claims to be close to standards. Sincerely, Holger -- Holger@Jakobs.com, Bergisch Gladbach, Germany Telefon (0 22 02) 5 99 91 oder (01 77) 7 92 24 66
Holger Jakobs wrote: > Hi Stephan, > > On 9 Nov, Stephan Szabo wrote: >> On Fri, 7 Nov 2003 holger@jakobs.com wrote: >> >>> Whenever an error occurs within the transaction, PostgreSQL puts the >>> whole transaction in an *ABORT* state, so that there is no difference >>> at all between COMMITing or ROLLBACKing it. Even commands >>> successfully carried out before the error ocurred are rolled back, >>> even if I COMMIT the transaction, where no error message whatsoever >>> is shown. >> >> In PostgreSQL all errors are currently considered unrecoverable, and >> all statements in a transaction must commit or rollback together as a >> single unit. In the future an implementation of nested transactions or >> savepoints would presumably relax this limitation to only the >> successfully committed subtransactions or statements that were not >> separately rolled back to a previous savepoint. > > What I meant was not subtransactions or savepoints, but the funny > behaviour that operations already successfully carried out never will be > committed, just because some other operation later within the same > transaction fails. This is far different from the behaviour of all other > DMBS I know. Why not: Why is that "funny behaviour" for you? By putting the statements into a transaction block you told the data management system "I want this group of statements to be atomic". Atomic means all or nothing. It might not be exactly what you intended to say, and you have a point if you conclude that PostgreSQL is limited because it doesn't let you say anything in between all or nothing. But after all, thus far it is only doing what you asked for. Jan > > begin work; > insert into x values (1, 'hi'); > --> success > insert into x values (1, 'there'); > --> failure due to primary key violation > insert into x values (2, 'foo'); > --> success > commit work; > > and have two new tuples in the table? Why do _all_ of these operations > have to be rolled back? I just don't get it that this has anything to do > with savepoints or so. I don't see any problem with an error being > recoverable, because the second insert failed and does not have to be > recovered while the first and the third worked fine and does not have to > be recovered either. When committing a transaction the effects of all > operations that did not fail will be made permanent. This is how > transaction processing is described in the literature. > > If a programmer wants the whole transaction to fail because one part > failed, (s)he can always program a rollback in case of at least one > error. But there should always be a difference between a rollback and a > commit, after at least one statement changing data has reported a > success. > > Hopefully this can be cleared and perhaps improved within PostgreSQL. > Otherwise, PostgreSQL always claims to be close to standards. > > Sincerely, > > Holger -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
> be recovered either. When committing a transaction the effects of all > operations that did not fail will be made permanent. This is how > transaction processing is described in the literature. I would be interested in reading that (URLs please) as I didn't see anything in the spec that was interesting on this topic. 4.8.5 from Framework (part 01) An SQL-transaction (transaction) is a sequence of executions of SQL-statementsthat is atomic with respect to recovery. That is to say: either the execution result is completely successful,or it has no effect on any SQL-schemas or SQL-data. The "execution result is completely successful" could certainly be used to back up PostgreSQLs choice to force a rollback. However, it doesn't differentiate between execution of what the user requested, and execution of recovery procedures on the successful user elements. Irregardless, I wish a commit on a failed transaction would throw an error -- END is good enough for Rollback or Commit. For PostgreSQL to implement this we need Savepoints or nested transactions internally since in many cases data is physically written in order to perform things like Foreign Key constraint checks.
Rod Taylor wrote:<br /><blockquote cite="mid1068490585.25089.7.camel@jester" type="cite"><blockquote type="cite"><pre wrap="">berecovered either. When committing a transaction the effects of all operations that did not fail will be made permanent. This is how transaction processing is described in the literature. </pre></blockquote><pre wrap=""> I would be interested in reading that (URLs please) as I didn't see anything in the spec that was interesting on this topic. 4.8.5 from Framework (part 01) An SQL-transaction (transaction) is a sequence of executions of SQL-statementsthat is atomic with respect to recovery. That is to say: either the execution result is completely successful,or it has no effect on any SQL-schemas or SQL-data.</pre></blockquote> Although i am not aware of the rootsof this discussion but would like to<br /> comment at this point .<br /><br /> When we work with sequences an abortedtransaction does have<br /> a permanent effect on the last value of sequence. Is this behaviour <br /> not a violationof above defination of transaction ?<br /><br /><br /> Regds<br /> Mallah.<br /><br /><blockquote cite="mid1068490585.25089.7.camel@jester"type="cite"><pre wrap=""> The "execution result is completely successful" could certainly be used to back up PostgreSQLs choice to force a rollback. However, it doesn't differentiate between execution of what the user requested, and execution of recovery procedures on the successful user elements. Irregardless, I wish a commit on a failed transaction would throw an error -- END is good enough for Rollback or Commit. For PostgreSQL to implement this we need Savepoints or nested transactions internally since in many cases data is physically written in order to perform things like Foreign Key constraint checks. ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? <a class="moz-txt-link-freetext" href="http://www.postgresql.org/docs/faqs/FAQ.html">http://www.postgresql.org/docs/faqs/FAQ.html</a> </pre></blockquote><br/>
> Although i am not aware of the roots of this discussion but would like > to > comment at this point . > > When we work with sequences an aborted transaction does have > a permanent effect on the last value of sequence. Is this behaviour > not a violation of above defination of transaction ? I believe you are correct, which is probably why Oracle offers serialized sequences for those who want them to be. Sequences have been explicitly documented as functioning the way they do as a concession for performance.
Holger Jakobs wrote: >> >> Why is that "funny behaviour" for you? By putting the statements into >> a transaction block you told the data management system "I want this >> group of statements to be atomic". Atomic means all or nothing. It >> might not be exactly what you intended to say, and you have a point >> if you conclude that PostgreSQL is limited because it doesn't let you >> say anything in between all or nothing. But after all, thus far it is >> only doing what you asked for. >> > > It is "funny behaviour", because I expect those operations of the > transaction, which executed successfully, to be performed in an atomic > way. It is obvious that I cannot expect an operation which reported an > error to have any effect. > > "Atomic" means that all operations (whether successful or not) will be > carried out all together or none of them - but only the successful ones > will have had an effect. As long as we talk in an SQL context, can you please stick to SQL terms? I don't know exactly what you mean with "operation". If for example the statement DELETE FROM order_line WHERE ol_ordernum = 4711; has 12 matching rows in order_line, is an operation the removal of one single order line or do the actions performed by the triggers fired due to their removal count as separate operations for you? And if there is one that cannot be deleted because a row in another table with a foreign key references it, do you delete none of them or the remaining 11? And if you decide to delete none, how do you magically undo the work of the BEFORE triggers if you hit the foreign key after successfully processing 5 rows? Is there an SQL return code for "partial success"? The question about "partial success" is the important part here. Imagine a stored procedure in PL/pgSQL consisting of two INSERTs. One fails with a duplicate key error, the other one succeeds. The language lacks for technical reasons an exception handling mechanism, so you have to define if the other statement or nothing of the procedure succeeds, because you have no chance to report "partial success", there is no return code defined for that. > > Again: Why not make a difference between "commit" and "rollback" in a > transaction in this case? Why not let the user decide which parts should > be commited? The practical reason is that programming would become a lot > more convenient. (if there is a practical reason it does not necessarily > need a technical reason, I believe.) Nobody said that it should not be possible. But you have to dig a little deeper and make a bit more of a complete proposal for this, covering the different possible failure reasons, definitions how exactly to react in case of statements affecting multiple rows, related triggers and so on and so forth. "Make a difference between commit and rollback" is way too fuzzy here. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
> > Why is that "funny behaviour" for you? By putting the statements into > a transaction block you told the data management system "I want this > group of statements to be atomic". Atomic means all or nothing. It > might not be exactly what you intended to say, and you have a point > if you conclude that PostgreSQL is limited because it doesn't let you > say anything in between all or nothing. But after all, thus far it is > only doing what you asked for. > It is "funny behaviour", because I expect those operations of the transaction, which executed successfully, to be performed in an atomic way. It is obvious that I cannot expect an operation which reported an error to have any effect. "Atomic" means that all operations (whether successful or not) will be carried out all together or none of them - but only the successful ones will have had an effect. Again: Why not make a difference between "commit" and "rollback" in a transaction in this case? Why not let the user decide which parts should be commited? The practical reason is that programming would become a lot more convenient. (if there is a practical reason it does not necessarily need a technical reason, I believe.) Sincerely, Holger -- Holger@Jakobs.com, Bergisch Gladbach, Germany Telefon (0 22 02) 5 99 91 oder (01 77) 7 92 24 66
On 11 Nov, Jan Wieck wrote: > As long as we talk in an SQL context, can you please stick to SQL > terms? I don't know exactly what you mean with "operation". If for > example the statement > > DELETE FROM order_line WHERE ol_ordernum = 4711; > > has 12 matching rows in order_line, is an operation the removal of one > single order line or do the actions performed by the triggers fired > due to their removal count as separate operations for you? And if > there is one that cannot be deleted because a row in another table > with a foreign key references it, do you delete none of them or the > remaining 11? And if you decide to delete none, how do you magically > undo the work of the BEFORE triggers if you hit the foreign key after > successfully processing 5 rows? Is there an SQL return code for > "partial success"? OK, let's say "statement" instead of "operation". No, there is no partial success. Either a statement delivers an "OK" or it doesn't. Actually, you will have to undo anything the statement did before the first error occurs. This may mean that you need some kind of savepoint. If so, the necessity to implent this shows and should be given a high priority. I don't know how the other RDMBS do it, but they do. I am talking from the perspective of an RDBMS user, not as an implementor. Calling a procedure is a statement as well, and it includes all other procedures called from this procedure. So the statement level is always the statements that were carried out directly in the transaction. If anything within one statement fails, the statement was not carried out and must not have any effect. It is not important whether the procedure was fired by a trigger or called by another procedure. Are there any Open Source RDBMS which behave like Oracle, Access and Allbase? If so, one might look into their code to find out how they have implented it. Coming back to Standards, here is a quote from the ANSI document: 4.28 SQL-transactions The execution of a <rollback statement> may be initiated implicitly by an implementation when it detects unrecoverableerrors. When such an error occurs, an exception condition is raised: transaction rollback with an implementation-definedsubclass code. This means that a rollback does not have to be initiated if an unrecoverable error occurs, it only _may_ happen. Since it is impractical, it should not. AN EXAMPLE: Let's have two tables, employees and doctors in a hospital. create table emp ( empno integer primary key, name varchar(40) ); create table doctor ( empno integer primary key references emp, beepernumber integer unique ); Now let a user enter the data of a doctor. First the data of the employee part are sent to the database: insert into emp values (1, 'Fred'); --> success Second the doctor-special data are sent to the database: insert into doctor values (1, 34); -->error, beepernumber alreadypresent, unique key violation Since there was an error, we let the user key in a different beeper number for the doctor and send the data to the database: insert into doctor (1, 45); -->should be successful (if 45 is not already there) and it _is_ in Allbase, Oracle,Access -->Postgres tells you something about and *ABORT* state We commit the transaction: commit work; Effect in all other databases: a successfully entered doctor Effect in PostgreSQL: nothing! To get the same effect in PostgreSQL, we would have to rollback (or commit, no difference) the transaction after the attempt of inserting the non-unique beeper number and then re-insert all data into all tables. WHY? It only makes programming more complex, thus error-prone. I would appreciate if most of the difficult tasks could be done within the database system - that's what it's there fore. The first reason is that the implementors usually are far more advanced programmers than the programmers of applications. The second reason is that this isolates the tricky programming from the already complex logic of the application. Additionally, whether one might personally think that this behaviour is important or not, it is what most commercial RDBMS do. So it would make porting of applications from e. g. Oracle to PostgreSQL dramatically easier. Until now, this has been one of the pitfalls. That's why I brought this up in the first place. Sincerely, Holger -- Holger@Jakobs.com, Bergisch Gladbach, Germany Telefon (0 22 02) 5 99 91 oder (01 77) 7 92 24 66
Holger Jakobs wrote: > Calling a procedure is a statement as well, and it includes all other > procedures called from this procedure. So the statement level is always > the statements that were carried out directly in the transaction. If > anything within one statement fails, the statement was not carried out > and must not have any effect. It is not important whether the procedure > was fired by a trigger or called by another procedure. So you define the smalles unit being one single statement as issued by the client application and receiving one single returncode over the frontend/backend protocol. That's almost what people think of as subtransactions. I think if we ever implement them, we will have some session setting that lets the backend behave like that and your needs will be satisfied. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #