Thread: Nested transaction workaround?
Is it possible to use the dblink and dblink_exec features from inside pl/pgsql functions to mimic the behaviour of nested transactions by calling another function or executing some SQL via the dblink (into the same database)? Does the SQL statement executed within the dblink sit in its own isolated transaction/session? And would an error thrown in the dblink statement abort the enclosing session that initiated the call? What is the overhead/cost in creating the dblink connection? I'm asking these questions before trying this out, just in case someone else has already tried, and either failed or succeeded. John Sidney-Woollett
"John Sidney-Woollett" <johnsw@wardbrook.com> writes: > Is it possible to use the dblink and dblink_exec features from inside > pl/pgsql functions to mimic the behaviour of nested transactions by > calling another function or executing some SQL via the dblink (into the > same database)? I wouldn't call it "nested" transactions: the remote transactions would be committed, and would stay committed even if you roll back the caller. regards, tom lane
Tom Lane said: > I wouldn't call it "nested" transactions: the remote transactions would be committed, and would stay committed even if you roll back the caller. Ah, that's true. This means the dblink could be used to provide functionality equivalent to Oracle's #PRAGMA AUTONOMOUS directive. Thanks for clearing that up. John
I was thinking about the nested transaction problem, and I came across an interesting insight. While it is true you could use dblink to perform db operations outside the transaction (which could be useful for logging, etc.) what is lacking is a way to roll back the internal transactions when the parent rolls back. It also occured to me that the main problem with nested transactions is that the hard part is this inherited rollback/commit, esp. with MVCC which places constraints on how one could look at managing these commit/rollbacks without paying huge performance costs even where, as in the majority of cases, this feature is not used. I am assuming that part of the problem is how the visibility/transaction information is handled via MVCC. Is my understanding correct? My final, albeit half-baked, conclusion is that one of the things that would make nested transactions MUCH easier would be a two-phase commit (2PC) framework which would be stored on the transaction level. Something like a transaction status storage which contains the following information: Status (in progress, committed, rolled back, pending commit as in 2PC), and "depends on xid" where you can then have the pending commit become 'committed' when transaction xid is commited. Again this is just off the top of my head. Also an 2PC framework if added into the protocol would allow for true nested transactions via DBLink. Best Wishes, Chris Travers
On Wed, Jan 14, 2004 at 02:45:38PM +0700, Chris Travers wrote: > I was thinking about the nested transaction problem, and I came across an > interesting insight. While it is true you could use dblink to perform db > operations outside the transaction (which could be useful for logging, etc.) > what is lacking is a way to roll back the internal transactions when the > parent rolls back. Well, you could not commit the dblink'ed transaction until the parent commits. The problem is that the dblink'ed statement cannot see the parent transaction's uncommitted changes. > It also occured to me that the main problem with nested transactions is that > the hard part is this inherited rollback/commit, esp. with MVCC which places > constraints on how one could look at managing these commit/rollbacks without > paying huge performance costs even where, as in the majority of cases, this > feature is not used. I am assuming that part of the problem is how the > visibility/transaction information is handled via MVCC. Is my understanding > correct? Well, actually, the problem appears to be that people want to be able to roll back each individual statement without killing the parent transaction, and they want to make this the default behaviour. This takes it out of the "never used" category to "everybody does it" category. So you have to make something that works even if you use it all the time. > My final, albeit half-baked, conclusion is that one of the things that would > make nested transactions MUCH easier would be a two-phase commit (2PC) > framework which would be stored on the transaction level. Something like a > transaction status storage which contains the following information: Status > (in progress, committed, rolled back, pending commit as in 2PC), and > "depends on xid" where you can then have the pending commit become > 'committed' when transaction xid is commited. Again this is just off the > top of my head. I think something like that has been suggested, check the archives. There is a bit of discussion on how to actually store that info in a way that can be checked efficiently because remember, visibility needs to be checked for every tuple on every sequential scan in every process that runs subsequently, so it needs to be *fast*. > Also an 2PC framework if added into the protocol would allow for true nested > transactions via DBLink. Dt doesn't solve the visibility problem though. Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > (... have gone from d-i being barely usable even by its developers > anywhere, to being about 20% done. Sweet. And the last 80% usually takes > 20% of the time, too, right?) -- Anthony Towns, debian-devel-announce
Attachment
From: "Martijn van Oosterhout" <kleptog@svana.org> > Well, actually, the problem appears to be that people want to be able to > roll back each individual statement without killing the parent transaction, > and they want to make this the default behaviour. This takes it out of the > "never used" category to "everybody does it" category. Ok. Now I am confused. I thought that a nested transaction would involve two features: 1: The ability to incrimentally commit/rollback changes, i.e. at certain points in the transaction have a sub-commit. 2: The ability to have a transaction within another transaction with transactional visibility rules applying within the transaction tree. What exactly do you mean by roll back individual statements? What exactly would be the default behavior? > There is > a bit of discussion on how to actually store that info in a way that can be > checked efficiently because remember, visibility needs to be checked for > every tuple on every sequential scan in every process that runs subsequently, > so it needs to be *fast*. Then you might have to have an array of "related transactions" which are also visible for each thransaction, sort of like a tree with bidirectional links. Unfortunately I can imagine this being a source of subtle, hard to troubleshoot bugs. Something like _x_id, _x_status, related_x_id[], child_x_id[], so that a rollback can rollback all child_x_id's without touching the other transactions which are parents, cousins, etc. but visible. Best Wishes, Chris Travers
On Wednesday 14 January 2004 09:45, Chris Travers wrote: > From: "Martijn van Oosterhout" <kleptog@svana.org> > > > Well, actually, the problem appears to be that people want to be able to > > roll back each individual statement without killing the parent > > transaction, > > > and they want to make this the default behaviour. This takes it out of > > the "never used" category to "everybody does it" category. > > Ok. Now I am confused. I thought that a nested transaction would involve > two features: > 1: The ability to incrimentally commit/rollback changes, i.e. at certain > points in the transaction have a sub-commit. > 2: The ability to have a transaction within another transaction with > transactional visibility rules applying within the transaction tree. Of course you can do #1 with #2. > What exactly do you mean by roll back individual statements? What exactly > would be the default behavior? I think we're talking about the "insert and if that fails update" sequence that seems to be a common approach. -- Richard Huxton Archonet Ltd
Re: What are nested transactions then? was Nested transaction workaround?
From
Martijn van Oosterhout
Date:
On Wed, Jan 14, 2004 at 04:45:46PM +0700, Chris Travers wrote: > Ok. Now I am confused. I thought that a nested transaction would involve > two features: > 1: The ability to incrimentally commit/rollback changes, i.e. at certain > points in the transaction have a sub-commit. I think they're referred to as savepoints, not sure. I think they become trivial once nested transactions are done. > 2: The ability to have a transaction within another transaction with > transactional visibility rules applying within the transaction tree. > > What exactly do you mean by roll back individual statements? What exactly > would be the default behavior? Well, one of the problems people would like to solve is that if a trigger or constraint fails then it doesn't roll back the whole transaction. The way this would be acheived is by making every statement within the parent transaction it's own implicit subtransaction so any errors can be caught and undone cleanly. This would mean that even normal transaction blocks that people do now could involve dozens of subtransactions, especially if you start involving triggers and function calls. In any case, I don't quite understand the intended semantics of "duplicate key shouldn't rollback transaction". If I call a function that inserts a duplicate key, should the other effects of the function be rolled back or not? Anyone know how MSSQL deals with this? > Then you might have to have an array of "related transactions" which are > also visible for each thransaction, sort of like a tree with bidirectional > links. Unfortunately I can imagine this being a source of subtle, hard to > troubleshoot bugs. Something like > _x_id, _x_status, related_x_id[], child_x_id[], so that a rollback can > rollback all child_x_id's without touching the other transactions which are > parents, cousins, etc. but visible. The issue is atomicity. You need to be able to make sure that even if the power fails halfway through a write that the whole transaction is either committed or it's not. The mrore you have to update the harder it gets. Anyway, far more intelligent people than I have been over this, check the archives of this list and -hackers. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > (... have gone from d-i being barely usable even by its developers > anywhere, to being about 20% done. Sweet. And the last 80% usually takes > 20% of the time, too, right?) -- Anthony Towns, debian-devel-announce
Attachment
Martijn van Oosterhout said: > In any case, I don't quite understand the intended semantics of "duplicate > key shouldn't rollback transaction". If I call a function that inserts a > duplicate key, should the other effects of the function be rolled back or > not? Anyone know how MSSQL deals with this? In Oracle (I believe) that the exception is propogated up to abort the initiating transaction unless you explictly place the SQL call which failed in a begin..exception..end block (like below) begin -- do some SQL stuff here exception when NO_DATA_FOUND then -- handle the exception end; You also have the option to RAISE an exception to abort the transaction after you handled the error and decided that there is nothing useful you can do. Maybe, (don't shoot me) better procedural (pl/pgsql) error/exception handling ought to be a dealt with before nested transaction support? John Sidney-Woollett
"John Sidney-Woollett" <johnsw@wardbrook.com> writes: > Maybe, (don't shoot me) better procedural (pl/pgsql) error/exception > handling ought to be a dealt with before nested transaction support? They are largely the same problem. The representational issues for keeping track of nested-transaction commit states are solved, at least on paper --- see the pghackers archives. The big nut to crack is that all of the error-handling logic in the backend is based on the assumption that any error kills the transaction and allows all transient state to be thrown away. Partial cleanup of transient state is simply not there, and it's not clear how to add it without (a) a lot of new code, (b) a big performance hit, and (c) tons of new bugs. regards, tom lane