Thread: Access to transaction status
Hi all, I am currently implementing an experimental middleware based replicator for a set of fully replicated databases. Do be able to handle all sorts of failures I needed two functions: - A function to get the current XID - A function which I can use later to tell if a given XID commited/aborted/whatever I did a small implementation of this (see attachment). Could one of you hackers tell me if you think this is - an ugly way of accessing the clog? - totally wrong because I missed some point? - or a good and correct idea :) It would be very nice if someone had the time to have a short look into this. Greetings, Christian
----- Original Message ----- From: "Jeroen T. Vermeulen" <jtv@xs4all.nl> > I ran into the same need (Bruce, we discussed this at FOSDEM in Brussels > this February) for libpqxx. My code tries to compensate for the > possibility that the backend connection is lost while waiting for a reply > to a COMMIT. The way I worked around it was to create a special record > at the beginning of the transaction, in a dedicated table that's > effectively a custom transaction log. If the record is still there when > I reconnect, the transaction committed. If not, it didn't. > > Obviously this leaves some garbage collection issues, so I'd be really > happy with a way to go back to the server after my connection is lost > and find out whether my transaction committed or not. I see a race condition in this approach: if you reconnect too fast, and the backend which actually should commit is still in progress (assume it takes a while to commit for whatever reasons) you get the impression that it did not commit - and a short time later the backend will commit... (before noticing that the client connection was lost). A safe method would be to shut down all backends (or is there another method to safely abort all transactions?), then start the backends again, and then read the table with the special records. In this way you would be sure that your transaction is not in progress while you're inspecting the table. Ofcourse, this approach is not very fast and may abort alot of transactions - but if consistency is more important for you than anything else... - Christian
On Fri, Jun 20, 2003 at 09:35:08AM +0200, Christian Plattner wrote: > > I see a race condition in this approach: if you reconnect too fast, and the > backend which actually should commit is still in progress (assume it takes a > while to commit for whatever reasons) you get the impression that it did not > commit - and a short time later the backend will commit... (before noticing > that the client connection was lost). Good point. So far I assumed that a broken connection would take a while to repair. OTOH by the time TCP gives up due to a bad network connection, wouldn't the server reach the same conclusion? Jeroen
----- Original Message ----- From: "Jeroen T. Vermeulen" <jtv@xs4all.nl> > > I see a race condition in this approach: if you reconnect too fast, and the > > backend which actually should commit is still in progress (assume it takes a > > while to commit for whatever reasons) you get the impression that it did not > > commit - and a short time later the backend will commit... (before noticing > > that the client connection was lost). > > Good point. So far I assumed that a broken connection would take a while > to repair. OTOH by the time TCP gives up due to a bad network connection, > wouldn't the server reach the same conclusion? > Well, I wouldn't rely solely on TCP when assuring consistency. Also, I don't think that the backend will ever inspect its TCP socket while committing. btw: There could be also other reasons for the client to loose the connection (i.e. client process crashes). - Christian
On Fri, Jun 20, 2003 at 10:20:14AM +0200, Christian Plattner wrote: > > Well, I wouldn't rely solely on TCP when assuring consistency. Also, I don't > think that the backend will ever inspect its TCP socket while committing. No, but its underlying IP stack would. > btw: There could be also other reasons for the client to loose the > connection (i.e. client process crashes). In that case the client would lose all its state as well, so not really a problem that can be handled client-side. Jeroen
----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> > How much later? clog is not kept forever. I took a deeper look into the source. Forget my last posting. As far as I understand your code there is only one chance that information in clog gets lost: If XIDs are reused then ExtendCLOG will overwrite existing entries. Also, it seems to me that VACCUM has not effect on the clog. Now let's assume that there is a GET_XID_STATUS(xid) function. If at the time "GET_XID_STATUS(xid)" gets executed 'xid' has not been reused (which only should occur after about 4 billion transactions following xid), then the mechanism should work. If one uses "TransactionIdPrecedes" to check if xid is in the past (as in my sample code), then the window is restricted to 2 billion transactions, which seems enough for me. I implemented this check so that the clog lookup code does not try to fetch pages that do not yet exist (which crashes the backend) if one supplies a wrong xid. What do you think? Thanks, Christian
----- Original Message ----- From: "Jeroen T. Vermeulen" <jtv@xs4all.nl> > > btw: There could be also other reasons for the client to loose the > > connection (i.e. client process crashes). > > In that case the client would lose all its state as well, so not really > a problem that can be handled client-side. Well, my client (actually it is a middleware layer which routes transactions to a set of replicas) keeps its own log, because it must be able to handle arbitary failures. So it never looses its state. - Christian
On Fri, Jun 20, 2003 at 02:41:29PM +0200, Christian Plattner wrote: > > Well, my client (actually it is a middleware layer which routes transactions > to a set of replicas) keeps its own log, because it must be able to handle > arbitary failures. So it never looses its state. In that case perhaps we should see if there's anything we can do for each other. At the current rate, libpqxx is growing towards a sort of middleware product, but obviously it's not the right place to tackle many of the typical middleware problems. Jeroen
"Christian Plattner" <postgresql@sioux.ch> writes: > From: "Tom Lane" <tgl@sss.pgh.pa.us> >> How much later? clog is not kept forever. > As far as I understand your code there is only one chance that information > in clog gets lost: > If XIDs are reused then ExtendCLOG will overwrite existing entries. > Also, it seems to me that VACCUM has not effect on the clog. You're quite mistaken. clog is truncated during vacuum, once we are confident that there are no unvacuumed rows in the database with XIDs older than a certain point. This is to keep clog space requirements within reason. regards, tom lane