Re: Exposing the Xact commit order to the user - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: Exposing the Xact commit order to the user |
Date | |
Msg-id | 4BFA53C702000025000319B3@gw.wicourts.gov Whole thread Raw |
In response to | Exposing the Xact commit order to the user (Jan Wieck <JanWieck@Yahoo.com>) |
Responses |
Re: Exposing the Xact commit order to the user
Re: Exposing the Xact commit order to the user |
List | pgsql-hackers |
Jan Wieck wrote: > In some systems (data warehousing, replication), the order of > commits is important, since that is the order in which changes > have become visible. This issue intersects with the serializable work I've been doing. While in database transactions using S2PL the above is true, in snapshot isolation and the SSI implementation of serializable transactions, it's not. In particular, the snapshot anomalies which can cause non-serializable behavior happen precisely because the apparent order of execution doesn't match anything so linear as order of commit. I'll raise that receipting example again. You have transactions which grab the current deposit data and insert it into receipts, as payments are received. At some point in the afternoon, the deposit date in a control table is changed to the next day, so that the receipts up to that point can be deposited during banking hours with the current date as their deposit date. A report is printed (and likely a transfer transaction recorded to move "cash in drawer" to "cash in checking", but I'll ignore that aspect for this example). Some receipts may not be committed when the update to the date in the control table is committed. This is "eventually consistent" -- once all the receipts with the old date commit or roll back the database is OK, but until then you might be able to select the new date in the control table and the set of receipts matching the old date without the database telling you that you're missing data. The new serializable implementation fixes this, but there are open R&D items (due to the need to discuss the issues) on the related Wiki page related to hot standby and other replication. Will we be able to support transactional integrity on slave machines? What if the update to the control table and the insert of receipts all happen on the master, but someone decides to move the (now happily working correctly with serializable transactions) reporting to a slave machine? (And by the way, don't get too hung up on this particular example, I could generate dozens more on demand -- the point is that order of commit doesn't always correspond to apparent order of execution; in this case the receipts *appear* to have executed first, because they are using a value "later" updated to something else by a different transaction, even though that other transaction *committed* first.) Replicating or recreating the whole predicate locking and conflict detection on slaves is not feasible for performance reasons. (I won't elaborate unless someone feels that's not intuitively obvious.) The only sane way I can see to have a slave database allow serializable behavior is to WAL-log the acquisition of a snapshot by a serializable transaction, and the rollback or commit, on the master, and to have the serializable snapshot build on a slave exclude any serializable transactions for which there are still concurrent serializable transactions. Yes, that does mean WAL- logging the snapshot acquisition even if the transaction doesn't yet have an xid, and WAL-logging the commit or rollback even if it never acquires an xid. I think this solve the issue Jan raises as long as serializable transactions are used; if they aren't there are no guarantees of transactional integrity no matter how you track commit sequence, unless it can be based on S2PL-type blocking locks. I'll have to leave that to someone else to sort out. -Kevin
pgsql-hackers by date: