Nested transactions RFC - Mailing list pgsql-hackers
From | Manfred Koizar |
---|---|
Subject | Nested transactions RFC |
Date | |
Msg-id | riandu8d681u68h1grteqo7nvh6hmtilis@4ax.com Whole thread Raw |
Responses |
Re: Nested transactions RFC
|
List | pgsql-hackers |
Hi, if it is acceptable for subtransactions to use up transaction numbers, then here is a half baked RFC for a possible implementation. If not, forget the rest of this message. The proposed implementation works much like the current transaction handling. It needs an additional system table pg_subtrans (child XactId PRIMARY KEY, parent XactId). BEGIN; -- starts a new (top level) transaction, say 100 INSERT row1; -- row1.xmin = 100 DELETE row2; -- row2.xmax = 100 BEGIN; -- starts a subtransaction, let's call it 200, -- stores 100 on the parent transaction stack -- (a localmemory structure), -- inserts (200, 100) into pg_subtrans INSERT row3; -- row3.xmin = 200, row3.XMIN_IS_SUB = true DELETE row4; -- row4.xmax = 200, row4.XMAX_IS_SUB = true COMMIT; -- resets CurrentTransaction to 100 (pop from xact stack), -- does *NOT* mark T200 as committed BEGIN; -- starts a subtransaction, let's call it 300, -- pushes 100 on the parent transaction stack, -- inserts(300, 100) into pg_subtrans BEGIN; -- starts a 3rd level subtransaction (400), -- pushes 300 on the parent transaction stack, -- inserts(400, 300) into pg_subtrans ... COMMIT; -- resets CurrentTransaction to 300 (transaction stack), -- does NOT mark T400 as committed INSERT row5; -- row5.xmin = 300, row5.XMIN_IS_SUB = true DELETE row6; -- row6.xmax = 300, row6.XMAX_IS_SUB = true ROLLBACK; -- resets CurrentTransaction to 100 (transaction stack), -- optionally removes (300, 100) from pg_subtrans, -- marks T300 as aborted COMMIT; -- marks T100 as committed or ROLLBACK; -- marks T100 as aborted Visibility: ----------- The checks for xmin and xmax are very similar. We look at xmin here: Traditionally a tuple is visible, if xmin has committed before the current snapshot was taken, or if xmin == CurrentTransaction(). A subtransaction is considered aborted, if it is marked aborted. Else it is considered to be in the same state as its parent transaction (which again can be a subtransaction). The effects of tup.xmin are considered visible, if ... (This is not a formal specification. It shall only illustrate the difference to the existing implementation of HeapTupleSatisfiesXxx() in tqual.c) if (tup.XMIN_ABORTED) // flag set by prior visitor return false; if (tup.XMIN_COMMITTED) // flag set by prior visitor return true; // xmin neither known aborted nor known committed, // could be active // or finished and tup not yet visited for(xmin = tup.xmin; IsValid(xmin); xmin = GetParentXact(xmin)) { if (TransactionIdDidAbort(xmin)) { tup.XMIN_ABORTED= true; return false; }/*if*/ if (IsCurrentTransaction(xmin)) { // tup.xmin is one of my own subtransactions, // it is alreadycommitted. So tup can be // considered belonging to the current transaction. tup.xmin = xmin; tup.XMIN_IS_SUB = CurrentTransactionIsSub(); return true; // or rather check cmin ... }/*if*/ if (TransactionIdDidCommit(xmin)) { // xmin is a top level transaction tup.xmin =xmin; tup.XMIN_IS_SUB = false; tup.XMIN_COMMITTED = true; return true; }/*if*/ if (!tup.XMIN_IS_SUB) { // Don't try expensive GetParentXact() break; }/*if*/ }/*for*/ // tup.xmin still active return false; TransactionId GetParentXact(TransactionId xnum) uses pg_subtrans to find the parent transaction of xnum. It returns InvalidTransaction, if it doesn't find one. Performance: ------------ . Zero overhead, if nested transactions are not used. . BEGIN SUB has to insert a pair of TransactionIds into pg_subtrans. Apart from that it is not slower than BEGIN top level transaction. . COMMIT SUB is faster than COMMIT. . ROLLBACK SUB is much like ROLLBACK, plus (optionally) deletes one entry from pg_subtrans. . COMMIT and ROLLBACK of top level transactions don't care about subtransactions. . Access a tuple inserted/deleted by a subtransaction: Zero overhead, if the subtransaction has been rolled back, otherwise the parent transaction has to be looked up in pg_subtrans (possibly recursive). This price has to be paid only once per tuple (well, once for xmin and once for xmax). More accurate: "once after the inserting/deleting top level transaction has finished". Problems: --------- . pg_subtrans grows by 8 bytes per subtransaction. . Other pitfalls??? Administration: --------------- As soon as a top level transaction has finished, its subtransaction ids are replaced by the top level transaction id on the next access to each tuple. VACUUM (*not* VACUUM tablename) removes old entries from pg_subtrans. An entry is old, if the parent transaction has finished, before VACUUM started. Challenge: ---------- For heavy use of subtransactions there has to be a really fast implementation of pg_subtrans, maybe something like a b-tree. AFAICS small WAL changes: pg_subtrans inserts (and deletes?) have to be logged. Everything else is straightforward. Comments? ServusManfred
pgsql-hackers by date: