Re: nested transactions - Mailing list pgsql-hackers
From | Manfred Koizar |
---|---|
Subject | Re: nested transactions |
Date | |
Msg-id | 1hffuuop10le25uiqmispp5n596kr3gerk@4ax.com Whole thread Raw |
In response to | Re: nested transactions (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: nested transactions
Re: nested transactions |
List | pgsql-hackers |
On Fri, 29 Nov 2002 13:33:28 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Unfortunately this discussion is wrong. User-level visibility checks >will usually have to fetch the parentxid in case 01 as well, because >even if the parent is committed, it might not be visible in our >snapshot. Or we don't allow a subtransaction's status to be updated from 11 to 01 until we know, that the main transaction is visible to all active transactions. Didn't check whether this is expensive to find out. At least it should be doable by VACCUM. >Snapshots will record only topmost-parent XIDs (because >that's what we can find in the PG_PROC array, and anything else would >create atomicity problems anyway). So we must chase to the topmost >parent before testing visibility. BTW, I think this *forces* us to replace the sub xid with the respective main xid in a tuple header, when we set XMIN/MAX_IS_COMMITTED. Otherwise we'd have to look for the main xid, whenever a tuple is touched. >Also, using a 11 state doubles the amount of pg_clog I/O needed to >commit a collection of subtransactions. Is a pg_clog page written out to disk each time a bit is changed? I'd expect some locality. >I think it would be preferable to use only three states: active, >aborted, committed. The parent commit protocol is (1) write 10 as state >of each aborted subtransaction (this should be done as soon as the >subtransaction is known aborted, rather than delaying to parent commit); >(2) write 01 as state of parent (this is the atomic commit); (3) write >01 as state of each committed subtransaction. Readers who see 00 must >check the parent state; if the parent is committed then they have to go >back and recheck the child state (to see if it became "aborted" after >they looked). Nice idea! This saves the fourth status for future uses (for example, Firebird uses it for two phase commit). OTOH for reasons you mentioned above there's no chance to save parent xid lookups, if we go this way. >This halves the write traffic during a commit, at the >cost of additional read traffic when subtransaction state is checked in >a narrow window after the time of parent transaction commit. I believe >it nets out to be faster. Maybe. The whole point of my approach is: If we can limit the active range of transactions requiring parent xid lookups to a small fraction of the range needing pg_clog lookups, then it makes sense to store status bits and parent xids in different files. Otherwise keeping them together in one file clearly is faster. ServusManfred
pgsql-hackers by date: