Re: nested transactions - Mailing list pgsql-hackers

From Tom Lane
Subject Re: nested transactions
Date
Msg-id 27778.1038594808@sss.pgh.pa.us
Whole thread Raw
In response to Re: nested transactions  (Manfred Koizar <mkoi-pg@aon.at>)
Responses Re: nested transactions  (Manfred Koizar <mkoi-pg@aon.at>)
List pgsql-hackers
Manfred Koizar <mkoi-pg@aon.at> writes:
> Visibility check by other transactions:  If a tuple is visited and its
> XMIN/XMAX_IS_COMMITTED/ABORTED flags are not yet set, pg_clog has to
> be consulted to find out the status of the inserting/deleting
> transaction xid.  If pg_clog[xid] is ...

>     00:  transaction still active

>     10:  aborted

>     01:  committed

>     11:  committed subtransaction, have to check parent

> Only in this last case do we have to get parentxid from pg_subtrans.

Unfortunately this discussion is wrong.  User-level visibility checks
will usually have to fetch the parentxid in case 01 as well, because
even if the parent is committed, it might not be visible in our
snapshot.  Snapshots will record only topmost-parent XIDs (because
that's what we can find in the PG_PROC array, and anything else would
create atomicity problems anyway).  So we must chase to the topmost
parent before testing visibility.

This means that the parentxid will need to be fetched in enough cases
that it's quite dubious that pushing it to a different file saves I/O.

Also, using a 11 state doubles the amount of pg_clog I/O needed to
commit a collection of subtransactions.  You have to write 11 as the
state of each commitable subtransaction, then commit the parent (write
01 as its state), then go back and change the state of each
subtransaction to 01.  (Whether this last bit is done as part of parent
transaction commit, or during later inspections of the state of the
subtransaction, doesn't change the argument.)

I think it would be preferable to use only three states: active,
aborted, committed.  The parent commit protocol is (1) write 10 as state
of each aborted subtransaction (this should be done as soon as the
subtransaction is known aborted, rather than delaying to parent commit);
(2) write 01 as state of parent (this is the atomic commit); (3) write
01 as state of each committed subtransaction.  Readers who see 00 must
check the parent state; if the parent is committed then they have to go
back and recheck the child state (to see if it became "aborted" after
they looked).  This halves the write traffic during a commit, at the
cost of additional read traffic when subtransaction state is checked in
a narrow window after the time of parent transaction commit.  I believe
it nets out to be faster.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Aren't lseg_eq and lseg_ne broken?
Next
From: "Christopher Kings-Lynne"
Date:
Subject: 7.4 Wishlist