Re: [PATCHES] Static snapshot data - Mailing list pgsql-hackers
From | Manfred Koizar |
---|---|
Subject | Re: [PATCHES] Static snapshot data |
Date | |
Msg-id | f2e2cvc2vrk82i76tsanqhcjskdc1v9ls3@4ax.com Whole thread Raw |
In response to | Re: [PATCHES] Static snapshot data (Alvaro Herrera <alvherre@dcc.uchile.cl>) |
Responses |
Re: [PATCHES] Static snapshot data
|
List | pgsql-hackers |
On Mon, 12 May 2003 23:55:31 -0400, Alvaro Herrera <alvherre@dcc.uchile.cl> wrote: >On Mon, May 12, 2003 at 09:40:37AM -0400, Tom Lane wrote: >> > Our (Alvaro's and my) current understanding is that snapshots are not >> > influenced by nested transactions. >> >> What was that long article Alvaro posted yesterday, then? Unfortunately I sent my reply to Tom before I read my inbox :-( >[...] if the reasoning below is >correct, we can get away with static Serializable- and QuerySnapshots. > >I don't think it makes sense to change the isolation level for a >non-toplevel transaction. That is, if the topmost transaction is >ISOLATION LEVEL SERIALIZABLE, all its child transactions will be. And >if it's not, then there's no way to make its child transactions be so. I agree. Tom replied: |I have a feeling that there might be some value in running a |SERIALIZABLE subtransaction inside a READ COMMITTED parent. Never thought of that, most probably due to my notion of nested transactions (which may be weird). Let's try to sort it out. Here is my view, it's not much more than three simple rules: Rule 1) Subtransactions can be nested to arbitrary levels. During execution of a subtransaction there is no change to the state of the enclosing transaction. Rule 2) On subtransaction ROLLBACK, changes done by the subtransaction are effectively undone. Rule 3) After subtransaction COMMIT, changes done by the subtransaction are effectively treated as if done by the enclosing transaction. So this sequence of commands BEGIN; -- main transaction T1 query 1; BEGIN; -- subtransaction T1.1 query 2; BEGIN; -- subtransaction T1.1.1 query 3; ROLLBACK; -- subtransaction T1.1.1 COMMIT; -- subtransaction T1.1 query 4; BEGIN; -- subtransaction T1.2 query 5; ROLLBACK; -- subtransactionT1.2 query 6; COMMIT; -- main transaction T1 effectively behaves like BEGIN; -- main transaction T1 query 1; query 2; query 4; query 6; COMMIT; -- maintransaction T1 I.e. it does not matter whether query 2 has been issued inside the (later committed) subtransaction T1.1 or directly in the main transaction T1. Query 3 and query 5 which are part of aborted subtransactions (T1.1.1 and T1.2) look as if they had never been issued. I'm inclined to include "SET TRANSACTION ISOLATION LEVEL" in the kind of changes that rules 2 and 3 deal with. Perhaps the rules should say "commands executed" instead of "changes done". This would forbid running parts of the same main transaction with different isolation levels. >With "constant isolation" in mind, it's clear that the >SerializableSnapshot is going to be constant for all transactions. We >don't need to calculate different SerializableSnapshots for child >transactions; thus going with a static variable for SerializableSnapshot >isn't wrong. | |But ... your definition of the snapshot includes the list of successful |previous subtransactions of the parent. Apart from possible (future) performance hacks, I see no need to include a list of completed subtransactions in any local data structure, neither Snapshots nor TransactionStates. We do not enumerate the children of a transaction. We look into the other direction: When we check visibility we find a transaction id in a tuple header and want to know its parent transaction id. This question can be answered by pg_subtrans which will be built on top of the SimpleLRU patch submitted a few days ago. | How's that static? >And about QuerySnapshots: given some running transaction with a given >QuerySnapshot, a newly created child transaction's first QuerySnapshot >can be calculated easily as: > >- Xmin, Xmax and xip are the same as in the current implementation > (i.e. the values from GetSnapshotData) Yes, and this overwrites the current QuerySnapshot. >- childxact is my parent's childxact >- parentxact is created by adding my parent XID to my parent's > parentxact No need for childxact and parentxact (see below). |Not entirely sure about that in READ COMMITTED mode. Should a child |xact be able to see commits from other backends that happened before it |started, but after its parent started? Why not? Even if there is no subtransaction, a new *query* sees commits from other backends. I don't see why query 2 in the right case should see commits that are invisible to query 2 in the left case. BEGIN; BEGIN; query 1; query 1; BEGIN; -- query 2; query 2; |I can think of arguments on both sides ... ?? >And given some non-topmost ending transaction, its parent transaction >next QuerySnapshot can be calculated as: A QuerySnapshot is always taken at the start of a query. It does not depend on the transaction nesting level. >Thus we don't need to keep track of multiple QuerySnapshots either -- >the new one can always be calculated from the last one. Not only "from the last one" but "independently from the last one". GetSnapshotData does not care about subtransactions. >We need to know all the XIDs that were completed within the same topmost >transaction, because all of them have to be taken into consideration for >the visibility rules. pg_subtrans keeps track of that (sort of, because it can navigate from child to parent but not vice versa). > IOW, we have to consider all of them like they >were only one transaction, discarding the changes made by the ones that >were aborted. Okay, cf. rules 2 and 3. [While we are at it, I continue with some comments to Alvaro's other message.] On Sun, 11 May 2003 19:29:27 -0400, Alvaro Herrera <alvherre@dcc.uchile.cl> wrote: :In the current implementation, it's sufficient to know :a) what transactions come before me (Xmin), :b) what transactions come after me (Xmax), :c) what transactions are in progress (xip), and :d) what commands come before me in the current transaction : (curcid) I propose that we don't change this, except that d) should say "... in the current transaction tree" :In the nested transactions case, we also need to know : :e) what subtransactions of my own parent transactions come before me, : and :f) what commands of my parent transactions come before me. Yes, we need to have this information. But that doesn't mean we have to store it in a snapshot. ad e) I can't see a need to directly answer this question. What we need is e') Does a given xid belong to the current xact tree? This can be answered using pg_subtrans and the transaction information stack (see below). ad f) I'd write this as: f') What commands of my transaction tree come before me? :Consider the following scenario: : :BEGIN; xid=1 :CREATE TABLE a (p int UNIQUE, q int); xid=1 cid=1 :INSERT INTO a (p) VALUES (1); xid=1 cid=2 :BEGIN; xid=2 : -- should fail due to unique constraint : INSERT INTO a (p) VALUES (1); xid=2 cid=1 :ROLLBACK; :BEGIN; xid=3 : INSERT INTO a (p) VALUES (2); cid=1 : DELETE FROM a WHERE one=1; cid=2 : -- "a" should have 1 tuple :COMMIT; :-- should work, because the old tuple doesn't exist anymore :INSERT INTO a (p) VALUES (1); xid=1 cid=3 :COMMIT; It might help, if we continue to increment cid across subtransaction boundaries. BEGIN; xid=1 CREATE TABLE a (p int UNIQUE, q int); xid=1 cid=1 INSERT INTO a (p) VALUES (1); xid=1 cid=2 BEGIN; xid=2-- should fail due to unique constraintINSERT INTO a (p) VALUES (1); xid=2 cid=1-> 3 ROLLBACK; BEGIN; xid=3INSERT INTO a (p) VALUES (2); cid=1 -> 4DELETE FROM a WHERE one=1; cid=2 -> 5-- "a" should have 1 tuple COMMIT; -- should work, because the old tuple doesn't exist anymore INSERT INTO a (p) VALUES (1); xid=1 cid=3 -> 6 COMMIT; :Here, the QuerySnapshot of xid 1, at the time of cid=3 [6] should see the :results of execution from xid 3, but it is not before Xmin, and it's :after Xmax, and is not in the xip array. This will be handled by HeapTupleSatisfiesXxxx using pg_subtrans: . We find a tuple (having p=2) with xmin=3. . In pg_clog we find that xact 3 is a committed subtransaction. . We lookup xact 3's parent transaction in pg_subtrans and get parent xact = 1. . Consulting the transaction information stack we find out that xact 1 is one of our own currently active transactions(in this case the only one). . Because the tuple's cmin (4) is less than CurrentCommandId (6) the tuple is visible. The snapshot is only consulted for transactions outside our own transaction tree. This is a natural extension to the current logic, where we check for IsCurrentTransactionId before we look at the snapshot. :Also, the QuerySnapshot of xid 3, should see the results of commands :from xid 1 just like they'd be seen if they where in the same xact but :with a lesser CommandId. Yes, because if we find a tuple with cmin still active, we look for this xid on our transaction information stack. :Both cases are not implementable with the current notion of a Snapshot. I think they are. What we need is not an extension to the snapshot structure, but a transaction information stack holding transaction specific information: TransactionId, TransState, TBlockState, ... This looks almost like struct TransactionStateData, except that commandId, startTime, and startTimeUsec belong only to the main transaction. :I'm not sure what the SerializableSnapshot should be. It does need to :take into account the changes made by previous committed :subtransactions, right? Per rule 3 previous committed subtransactions are equivalent to previous queries. So whether their effects are visible depends on the current query snapshot. Consider this UPDATE a SET q = 1 WHERE ...; -- xid=1 cid=7 when there is a TRIGGER BEFORE UPDATE FOR EACH ROW containing: ...BEGIN; -- subtransactionUPDATE ...;COMMIT;... While the second tuple is processed, visibility rules for the effects of the trigger executed for the first tuple are the same as if the trigger had executed its UPDATE without wrapping it into a subtransaction. :It's also clear that we need to differentiate a parent's QuerySnapshot :from their child's. No, a QuerySnapshot is taken at the start of a query ... : It's not clear to me what should be done in the :case of a SerializableSnapshot. A SerializableSnapshot is the first snapshot taken during a *transaction tree*. ServusManfred
pgsql-hackers by date: