READ COMMITTED isolevel is implemented ... - Mailing list pgsql-hackers

and this is now the DEFAULT isolevel.

I run some tests to ensure how it works, but not so much.
Unfortunately, currently it's not possible to add
such tests to regression suit because of they require
concurrent transactions. We could write simple script to
run a few psql-s simultaneously and than just put queries
to them (through pipes) in required order. I have no time
for this now...

Processing updates in READ COMMITTED isolevel is much
complex than in SERIALIZABLE one, because of if transaction T1
notices that tuple to be updated/deleted/selected_for_update
is changed by concurrent transaction T2 then T1 has to check
does new version of tuple satisfy T1 plan qual or not.
For simple cases like UPDATE t ... WHERE x = 0 or x = 1
it would be possible to just ExecQual for new tuple, but
for joins & subqueries it's required to re-execute entire
plan having this tuple stuck in Index/Seq Scan over result
relation (i.e. - scan over result relation will return
only this new tuple, but all other scans will work as usual). 
To archieve this, copy of plan is created and executed. If
tuple is returned by this child plan then T1 tries to update
new version of tuple and if it's already updated (in the time
of child plan execution) by transaction T3 then T1 will re-execute
child plan for T3' version of tuple, etc.

Handling of SELECT FOR UPDATE OF > 1 relations is ever more
complex. While processing tuples (more than 1 tuple may be 
returned by join) from child plan P1 created for tuple of table
A and trying to mark a tuple of table B, updated by T3, T1
will have to suspend P1 execution and create new child plan
P2 with two tuples stuck in scans of A & B. Execution of P1
will be continued after execution of P2 (P3, P4 ... -:)).
Fortunately, max # of possible child plans is equal to
the number of relations in FOR UPDATE clause: if while
processing first tuple from Pn T1 sees that tuple stuck in
Pm, m < n, was changed, then T1 stops execution of
Pn, ..., Pm-1 and re-start Pm execution for new version
of tuple. Note that n - m may be more than 1 because of
tuples are always marked in the order specified in FOR UPDATE
clause and only after transaction ensured that new tuple
version satisfies plan qual.

Trigger manager is also able to use child plans for
before row update/delete triggers (tuple must be 
marked for update - i.e. locked - before trigger
execution), but this is not tested at all, yet.

Executor never frees child plans explicitely but re-uses
them if needed and there are unused ones. 

Well, MVCC todo list:

-- big items

1. vacuum
2. btree  2.1 still use page locking  2.2 ROOT page may be changed by concurrent insertion but      btinsert doesn't
checkthis
 

-- small ones

3. refint - selects don't block concurrent transactions:            FOR UPDATE must be used in some cases
4. user_lock contrib code: lmgr structures changed

Vadim


pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: postgres v6.0 and pgsql-docs-digest V1 #401
Next
From: Michael Meskes
Date:
Subject: another ecpg patch