Re: atomic commit;begin for long running transactions , in combination with savepoint. - Mailing list pgsql-general

From Trevor Talbot
Subject Re: atomic commit;begin for long running transactions , in combination with savepoint.
Date
Msg-id 90bce5730710150528k153956c3had9cd05c5e296a12@mail.gmail.com
Whole thread Raw
In response to Re: atomic commit;begin for long running transactions , in combination with savepoint.  (Syan Tan <kittylitter@people.net.au>)
List pgsql-general
On 10/15/07, Syan Tan <kittylitter@people.net.au> wrote:

> >In order to detect a change occurred, what you want is a SERIALIZABLE
> >transaction: you want the update to fail if the row it matches is no
> >longer the same version as your snapshot.  However, in order to read
> >the new value to decide if you want to update it anyway, you need to
> >leave your current snapshot.  As soon as you do that, ALL previously
> >read values lose the update checks that snapshot provided you.
>
> you read the old value at the beginning of the transaction, and
> you don't re-read it , assuming it is infrequently changed, so
> if someone updates it concurrently, when you try to write, then
> you detect the conflict, and rollback to the savepoint.

Transactions don't operate based on what you've read.  "UPDATE ...
WHERE ..." finds the row(s) to update using the WHERE clause right
now, not based on any previously-read values.  It does not know what
you've read before.

The only difference is in the data you _can_ read.  For the
SERIALIZABLE isolation level, that data was decided at the beginning
of the transaction.  A row that was updated by another transaction
will make the version that you can see effectively "read only", so
when the UPDATE tries to change it, there will be a transactional
conflict due to the isolation level.

> You DONT want a serializable transaction, because then you can't read
> the other committed value after rolling back to the savepoint.

Correct.  But the READ COMMITTED isolation level does not limit what
data you can see at the beginning of the transaction, so an UPDATE
will always find the latest version of a row.  There is no conflict as
far as the transaction is concerned.

> >A way to do this using PostgreSQL's own row version data came up
> >recently on this list.  Have a look at this post and the one following
> >it:
> >http://archives.postgresql.org/pgsql-general/2007-10/msg00503.php
>
> this is the same as using your own version ids and incrementing them
> within the application, which would leave the xmin within postgresql's
> domain , and would also mean the application's sql is not tied
> to postgresql.

Yes.  You were asking for a feature in PostgreSQL that doesn't match
standard transaction semantics though, so I figured you wouldn't mind
a PostgreSQL-specific option :)


Assuming READ COMMITTED isolation level:

> begin;
> select x from t1 where id=2;
> (store in variable x0 , display to user)
> ...(much later)
> (user changes stored x, at client, now x1)
> savepoint a;
>
> answ = n;
>
> do:
> try:
>     update t1 set x=x1 where id = 2;
>     commit-and-continue;
> catch:
>      rollback to savepoint a;
>      select x from t1 where id=2 ( store as x2)
>      if (x2 <> x0) notify user "x has changed from x0 to x2, continue to write x1?"
>      input answ;
>
> while answ ==y;
>
>
> In the first pass of the loop, the transaction hasn't read x a second
> time so the transaction state for x is at x0,

Transaction state is not based on what you've read; it doesn't know.

> if x has been changed by another transaction's commit, then the catch will
> execute ,

The UPDATE will find the latest version of the row.  The change made
by the other transaction is not a problem at this isolation level (you
can see it), so the UPDATE will simply proceed and change it anyway.
The catch block will never execute.

> If the transaction was serializable , and another transaction has committed,
> this would never work, because a commit would nullify the other
> transactions write without this transaction ever having seen the other
> transactions write, so this transaction would always be forced to rollback.

Exactly.  But without SERIALIZABLE isolation, any operations you
perform always see the other transaction's write, so there is never
any conflict.

Also keep in mind that MVCC is not the only way to implement
transactions; pure locking is more common in other databases.  In the
locking model, most transactions prevent others from writing until
after they are finished.  Rows simply can't have different versions
(and of course concurrent performance is awful).

pgsql-general by date:

Previous
From: Joao Miguel Ferreira
Date:
Subject: Re: replicating to a stopped server
Next
From: "Gauthier, Dave"
Date:
Subject: Re: Guideline on use of temporary tables