Re: Table Lock issue - Mailing list pgsql-general

From Tom Lane
Subject Re: Table Lock issue
Date
Msg-id 24436.1082089052@sss.pgh.pa.us
Whole thread Raw
In response to Re: Table Lock issue  ("Uwe C. Schroeder" <uwe@oss4u.com>)
Responses Re: Table Lock issue
List pgsql-general
"Uwe C. Schroeder" <uwe@oss4u.com> writes:
> On Thursday 15 April 2004 04:20 pm, Tom Lane wrote:
>> ... The problem is that the SELECT
>> doesn't get the right result.  The SELECT actually sees two row versions
>> as being valid: the one you just created by UPDATE, and whichever one
>> was current when the outer transaction started.

> One question to "was current when the outer transaction started".

I knew I wasn't going to get away with that explanation ;-).  Okay,
here's the long version.  Let's imagine this situation: initially
there's one row with last_value 1.  Transaction A comes and does

    begin;
    select ib_nextval('mycounter');

but doesn't commit yet.  Now there are two rows: one with last_value 1,
which is marked as created by some past committed transaction, and as
deleted by transaction A which is as yet uncommitted.  There is also
a row with last_value 2, which is marked as created by transaction A and
deleted by nobody.

Now transaction B comes and does

    select ib_nextval('mycounter');

It's going to block until A commits --- in your version, it blocks at
the LOCK-table-exclusively command, in mine at the UPDATE because the
UPDATE sees the row lock on the updated row.  But in either case it
waits.  Once xact A commits, B proceeds to mark the row with last_value
2 as deleted by itself, and creates a row with last_value 3, created by
itself and deleted by nobody.  (This row will have last_value 3, not
something less, because the UPDATE will use the latest available
committed row as the starting point for its "last_value+1" computation.)

Now we come to the tricky part: transaction B does its SELECT.  Which of
the three available rows will it consider valid?  Because this SELECT is
inside a function, and we don't advance the QuerySnapshot inside a
function, the SELECT will be applying MVCC rules with respect to a
snapshot that was taken when the outer "select ib_nextval()" began ---
in other words, before transaction A committed.  So the validity checks
stack up like this:

* original row with last_value 1: created by a long-since-committed
transaction, and deleted by a transaction (xact A) that had not
committed at the time of the snapshot.  Ergo, good.

* second row with last_value 2: created by a not-yet-committed xact (A)
and deleted by my own transaction.  Loses on either count; not good.

* third row with last_value 3: created by my own transaction and deleted
by nobody.  Ergo, good.

So both last_value 1 and last_value 3 are visible to the SELECT, and
it's a crapshoot which will come up first in SELECT INTO.

If we were to advance the QuerySnaphot between statements of a plpgsql
function, the problem would go away because the SELECT would see
transaction A as already committed, making the original row not-good.

Now in this situation it is good to recognize the effects of other
transactions between statements of a plpgsql function, but it's not hard
to think up cases in which plpgsql functions would break if the visible
database state changes between statements.  So it's a bit of a tough
choice what to do.  I'm personally starting to think that we *should*
advance the QuerySnapshot, but as I said there's not yet a consensus
about it.

Oh, one other point: SELECT FOR UPDATE fixes this because it has
different visibility rules.  Like UPDATE, it will *never* consider good
a row version that is marked as deleted by any committed transaction.

            regards, tom lane

pgsql-general by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: [HACKERS] Remove MySQL Tools from Source?
Next
From: "Thomas Chille"
Date:
Subject: Re: PLpgSQL-Problem