Thread: possible row locking bug in 7.0.3 & 7.1

possible row locking bug in 7.0.3 & 7.1

From
Forest Wilkinson
Date:
I'm having a problem with functions written in SQL.  Specifically, they
don't seem to be adhering to Postgres locking rules.  For the record, I'm 
using postgres 7.0.3, installed from RPMs, on Red Hat 6.2.  I got the same
results with postgres 7.1 beta 6, installed from sources.

Here's what I'm seeing:
(psql input represented by '<<'; output represented by '>>'.)


session1<< create table idseq
session1<<   (
session1<<   name varchar(32) not null,
session1<<   id int8 not null default 0
session1<<   );
session1>> CREATE

session1<< insert into idseq values ('myid');
session1>> INSERT 18734 1


Each row in the table is supposed to represent a named numeric sequence,
much like the sequences built into postgres.  (Mine use an int8 though,
so their values can be much higher.)


session1<< create function nextid( varchar(32)) returns int8 as '
session1<<   select * from idseq where name = $1::text for update;
session1<<   update idseq set id = id + 1 where name = $1::text;
session1<<   select id from idseq where name = $1::text;
session1<<   ' language 'sql';
session1>> CREATE


The idea here is that the select...for update within the nextid() function
will establish a row level lock, preventing two concurrent function calls
from overlapping.

Next, I test with two sessions as follows:


session1<< begin;
session1>> BEGIN

session2<< begin;
session2>> BEGIN

session1<< select nextid('myid');
session1>>  nextid
session1>> --------
session1>>       1
session1>> (1 row)

session2<< select nextid('myid');

(session2 blocks until session1 completes its transaction)

session1<< commit;
session1>> COMMIT

(session2 resumes)

session2>>  nextid
session2>> --------
session2>>       0
session2>> (1 row)

What gives???  I expected the second call to nextid() to return 2!

session2<< commit;
session2>> COMMIT

session2<< select * from idseq;
session2>>  name | id
session2>> ------+----
session2>>  myid |  2
session2>> (1 row)

session1<< select * from idseq;
session1>>  name | id
session1>> ------+----
session1>>  myid |  2
session1>> (1 row)

As you can see, my nextid() function is not synchronized the way I hoped.
I don't know why, though.  Can someone help?  I'm going to try out some of my 
SPI functions with 7.1 beta 6, to see if they exhibit a locking problem as 
well.

Thanks,

Forest Wilkinson



Re: possible row locking bug in 7.0.3 & 7.1

From
Tom Lane
Date:
Forest Wilkinson <fspam@home.com> writes:
> session1<< create function nextid( varchar(32)) returns int8 as '
> session1<<   select * from idseq where name = $1::text for update;
> session1<<   update idseq set id = id + 1 where name = $1::text;
> session1<<   select id from idseq where name = $1::text;
> session1<<   ' language 'sql';
> [ doesn't work as expected in parallel transactions ]

This is a fairly interesting example.  What I find is that at the final
SELECT, the function can see both the tuple outdated by the other
transaction AND the new tuple it has inserted.  (You can demonstrate
that by doing select count(id) instead of select id.)  Whichever one
happens to be visited first is the one that gets returned by the
function, and that's generally the older one in this example.

MVCC seems to be operating as designed here, more or less.  The outdated
tuple is inserted by a known-committed transaction, and deleted by a
transaction that's also committed, but one that committed *since the
start of the current transaction*.  So its effects should not be visible
to the SELECT, and therefore the tuple should be visible.  The anomalous
behavior is not really in the final SELECT, but in the earlier commands
that were able to see the effects of a transaction committed later than
the start of the second session's transaction.

The workaround for Forest is to make the final SELECT be a SELECT FOR
UPDATE, so that it's playing by the same rules as the earlier commands.
But I wonder whether we ought to rethink the MVCC rules so that that's
not necessary.  I have no idea how we might change the rules though.
If nothing else, we should document this issue better: SELECT and SELECT
FOR UPDATE have different visibility rules, so you probably don't want
to intermix them.
        regards, tom lane


Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

From
Philip Warner
Date:
At 18:14 27/03/01 -0500, Tom Lane wrote:
>Forest Wilkinson <fspam@home.com> writes:
>> session1<< create function nextid( varchar(32)) returns int8 as '
>> session1<<   select * from idseq where name = $1::text for update;
>> session1<<   update idseq set id = id + 1 where name = $1::text;
>> session1<<   select id from idseq where name = $1::text;
>> session1<<   ' language 'sql';
>> [ doesn't work as expected in parallel transactions ]
>
>What I find is that at the final
>SELECT, the function can see both the tuple outdated by the other
>transaction AND the new tuple it has inserted.

Surely we should distinguish between real new tuples, and new tuple
versions? I don't think it's ever reasonable behaviour to see two versions
of the same row.


>(You can demonstrate
>that by doing select count(id) instead of select id.)  Whichever one
>happens to be visited first is the one that gets returned by the
>function, and that's generally the older one in this example.
>
>MVCC seems to be operating as designed here, more or less.  The outdated
>tuple is inserted by a known-committed transaction, and deleted by a
>transaction that's also committed, but one that committed *since the
>start of the current transaction*.  So its effects should not be visible
>to the SELECT, and therefore the tuple should be visible.  The anomalous
>behavior is not really in the final SELECT, but in the earlier commands
>that were able to see the effects of a transaction committed later than
>the start of the second session's transaction.

Looking at the docs, I see that 'SERIALIZABLE' has the same visibility
rules as 'READ COMMITTED', which is very confusing. I expect that a Read
Committed TX should see committed changes for a TX that commits during the
first TX (although this may need to be limited to TXs started before the
first TX, but I'm not sure). If this is not the case, then we never get
non-repeatable reads, AFAICT:
   P2 (��Non-repeatable read��): SQL-transaction T1 reads a row.    SQL-transaction T2 then modifies or deletes that
rowand performs    a COMMIT. If T1 then attempts to reread the row, it may   receive the modified value or discover
thatthe row has been deleted.
 

which is one of the differences between SERIALIZABLE and READ-COMMITTED.


>The workaround for Forest is to make the final SELECT be a SELECT FOR
>UPDATE, so that it's playing by the same rules as the earlier commands.

Eek. Does this seem good to you? I would expect that SELECT and
SELECT...FOR UPDATE should return the same result set.


>But I wonder whether we ought to rethink the MVCC rules so that that's
>not necessary.  I have no idea how we might change the rules though.

Disallowing visibility of two versions of the same row would help.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
>> The workaround for Forest is to make the final SELECT be a SELECT FOR
>> UPDATE, so that it's playing by the same rules as the earlier commands.

> Eek. Does this seem good to you?

I did call it a workaround ;-)

I don't think that we dare try to make any basic changes in MVCC for 7.1
at this late hour, so Forest is going to have to live with that answer
for awhile.  But I would like to see a cleaner answer in future
releases.  As I've opined before, the whole EvalPlanQual mechanism
strikes me as essentially bogus in any case...
        regards, tom lane


Re: possible row locking bug in 7.0.3 & 7.1

From
Forest Wilkinson
Date:
On Tuesday 27 March 2001 15:14, Tom Lane wrote:
> Forest Wilkinson <fspam@home.com> writes:
> > session1<< create function nextid( varchar(32)) returns int8 as '
> > session1<<   select * from idseq where name = $1::text for update;
> > session1<<   update idseq set id = id + 1 where name = $1::text;
> > session1<<   select id from idseq where name = $1::text;
> > session1<<   ' language 'sql';
> > [ doesn't work as expected in parallel transactions ]
[snip]
> The workaround for Forest is to make the final SELECT be a SELECT FOR
> UPDATE, so that it's playing by the same rules as the earlier commands.
> But I wonder whether we ought to rethink the MVCC rules so that that's
> not necessary.  I have no idea how we might change the rules though.
> If nothing else, we should document this issue better: SELECT and SELECT
> FOR UPDATE have different visibility rules, so you probably don't want
> to intermix them.

My, that's ugly.  (But thanks for the workaround.)

If I remember correctly, UPDATE establishes a lock on the affected rows, 
which will block another UPDATE on the same rows for the duration of the 
transaction.  If that's true, shouldn't I be able to achieve my desired 
behavior by removing the initial as follows:

create function nextid( varchar(32)) returns int8 as ' update idseq set id = id + 1 where name = $1::text; select id
fromidseq where name = $1::text; ' language 'sql';
 

Or, would I still have to add FOR UPDATE to that final SELECT?


Forest


Re: possible row locking bug in 7.0.3 & 7.1

From
Tom Lane
Date:
Forest Wilkinson <fspam@home.com> writes:
> If I remember correctly, UPDATE establishes a lock on the affected rows, 
> which will block another UPDATE on the same rows for the duration of the 
> transaction.  If that's true, shouldn't I be able to achieve my desired 
> behavior by removing the initial as follows:

> create function nextid( varchar(32)) returns int8 as '
>   update idseq set id = id + 1 where name = $1::text;
>   select id from idseq where name = $1::text;
>   ' language 'sql';

> Or, would I still have to add FOR UPDATE to that final SELECT?

You're right, the initial SELECT FOR UPDATE is a waste of cycles
considering that you're not using the value it returns.  But you'll
still need the last select to be FOR UPDATE so that it plays by the
same rules as the UPDATE does.
        regards, tom lane


Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Philip Warner <pjw@rhyme.com.au> writes:
> >> The workaround for Forest is to make the final SELECT be a SELECT FOR
> >> UPDATE, so that it's playing by the same rules as the earlier commands.
> 
> > Eek. Does this seem good to you?
> 
> I did call it a workaround ;-)
> 
> I don't think that we dare try to make any basic changes in MVCC for 7.1
> at this late hour, so Forest is going to have to live with that answer
> for awhile.  But I would like to see a cleaner answer in future
> releases.

Is it the MVCC's restriction that each query inside a function
must use the same snapshot ?

> As I've opined before, the whole EvalPlanQual mechanism
> strikes me as essentially bogus in any case...
> 

How would you change it ? UPDATE/SELECT FOR UPDATE have to
SELECT/UPDATE the latest tuples. I don't think of any simple
way for 'SELECT FOR UPDATE' to have the same visibility as
simple SELECT.

regards,
Hiroshi Inoue