BUG #4925: "select ... for update" doesn't affect rows from sub-query - Mailing list pgsql-bugs

From Steve Caligo
Subject BUG #4925: "select ... for update" doesn't affect rows from sub-query
Date
Msg-id 200907161634.n6GGYDQJ049925@wwwmaster.postgresql.org
Whole thread Raw
Responses Re: BUG #4925: "select ... for update" doesn't affect rows from sub-query  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-bugs
The following bug has been logged online:

Bug reference:      4925
Logged by:          Steve Caligo
Email address:      steve.caligo@ctie.etat.lu
PostgreSQL version: 8.3.7 and 8.4.0
Operating system:   Archlinux and Gentoo 8.3.7, Gentoo 8.4.0
Description:        "select ... for update" doesn't affect rows from
sub-query
Details:

While trying to guarantee data consistency when doing concurrent processing,
I stumbled upon your cautions mentionned in your documentation (especially
"limit" ... "for update"):
http://www.postgresql.org/docs/8.4/static/sql-select.html

I tried working around this limitation and the statement on the same page
seemed promising to me:
"If FOR UPDATE or FOR SHARE is applied to a view or sub-query, it affects
all tables used in the view or sub-query."

But unfortunately the latter statement doesn't seem to be true and the
subquery isn't protected by row locks, as the following examples show.


1) INITIAL SITUATION

Create a simple table with some data. No constraints, no indexes, just the
bare minimum:

create table test (
    id integer,
    name varchar(10),
    c integer
);

insert into test values
    (1, 'test1', 0),
    (2, 'test2', 0),
    (3, 'test3', 0),
    (4, 'test4', 0),
    (5, 'test5', 0)
;


2) FIRST TRY, USING "UPDATE WHERE ID = ( SELECT )"
The goal is to have two clients set their unique ID to a single/different
row from the table. First, using "limit" in a slightly different way:

1=> begin transaction;
2=> begin transaction;

1=> update test set c = 1 where id = (
    select id from test where c = 0 order by id limit 1
)
; -- updates row id=1
2=> update test set c = 2 where id = (
    select id from test where c = 0 order by id limit 1
)
; -- forced to wait on lock

1=> commit; -- client #2 continues
1=> select * from test where id = 1;
 id | name  | c
----+-------+---
  1 | test1 | 1
(1 row)

2=> commit;
2=> select * from test where id = 1;
 id | name  | c
----+-------+---
  1 | test1 | 2
(1 row)

1=> select * from test where id = 1;
 id | name  | c
----+-------+---
  1 | test1 | 2
(1 row)


Conclusion: didn't work.
Probably because the subquery is executed before the update and not affected
by row locking.

Expected behaviour: one client update one row to c=1 and the other client
updates a different row to c=2.


3) SECOND TRY, PROTECTING THE UPDATE BY AN ADDITIONAL "SELECT ... FOR
UPDATE" AND AVOIDING THE "LIMIT"

1=> begin transaction;
2=> begin transaction;
1=> select id from test where id = (
    select min(id) from test where c = 0
) for update;
 id
----
  2
(1 row)

2=> select id from test where id = (
    select min(id) from test where c = 0
) for update; -- forced to wait on lock

1=> update test set c = 1 where id = 2;
1=> commit; -- client #2 continues:

2=> -- client #2 outputs:
 id
----
  2
(1 row)

2=> select * from test where id = 2;
 id | name  | c
----+-------+---
  2 | test2 | 1
(1 row)
2=> -- now this isn't what we initially asked for, let's just repeat the
query once more:

2=> select id from test where id = ( select min(id) from test where c = 0 )
for update;
 id
----
  3
(1 row)


Conclusion: didn't work.
The situation one ends up in is one that contradicts your above statement,
but also seemingly violates the "I" in ACID. In 3), client #2 is clearly
affected by the actions of client #1. While serialized transactions or full
table locks would avoid this race condition, it either requires large
changes in the application or impacts performance during contention.

Adding an additional "c = 0" to the main query of 3) of course suppresses
the row from the concurrent update, but it shouldn't have been returned with
a value of "c = 1" because the transaction #2 started prior to the update
statement of #1.

pgsql-bugs by date:

Previous
From: Frank van Vugt
Date:
Subject: Re: bug or simply not enough stack space?
Next
From: "Juan C. Aragon"
Date:
Subject: FATAL: could not reattach to shared memory (key=268, addr=01E30000): 487