Thread: odd behavior: function not atomic/not seeing it's own updates/not acting serializable nor read committed

fwiw this was tested on 7.2 and 7.4 devel.

test case:
DROP TABLE locktest;
CREATE TABLE locktest (a INTEGER, c TIMESTAMPTZ DEFAULT now() NOT NULL);
INSERT INTO locktest VALUES (0);

CREATE OR REPLACE FUNCTION locktest1(integer) RETURNS INTEGER AS '
DECLARE
x INTEGER :=1;
y ALIAS FOR $1;
foo INTEGER;
BEGIN
LOCK TABLE locktest IN EXCLUSIVE MODE;
INSERT INTO locktest(a) VALUES (y);
LOOP
x:= x + 1;
EXIT WHEN x >= 1000000;
END LOOP;
INSERT INTO locktest(a) VALUES (y+1);
SELECT a FROM locktest ORDER BY c ASC LIMIT 1 INTO foo;
RAISE NOTICE \'original entry in column a is %\', foo;
UPDATE locktest SET a=a+y;
SELECT a FROM locktest ORDER BY c ASC LIMIT 1 INTO foo;
RAISE NOTICE \'original entry in column a is now %\', foo;
return y;
END;
' language 'plpgsql';


In two separate windows, I select the function locktest at (roughly) the
same time:

window#1
foo=# select locktest1(10);
NOTICE:  original column a is 0
NOTICE:  original column a is 10
 locktest1
-----------
        10
(1 row)

foo=#

window#2
foo=# select locktest1(20);
NOTICE:  original column a is 0
NOTICE:  original column a is 0
 locktest1
-----------
        20
(1 row)

foo=#
foo=# select * from locktest order by c;
 a  |               c
----+-------------------------------
 30 | 2002-12-16 16:37:15.728062-05
 20 | 2002-12-16 16:37:23.067412-05
 21 | 2002-12-16 16:37:23.067412-05
 40 | 2002-12-16 16:37:25.802827-05
 41 | 2002-12-16 16:37:25.802827-05
(5 rows)

foo=#

internally the second function gives conflicting results: it never sees
the updated information for the first entry (at 15 seconds into the
minute) as far as the select statements are concerned as evidenced by
the raise notices both being 0. In fact, it also does not see the
changes of it's own update command, even though the first function call
at least saw that. However on some level it does see the actual updated
information from the first function, because the initial entry ends up
as 30 (0 + 10 + 20).  Furthermore, concurrency between the two is broken
because while the second function sees the update changes made by the
first function and updates them accordingly, it doesn't see the insert
changes made by the first function, and so they are not updated.

"this is broken on so many levels..."

If functions ran like they were in read committed mode, then my final
resutset should look like this:

 a  |               c
----+-------------------------------
 30 | 2002-12-16 16:43:02.374575-05
 40 | 2002-12-16 16:43:11.640649-05
 41 | 2002-12-16 16:43:11.640649-05
 40 | 2002-12-16 16:43:16.705811-05
 41 | 2002-12-16 16:43:16.705811-05
(5 rows)

since the second function call would wait for the first call to finish
before attempting to make changes, would see all three records in the
resultset of the first function, and would update all of those records
accordingly.


OTOH if functions ran as if they we're in serializable mode, the second
function would, upon attempt to update the first record, see that the
record was already updated, and throw a "ERROR:  Can't serialize access
due to concurrent update", which could then be dealt with accordingly.

As it stands now we have a bastardization of the two transaction levels.

Robert Treat
Robert Treat <xzilla@users.sourceforge.net> writes:
> "this is broken on so many levels..."

The issue here is that the second transaction has already set its
snapshot when it enters the function body; so even though its LOCK
command faithfully waits for the first transaction to commit, the
second transaction can't see (and so can't update) the two rows
inserted by the first transaction.  It doesn't really see the update
of the original row either, except for purposes of its own UPDATE.

It would work more like you're expecting if you'd issued the LOCK
before calling the function.  I realize that's not convenient in
many cases.

> OTOH if functions ran as if they we're in serializable mode, the second
> function would, upon attempt to update the first record, see that the
> record was already updated, and throw a "ERROR:  Can't serialize access
> due to concurrent update", which could then be dealt with accordingly.

It would do that, if you had run it in serializable mode.  I get:

regression=# begin;
BEGIN
regression=# set TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
SET
regression=# select locktest1(20);
NOTICE:  original entry in column a is 0
WARNING:  Error occurred while executing PL/pgSQL function locktest1
WARNING:  line 15 at SQL statement
ERROR:  Can't serialize access due to concurrent update
regression=#


In the read-committed case, there has been some talk of executing
SetQuerySnapshot between statements of a function; search the archives
for "SetQuerySnapshot" to find past threads.  I'm leaning in favor
of that myself, but no one's yet given a convincing analysis of what
this would change and what the downside might be.

(A compromise less likely to break existing code might be to do
SetQuerySnapshot only after a function issues LOCK, but still the real
effects of this would need to be clearly understood before such a
proposal is likely to pass.)

            regards, tom lane

Re: odd behavior: function not atomic/not seeing it's own

From
Robert Treat
Date:
On Mon, 2002-12-16 at 18:53, Tom Lane wrote:
> Robert Treat <xzilla@users.sourceforge.net> writes:
> > "this is broken on so many levels..."
>
> The issue here is that the second transaction has already set its
> snapshot when it enters the function body; so even though its LOCK
> command faithfully waits for the first transaction to commit, the
> second transaction can't see (and so can't update) the two rows
> inserted by the first transaction.  It doesn't really see the update
> of the original row either, except for purposes of its own UPDATE.
>

I understand that, but what I don't understand is why does the first
function call see it's changes (as evidence by the different information
in the raise notices -- 0,10) but the second query is unable to see
either the first functions changes or its own? (raise notice returns
0,0) The fact that functions behave differently when run concurrently
seems dangerous to me. Consider rewriting the test case function as
this:

CREATE OR REPLACE FUNCTION locktest1(integer) RETURNS INTEGER AS '
DECLARE
x INTEGER :=1;
y ALIAS FOR $1;
foo INTEGER;
BEGIN
LOCK TABLE locktest IN EXCLUSIVE MODE;
INSERT INTO locktest(a) VALUES (y);
LOOP
x:= x + 1;
EXIT WHEN x >= 1000000;
END LOOP;
INSERT INTO locktest(a) VALUES (y+1);
SELECT a FROM locktest ORDER BY c ASC LIMIT 1 INTO foo;
RAISE NOTICE \'original entry in column a is %\', foo;
UPDATE locktest SET a=a+y;
SELECT a FROM locktest ORDER BY c ASC LIMIT 1 INTO foo;

IF foo > 0 THEN
UPDATE locktest SET a=a+100;
END IF;

RAISE NOTICE \'original entry in column a is now %\', foo;
return y;
END;
' language 'plpgsql';

the check for foo returns true in the first function, returns false in
the second function, which will cause the functions to behave
differently given the same logic. This seems very dangerous to me.


> It would work more like you're expecting if you'd issued the LOCK
> before calling the function.  I realize that's not convenient in
> many cases.

inconvenient to say the least. one of the major reason to use functions
is the ability to capture business logic within the database so that it
can be used by multiple apps.  Given the need to issue locks in the
application code, this functionality takes a major hit. (Your left
hoping the application programmers put locks in all the right places,
otherwise your data may get corrupted)

>
> > OTOH if functions ran as if they we're in serializable mode, the second
> > function would, upon attempt to update the first record, see that the
> > record was already updated, and throw a "ERROR:  Can't serialize access
> > due to concurrent update", which could then be dealt with accordingly.
>
> It would do that, if you had run it in serializable mode.  I get:
>
> regression=# begin;
> BEGIN
> regression=# set TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
> SET
> regression=# select locktest1(20);
> NOTICE:  original entry in column a is 0
> WARNING:  Error occurred while executing PL/pgSQL function locktest1
> WARNING:  line 15 at SQL statement
> ERROR:  Can't serialize access due to concurrent update
> regression=#
>

true. however again, this work around requires code in the application,
since there is no way to set the transaction level within a function.
what we need is self-contained solution for functions.

>
> In the read-committed case, there has been some talk of executing
> SetQuerySnapshot between statements of a function; search the archives
> for "SetQuerySnapshot" to find past threads.  I'm leaning in favor
> of that myself, but no one's yet given a convincing analysis of what
> this would change and what the downside might be.

I've done a bit of searching on this and I don't see any cases where the
current way setQuerySnapshot works is improving the situation. My
feeling is it seems there might be cases where functions would not work
atomically if they always saw what other peoples changes were doing.

Imagine a function that selects a value, does some computation, then
selects again.  If that first value changes in the middle of the
function, it might throw off what is supposed to happen. This is a
result of the fact that some functions should work instantaneously, but
can't always do so.

>
> (A compromise less likely to break existing code might be to do
> SetQuerySnapshot only after a function issues LOCK, but still the real
> effects of this would need to be clearly understood before such a
> proposal is likely to pass.)
>

I've thought about this too, and I think it is a decent compromise. If
there is one case where your really likely to want an updated
QuerySnapshot it's when you grab a lock on a table, after all, why would
you want to lock data that you can't see?  I can't think of a case where
this would break things given that right now we are blindly acquiring
locks; how can actually knowing what you've locked cause you problems?
Furthermore, this would help ensure that further queries inside the
function will be looking at a more consistent snapshot. I would lobby
that this should go in 7.3.1 as it at least gives people a viable, self
contained work around.

I think ideally what would be good would be the ability to create
functions as inherently serializable or read committed (perhaps though a
parameter passed in at function creation like iscacheable). This would
force the function to behave in one of the two modes as documented for
transaction types, and update the query snapshots accordingly.  I think
the default would be serializable in order to preserve atomicity.

One last thing, and correct me if I'm wrong, but any fix would need to
happen for all of the function languages, not just plpgsql, right?

Robert Treat