Thread: BUG #17050: cursor with for update + commit in loop

BUG #17050: cursor with for update + commit in loop

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17050
Logged by:          Алексей Булгаков
Email address:      bulgakovalexey1980@gmail.com
PostgreSQL version: 12.7
Operating system:   Red Hat 4.4.7-23
Description:

create table public.test_tuple_stream (
       id serial,
       nm text,
       dt timestamptz,
       num bigint
       );

CREATE OR REPLACE PROCEDURE public.test_tuple_stream()
 LANGUAGE plpgsql
AS $procedure$
  declare
      l_cur cursor for
        select id
        from public.test_tuple_stream
        order by id
        for update;
    begin
      for rec in l_cur loop
        update public.test_tuple_stream
        set num = num + 1
        where id = rec.id;
       
        commit;       
      end loop;
     
      commit;
    END;
$proc      
      
--    truncate table public.test_tuple_stream;  
      
    insert into public.test_tuple_stream(nm, dt, num)
    values ('A', now(), 1);
    insert into public.test_tuple_stream(nm, dt, num)
    values ('B', now(), 1);
    insert into public.test_tuple_stream(nm, dt, num)
    values ('C', now(), 1);

  call public.test_tuple_stream()

    select *
    from public.test_tuple_stream
    order by id

If run procedure test_tuple_stream then in result updated 2 rows of 3.
Why?
if remove in procedure "for update" or "commit in loop" then updated 3 rows
of 3


Re: BUG #17050: cursor with for update + commit in loop

From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes:
> If run procedure test_tuple_stream then in result updated 2 rows of 3.
> Why?

So what's happening here is that at the point of the first COMMIT,
we need to save the output of the cursor query, because we have to
release locks and so forth.  The expectation is that subsequent
fetches will read from a tuplestore rather than the live query.

To do that, PersistHoldablePortal supposes that it can rewind and
re-read the query's executor run.  But that only works if the
query is guaranteed stable, which anything involving FOR UPDATE
is not.  In the example at hand, when we re-read the row with ID 1,
nodeLockRows sees that its state is now TM_SelfModified thanks to the
UPDATE we just did.  This causes it to not return that row, so that
what ends up in the tuplestore is only the rows with IDs 2 and 3.
But since the cursor "knows" that one row has already been read out,
the next fetch will return the row with ID 3, and the procedure's
loop never visits the row with ID 2.

I imagine that similar misbehavior could be constructed around
queries containing volatile functions (e.g. random()), rather than
FOR UPDATE.

The only obvious way to fix this is to always save aside the output
of a cursor query in case we need to persist it later, so that
PersistHoldablePortal doesn't have to assume that rewinding is safe.
That would be pretty catastrophic for performance, though, so I doubt
anybody will be happy with that answer.

For cursors that aren't marked scrollable, we might be able to say
that we only save the *rest* of the query output, and then adjust
the cursor state appropriately for that choice.  Seems possibly
nontrivial though, and there's still the question of what to do
for scrollable ones.

Anyway, many thanks for the report!  But don't hold your breath
for a fix; this is going to take some thought.

            regards, tom lane



Re: BUG #17050: cursor with for update + commit in loop

From
Tom Lane
Date:
I wrote:
> The only obvious way to fix this is to always save aside the output
> of a cursor query in case we need to persist it later, so that
> PersistHoldablePortal doesn't have to assume that rewinding is safe.
> That would be pretty catastrophic for performance, though, so I doubt
> anybody will be happy with that answer.
> For cursors that aren't marked scrollable, we might be able to say
> that we only save the *rest* of the query output, and then adjust
> the cursor state appropriately for that choice.  Seems possibly
> nontrivial though, and there's still the question of what to do
> for scrollable ones.

Actually ... maybe it's not that bad.  The nonscrollable case seems
to be quite simple to fix, and as for the scrollable case, maybe
we can just say it's on the user's head that the query produce
stable results.  There's already a prohibition on using FOR UPDATE
with SCROLL, and the DECLARE CURSOR reference page has some warnings
about volatile queries with WITH HOLD, which is basically the same
case as we're worried about here.

I think the DECLARE CURSOR page needs some modernization to mention
that cursors in procedures are basically the same as WITH HOLD.
But as far as code changes go, the attached seems sufficient.

            regards, tom lane

diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 6f2397bd36..d34cc39fde 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -374,10 +374,23 @@ PersistHoldablePortal(Portal portal)
         PushActiveSnapshot(queryDesc->snapshot);

         /*
-         * Rewind the executor: we need to store the entire result set in the
-         * tuplestore, so that subsequent backward FETCHs can be processed.
+         * If the portal is marked scrollable, we need to store the entire
+         * result set in the tuplestore, so that subsequent backward FETCHs
+         * can be processed.  Otherwise, store only the not-yet-fetched rows.
+         * (The latter is not only more efficient, but avoids semantic
+         * problems if the query's output isn't stable.)
          */
-        ExecutorRewind(queryDesc);
+        if (portal->cursorOptions & CURSOR_OPT_SCROLL)
+        {
+            ExecutorRewind(queryDesc);
+        }
+        else
+        {
+            /* We must reset the cursor state as though at start of query */
+            portal->atStart = true;
+            portal->atEnd = false;
+            portal->portalPos = 0;
+        }

         /*
          * Change the destination to output to the tuplestore.  Note we tell
diff --git a/src/pl/plpgsql/src/expected/plpgsql_transaction.out b/src/pl/plpgsql/src/expected/plpgsql_transaction.out
index 8fceb88c9b..4565107073 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_transaction.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_transaction.out
@@ -335,6 +335,32 @@ SELECT * FROM pg_cursors;
 ------+-----------+-------------+-----------+---------------+---------------
 (0 rows)

+-- interaction of FOR UPDATE cursor with subsequent updates (bug #17050)
+TRUNCATE test1;
+INSERT INTO test1 VALUES (1,'one'), (2,'two'), (3,'three');
+DO LANGUAGE plpgsql $$
+DECLARE
+    l_cur CURSOR FOR SELECT a FROM test1 ORDER BY 1 FOR UPDATE;
+BEGIN
+    FOR r IN l_cur LOOP
+      UPDATE test1 SET b = b || ' ' || b WHERE a = r.a;
+      COMMIT;
+    END LOOP;
+END;
+$$;
+SELECT * FROM test1;
+ a |      b
+---+-------------
+ 1 | one one
+ 2 | two two
+ 3 | three three
+(3 rows)
+
+SELECT * FROM pg_cursors;
+ name | statement | is_holdable | is_binary | is_scrollable | creation_time
+------+-----------+-------------+-----------+---------------+---------------
+(0 rows)
+
 -- commit inside block with exception handler
 TRUNCATE test1;
 DO LANGUAGE plpgsql $$
diff --git a/src/pl/plpgsql/src/sql/plpgsql_transaction.sql b/src/pl/plpgsql/src/sql/plpgsql_transaction.sql
index 94fd406b7a..e65f840a0e 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_transaction.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_transaction.sql
@@ -273,6 +273,27 @@ SELECT * FROM test2;
 SELECT * FROM pg_cursors;


+-- interaction of FOR UPDATE cursor with subsequent updates (bug #17050)
+TRUNCATE test1;
+
+INSERT INTO test1 VALUES (1,'one'), (2,'two'), (3,'three');
+
+DO LANGUAGE plpgsql $$
+DECLARE
+    l_cur CURSOR FOR SELECT a FROM test1 ORDER BY 1 FOR UPDATE;
+BEGIN
+    FOR r IN l_cur LOOP
+      UPDATE test1 SET b = b || ' ' || b WHERE a = r.a;
+      COMMIT;
+    END LOOP;
+END;
+$$;
+
+SELECT * FROM test1;
+
+SELECT * FROM pg_cursors;
+
+
 -- commit inside block with exception handler
 TRUNCATE test1;