Thread: SELECT FOR UPDATE differs inside and outside a pl/pgsql function (7.4)

SELECT FOR UPDATE differs inside and outside a pl/pgsql function (7.4)

From

Mark Shewmaker

Date:

17 December 2003, 13:00:16

Should there be a difference between the behavior of a "select for
update" typed into psql directly versus "select into variable_name for
update" done within a function?

In other words:  Is this a bug or a user misunderstanding:

1.  Run the following commands to set up a table called
    mytable and a function called myfunction:

    ------------------------------

    create table mytable (a int);
    insert into mytable values (1);
    insert into mytable values (2);
    insert into mytable values (3);

    create or replace function myfunction() returns integer as '
    DECLARE
    myrow     mytable%ROWTYPE;
    BEGIN
    LOOP
      select * into myrow from mytable limit 1 for update;
      if found then exit;
      end if;
    END LOOP;
    return myrow.a;
    end; ' language 'plpgsql';
    ------------------------------

2.  Then open up two psql sessions and run the following commands:

    +-----------------------------------------+----------------------+
    | psql Session A                          | psql Session B       |
    +-----------------------------------------+----------------------+
    |begin transaction;                       |                      |
    |                                         | begin transaction;   |
    |select * from mytable limit 1 for update;|                      |
    |                                         | select myfunction(); |
    |delete from mytable where a=1;           |                      |
    |commit;                                  |                      |
    +-----------------------------------------+----------------------+

Session B's "select myfunction();" will hang, and it will continue to
hang even after session A commits.  Is this expected behavior?

There are two ways to have session B not hang after the commit:

1.  Don't do the "delete from mytable where a=1;".

    Session B's "select myfunction();" will then return after
    Session A commits, and with a value of 1.

    Or,

2.  Instead of running "select myfunction();" in Session B,
    run two manual "select * from mytable limit 1 for update;"s.

    The first manual select-for-update will hang until Session A's
    transaction commits, after which the second manual select-for-update
    in session A will succeeds.

    This one really confuses me--should a function not be able
    to find a row when a manual command can?

So I guess I'm curious as to:

1.  Whether this is a bug or not.  I'm guessing yes.

    (I expected the multiple select-for-update attempts in the
    function to the same behavior as multiple select-for-update's
    done manually.  That is, I expected both types of B's
    select-for-updates selecting locked rows to hang until session A's
    commit, immediately fail to find any row, and then succeed on the
    next try.  It would be nice if session B's first select were to
    have transparently succeeded on the a=2 row, something which I think
    would be a legal thing to happen, but as it's not what the
    documentation implies would happen I didn't expect that.  In any
    event I did not expect the select-for-update within pgsql to
    continually fail to find a row.)

2.  If there's a better way to have multiple transactions lock rows
    with select-for-update without any chance of the transaction
    erroring out, (such as would occur with serializable--and that
    would mean I'd have to have to move more of the logic to the
    application.)

    (I don't know of a way to avoid even the busy-looping--though I
    could call sleep functions from plperl or something to lower the
    cpu load.  BTW, it would be nice to have some plpgsql-native sleep
    function just to more easily test for problems like this.)

3.  If there's some really elegant solution out there, such as a
    way to do a "select for update where not locked" to search for
    rows no one has a conflicting lock on.  (To me this would seem
    to be the best of all possible solutions.)

I'm running version() "PostgreSQL 7.4 on i686-pc-linux-gnu, compiled
by GCC gcc (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7)".

--
Mark Shewmaker
mark at primefactor dot com

Re: SELECT FOR UPDATE differs inside and outside a pl/pgsql function (7.4)

From

Tom Lane

Date:

17 December 2003, 15:03:21

Mark Shewmaker <mark@primefactor.com> writes:
> In other words:  Is this a bug or a user misunderstanding:

You've got the function doing

>     LOOP
>       select * into myrow from mytable limit 1 for update;
>       if found then exit;
>       end if;
>     END LOOP;

which means it will loop infinitely if the SELECT finds no row.

Because "LIMIT 1" executes before "FOR UPDATE" does, it is possible that
the first row returned by the query is rejected by FOR UPDATE (in this
case because it was deleted by an already-committed transaction), and
then the query will not examine any more rows because of the LIMIT.

I think you could get the behavior you want with

      select * into myrow from mytable for update;

that is, just rely on the implicit LIMIT 1 associated with SELECT INTO
rather than writing one in the query.  Then you will get the first row
that passes the FOR UPDATE restriction, which I think is what you're
after.

There has been some argument about whether FOR UPDATE shouldn't execute
before LIMIT, but IIRC there are nasty corner cases in that case too.
So it's hard to say if this should be considered a bug or not.

            regards, tom lane

Re: SELECT FOR UPDATE differs inside and outside a pl/pgsql

From

Mark Shewmaker

Date:

17 December 2003, 19:20:42

On Wed, 2003-12-17 at 14:02, Tom Lane wrote:
> Mark Shewmaker <mark@primefactor.com> writes:
> > In other words:  Is this a bug or a user misunderstanding:
>
> You've got the function doing
>
> >     LOOP
> >       select * into myrow from mytable limit 1 for update;
> >       if found then exit;
> >       end if;
> >     END LOOP;
>
> which means it will loop infinitely if the SELECT finds no row.
>
> Because "LIMIT 1" executes before "FOR UPDATE" does, it is possible that
> the first row returned by the query is rejected by FOR UPDATE (in this
> case because it was deleted by an already-committed transaction), and
> then the query will not examine any more rows because of the LIMIT.
>
> I think you could get the behavior you want with
>
>       select * into myrow from mytable for update;
>
> that is, just rely on the implicit LIMIT 1 associated with SELECT INTO
> rather than writing one in the query.  Then you will get the first row
> that passes the FOR UPDATE restriction, which I think is what you're
> after.

You're right:  Changing the loop in myfunction from the above to use
"select * into myrow from mytable for update;" results in the behavior I
was after.

What's more, the the whole loop can be replaced with that single "select
* into myrow from mytable for update;", and it works just perfectly!

(I think you may have been implying that.)

If the first transaction holds the first available row locked, a second
transaction running this function just waits.  When the first
transaction deletes the row and then commits, the second transaction
suddenly selects the next available row and goes on its merry way.

All very excellent!

> There has been some argument about whether FOR UPDATE shouldn't execute
> before LIMIT, but IIRC there are nasty corner cases in that case too.
> So it's hard to say if this should be considered a bug or not.

Okay.  Well, I wouldn't want to suggest anything that would cause more
problems, especially given that there's a simple and effective
solution/workaround.  ("Workaround" being a bit strong if this isn't
even clearly a bug.)

However, your answer does leave some other questions open as well as
opening up more:

1.  There's still the original outstanding oddity that there was a
    *difference* in the first place between the results from a "select
    limit 1 for update" command run directly within pgsql and a
    "select into variable limit 1 for update" run within a plpgsql
    function that's run directly within pgsql.

    The original function never could lock a row, whereas the command
    run directly could obtain a lock on the second try:

        Infinite loop in function:  Never locks a row.
        First attempt run directly:  No row locked.
        Second attempt run directly:  One row successfully locked.

    If a "FOR UPDATE executes before LIMIT" rule stopped the function
    from ever locking a row, it's still curious why didn't it stop the
    direct command from ever locking a row as well.  That still looks
    troublingly inconsistent.  (Even though it's easy to get the desired
    result in either case admittedly.)

2.  There's now a difference between the suggested "select * into
    myrow from mytable for update;" run within a function, with its
    JUST GREAT behavior, and the original "select * from mytable
    limit 1 for update;":

       New command run from function, first try:  One row successfully
                                                  locked.
       Command run directly, first try:  No row locked.
       Command run directly, second try:  One row locked successfully.

    I suppose my main question here is:  Can I rely on this unexpected
    (by me) and delightful feature of the suggested "select * into
    myrow from mytable for update;" to not require multiple runs in
    this sort of situation?

3.  If there are nasty deadlocks in FOR UPDATE running before LIMIT,
    and there are these slight inconsistencies in how select-for-update
    works depending on where the command is run, I suppose I wonder
    if some of the inconsistencies can trigger some of these problems.

    (Very vague, I know.  I don't have any problems to report or ask
    about related to this, but.. figured I should bring it up in case
    it causes someone to suddenly realize or think of where a real
    deadlock might be.)

4.  As an aside, since another way to directly solve the problem would
    be a way to only select rows that aren't locked, are there any
    thoughts on having that sort of functionality in a future revision?

    (From looking things up online, Oracle 8.1 and onward has something
    like this called "Skip Locked" (with "NOWAIT" being something close
    to the opposite), and microsoft sql-server has something like it
    referred to as "read past".)

Thanks,

 -Mark Shewmaker
  mark at primefactor dot com

Re: SELECT FOR UPDATE differs inside and outside a pl/pgsql

From

Tom Lane

Date:

17 December 2003, 20:57:40

Mark Shewmaker <mark@primefactor.com> writes:
>     If a "FOR UPDATE executes before LIMIT" rule stopped the function
>     from ever locking a row, it's still curious why didn't it stop the
>     direct command from ever locking a row as well.

I think it would.  Did you try the test the other way around (with the
direct command being blocked behind someone who deletes the first row)?

> 2.  There's now a difference between the suggested "select * into
>     myrow from mytable for update;" run within a function, with its
>     JUST GREAT behavior, and the original "select * from mytable
>     limit 1 for update;":

Yeah, because plpgsql is implicitly letting you apply a LIMIT 1 after
the FOR UPDATE, whereas there is no way to get the equivalent behavior
in plain SQL.

> 4.  As an aside, since another way to directly solve the problem would
>     be a way to only select rows that aren't locked, are there any
>     thoughts on having that sort of functionality in a future revision?

There have been previous discussions about how to do this sort of
select-the-first-available-job logic, and IIRC people have found
suitable implementations.  Search the archives.  It's not on-topic
for pgsql-bugs, in any case.

            regards, tom lane

Re: SELECT FOR UPDATE differs inside and outside a pl/pgsql

From

Mark Shewmaker

Date:

18 December 2003, 05:29:02

On Wed, 2003-12-17 at 19:57, Tom Lane wrote:
> Mark Shewmaker <mark@primefactor.com> writes:
> >     If a "FOR UPDATE executes before LIMIT" rule stopped the function
> >     from ever locking a row, it's still curious why didn't it stop the
> >     direct command from ever locking a row as well.
>
> I think it would.  Did you try the test the other way around (with the
> direct command being blocked behind someone who deletes the first row)?

Yes, or at least I've done the test that I think you're asking about.
(See Session_A-with-early-delete occurring with Session B_1 below.)

To be as clear as possible as to the actual tests run:

+-----------------+---------------------------------------------------+
|Run this in psql:| Along with one of these in a second psql session: |
+-----------------+-----------------+-----------------+---------------+
| Session A       | Session B_1     | Session B_2     | Session B_3   |
+-----------------+-----------------+-----------------+---------------+
|                 |                 |                 |               |
| # begin         |                 |                 |               |
|   transaction   |                 |                 |               |
|                 |                 |                 |               |
|                 | # begin         | # begin         | # begin       |
|                 |   transaction;  |   transaction   |   transaction |
|                 |                 |                 |               |
| # select * from |                 |                 |               |
|   mytable limit |                 |                 |               |
|   1 for update; |                 |                 |               |
| (returns a=1)   |                 |                 |               |
|                 |                 |                 |               |
|(NOTE: if the    |                 |                 |               |
|following delete |                 |                 |               |
|statement is     |                 |                 |               |
|moved here,before|                 |                 |               |
|session B_?'s    |                 |                 |               |
|selects, it has  |                 |                 |               |
|no effect on     |                 |                 |               |
|session B_?'s    |                 |                 |               |
|hangs & results) |                 |                 |               |
|                 |                 |                 |               |
|                 | # select * from | # select        | # select      |
|                 |   mytable limit |   myfunction(); |   myfunction2();
|                 |   1 for update; | (hangs)         | (hangs)       |
|                 | (hangs)         | (hangs)         | (hangs)       |
| # delete from   | (hangs)         | (hangs)         | (hangs)       |
|   mytable where | (hangs)         | (hangs)         | (hangs)       |
|   a=1;          | (hangs)         | (hangs)         | (hangs)       |
| (succeeds with  | (hangs)         | (hangs)         | (hangs)       |
| "DELETE 1")     | (hangs)         | (hangs)         | (hangs)       |
|                 | (hangs)         | (hangs)         | (hangs)       |
| # commit;       | (hangs)         | (hangs)         | (hangs)       |
| (succeeds with  | (hangs)         | (hangs)         | (hangs)       |
| "COMMIT")       | (hangs)         | (hangs)         | (hangs)       |
|                 | (returns with   | (hangs)         | (returns with |
|                 | no rows)        | (hangs)         |  a=2)         |
|                 |                 | (hangs)         |               |
|                 | # select * from | (hangs)         | # delete from |
|                 |   mytable limit | (hangs)         |   mytable     |
|                 |   1 for update; | (hangs)         |   where a=2;  |
|                 | (returns with   | (hangs)         | (succeeds with|
|                 | a=2)            | (hangs)         | "DELETE 1")   |
|                 |                 | (hangs)         |               |
|                 | # delete from   | (hangs)         | # commit;     |
|                 |   mytable where | (hangs)         | (succeeds with|
|                 |   a=2;          | (hangs)         | "COMMIT")     |
|                 | (succeeds with  | (hangs)         |               |
|                 | "DELETE 1")     | (hangs)         |               |
|                 |                 | (hangs)         |               |
|                 | # commit;       | (hangs)         |               |
|                 | (succeeds with  | (hangs)         |               |
|                 | "COMMIT")       | (hangs)         |               |
|                 |                 | (hangs)         |               |
+-----------------+-----------------+-----------------+---------------+
| Before each test:  # delete from mytable;                           |
|                    # insert into mytable (a) values (1);            |
|                    # insert into mytable (a) values (2);            |
|                    # insert into mytable (a) values (3);            |
+---------------------------------------------------------------------+
| myfunction() mostly consists of:                                    |
|                                                                     |
|      LOOP                                                           |
|        select * into myrow from mytable limit 1 for update;         |
|        if found then exit;                                          |
|        end if;                                                      |
|      END LOOP;                                                      |
|      return myrow.a;                                                |
+---------------------------------------------------------------------+
| myfunction2() mostly consists of:                                   |
|                                                                     |
|      select * into myrow from mytable for update;                   |
|      return myrow.a;                                                |
+---------------------------------------------------------------------+

Does this contain the test case you were asking about?

("select * into myrow from mytable for update;" as you suggested, with
it's implicit limit done within the function is still a great solution
for me.)

> > 2.  There's now a difference between the suggested "select * into
> >     myrow from mytable for update;" run within a function, with its
> >     JUST GREAT behavior, and the original "select * from mytable
> >     limit 1 for update;":
>
> Yeah, because plpgsql is implicitly letting you apply a LIMIT 1 after
> the FOR UPDATE, whereas there is no way to get the equivalent behavior
> in plain SQL.

Ahhh.  Okay.

Now that you've explained that one a second time, it's finally begun to
sink in.  :-)

(That doesn't explain the original difference discussed above, of
course.)

> > 4.  As an aside, since another way to directly solve the problem would
> >     be a way to only select rows that aren't locked, are there any
> >     thoughts on having that sort of functionality in a future revision?
>
> There have been previous discussions about how to do this sort of
> select-the-first-available-job logic, and IIRC people have found
> suitable implementations.  Search the archives.  It's not on-topic
> for pgsql-bugs, in any case.

Thanks.

I'll search the archives further, and bring up it up in a more suitable
list if need be and where it will be more on topic.

--
Mark Shewmaker
mark@primefactor.com

Re: SELECT FOR UPDATE differs inside and outside a pl/pgsql

From

Tom Lane

Date:

18 December 2003, 11:12:20

Mark Shewmaker <mark@primefactor.com> writes:
> On Wed, 2003-12-17 at 19:57, Tom Lane wrote:
>> Mark Shewmaker <mark@primefactor.com> writes:
>>> If a "FOR UPDATE executes before LIMIT" rule stopped the function
>>> from ever locking a row, it's still curious why didn't it stop the
>>> direct command from ever locking a row as well.
>>
>> I think it would.  Did you try the test the other way around (with the
>> direct command being blocked behind someone who deletes the first row)?

> Yes, or at least I've done the test that I think you're asking about.

So you have.  Your session B_1 (second column) shows exactly the
behavior I expected: the first invocation of SELECT FOR UPDATE
fails to lock any row.  You manually did the equivalent of looping
as in myfunction().  So it looks the same to me.

            regards, tom lane