Home > mailing lists

Re: BUG #18472: SELECT FOR UPDATE locking more rows than expected - Mailing list pgsql-bugs

From	Tom Lane
Subject	Re: BUG #18472: SELECT FOR UPDATE locking more rows than expected
Date	May 18 18:06:06
Msg-id	2760373.1716044766@sss.pgh.pa.us Whole thread Raw
In response to	BUG #18472: SELECT FOR UPDATE locking more rows than expected (PG Bug reporting form <noreply@postgresql.org>)
List	pgsql-bugs

Tree view

PG Bug reporting form <noreply@postgresql.org> writes:
> And then retry I can see that I get the expected 5 rows back in the query
> from tab 2 leading me to believe that when both the LIMIT and ORDER BY are
> present postgres is locking more rows than I believe should be necessary
> based on the documentation that I've read and experiments that I've done.

This is not a bug, because there is no guarantee that SELECT FOR
UPDATE will lock only what you think is the minimum number of rows.

I realize that this is probably an oversimplified example of your real
problem, but in the given case the issue is use of the sub-select;
do you really need that?  EXPLAIN shows the plan as

 Limit
   ->  LockRows
         ->  Sort
               Sort Key: t1.b
               ->  Nested Loop
                     ->  Seq Scan on tab1 t1
                     ->  Subquery Scan on t2
                           ->  Limit
                                 ->  LockRows
                                       ->  Seq Scan on tab2
                                             Filter: (t1.a = a_ref)

The upper LockRows node should indeed not lock any rows that
aren't returned, but the trouble is the lower one, which will
probably end up locking every tab2 row that has a match in tab1.
(Remember the Sort will not be able to return any row until
it's run the Nested Loop to completion.)

The reason that's there is that the outer query interprets
application of FOR UPDATE to a sub-SELECT-in-FROM as being
a request to push down a FOR UPDATE into the sub-SELECT.
You could get rid of that by specifying just "FOR UPDATE OF t1"
in the outer query, but of course then you'd end with no lock
applied to the joined t2 row, so I don't know if that works
for you.

I think the answer could be to get rid of the inner LIMIT,
which is what's preventing optimization of the subquery.
I dislike that on semantic grounds anyway, because as this
is written it's totally undefined which tab2 rows get joined
to which tab1 rows.  Without that I get

explain (costs off)
SELECT *
FROM public.tab1 t1,
LATERAL (SELECT * 
FROM public.tab2
WHERE "a"="a_ref" 
) t2
ORDER BY b ASC
LIMIT 1
FOR UPDATE;
                     QUERY PLAN                      
-----------------------------------------------------
 Limit
   ->  LockRows
         ->  Sort
               Sort Key: t1.b
               ->  Merge Join
                     Merge Cond: (t1.a = tab2.a_ref)
                     ->  Sort
                           Sort Key: t1.a
                           ->  Seq Scan on tab1 t1
                     ->  Sort
                           Sort Key: tab2.a_ref
                           ->  Seq Scan on tab2
(12 rows)

which would behave a lot better in terms of locking
only the returned row(s).

            regards, tom lane

pgsql-bugs by date:

From: Dmitry Dolgov
Date: 18 May, 17:18:01
Subject: Re: BUG #18465: Wrong results from SELECT DISTINCT MIN in scalar subquery using HashAggregate

From: Tom Lane
Date: 18 May, 19:56:44
Subject: Re: BUG #18465: Wrong results from SELECT DISTINCT MIN in scalar subquery using HashAggregate

Re: BUG #18472: SELECT FOR UPDATE locking more rows than expected - Mailing list pgsql-bugs

Previous

Next