Thread: About row locking ordering

About row locking ordering

From

"Ryo Yamaji (Fujitsu)"

Date:

24 November 2022, 10:06:54

Hi All.

When multiple sessions tried to acquire a row lock on the same tuple sequentially, I expected
that the first session waiting for the lock would acquire the lock first (FIFO). However, when we
actually tested it, there were cases where a session that was behind a first session acquired
a row lock ahead of first session.
Is this behavior is expected specifications or is it a bug?

The following is a list of the cases where the next session gets the lock first.
・Multiple sessions request row locks for the same tuple
・Update occurs for target tuple

The above behavior may have the following problems:
・If the lock acquiring order is FIFO in Postgres, it is not satisfied.
・Any sessions which cannot acquire a lock for a long time result in a long transaction

If the above behavior is a specification, I think it is necessary to document design considerations
when dealing with row locks.

[Condition]
・Version
    - PostgreSQL 14.0 (compiled with LOCK_DEBUG)
・postgresql.conf
　　- log_lock_waits = on
　　- deadlock_timeout = 10
　　- log_line_prefix = '%m [%p:%l:%x] '
　　- trace_locks = on
　　- trace_lock_oidmin = 16000

[Table Definition]
create table t1 (col1 int, col2 int, col3 char (10));
insert into t1 values (1,0, ''); insert into t1 values (100, 9999, '');

[Application]
<a.sql> ---------------------------
　　　BEGIN;
　　　SELECT * FROM t1 WHERE col1 = 1 FOR UPDATE;
　　　SELECT current_timestamp AS TRANSTART, clock_timestamp() AS NOW;
　　　UPDATE t1 set col2 = :val, col3 = :valb WHERE col1 = 1;
　　　SELECT pg_sleep(1);
　　　COMMIT;
------------------------------------

[Test]
The test set is attached.
1. Run the following in 100 multiplex
　for i in `seq 100`; do psql postgres -f ./a.sql -v val=${i} -v valb="'a.sh'" &; done
2.After 10 seconds (1. After completion of execution), run the following in 10 multiplexes
　for i in `seq 10`; do ii=`expr ${i} + 900`; psql postgres -f ./a.sql -v val=${ii} -v valb="'b.sh'" &; done

All a.sh were expected to complete processing first (FIFO). However, b.sh completed the process
before part of a.sh.

[Log]
We are checking two types of logs.

1. ShareLock has one wait, the rest is in AccessExclusiveLock

1-1. Only 1369555 is aligned with ShareLock, the transaction ID obtained by 1369547, and the rest with
　　AccessExclusiveLock, the tuple obtained by 1369555.
　　This is similar to a pattern in which no updates have occurred to the tuple.
--------------------------------------------------------------
2022-10-26 01:20:08.881 EDT [1369555:19:0] LOG: process 1369555 still waiting for ShareLock on transaction 2501 after
10.072ms 
2022-10-26 01:20:08.881 EDT [1369555:20:0] DETAIL: Process holding the lock: 1369547. Wait queue: 1369555.
〜
2022-10-26 01:21:58.918 EDT [1369898:17:0] LOG: process 1369898 acquired AccessExclusiveLock on tuple (1, 0) of
relation16546 of database 13779 after 10.321 ms 
2022-10-26 01:21:58.918 EDT [1369898:18:0] DETAIL: Process holding the lock: 1369555. Wait queue: 1369558, 1369561,
1369564,1369567, 1369570, 1369573, 1369576, ... 
--------------------------------------------------------------


2. All processes wait with ShareLock

2-1. With 1369558 holding the t1 (0, 4) lock, the queue head is 1369561.
--------------------------------------------------------------
2022-10-26 01:22:27.230 EDT [1369623:46:2525] LOG: process 1369623 still waiting for ShareLock on transaction 2504
after10.133 msprocess 1369623 still waiting for ShareLock on transaction 2504 after 10.133 ms 
2022-10-26 01:22:27.242 EDT [1369877:47:2604] DETAIL: Process holding the lock: 1369558. Wait queue: 1369561, 1369623,
1369626,... 
--------------------------------------------------------------

2-2. When 1369558 locks are released, the first 1369561 in the Wait queue was expected to acquire the lock,
　　but the process actually acquired 1369787
--------------------------------------------------------------
2022-10-26 01:22:28.237 EDT [1369623:63:2525] LOG: process 1369623 still waiting for ShareLock on transaction 2577
after10.028 ms 
2022-10-26 01:22:28.237 EDT [1369623:64:2525] DETAIL: Process holding the lock: 1369787. Wait queue: 1369623, 1369610,
1369614,1369617, 1369620. 
--------------------------------------------------------------

2-3. Checking that the 1369561 is rearranging.
--------------------------------------------------------------
2022-10-26 01:22:28.237 EDT [1369629:64:2527] DETAIL: Process holding the lock: 1369623. Wait queue: 1369629, 1369821,
1369644,... 1369561, ... 
--------------------------------------------------------------


Regerds, Ryo

Attachment

testset.zip

Re: About row locking ordering

From

Kirk Wolak

Date:

25 November 2022, 09:11:37

My first question is why are you not using "WHERE CURRENT OF" cursor_name?

The link to the page follows.

But effectively, you are locking the row and that is the row you want to update (the current row of the cursor).

I wonder if that addresses the problem...

https://www.postgresql.org/docs/current/sql-update.html

On Fri, Nov 25, 2022 at 2:58 AM Ryo Yamaji (Fujitsu) <yamaji.ryo@fujitsu.com> wrote:

Hi All.

When multiple sessions tried to acquire a row lock on the same tuple sequentially, I expected
that the first session waiting for the lock would acquire the lock first (FIFO). However, when we
actually tested it, there were cases where a session that was behind a first session acquired
a row lock ahead of first session.
Is this behavior is expected specifications or is it a bug?

The following is a list of the cases where the next session gets the lock first.
・Multiple sessions request row locks for the same tuple
・Update occurs for target tuple

The above behavior may have the following problems:
・If the lock acquiring order is FIFO in Postgres, it is not satisfied.
・Any sessions which cannot acquire a lock for a long time result in a long transaction

If the above behavior is a specification, I think it is necessary to document design considerations
when dealing with row locks.

[Condition]
・Version
- PostgreSQL 14.0 (compiled with LOCK_DEBUG)
・postgresql.conf
　　- log_lock_waits = on
　　- deadlock_timeout = 10
　　- log_line_prefix = '%m [%p:%l:%x] '
　　- trace_locks = on
　　- trace_lock_oidmin = 16000

[Table Definition]
create table t1 (col1 int, col2 int, col3 char (10));
insert into t1 values (1,0, ''); insert into t1 values (100, 9999, '');

[Application]
<a.sql> ---------------------------
　　　BEGIN;
　　　SELECT * FROM t1 WHERE col1 = 1 FOR UPDATE;
　　　SELECT current_timestamp AS TRANSTART, clock_timestamp() AS NOW;
　　　UPDATE t1 set col2 = :val, col3 = :valb WHERE col1 = 1;
　　　SELECT pg_sleep(1);
　　　COMMIT;
------------------------------------

[Test]
The test set is attached.
1. Run the following in 100 multiplex
　for i in `seq 100`; do psql postgres -f ./a.sql -v val=${i} -v valb="'a.sh'" &; done
2.After 10 seconds (1. After completion of execution), run the following in 10 multiplexes
　for i in `seq 10`; do ii=`expr ${i} + 900`; psql postgres -f ./a.sql -v val=${ii} -v valb="'b.sh'" &; done

All a.sh were expected to complete processing first (FIFO). However, b.sh completed the process
before part of a.sh.

[Log]
We are checking two types of logs.

1. ShareLock has one wait, the rest is in AccessExclusiveLock

1-1. Only 1369555 is aligned with ShareLock, the transaction ID obtained by 1369547, and the rest with
　　AccessExclusiveLock, the tuple obtained by 1369555.
　　This is similar to a pattern in which no updates have occurred to the tuple.
--------------------------------------------------------------
2022-10-26 01:20:08.881 EDT [1369555:19:0] LOG: process 1369555 still waiting for ShareLock on transaction 2501 after 10.072 ms
2022-10-26 01:20:08.881 EDT [1369555:20:0] DETAIL: Process holding the lock: 1369547. Wait queue: 1369555.
～
2022-10-26 01:21:58.918 EDT [1369898:17:0] LOG: process 1369898 acquired AccessExclusiveLock on tuple (1, 0) of relation 16546 of database 13779 after 10.321 ms
2022-10-26 01:21:58.918 EDT [1369898:18:0] DETAIL: Process holding the lock: 1369555. Wait queue: 1369558, 1369561, 1369564, 1369567, 1369570, 1369573, 1369576, ...
--------------------------------------------------------------

2. All processes wait with ShareLock

2-1. With 1369558 holding the t1 (0, 4) lock, the queue head is 1369561.
--------------------------------------------------------------
2022-10-26 01:22:27.230 EDT [1369623:46:2525] LOG: process 1369623 still waiting for ShareLock on transaction 2504 after 10.133 msprocess 1369623 still waiting for ShareLock on transaction 2504 after 10.133 ms
2022-10-26 01:22:27.242 EDT [1369877:47:2604] DETAIL: Process holding the lock: 1369558. Wait queue: 1369561, 1369623, 1369626, ...
--------------------------------------------------------------

2-2. When 1369558 locks are released, the first 1369561 in the Wait queue was expected to acquire the lock,
　　but the process actually acquired 1369787
--------------------------------------------------------------
2022-10-26 01:22:28.237 EDT [1369623:63:2525] LOG: process 1369623 still waiting for ShareLock on transaction 2577 after 10.028 ms
2022-10-26 01:22:28.237 EDT [1369623:64:2525] DETAIL: Process holding the lock: 1369787. Wait queue: 1369623, 1369610, 1369614, 1369617, 1369620.
--------------------------------------------------------------

2-3. Checking that the 1369561 is rearranging.
--------------------------------------------------------------
2022-10-26 01:22:28.237 EDT [1369629:64:2527] DETAIL: Process holding the lock: 1369623. Wait queue: 1369629, 1369821, 1369644, ... 1369561, ...
--------------------------------------------------------------

Regerds, Ryo

RE: About row locking ordering

From

"Ryo Yamaji (Fujitsu)"

Date:

07 December 2022, 03:03:27

Thank you for your comments. Sorry for the late reply.

From: Kirk Wolak  Sent: Friday, November 25, 2022 6:12 PM
> My first question is why are you not using "WHERE CURRENT OF" cursor_name?

I thought that cursors are preferred for manipulating large numbers of rows. 
So I did not consider using cursors in this test case because each process manipulates only one row.

However, assuming general usage, I thought it would be preferable to use a cursor, as you pointed out, because it might
resultin fewer table accesses.
 
I didn't have that knowledge.


> But effectively, you are locking the row and that is the row you want to update (the current row of the cursor).
> I wonder if that addresses the problem...

I tested it with a cursor, but there was still a problem..
I will attach the script used for the test, though it will be simple.

Attachment

testset2.zip