Thread: Row locks, SKIP LOCKED, and transactions

Row locks, SKIP LOCKED, and transactions

From

Steven Winfield

Date:

17 December 2019, 16:12:07

Hi all,

I'm seeing some unexpected behaviour with SELECT ... FOR UPDATE SKIP LOCKED, and having finding it tricky to boil it
downto a simple repro case as there's almost certainly a race condition somewhere (more later). So I thought I would
askif what I'm doing is unsupported (or just plain wrong!), before expending more effort in reproducing it. 

I'm running v11.5, RHEL 7.7.

I have two tables jobs and results:
CREATE TABLE job (id integer PRIMARY KEY);
CREATE TABLE result (id integer PRIMARY KEY);
(obviously the real tables have more columns, but that's not too important here)

Something populates the job table with IDs.
A job is done if its id appears in the result table.
I would like to have multiple worker processes working on jobs.


I thought I could achieve this with each working doing the following:

BEGIN;

SELECT id
FROM job
WHERE NOT EXISTS (SELECT 1 FROM result WHERE result.id = job.id)
LIMIT 1
FOR UPDATE SKIP LOCKED;

-- worker process does some work for the selected ID here

INSERT INTO result (id) VALUES (the_id_from_above);

COMMIT;


However, even with just two worker processes, I quickly found that one worker process would be assigned a job id that
had*very* recently been completed by the other worker. 

Some more potentially useful information:
* The LockRows node of the plan for the SELECT query above doesn't receive any tuples until about a second after the
querybegins executing 
* If worker 2 begins querying for a new job id half a second before worker 1 commits then worker 2 will pick up the job
idthat worker 1 has just finished with. 
* I observe this even if I crank up the transaction isolation level to repeatable read and serializable.


I'm wondering if row locks are not obeying the same transactional semantics as row data, as a potential explanation for
theabove behaviour is as follows (W1/2 = Worker 1/2): 

W1: BEGIN;
W1: SELECT ...
W1: (SELECT returns id=1. W1 now has job(id=1) locked.)
W1: INSERT INTO result (id) VALUES (1)

W2: BEGIN;
W2: SELECT ...

W1: COMMIT; job(id=1) is now unlocked.

W2: (SELECT returns id=1: W1 had not committed when the SELECT started, so result(id=1) is not visible, but LockRows
foundthat job(id=1) was not locked. W2 now has job(id=1) locked.) 


...i.e. W2's SELECT could not see the row that W1 INSERTed (because W2's BEGIN occurs and W2's SELECT begins before
W1'scommit), but W2's SELECT *could* see the removal of W1's row lock.  


Perhaps this is a misuse of the locking system, since I'm locking a row "FOR UPDATE" but not actually updating it, but
asrow locks are released at the end of a transaction (according to the docs) then my expectation was for the unlocking
andthe visibility of newly committed rows to be atomic. 
I've tried FOR NO KEY UPDATE too, without luck.

If I'm doing something forbidden (and the docs say so) then I'd be grateful if someone could point that out!

Best,
Steven.

Re: Row locks, SKIP LOCKED, and transactions

From

Adrian Klaver

Date:

18 December 2019, 00:09:45

On 12/17/19 8:12 AM, Steven Winfield wrote:
> Hi all,
> 
> I'm seeing some unexpected behaviour with SELECT ... FOR UPDATE SKIP LOCKED, and having finding it tricky to boil it
downto a simple repro case as there's almost certainly a race condition somewhere (more later). So I thought I would
askif what I'm doing is unsupported (or just plain wrong!), before expending more effort in reproducing it.
 
> 
> I'm running v11.5, RHEL 7.7.
> 
> I have two tables jobs and results:
> CREATE TABLE job (id integer PRIMARY KEY);
> CREATE TABLE result (id integer PRIMARY KEY);
> (obviously the real tables have more columns, but that's not too important here)
> 
> Something populates the job table with IDs.
> A job is done if its id appears in the result table.
> I would like to have multiple worker processes working on jobs.
> 
> 
> I thought I could achieve this with each working doing the following:
> 
> BEGIN;
> 
> SELECT id
> FROM job
> WHERE NOT EXISTS (SELECT 1 FROM result WHERE result.id = job.id)
> LIMIT 1
> FOR UPDATE SKIP LOCKED;
> 
> -- worker process does some work for the selected ID here
> 
> INSERT INTO result (id) VALUES (the_id_from_above);
> 
> COMMIT;
> 
> 
> However, even with just two worker processes, I quickly found that one worker process would be assigned a job id that
had*very* recently been completed by the other worker.
 
> 
> Some more potentially useful information:
> * The LockRows node of the plan for the SELECT query above doesn't receive any tuples until about a second after the
querybegins executing
 
> * If worker 2 begins querying for a new job id half a second before worker 1 commits then worker 2 will pick up the
jobid that worker 1 has just finished with.
 
> * I observe this even if I crank up the transaction isolation level to repeatable read and serializable.
> 
> 
> I'm wondering if row locks are not obeying the same transactional semantics as row data, as a potential explanation
forthe above behaviour is as follows (W1/2 = Worker 1/2):
 
> 
> W1: BEGIN;
> W1: SELECT ...
> W1: (SELECT returns id=1. W1 now has job(id=1) locked.)
> W1: INSERT INTO result (id) VALUES (1)
> 
> W2: BEGIN;
> W2: SELECT ...
> 
> W1: COMMIT; job(id=1) is now unlocked.
> 
> W2: (SELECT returns id=1: W1 had not committed when the SELECT started, so result(id=1) is not visible, but LockRows
foundthat job(id=1) was not locked. W2 now has job(id=1) locked.)
 
> 
> 
> ...i.e. W2's SELECT could not see the row that W1 INSERTed (because W2's BEGIN occurs and W2's SELECT begins before
W1'scommit), but W2's SELECT *could* see the removal of W1's row lock.
 

Gotta believe it is this:

https://www.postgresql.org/docs/11/transaction-iso.html#XACT-READ-COMMITTED

"UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands  ..."

If I read correctly, transactions can see the effects of other 
transactions that commit during their lifetime.

> 
> 
> Perhaps this is a misuse of the locking system, since I'm locking a row "FOR UPDATE" but not actually updating it,
butas row locks are released at the end of a transaction (according to the docs) then my expectation was for the
unlockingand the visibility of newly committed rows to be atomic.
 
> I've tried FOR NO KEY UPDATE too, without luck.
> 
> If I'm doing something forbidden (and the docs say so) then I'd be grateful if someone could point that out!
> 
> Best,
> Steven.
> 
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Row locks, SKIP LOCKED, and transactions

From

Thomas Munro

Date:

18 December 2019, 02:10:27

On Wed, Dec 18, 2019 at 5:12 AM Steven Winfield
<Steven.Winfield@cantabcapital.com> wrote:
> * I observe this even if I crank up the transaction isolation level to repeatable read and serializable.

Huh.  SERIALIZABLE shouldn't allow two transactions to see no result
row for a given ID and then insert a result row for that ID.  One of
those transactions should have to roll back, because otherwise it'd be
incompatible with both serial orderings of the two transactions.

> I'm wondering if row locks are not obeying the same transactional semantics as row data,

They are indeed a bit weird.  They sometimes check if the condition
still apply (since the row might have changed between the scan and
LockRows node) which leads to some interesting effects, but only if
the row being locked was concurrently updated, and here that isn't the
case.  This is a source of a fair amount of confusion about FOR UPDATE
and joins/subselects.

> Perhaps this is a misuse of the locking system, since I'm locking a row "FOR UPDATE" but not actually updating it,
butas row locks are released at the end of a transaction (according to the docs) then my expectation was for the
unlockingand the visibility of newly committed rows to be atomic. 
> I've tried FOR NO KEY UPDATE too, without luck.
>
> If I'm doing something forbidden (and the docs say so) then I'd be grateful if someone could point that out!

Conceptually, the thing you really need to lock for this to work is
the result row that isn't there yet, so that some overlapping
transaction doesn't try to lock the same absent thing.  Unfortunately,
our system for locking things that aren't there isn't there either.
Some articles on serializability talk about "materialising the
conflict", which means locking some other surrogate thing that
"covers" a gap you are interested in.  You might think the job row
would do the trick, but since we don't recheck the condition (that is,
recheck that there is no corresponding result because you don't update
the job row), no cigar.  You could also use plain old
pg_try_advisory_xact_lock(id), because it just locks integers, and
they're always there.

SERIALIZABLE deals with that type of magic internally (it locks gaps
in key ranges by predicate-locking a physical btree or hash page that
you'd need to write on to insert a row with a matching key, which is
how it discovers a conflict between one transaction that went looking
for key=42 but didn't find it and another that later writes key=42),
but, as mentioned, SERIALIZABLE doesn't really allow concurrency with
this workload, and you specified that you wanted concurrency with SKIP
LOCKED (but I think you'd have the same problem without it; SKIP
LOCKED just gets you the wrong answer faster).

There are various ways you could deal with this, but I'd probably go
for a simple scheme where you only have to consult a single row to
know if you can claim it.  You could still put the results into a
separate table, but use job.state to find work, and set it to DONE
when you insert the result.  It may also be possible to add no new
columns but do a dummy update to the job row to get the join qual
rechecked, but I'm not sure if that'd work.  Another reason to add a
state column to the job table is so that you can put a conditional
index on it so you can find jobs to be done very quickly, if you're
not planning to remove the ones that are done.

RE: Row locks, SKIP LOCKED, and transactions

From

Steven Winfield

Date:

18 December 2019, 10:42:49

>> * I observe this even if I crank up the transaction isolation level to repeatable read and serializable.
>>
>>
>> I'm wondering if row locks are not obeying the same transactional semantics as row data,


>Gotta believe it is this:
>
>https://www.postgresql.org/docs/11/transaction-iso.html#XACT-READ-COMMITTED
>
>"UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands ..."
>
>If I read correctly, transactions can see the effects of other
>transactions that commit during their lifetime.

Thanks. I had a look at those docs when I first encountered the issue (if it can be called that), which prompted me to
tryrepeatable read and serializable isolation levels, but to no avail. I couldn't find anything specifically mentioning
thevisibility of row locks at different isolation levels.

RE: Row locks, SKIP LOCKED, and transactions

From

Steven Winfield

Date:

18 December 2019, 11:33:07

> Huh.  SERIALIZABLE shouldn't allow two transactions to see no result row
> for a given ID and then insert a result row for that ID.  One of those
> transactions should have to roll back, because otherwise it'd be
> incompatible with both serial orderings of the two transactions.

Sorry for the misunderstanding - I wasn't suggesting that.
Even at the serializable level, W2 can see a row that is unlocked by W1's commit despite W2's snapshot being taken
beforeW1 commits.
 
Carrying on my example, W2 would indeed fail to insert a result(id=1) row.

> Conceptually, the thing you really need to lock for this to work is the
> result row that isn't there yet, so that some overlapping transaction
> doesn't try to lock the same absent thing.  Unfortunately, our system for
> locking things that aren't there isn't there either.
> Some articles on serializability talk about "materialising the conflict",
> which means locking some other surrogate thing that "covers" a gap you are
> interested in.  You might think the job row would do the trick, but since
> we don't recheck the condition (that is, recheck that there is no
> corresponding result because you don't update the job row), no cigar. 

I like the concept of "materialising the conflict", that’s a useful way of thinking about it - thanks.

> You could also use plain old pg_try_advisory_xact_lock(id), because it just
> locks integers, and they're always there.

Yeah, I tried this, and might give it another go. A naïve attempt failed for a similar reason.

> 
> SERIALIZABLE deals with that type of magic internally (it locks gaps in
> key ranges by predicate-locking a physical btree or hash page that you'd
> need to write on to insert a row with a matching key, which is how it
> discovers a conflict between one transaction that went looking for key=42
> but didn't find it and another that later writes key=42), but, as
> mentioned, SERIALIZABLE doesn't really allow concurrency with this
> workload, and you specified that you wanted concurrency with SKIP LOCKED
> (but I think you'd have the same problem without it; SKIP LOCKED just gets
> you the wrong answer faster).
> 
> There are various ways you could deal with this, but I'd probably go for a
> simple scheme where you only have to consult a single row to know if you
> can claim it.  You could still put the results into a separate table, but
> use job.state to find work, and set it to DONE when you insert the result.
> It may also be possible to add no new columns but do a dummy update to the
> job row to get the join qual rechecked, but I'm not sure if that'd work.
> Another reason to add a state column to the job table is so that you can
> put a conditional index on it so you can find jobs to be done very
> quickly, if you're not planning to remove the ones that are done.

Thanks. I rejected the idea of doing a dummy update to the locked row as I wanted to avoid too much extra WAL - the
realtable originally had quite a few more columns than the toy example, but it's much slimmer now so this could be a
viableoption.

Re: Row locks, SKIP LOCKED, and transactions

From

Tom Lane

Date:

18 December 2019, 14:29:04

Steven Winfield <Steven.Winfield@cantabcapital.com> writes:
>> There are various ways you could deal with this, but I'd probably go for a
>> simple scheme where you only have to consult a single row to know if you
>> can claim it.  You could still put the results into a separate table, but
>> use job.state to find work, and set it to DONE when you insert the result.
>> It may also be possible to add no new columns but do a dummy update to the
>> job row to get the join qual rechecked, but I'm not sure if that'd work.
>> Another reason to add a state column to the job table is so that you can
>> put a conditional index on it so you can find jobs to be done very
>> quickly, if you're not planning to remove the ones that are done.

> Thanks. I rejected the idea of doing a dummy update to the locked row as I wanted to avoid too much extra WAL - the
realtable originally had quite a few more columns than the toy example, but it's much slimmer now so this could be a
viableoption. 

Yeah ... the fundamental reason why this isn't working for you is that
the FOR UPDATE will only lock/check conflicts in the "job" table.
You could add a FOR UPDATE in the sub-select to lock the "result" table,
but that will still only lock rows it read, not rows it didn't read
because they weren't there yet :-(.  Updating the state of the job row
to show that it's claimed is much the most reliable way to fix this.

(Or you could use serializable mode, but that feels like using a hammer
to swat a fly.)

            regards, tom lane

RE: Row locks, SKIP LOCKED, and transactions

From

Steven Winfield

Date:

19 December 2019, 15:30:32

> (Or you could use serializable mode, but that feels like using a hammer to swat a fly.)

Do you mean the serializable transaction isolation level? Because that doesn't work either. Here (finally) is a tiny
reprocase. You'll need 2 psql sessions (S1, S2): 

S1: CREATE TABLE t (id integer):
S1: INSERT INTO t VALUES (1);
S1: BEGIN;
S1: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
S1: SELECT id FROM t FOR UPDATE;

(So now there is a single, globally visible row that S1 has a lock on)

S2: BEGIN;
S2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
S2: SELECT id FROM t;  -- returns 1 row, as expected
S2: SELECT id FROM t FOR UPDATE SKIP LOCKED; -- returns 0 rows, as expected

S1: ROLLBACK;  -- S1's row lock is released

S2: SELECT id FROM t FOR UPDATE SKIP LOCKED; -- returns 1 row

...i.e. the row appears unlocked to S2 despite its transaction's snapshot being taken before the lock was released.


I'm going to use the suggestions made by you and others previously in this thread, so (for me at least) this is now
justacademic, but I'm still interested to know if the above behaviour is expected, and if I should have been able to
deduceit from the docs. The best I could find is: 

https://www.postgresql.org/docs/11/sql-select.html
"With SKIP LOCKED, any selected rows that cannot be immediately locked are skipped. Skipping locked rows provides an
inconsistentview of the data, so this is not suitable for general purpose work, but can be used to avoid lock
contentionwith multiple consumers accessing a queue-like table." 

Thanks for your (and everyone else's) help,

Steve.