Thread: Delete from locking ordering differences

Delete from locking ordering differences

From
Peter Hendriks
Date:
We are wondering if anyone can explain the difference we are having in production with the following queries:

DELETE FROM store
WHERE id IN (
  SELECT id FROM store 
  FOR UPDATE SKIP LOCKED 
  ORDER BY ID
  LIMIT 1000
)
RETURNING id, payload

This query is sometimes executed with high concurrency, and then can hang indefinitely, we assume because of a locking problem that postgresql is not detecting as a deadlock.

This alternative query does not have the hanging problem:

WITH store_ids AS (
  SELECT id FROM store 
  FOR UPDATE SKIP LOCKED 
  ORDER BY ID
  LIMIT 1000
)
DELETE FROM store s
USING store_ids si 
WHERE s.id = si.id
RETURNING s.id, s.payload

Can anyone explain why the first query is expected to fail (hang), and the second query does not have this problem? We would be interested in more understanding on this. Thanks! 

Re: Delete from locking ordering differences

From
Rob Sargent
Date:
On 9/23/22 01:31, Peter Hendriks wrote:
We are wondering if anyone can explain the difference we are having in production with the following queries:

DELETE FROM store
WHERE id IN (
  SELECT id FROM store 
  FOR UPDATE SKIP LOCKED 
  ORDER BY ID
  LIMIT 1000
)
RETURNING id, payload

This query is sometimes executed with high concurrency, and then can hang indefinitely, we assume because of a locking problem that postgresql is not detecting as a deadlock.

This alternative query does not have the hanging problem:

WITH store_ids AS (
  SELECT id FROM store 
  FOR UPDATE SKIP LOCKED 
  ORDER BY ID
  LIMIT 1000
)
DELETE FROM store s
USING store_ids si 
WHERE s.id = si.id
RETURNING s.id, s.payload

Can anyone explain why the first query is expected to fail (hang), and the second query does not have this problem? We would be interested in more understanding on this. Thanks! 

The "optimizatin fence" nature of CTEs appears to be a win in this case.  Why the "order by"?  I assume these are down within a transaction?

Re: Delete from locking ordering differences

From
Peter Hendriks
Date:
Yes, we do not really care about order, but to prevent locking issues. Multiple transactions may run this query at the same time. It should never contend because of the skip locked, so maybe we should try this without an order by too. We can not get this query to fail in test so far, just in production, wo we are bit hesitant to change now that it is finally working... Our guess is to why this makes such a difference is that the delete statement in postgres does not guarantee ordering, so maybe the optimizer makes different choices than expected. Asking the question here, so maybe someone with more understanding can explain why we need the CTE.

Op vr 23 sep. 2022 om 09:31 schreef Peter Hendriks <peter@mindloops.nl>:
We are wondering if anyone can explain the difference we are having in production with the following queries:

DELETE FROM store
WHERE id IN (
  SELECT id FROM store 
  FOR UPDATE SKIP LOCKED 
  ORDER BY ID
  LIMIT 1000
)
RETURNING id, payload

This query is sometimes executed with high concurrency, and then can hang indefinitely, we assume because of a locking problem that postgresql is not detecting as a deadlock.

This alternative query does not have the hanging problem:

WITH store_ids AS (
  SELECT id FROM store 
  FOR UPDATE SKIP LOCKED 
  ORDER BY ID
  LIMIT 1000
)
DELETE FROM store s
USING store_ids si 
WHERE s.id = si.id
RETURNING s.id, s.payload

Can anyone explain why the first query is expected to fail (hang), and the second query does not have this problem? We would be interested in more understanding on this. Thanks! 


--
Met vriendelijke groet,

Peter Hendriks
Mindloops B.V.

T. +31 (0)6 37 23 26 73
E. peter@mindloops.nl
KvK: 85061921