Thread: BUG #17686: SELECT pg_advisory_lock(...) with low lock_timeout sometimes times out but the lock is acquired
BUG #17686: SELECT pg_advisory_lock(...) with low lock_timeout sometimes times out but the lock is acquired
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17686 Logged by: Mike Adelson Email address: mike.adelson314@gmail.com PostgreSQL version: 12.2 Operating system: Windows 10 Description: I am trying to use the pg_advisory_lock function in combination with setting lock_timeout to wait on the lock with a timeout. Here is my query: ``` SET LOCAL statement_timeout = 0; SET LOCAL lock_timeout = 200; SELECT pg_catalog.pg_advisory_lock(12345) ``` I'm finding that with relatively small values of lock_timeout and when the system is under load (e.g. 8 connections acquiring concurrently), I will encounter a case where the query exits with state 55P03 (lock_not_available) and yet the lock was actually acquired (I can tell it has been acquired by querying pg_locks and because other connections' calls to pg_advisory_lock block). Here's a standalone repro application which reliably recreates the behavior for me: https://github.com/Tzachi009/distributed-locks-issue More discussion here: https://github.com/madelson/DistributedLock/issues/147
Re: BUG #17686: SELECT pg_advisory_lock(...) with low lock_timeout sometimes times out but the lock is acquired
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > I'm finding that with relatively small values of lock_timeout and when the > system is under load (e.g. 8 connections acquiring concurrently), I will > encounter a case where the query exits with state 55P03 (lock_not_available) > and yet the lock was actually acquired (I can tell it has been acquired by > querying pg_locks and because other connections' calls to pg_advisory_lock > block). Yeah, there's a fairly basic race condition there, which is that the lock might get granted to us immediately after the timeout fires. In ordinary usage this isn't very problematic because the lock would get released anyway during the transaction abort resulting from the timeout error. However, when you're asking for a session-level advisory lock, that doesn't happen. I spent a little time studying whether there's a way to close the race, but if it's possible at all it'd take major restructuring of what's already quite complex and delicate code. I can't get excited about putting such effort into it, because related problems will exist no matter what: if you get an error from the pg_advisory_lock command, was the lock granted before that error occurred? I don't think there's any bulletproof way to deal with that except to check the lock status afterwards (e.g., via pg_locks). It's very much like the inherent issue with errors late in COMMIT --- there's an atomic point where the commit is logged, but the error report isn't going to be very clear about whether we got past that, if indeed the error report gets to the client at all. regards, tom lane
Re: BUG #17686: SELECT pg_advisory_lock(...) with low lock_timeout sometimes times out but the lock is acquired
From
Tom Lane
Date:
Mike Adelson <mike.adelson314@gmail.com> writes: > Glad to have the race condition confirmed. In my view the “real” fix > here would be to offer an advisory locking function that accepts a > timeout. How would that remove the low-level race condition? regards, tom lane