Hi,
On Sep 29, 2022, 00:56 +0800, Glen Mailer <glen@geckoboard.com>, wrote:
Hello everyone
I believe I've run into a bug in the behaviour of SKIP LOCKED, where I have a program that implements a queue with concurrent workers SELECTing work from some shared tables.
The code in question does a LEFT JOIN across two tables with a FOR UPDATE on the left table and a SKIP LOCKED clause, and then UPDATEs or INSERTs rows into the table on right side of the JOIN in a way that leads to subsequent executions of the same query to no longer match those rows. However, when run concurrently I'm seeing the same row be selected by multiple workers - which shouldn't be possible based on my understanding of the relevant semantics of these operations. Perhaps I'm just holding it wrong, but I would have expected the FOR UPDATE lock on the left table to be sufficient to avoid overlapping results.
I have extracted a fairly minimal reproducing case from our production code, which includes some Go code as a test harness to run the queries concurrently enough to demonstrate the problem - this can be found at https://github.com/glenjamin/postgres-skip-locked-surprise
I wasn't sure how much detail from that reproducing case to repeat in this email, so I've only gone with an outline of the observed and expected behaviour - but I can try and add more detail to this thread if desired
Cheers
Glen
According to doc:
With SKIP LOCKED
, any selected rows that cannot be immediately locked are skipped. Skipping locked rows provides an inconsistent view of the data, so this is not suitable for general purpose work, but can be used to avoid lock contention with multiple consumers accessing a queue-like table.
this can be found at https://github.com/glenjamin/postgres-skip-locked-surprise
And a golang script is not convenient for hackers to reproduce. Could you provide some steps to produce the bug stably if it really was ?
Regards,
Zhang Mingli