Re: SELECT FOR UPDATE on rows that don't exist - Mailing list pgsql-novice

From David G. Johnston
Subject Re: SELECT FOR UPDATE on rows that don't exist
Date
Msg-id CAKFQuwZ4=c7J-g5=t2veA2-w5tTi3oER34HJvYQ3QsaqSNqEig@mail.gmail.com
Whole thread Raw
In response to SELECT FOR UPDATE on rows that don't exist  (Joe <symphony.red+pg@gmail.com>)
Responses Re: SELECT FOR UPDATE on rows that don't exist  (Joe <symphony.red+pg@gmail.com>)
List pgsql-novice
On Wed, Mar 23, 2022 at 12:52 PM Joe <symphony.red+pg@gmail.com> wrote:
1) What are the semantics of SELECT FOR UPDATE when the row doesn't exist yet?

You've informed the system you are going to be updating rows on the table but as yet have not given it specific rows to protect.

Without #*1, a simple experiment shows that two processes can be in the critical section at the same time. Add #*1 seems to achieve the desired behavior, but is it really?

The index will not allow duplicates to be inserted and the first one to try forces all other potential insertions to wait until the first one commits;

"Short-term share/exclusive page-level locks are used for read/write access. Locks are released immediately after each index row is fetched or inserted. These index types provide the highest concurrency without deadlock conditions."

But, once the insertion happens the second transaction sees the potentially conflicting record and so continues waiting to see whether it commits with the conflict in place (at which point it returns the duplicate key error) or not (in which case one of them then gets to proceed with their insertion).  This manifests as a lock on the transactionid of the first session as is most easily seen in in pg_stat_activity wait_event/wait_event_type fields.

I picked up the last point through my own experimentation, if there is documentation someone else will hopefully point it out.


2) The DELETE @ #2 is so that the row is never present when not executing in the critical section mainly so that #1 can be a simple insert rather than an upsert. Is there a more standard pattern for this?

Not sure.  But while definitely not standard, and somewhat limited, there is the advisory locks feature.



3)  Using the DB as a distributed mutex seems like a common application but nothing came up in various DB and PostgreSQL books I consulted or on the web. Is this a bad idea, or are there gotchas I'm missing?


For a job queue typically you'd fire off an update to claim the work - ensuring only one session gets the assignment - and another to indicate completion.  The "critical section" is usually so long as to be harmful to the database to be performed while holding a transaction open specifically for locking purposes.  I have to imagine there are better tools, especially in-memory ones, designed to handle distributed process coordination. That said, the goodness or badness of doing it in the database heavily depends on knowing more than just "distributed mutex".  The critical section details, not just existence, factors into such a subjective evaluation.  The fact you have a functioning database does make things appealing.

David J.

pgsql-novice by date:

Previous
From: Joe
Date:
Subject: SELECT FOR UPDATE on rows that don't exist
Next
From: Joe
Date:
Subject: Re: SELECT FOR UPDATE on rows that don't exist