Re: Logical locking beyond pg_advisory - Mailing list pgsql-general

From Chris Travers
Subject Re: Logical locking beyond pg_advisory
Date
Msg-id CAKt_ZfuTOqv3daC9k-Pymtg+WqsDJEfN9ZwYdCh4UDz=2jFO7Q@mail.gmail.com
Whole thread Raw
In response to Re: Logical locking beyond pg_advisory  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Logical locking beyond pg_advisory  (marcelo <marcelo.nicolet@gmail.com>)
List pgsql-general


On Mon, Sep 17, 2018 at 5:09 PM Merlin Moncure <mmoncure@gmail.com> wrote:
On Sun, Sep 16, 2018 at 3:53 PM marcelo <marcelo.nicolet@gmail.com> wrote:
>
> I need a mechanism of "logical locking" more ductile than the pg_advisory family.
> I'm thinking of a table ("lock_table") that would be part of the database, with columns
> * tablename varchar - name of the table "locked"
> * rowid integer, - id of the row "locked"
> * ownerid varchar, - identifier of the "user" who acquired the lock
> * acquired timestamp - to be able to release "abandoned" locks after a certain time
>
> and a group of functions
> 1) lock_table (tablename varchar, ownerid varchar) bool - get to lock over the entire table, setting rowid to zero
> 2) unlock_table (tablename varchar, ownerid varchar) bool - unlock the table, if the owner is the recorded one
> 3) locked_table (tablename varchar, ownerid varchar) bool - ask if the table is locked by some user other than the ownerid argument
> 4) lock_row (tablename varchar, rowid integer, ownerid varchar) bool - similar to pg_try_advisory_lock
> 5) unlock_row (tablename varchar, rowid integer, ownerid varchar) bool - similar to pg_advisory_unlock
> 6) unlock_all (ownerid varchar) bool - unlock all locks owned by ownerid
>
> The timeout (default, maybe 15 minutes) is implicitly applied if the lock is taken by another user (there will be no notification).
> Redundant locks are not queued, they simply return true, may be after an update of the acquired column.
> Successful locks insert a new row, except the rare case of a timeout, which becomes an update (ownerid and acquired)
> Unlock operations deletes the corresponding row
>
> My question is double
> a) What is the opinion on the project?
> b) What are the consequences of the large number of inserts and deletions
> c) Performance. In fact, pg_advisory* implies a network roundtrip, but (I think) no table operations.

Why can't you use the advisory lock functions?  The challenge with
manually managed locks are they they are slow and you will lose the
coordination the database provides you.  For example, if your
application crashes you will have to clean up all held locks yourself.
Building out that infrastructure will be difficult.

First, I think in an ideal world, you wouldn't handle this problem with either approach but sometimes you have to.

I have done both approaches actually.  LedgerSMB uses its own lock table because locks have to persist across multiple HTTP requests and we have various automatic cleanup processes.

When I was working on  the queue management stuff at Novozymes we used advisory locks extensively.

These two approaches have serious downsides:
1.  Lock tables are *slow* and require careful thinking through cleanup scenarios.  In LedgerSMB we tied to the application session with an ON DELETE event that would unlock the row.  We estimated that for every 2 seconds that the db spent doing useful work, it spent 42 seconds managing the locks.....  Additionally the fact that locks take effect on snapshot advance is a problem here.

2.  In my talk, "PostgreSQL at 10TB and  Beyond" I talk about a problem we had using advisory locks for managing rows that were being processed for deletion.  Since the deletion was the scan for items at the head of an index, under heavy load we could spend long enough checking dead rows that the locks could go away with our snapshot failing to advance.  This would result in duplicate processing.  So the fact that advisory locks don't really follow snapshot semantics is a really big problem here since it means you can have race conditions in advisory locks that can't happen with other locking issues.  I still love advisory locks but they are not a good tool for this.

The real solution most of the time is actually to lock the rows by selecting FOR UPDATE and possibly SKIP LOCKED.  The way update/delete row locking in PostgreSQL works is usually good enough except in a few rare edge cases.  Only in *very rare* cases do lock tables or advisory locks make sense for actual row processing.

merlin



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

pgsql-general by date:

Previous
From: Dmitri Maziuk
Date:
Subject: Re: Code of Conduct plan
Next
From: "Joshua D. Drake"
Date:
Subject: Re: Code of Conduct plan