Thread: Is this a reasonable use for advisory locks?

Is this a reasonable use for advisory locks?

From
Perryn Fowler
Date:
Hi there,

We have identified a problem that we think advisory locks could help with, but we wanted to get some advice on whether its a good idea to use them this way (and any tips, best practices or gotchas we should know about)

THE PROBLEM

We have some code that does the following
       - For a customer:
             - sum a ledger of transactions
             - if the result shows that money is owed:
                    - charge a credit card (via a call to an external api)
                    - if the charge is successful, insert a transaction into the ledger

We would like to serialise execution of this code on a per customer basis, so that
we do not double charge their credit card if execution happens concurrently.

We are considering taking an advisory lock using the customer id to accomplish this.

OUR CONCERNS
      - The fact that the key for an advisory lock is an integer makes us wonder if this is designed for taking locks per process type, rather than per record (like a customer)
      - Is it a bad idea to hold an advisory lock while an external api call happens? Should the locks be shorter lived?
      - The documentation notes that these locks live in a memory pool and that 'care should be taken not to exhaust this memory'. What are the implications if it is exhausted? (Eg will the situation recover once locks are released?). Are there established patterns for detecting and preventing this situation?
      - anything else we should know?


Thanks in advance for any advice!

Cheers
Perryn

Re: Is this a reasonable use for advisory locks?

From
Steve Baldwin
Date:
Hi Perryn,

I don't know why you think advisory locks are the solution. It seems regular row locks would ensure you have exclusive access to the customer.

Maybe something like this:

begin;
select * from customer where id = $1 for update skip locked;
if the query returns no rows it means something else already has a lock on the customer so rollback and exit
otherwise call the external api (assume synchronous)
if successful insert a row into the ledger table and commit else rollback

There are some tricky aspects to this but nothing that can be helped by advisory locks over row locks. For example, if the external call takes too long and you time out, or your network connection drops, how do you know whether or not it was successful? You also need to work out what happens if the insert into the ledger fails. If you haven't already, maybe check out the 'saga' pattern.

Cheers,

Steve

On Thu, Apr 14, 2022 at 5:11 PM Perryn Fowler <perryn@fresho.com> wrote:
Hi there,

We have identified a problem that we think advisory locks could help with, but we wanted to get some advice on whether its a good idea to use them this way (and any tips, best practices or gotchas we should know about)

THE PROBLEM

We have some code that does the following
       - For a customer:
             - sum a ledger of transactions
             - if the result shows that money is owed:
                    - charge a credit card (via a call to an external api)
                    - if the charge is successful, insert a transaction into the ledger

We would like to serialise execution of this code on a per customer basis, so that
we do not double charge their credit card if execution happens concurrently.

We are considering taking an advisory lock using the customer id to accomplish this.

OUR CONCERNS
      - The fact that the key for an advisory lock is an integer makes us wonder if this is designed for taking locks per process type, rather than per record (like a customer)
      - Is it a bad idea to hold an advisory lock while an external api call happens? Should the locks be shorter lived?
      - The documentation notes that these locks live in a memory pool and that 'care should be taken not to exhaust this memory'. What are the implications if it is exhausted? (Eg will the situation recover once locks are released?). Are there established patterns for detecting and preventing this situation?
      - anything else we should know?


Thanks in advance for any advice!

Cheers
Perryn

Re: Is this a reasonable use for advisory locks?

From
Perryn Fowler
Date:
Hi Steve,

Thanks for your thoughts!

I was thinking to avoid using locks on the customer rows because there is a lot of other unrelated access to that table. In particular I don’t want writes to that table queueing up behind this process.

However, does the fact that you are suggesting  row locks mean you think advisory locks are a unsuitable?

(Thanks for the mention of network issues, but I am confident that we have appropriate mechanisms in place to ensure fault tolerant and idempotent processing - I’m specifically wanting to address  the race condition)

Cheers
Perryn

On Thu, 14 Apr 2022 at 6:38 pm, Steve Baldwin <steve.baldwin@gmail.com> wrote:
Hi Perryn,

I don't know why you think advisory locks are the solution. It seems regular row locks would ensure you have exclusive access to the customer.

Maybe something like this:

begin;
select * from customer where id = $1 for update skip locked;
if the query returns no rows it means something else already has a lock on the customer so rollback and exit
otherwise call the external api (assume synchronous)
if successful insert a row into the ledger table and commit else rollback

There are some tricky aspects to this but nothing that can be helped by advisory locks over row locks. For example, if the external call takes too long and you time out, or your network connection drops, how do you know whether or not it was successful? You also need to work out what happens if the insert into the ledger fails. If you haven't already, maybe check out the 'saga' pattern.

Cheers,

Steve

On Thu, Apr 14, 2022 at 5:11 PM Perryn Fowler <perryn@fresho.com> wrote:
Hi there,

We have identified a problem that we think advisory locks could help with, but we wanted to get some advice on whether its a good idea to use them this way (and any tips, best practices or gotchas we should know about)

THE PROBLEM

We have some code that does the following
       - For a customer:
             - sum a ledger of transactions
             - if the result shows that money is owed:
                    - charge a credit card (via a call to an external api)
                    - if the charge is successful, insert a transaction into the ledger

We would like to serialise execution of this code on a per customer basis, so that
we do not double charge their credit card if execution happens concurrently.

We are considering taking an advisory lock using the customer id to accomplish this.

OUR CONCERNS
      - The fact that the key for an advisory lock is an integer makes us wonder if this is designed for taking locks per process type, rather than per record (like a customer)
      - Is it a bad idea to hold an advisory lock while an external api call happens? Should the locks be shorter lived?
      - The documentation notes that these locks live in a memory pool and that 'care should be taken not to exhaust this memory'. What are the implications if it is exhausted? (Eg will the situation recover once locks are released?). Are there established patterns for detecting and preventing this situation?
      - anything else we should know?


Thanks in advance for any advice!

Cheers
Perryn

Re: Is this a reasonable use for advisory locks?

From
Steve Baldwin
Date:
Ok, so you want to allow _other_ updates to a customer while this process is happening? In that case, advisory locks will probably work. The only consideration is that the 'id' is a bigint. If your customer id maps to that, great. If not (for example we use UUID's), you will need some way to convert that id to a bigint.

Cheers,

Steve

On Thu, Apr 14, 2022 at 7:06 PM Perryn Fowler <perryn@fresho.com> wrote:
Hi Steve,

Thanks for your thoughts!

I was thinking to avoid using locks on the customer rows because there is a lot of other unrelated access to that table. In particular I don’t want writes to that table queueing up behind this process.

However, does the fact that you are suggesting  row locks mean you think advisory locks are a unsuitable?

(Thanks for the mention of network issues, but I am confident that we have appropriate mechanisms in place to ensure fault tolerant and idempotent processing - I’m specifically wanting to address  the race condition)

Cheers
Perryn

On Thu, 14 Apr 2022 at 6:38 pm, Steve Baldwin <steve.baldwin@gmail.com> wrote:
Hi Perryn,

I don't know why you think advisory locks are the solution. It seems regular row locks would ensure you have exclusive access to the customer.

Maybe something like this:

begin;
select * from customer where id = $1 for update skip locked;
if the query returns no rows it means something else already has a lock on the customer so rollback and exit
otherwise call the external api (assume synchronous)
if successful insert a row into the ledger table and commit else rollback

There are some tricky aspects to this but nothing that can be helped by advisory locks over row locks. For example, if the external call takes too long and you time out, or your network connection drops, how do you know whether or not it was successful? You also need to work out what happens if the insert into the ledger fails. If you haven't already, maybe check out the 'saga' pattern.

Cheers,

Steve

On Thu, Apr 14, 2022 at 5:11 PM Perryn Fowler <perryn@fresho.com> wrote:
Hi there,

We have identified a problem that we think advisory locks could help with, but we wanted to get some advice on whether its a good idea to use them this way (and any tips, best practices or gotchas we should know about)

THE PROBLEM

We have some code that does the following
       - For a customer:
             - sum a ledger of transactions
             - if the result shows that money is owed:
                    - charge a credit card (via a call to an external api)
                    - if the charge is successful, insert a transaction into the ledger

We would like to serialise execution of this code on a per customer basis, so that
we do not double charge their credit card if execution happens concurrently.

We are considering taking an advisory lock using the customer id to accomplish this.

OUR CONCERNS
      - The fact that the key for an advisory lock is an integer makes us wonder if this is designed for taking locks per process type, rather than per record (like a customer)
      - Is it a bad idea to hold an advisory lock while an external api call happens? Should the locks be shorter lived?
      - The documentation notes that these locks live in a memory pool and that 'care should be taken not to exhaust this memory'. What are the implications if it is exhausted? (Eg will the situation recover once locks are released?). Are there established patterns for detecting and preventing this situation?
      - anything else we should know?


Thanks in advance for any advice!

Cheers
Perryn

Re: Is this a reasonable use for advisory locks?

From
Nick Cleaton
Date:
On Thu, 14 Apr 2022 at 10:47, Steve Baldwin <steve.baldwin@gmail.com> wrote:
Ok, so you want to allow _other_ updates to a customer while this process is happening? In that case, advisory locks will probably work. The only consideration is that the 'id' is a bigint. If your customer id maps to that, great. If not (for example we use UUID's), you will need some way to convert that id to a bigint.

Alternatively, create a new table that records the start timestamp of the most recent run of your code block for each customer, and update that as the first action in your transaction. Then row locks on that table will protect you from concurrent runs.

Re: Is this a reasonable use for advisory locks?

From
Perryn Fowler
Date:

Hey Nick, 

Thanks! Yep that’s an alternative (together with a uniqueness constraint and retry mechanism)

I guess what I’m trying to get my head around is whether and why this would be better than using advisory locks…

Cheers
Perryn 

On Thu, 14 Apr 2022 at 10:32 pm, Nick Cleaton <nick@cleaton.net> wrote:
On Thu, 14 Apr 2022 at 10:47, Steve Baldwin <steve.baldwin@gmail.com> wrote:
Ok, so you want to allow _other_ updates to a customer while this process is happening? In that case, advisory locks will probably work. The only consideration is that the 'id' is a bigint. If your customer id maps to that, great. If not (for example we use UUID's), you will need some way to convert that id to a bigint.

Alternatively, create a new table that records the start timestamp of the most recent run of your code block for each customer, and update that as the first action in your transaction. Then row locks on that table will protect you from concurrent runs.

Re: Is this a reasonable use for advisory locks?

From
Michael Lewis
Date:
How many of these processes do you expect to have running concurrently? How long does that API call take? Might it be better to update the customer (or in a separate table as suggested) as "catch up charge process started at" and then clear that or set completed time in another column to serialize? That way, no need to hold that db connection while doing external work via api.