Is this a reasonable use for advisory locks?

Started by Perryn Fowleralmost 4 years ago7 messagesgeneral

perryn@fresho.com

almost 4 years ago

Hi there,

We have identified a problem that we think advisory locks could help with,
but we wanted to get some advice on whether its a good idea to use them
this way (and any tips, best practices or gotchas we should know about)

THE PROBLEM

We have some code that does the following
- For a customer:
- sum a ledger of transactions
- if the result shows that money is owed:
- charge a credit card (via a call to an external api)
- if the charge is successful, insert a
transaction into the ledger

We would like to serialise execution of this code on a per customer basis,
so that
we do not double charge their credit card if execution happens concurrently.

We are considering taking an advisory lock using the customer id to
accomplish this.

OUR CONCERNS
- The fact that the key for an advisory lock is an integer makes us
wonder if this is designed for taking locks per process type, rather than
per record (like a customer)
- Is it a bad idea to hold an advisory lock while an external api
call happens? Should the locks be shorter lived?
- The documentation notes that these locks live in a memory pool and
that 'care should be taken not to exhaust this memory'. What are the
implications if it is exhausted? (Eg will the situation recover once locks
are released?). Are there established patterns for detecting and preventing
this situation?
- anything else we should know?

Thanks in advance for any advice!

Cheers
Perryn

steve.baldwin@gmail.com

almost 4 years ago

In reply to: Perryn Fowler (#1)

Re: Is this a reasonable use for advisory locks?

Hi Perryn,

I don't know why you think advisory locks are the solution. It seems
regular row locks would ensure you have exclusive access to the customer.

Maybe something like this:

begin;
select * from customer where id = $1 for update skip locked;
if the query returns no rows it means something else already has a lock on
the customer so rollback and exit
otherwise call the external api (assume synchronous)
if successful insert a row into the ledger table and commit else rollback

There are some tricky aspects to this but nothing that can be helped by
advisory locks over row locks. For example, if the external call takes too
long and you time out, or your network connection drops, how do you know
whether or not it was successful? You also need to work out what happens if
the insert into the ledger fails. If you haven't already, maybe check out
the 'saga' pattern.

Cheers,

Steve

On Thu, Apr 14, 2022 at 5:11 PM Perryn Fowler <perryn@fresho.com> wrote:

Show quoted text

Hi there,

We have identified a problem that we think advisory locks could help with,
but we wanted to get some advice on whether its a good idea to use them
this way (and any tips, best practices or gotchas we should know about)

THE PROBLEM

We have some code that does the following
- For a customer:
- sum a ledger of transactions
- if the result shows that money is owed:
- charge a credit card (via a call to an external api)
- if the charge is successful, insert a
transaction into the ledger

We would like to serialise execution of this code on a per customer basis,
so that
we do not double charge their credit card if execution happens
concurrently.

We are considering taking an advisory lock using the customer id to
accomplish this.

OUR CONCERNS
- The fact that the key for an advisory lock is an integer makes us
wonder if this is designed for taking locks per process type, rather than
per record (like a customer)
- Is it a bad idea to hold an advisory lock while an external api
call happens? Should the locks be shorter lived?
- The documentation notes that these locks live in a memory pool and
that 'care should be taken not to exhaust this memory'. What are the
implications if it is exhausted? (Eg will the situation recover once locks
are released?). Are there established patterns for detecting and preventing
this situation?
- anything else we should know?

Thanks in advance for any advice!

Cheers
Perryn

perryn@fresho.com

almost 4 years ago

In reply to: Steve Baldwin (#2)

Re: Is this a reasonable use for advisory locks?

Hi Steve,

Thanks for your thoughts!

I was thinking to avoid using locks on the customer rows because there is a
lot of other unrelated access to that table. In particular I don’t want
writes to that table queueing up behind this process.

However, does the fact that you are suggesting row locks mean you think
advisory locks are a unsuitable?

(Thanks for the mention of network issues, but I am confident that we have
appropriate mechanisms in place to ensure fault tolerant and idempotent
processing - I’m specifically wanting to address the race condition)

Cheers
Perryn

On Thu, 14 Apr 2022 at 6:38 pm, Steve Baldwin <steve.baldwin@gmail.com>
wrote:

Show quoted text

Hi Perryn,

I don't know why you think advisory locks are the solution. It seems
regular row locks would ensure you have exclusive access to the customer.

Maybe something like this:

begin;
select * from customer where id = $1 for update skip locked;
if the query returns no rows it means something else already has a lock on
the customer so rollback and exit
otherwise call the external api (assume synchronous)
if successful insert a row into the ledger table and commit else rollback

There are some tricky aspects to this but nothing that can be helped by
advisory locks over row locks. For example, if the external call takes too
long and you time out, or your network connection drops, how do you know
whether or not it was successful? You also need to work out what happens if
the insert into the ledger fails. If you haven't already, maybe check out
the 'saga' pattern.

Cheers,

Steve

On Thu, Apr 14, 2022 at 5:11 PM Perryn Fowler <perryn@fresho.com> wrote:

Hi there,

We have identified a problem that we think advisory locks could help
with, but we wanted to get some advice on whether its a good idea to use
them this way (and any tips, best practices or gotchas we should know about)

THE PROBLEM

We have some code that does the following
- For a customer:
- sum a ledger of transactions
- if the result shows that money is owed:
- charge a credit card (via a call to an external api)
- if the charge is successful, insert a
transaction into the ledger

We would like to serialise execution of this code on a per customer
basis, so that
we do not double charge their credit card if execution happens
concurrently.

We are considering taking an advisory lock using the customer id to
accomplish this.

OUR CONCERNS
- The fact that the key for an advisory lock is an integer makes us
wonder if this is designed for taking locks per process type, rather than
per record (like a customer)
- Is it a bad idea to hold an advisory lock while an external api
call happens? Should the locks be shorter lived?
- The documentation notes that these locks live in a memory pool
and that 'care should be taken not to exhaust this memory'. What are the
implications if it is exhausted? (Eg will the situation recover once locks
are released?). Are there established patterns for detecting and preventing
this situation?
- anything else we should know?

Thanks in advance for any advice!

Cheers
Perryn

steve.baldwin@gmail.com

almost 4 years ago

In reply to: Perryn Fowler (#3)

Re: Is this a reasonable use for advisory locks?

Ok, so you want to allow _other_ updates to a customer while this process
is happening? In that case, advisory locks will probably work. The only
consideration is that the 'id' is a bigint. If your customer id maps to
that, great. If not (for example we use UUID's), you will need some way to
convert that id to a bigint.

Cheers,

Steve

On Thu, Apr 14, 2022 at 7:06 PM Perryn Fowler <perryn@fresho.com> wrote:

Show quoted text

Hi Steve,

Thanks for your thoughts!

I was thinking to avoid using locks on the customer rows because there is
a lot of other unrelated access to that table. In particular I don’t want
writes to that table queueing up behind this process.

However, does the fact that you are suggesting row locks mean you think
advisory locks are a unsuitable?

(Thanks for the mention of network issues, but I am confident that we have
appropriate mechanisms in place to ensure fault tolerant and idempotent
processing - I’m specifically wanting to address the race condition)

Cheers
Perryn

On Thu, 14 Apr 2022 at 6:38 pm, Steve Baldwin <steve.baldwin@gmail.com>
wrote:

Hi Perryn,

I don't know why you think advisory locks are the solution. It seems
regular row locks would ensure you have exclusive access to the customer.

Maybe something like this:

begin;
select * from customer where id = $1 for update skip locked;
if the query returns no rows it means something else already has a lock
on the customer so rollback and exit
otherwise call the external api (assume synchronous)
if successful insert a row into the ledger table and commit else rollback

There are some tricky aspects to this but nothing that can be helped by
advisory locks over row locks. For example, if the external call takes too
long and you time out, or your network connection drops, how do you know
whether or not it was successful? You also need to work out what happens if
the insert into the ledger fails. If you haven't already, maybe check out
the 'saga' pattern.

Cheers,

Steve

On Thu, Apr 14, 2022 at 5:11 PM Perryn Fowler <perryn@fresho.com> wrote:

Hi there,

We have identified a problem that we think advisory locks could help
with, but we wanted to get some advice on whether its a good idea to use
them this way (and any tips, best practices or gotchas we should know about)

THE PROBLEM

We have some code that does the following
- For a customer:
- sum a ledger of transactions
- if the result shows that money is owed:
- charge a credit card (via a call to an external
api)
- if the charge is successful, insert a
transaction into the ledger

We would like to serialise execution of this code on a per customer
basis, so that
we do not double charge their credit card if execution happens
concurrently.

We are considering taking an advisory lock using the customer id to
accomplish this.

OUR CONCERNS
- The fact that the key for an advisory lock is an integer makes
us wonder if this is designed for taking locks per process type, rather
than per record (like a customer)
- Is it a bad idea to hold an advisory lock while an external api
call happens? Should the locks be shorter lived?
- The documentation notes that these locks live in a memory pool
and that 'care should be taken not to exhaust this memory'. What are the
implications if it is exhausted? (Eg will the situation recover once locks
are released?). Are there established patterns for detecting and preventing
this situation?
- anything else we should know?

Thanks in advance for any advice!

Cheers
Perryn

nick@cleaton.net

almost 4 years ago

In reply to: Steve Baldwin (#4)

Re: Is this a reasonable use for advisory locks?

On Thu, 14 Apr 2022 at 10:47, Steve Baldwin <steve.baldwin@gmail.com> wrote:

Ok, so you want to allow _other_ updates to a customer while this process
is happening? In that case, advisory locks will probably work. The only
consideration is that the 'id' is a bigint. If your customer id maps to
that, great. If not (for example we use UUID's), you will need some way to
convert that id to a bigint.

Alternatively, create a new table that records the start timestamp of the
most recent run of your code block for each customer, and update that as
the first action in your transaction. Then row locks on that table will
protect you from concurrent runs.

perryn@fresho.com

almost 4 years ago

In reply to: Nick Cleaton (#5)

Re: Is this a reasonable use for advisory locks?

Hey Nick,

Thanks! Yep that’s an alternative (together with a uniqueness constraint
and retry mechanism)

I guess what I’m trying to get my head around is whether and why this would
be better than using advisory locks…

Cheers
Perryn

On Thu, 14 Apr 2022 at 10:32 pm, Nick Cleaton <nick@cleaton.net> wrote:

Show quoted text

On Thu, 14 Apr 2022 at 10:47, Steve Baldwin <steve.baldwin@gmail.com>
wrote:

Ok, so you want to allow _other_ updates to a customer while this process
is happening? In that case, advisory locks will probably work. The only
consideration is that the 'id' is a bigint. If your customer id maps to
that, great. If not (for example we use UUID's), you will need some way to
convert that id to a bigint.

Alternatively, create a new table that records the start timestamp of the
most recent run of your code block for each customer, and update that as
the first action in your transaction. Then row locks on that table will
protect you from concurrent runs.

mlewis@entrata.com

almost 4 years ago

In reply to: Perryn Fowler (#6)

Re: Is this a reasonable use for advisory locks?

How many of these processes do you expect to have running concurrently? How
long does that API call take? Might it be better to update the customer (or
in a separate table as suggested) as "catch up charge process started at"
and then clear that or set completed time in another column to serialize?
That way, no need to hold that db connection while doing external work via
api.

Show quoted text