Re: Avoiding deadlocks when performing bulk update and delete operations - Mailing list pgsql-general

From Bill Moran
Subject Re: Avoiding deadlocks when performing bulk update and delete operations
Date
Msg-id 20141124124215.3dddf50028c3d4a24a3f12b4@potentialtech.com
Whole thread Raw
In response to Avoiding deadlocks when performing bulk update and delete operations  (Sanjaya Vithanagama <svithanagama@gmail.com>)
Responses Re: Avoiding deadlocks when performing bulk update and delete operations  (Sanjaya Vithanagama <svithanagama@gmail.com>)
List pgsql-general
On Mon, 24 Nov 2014 14:51:42 +1100
Sanjaya Vithanagama <svithanagama@gmail.com> wrote:

> Hi All,
>
> We have a single table which does not have any foreign key references.
>
> id_A (bigint)
> id_B (bigint)
> val_1 (varchar)
> val_2 (varchar)
>
> The primary key of the table is a composite of id_A and id_B.
>
> Reads and writes of this table are highly concurrent and the table has
> millions of rows. We have several stored procedures which do mass updates
> and deletes. Those stored procedures are being called concurrently mainly
> by triggers and application code.
>
> The operations usually look like the following where it could match
> thousands of records to update or delete:
>
> DELETE FROM table_name t
> USING (
>    SELECT id_A, id_B
>    FROM   table_name
>    WHERE  id_A = ANY(array_of_id_A)
>    AND    id_B = ANY(array_of_id_B)
>    ORDER  BY id_A, id_B
>    FOR    UPDATE
>    ) del
> WHERE  t.id_A = del.id_A
> AND    t.id_B = del.id_B;
>
>
> UPDATE table_name t
> SET    val_1 = 'some value'
>      , val_2 = 'some value'
> FROM (
>    SELECT id_A, id_B
>    FROM   table_name
>    WHERE  id_A = ANY(array_of_id_A)
>    AND    id_B = ANY(array_of_id_B)
>    ORDER  BY id_A, id_B
>    FOR    UPDATE
>    ) upd
> WHERE  t.id_A = upd.id_A
> AND    t.id_B = upd.id_B;
>
> We are experiencing deadlocks and all our attempts to perform operations
> with locks (row level using SELECT FOR UPDATE as used in the above queries
> and table level locks) do not seem to solve these deadlock issues. (Note
> that we cannot in any way use access exclusive locking on this table
> because of the performance impact)
>
> Is there another way that we could try to solve these deadlock situations?
> The reference manual says ? "The best defense against deadlocks is
> generally to avoid them by being certain that all applications using a
> database acquire locks on multiple objects in a consistent order."
>
> Is there a guaranteed way to do bulk update/delete operations in a
> particular order so that we can ensure deadlocks won't occur? Or are there
> any other tricks to avoid deadlocks in this situation?

Lots of stuff to say about this ...

First off, Igor has a valid point that the subselects are not helping any
and may be making the situation slightly worse. I can't see any reason not
to simiplify the queries as he suggested.

Secondly, a lot of your comments are too vague for me to understand what's
happening, so I'm going to ask a bunch of questions to clarify:

* How many UPDATE/INSERT queries are you running per second?
* How many in parallel on average?
* What's the typical execution time for an UPDATE/INSERT that might cause
  this problem?
* How frequently do deadlocks occur?
* Are there other tables involved in the transactions ... i.e., have you
  confirmed that these are the _only_ tables causing the deadlock?

Since you didn't include any log output, I'm fuzzy on some of those things
above ... but I'm assuming that you're unable to post specific details of
the precise problem.

I have a lot of suggestions, but instead of bombing you with all of them, I
think it would be better if you answered those questions, which will tell
me which suggestions are most likely to help.

--
Bill Moran
I need your help to succeed:
http://gamesbybill.com


pgsql-general by date:

Previous
From: Berend Tober
Date:
Subject: Re: How to individually list the DDL for all individual data base objects
Next
From: Olivier MATROT
Date:
Subject: Serialization exception : Who else was involved?