Thread: how to avoid deadlock on masive update with multiples delete

how to avoid deadlock on masive update with multiples delete

From

"Anibal David Acosta"

Date:

04 October 2012, 14:01:21

Hi,

I have a table with about 10 millions of records, this table is update and inserted very often during the day (approx. 200 per second) , in the night the activity is a lot less, so in the first seconds of a day (00:00:01) a batch process update some columns (used like counters) of this table setting his value to 0.

Yesterday, the first time it occurs, I got a deadlock when other process try to delete multiple (about 10 or 20) rows of the same table.

I think that maybe the situation was:

Process A (PA) (massive update)

Process B (PB) (multiple delete)

PA Block record 1, update

PA Block record 2, update

PA Block record 3, update

PB Block record 4, delete

PB Block record 5, delete

PA Block record 4, waiting

PB Block record 3, waiting

The other situation could be that update process while blocking rows scale to block page and the try to scale to lock table while the delete process as some locked rows.

Any ideas how to prevent this situation?

Thanks!

Re: how to avoid deadlock on masive update with multiples delete

From

Jeff Janes

Date:

04 October 2012, 16:10:13

On Thu, Oct 4, 2012 at 7:01 AM, Anibal David Acosta <aa@devshock.com> wrote:
> Hi,
>
> I have a table with about 10 millions of records, this table is update and
> inserted very often during the day (approx. 200 per second) , in the night
> the activity is a lot less, so in the first seconds of a day (00:00:01) a
> batch process update  some columns (used like counters) of this table
> setting his value to 0.
>
>
>
> Yesterday, the first time it occurs, I got a deadlock when other process try
> to delete multiple (about 10 or 20) rows of the same table.
...
>
> Any ideas how to prevent this situation?

The bulk update could take an Exclusive (not Access Exclusive) lock.
Or the delete could perhaps be arranged to delete the records in ctid
order (although that might still deadlock).  Or you could just repeat
the failed transaction.

Cheers,

Jeff

Re: how to avoid deadlock on masive update with multiples delete

From

Igor Neyman

Date:

05 October 2012, 14:06:52

From: Anibal David Acosta [mailto:aa@devshock.com]
Sent: Thursday, October 04, 2012 10:01 AM
To: pgsql-performance@postgresql.org
Subject: how to avoid deadlock on masive update with multiples delete

.....
.....
.....

The other situation could be that update process while blocking rows scale to block page and the try to scale to lock
tablewhile the delete process as some locked rows. 

Thanks!

This (lock escalation from row -> to page -> to table) is MS SQL Server "feature", pretty sure Postgres does not do it.

Regards,
Igor Neyman

Re: how to avoid deadlock on masive update with multiples delete

From

Claudio Freire

Date:

05 October 2012, 14:27:16

On Thu, Oct 4, 2012 at 1:10 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> The bulk update could take an Exclusive (not Access Exclusive) lock.
> Or the delete could perhaps be arranged to delete the records in ctid
> order (although that might still deadlock).  Or you could just repeat
> the failed transaction.

How do you make pg update/delete records, in bulk, in some particular order?

(ie, without issuing separate queries for each record)

Re: how to avoid deadlock on masive update with multiples delete

From

Maciek Sakrejda

Date:

05 October 2012, 15:08:23

Presumably something like this?:

maciek=# CREATE TABLE test AS SELECT g, random() FROM
generate_series(1,1000) g;
CREATE
maciek=# EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY
ctid) x where x.g = test.g;
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Delete on test  (cost=188.99..242.34 rows=1940 width=34)
   ->  Hash Join  (cost=188.99..242.34 rows=1940 width=34)
         Hash Cond: (x.g = public.test.g)
         ->  Subquery Scan on x  (cost=135.34..159.59 rows=1940 width=32)
           ->  Sort  (cost=135.34..140.19 rows=1940 width=10)
                     Sort Key: public.test.ctid
                     ->  Seq Scan on test  (cost=0.00..29.40 rows=1940 width=10)
         ->  Hash  (cost=29.40..29.40 rows=1940 width=10)
           ->  Seq Scan on test  (cost=0.00..29.40 rows=1940 width=10)
(9 rows)

Re: how to avoid deadlock on masive update with multiples delete

From

Tom Lane

Date:

05 October 2012, 15:31:48

Maciek Sakrejda <m.sakrejda@gmail.com> writes:
> Presumably something like this?:
> maciek=# CREATE TABLE test AS SELECT g, random() FROM
> generate_series(1,1000) g;
> CREATE
> maciek=# EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY
> ctid) x where x.g = test.g;

There's no guarantee that the planner won't re-sort the rows coming from
the sub-select, unfortunately.

            regards, tom lane

Re: how to avoid deadlock on masive update with multiples delete

From

Andres Freund

Date:

05 October 2012, 15:36:28

On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote:
> Maciek Sakrejda <m.sakrejda@gmail.com> writes:
> > Presumably something like this?:
> > maciek=# CREATE TABLE test AS SELECT g, random() FROM
> > generate_series(1,1000) g;
> > CREATE
> > maciek=# EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY
> > ctid) x where x.g = test.g;
>
> There's no guarantee that the planner won't re-sort the rows coming from
> the sub-select, unfortunately.
More often than not you can prevent the planner from doing that by putting a
OFFSET 0 in the query. Not 100% but better than nothing.

We really need ORDER BY for DML.

Andres
--
Andres Freund        http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: how to avoid deadlock on masive update with multiples delete

From

Tom Lane

Date:

05 October 2012, 15:46:11

Andres Freund <andres@2ndquadrant.com> writes:
> On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote:
>> There's no guarantee that the planner won't re-sort the rows coming from
>> the sub-select, unfortunately.

> More often than not you can prevent the planner from doing that by putting a
> OFFSET 0 in the query. Not 100% but better than nothing.

No, that will accomplish exactly nothing.  The ORDER BY is already an
optimization fence.  The problem is that of the several ways the planner
might choose to join the subquery output to the original table, not all
will produce the join rows in the same order as the subquery's result
is.  For instance, when I tried his example I initially got

 Delete on test  (cost=400.88..692.85 rows=18818 width=34)
   ->  Merge Join  (cost=400.88..692.85 rows=18818 width=34)
         Merge Cond: (test.g = x.g)
         ->  Sort  (cost=135.34..140.19 rows=1940 width=10)
               Sort Key: test.g
               ->  Seq Scan on test  (cost=0.00..29.40 rows=1940 width=10)
         ->  Sort  (cost=265.53..270.38 rows=1940 width=32)
               Sort Key: x.g
               ->  Subquery Scan on x  (cost=135.34..159.59 rows=1940 width=32)
                     ->  Sort  (cost=135.34..140.19 rows=1940 width=10)
                           Sort Key: test_1.ctid
                           ->  Seq Scan on test test_1  (cost=0.00..29.40 rows=1940 width=10)

which is going to do the deletes in "g" order, not ctid order;
and then after an ANALYZE I got

 Delete on test  (cost=90.83..120.58 rows=1000 width=34)
   ->  Hash Join  (cost=90.83..120.58 rows=1000 width=34)
         Hash Cond: (test.g = x.g)
         ->  Seq Scan on test  (cost=0.00..16.00 rows=1000 width=10)
         ->  Hash  (cost=78.33..78.33 rows=1000 width=32)
               ->  Subquery Scan on x  (cost=65.83..78.33 rows=1000 width=32)
                     ->  Sort  (cost=65.83..68.33 rows=1000 width=10)
                           Sort Key: test_1.ctid
                           ->  Seq Scan on test test_1  (cost=0.00..16.00 rows=1000 width=10)

which is going to do the deletes in ctid order, but that's an artifact
of using a seqscan on the test table; the order of the subquery's output
is irrelevant, since it got hashed.

> We really need ORDER BY for DML.

Meh.  That's outside the SQL standard (not only outside the letter of
the standard, but foreign to its very conceptual model) and I don't
think the problem really comes up that often.  Personally, if I had to
deal with this I'd use a plpgsql function (or DO command) that does

    FOR c IN SELECT ctid FROM table WHERE ... ORDER BY ... LOOP
        DELETE FROM table WHERE ctid = c;
    END LOOP;

which is not great but at least it avoids client-to-server traffic.

Having said all that, are we sure this is even a deletion-order
problem?  I was wondering about deadlocks from foreign key references,
for instance.

            regards, tom lane

Re: how to avoid deadlock on masive update with multiples delete

From

Merlin Moncure

Date:

05 October 2012, 15:55:43

On Fri, Oct 5, 2012 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
>> On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote:
>>> There's no guarantee that the planner won't re-sort the rows coming from
>>> the sub-select, unfortunately.
>
>> More often than not you can prevent the planner from doing that by putting a
>> OFFSET 0 in the query. Not 100% but better than nothing.
>
> No, that will accomplish exactly nothing.  The ORDER BY is already an
> optimization fence.  The problem is that of the several ways the planner
> might choose to join the subquery output to the original table, not all
> will produce the join rows in the same order as the subquery's result
> is.  For instance, when I tried his example I initially got
>
>  Delete on test  (cost=400.88..692.85 rows=18818 width=34)
>    ->  Merge Join  (cost=400.88..692.85 rows=18818 width=34)
>          Merge Cond: (test.g = x.g)
>          ->  Sort  (cost=135.34..140.19 rows=1940 width=10)
>                Sort Key: test.g
>                ->  Seq Scan on test  (cost=0.00..29.40 rows=1940 width=10)
>          ->  Sort  (cost=265.53..270.38 rows=1940 width=32)
>                Sort Key: x.g
>                ->  Subquery Scan on x  (cost=135.34..159.59 rows=1940 width=32)
>                      ->  Sort  (cost=135.34..140.19 rows=1940 width=10)
>                            Sort Key: test_1.ctid
>                            ->  Seq Scan on test test_1  (cost=0.00..29.40 rows=1940 width=10)
>
> which is going to do the deletes in "g" order, not ctid order;
> and then after an ANALYZE I got
>
>  Delete on test  (cost=90.83..120.58 rows=1000 width=34)
>    ->  Hash Join  (cost=90.83..120.58 rows=1000 width=34)
>          Hash Cond: (test.g = x.g)
>          ->  Seq Scan on test  (cost=0.00..16.00 rows=1000 width=10)
>          ->  Hash  (cost=78.33..78.33 rows=1000 width=32)
>                ->  Subquery Scan on x  (cost=65.83..78.33 rows=1000 width=32)
>                      ->  Sort  (cost=65.83..68.33 rows=1000 width=10)
>                            Sort Key: test_1.ctid
>                            ->  Seq Scan on test test_1  (cost=0.00..16.00 rows=1000 width=10)
>
> which is going to do the deletes in ctid order, but that's an artifact
> of using a seqscan on the test table; the order of the subquery's output
> is irrelevant, since it got hashed.
>
>> We really need ORDER BY for DML.
>
> Meh.  That's outside the SQL standard (not only outside the letter of
> the standard, but foreign to its very conceptual model) and I don't
> think the problem really comes up that often.  Personally, if I had to
> deal with this I'd use a plpgsql function (or DO command) that does

Can't it be forced like this (assuming it is in fact a vanilla order
by problem)?

EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY
ctid FOR UPDATE) x where x.g = test.g;

(emphasis on 'for update')

merlin

Re: how to avoid deadlock on masive update with multiples delete

From

Claudio Freire

Date:

05 October 2012, 15:58:55

On Fri, Oct 5, 2012 at 12:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>         FOR c IN SELECT ctid FROM table WHERE ... ORDER BY ... LOOP
>                 DELETE FROM table WHERE ctid = c;
>         END LOOP;

Maybe, in that sense, it would be better to optimize client-server
protocol for batch operations.

PREPARE blah(c) AS DELETE FROM table WHERE ctid = $1;

EXECUTE blah(c1), blah(c2), blah(c3), ...
 ^ 1 transaction, 1 roundtrip, multiple queries

Re: how to avoid deadlock on masive update with multiples delete

From

Tom Lane

Date:

05 October 2012, 16:21:09

Merlin Moncure <mmoncure@gmail.com> writes:
> Can't it be forced like this (assuming it is in fact a vanilla order
> by problem)?

> EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY
> ctid FOR UPDATE) x where x.g = test.g;

> (emphasis on 'for update')

Hm ... yeah, that might work, once you redefine the problem as "get the
row locks in a consistent order" rather than "do the updates in a
consistent order".  But I'd be inclined to phrase it as

EXPLAIN DELETE FROM test USING (SELECT ctid FROM test ORDER BY
g FOR UPDATE) x where x.ctid = test.ctid;

I'm not sure that "ORDER BY ctid" is really very meaningful here; think
about FOR UPDATE switching its attention to updated versions of rows.

            regards, tom lane

Re: how to avoid deadlock on masive update with multiples delete

From

Andres Freund

Date:

05 October 2012, 16:33:59

On Friday, October 05, 2012 05:46:05 PM Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote:
> >> There's no guarantee that the planner won't re-sort the rows coming from
> >> the sub-select, unfortunately.
> >
> > More often than not you can prevent the planner from doing that by
> > putting a OFFSET 0 in the query. Not 100% but better than nothing.
>
> No, that will accomplish exactly nothing.  The ORDER BY is already an
> optimization fence.
Yea, sorry. I was thinking of related problem/solution.

> > We really need ORDER BY for DML.
>
> Meh.  That's outside the SQL standard (not only outside the letter of
> the standard, but foreign to its very conceptual model) and I don't
> think the problem really comes up that often.
Back when I mostly did consulting/development on client code it came up about
once a week. I might have  a warped view though because thats the kind of
thing you would ask a consultant about...

> Having said all that, are we sure this is even a deletion-order
> problem?  I was wondering about deadlocks from foreign key references,
> for instance.
Absolutely not sure, no.

Andres
--
Andres Freund        http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: how to avoid deadlock on masive update with multiples delete

From

"Anibal David Acosta"

Date:

05 October 2012, 18:34:00

Process 1 (massive update): update table A set column1=0, column2=0

Process 2 (multiple delete): perform delete_row(user_name, column1, column2)
from table A where user_name=YYY

The pgsql function delete_row delete the row and do other business logic not
related to table A.



-----Mensaje original-----
De: Claudio Freire [mailto:klaussfreire@gmail.com]
Enviado el: viernes, 05 de octubre de 2012 10:27 a.m.
Para: Jeff Janes
CC: Anibal David Acosta; pgsql-performance@postgresql.org
Asunto: Re: [PERFORM] how to avoid deadlock on masive update with multiples
delete

On Thu, Oct 4, 2012 at 1:10 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> The bulk update could take an Exclusive (not Access Exclusive) lock.
> Or the delete could perhaps be arranged to delete the records in ctid
> order (although that might still deadlock).  Or you could just repeat
> the failed transaction.

How do you make pg update/delete records, in bulk, in some particular order?

(ie, without issuing separate queries for each record)