Thread: how to avoid deadlock on masive update with multiples delete
Hi,
I have a table with about 10 millions of records, this table is update and inserted very often during the day (approx. 200 per second) , in the night the activity is a lot less, so in the first seconds of a day (00:00:01) a batch process update some columns (used like counters) of this table setting his value to 0.
Yesterday, the first time it occurs, I got a deadlock when other process try to delete multiple (about 10 or 20) rows of the same table.
I think that maybe the situation was:
Process A (PA) (massive update)
Process B (PB) (multiple delete)
PA Block record 1, update
PA Block record 2, update
PA Block record 3, update
PB Block record 4, delete
PB Block record 5, delete
PA Block record 4, waiting
PB Block record 3, waiting
The other situation could be that update process while blocking rows scale to block page and the try to scale to lock table while the delete process as some locked rows.
Any ideas how to prevent this situation?
Thanks!
On Thu, Oct 4, 2012 at 7:01 AM, Anibal David Acosta <aa@devshock.com> wrote: > Hi, > > I have a table with about 10 millions of records, this table is update and > inserted very often during the day (approx. 200 per second) , in the night > the activity is a lot less, so in the first seconds of a day (00:00:01) a > batch process update some columns (used like counters) of this table > setting his value to 0. > > > > Yesterday, the first time it occurs, I got a deadlock when other process try > to delete multiple (about 10 or 20) rows of the same table. ... > > Any ideas how to prevent this situation? The bulk update could take an Exclusive (not Access Exclusive) lock. Or the delete could perhaps be arranged to delete the records in ctid order (although that might still deadlock). Or you could just repeat the failed transaction. Cheers, Jeff
From: Anibal David Acosta [mailto:aa@devshock.com] Sent: Thursday, October 04, 2012 10:01 AM To: pgsql-performance@postgresql.org Subject: how to avoid deadlock on masive update with multiples delete ..... ..... ..... The other situation could be that update process while blocking rows scale to block page and the try to scale to lock tablewhile the delete process as some locked rows. Thanks! This (lock escalation from row -> to page -> to table) is MS SQL Server "feature", pretty sure Postgres does not do it. Regards, Igor Neyman
On Thu, Oct 4, 2012 at 1:10 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > The bulk update could take an Exclusive (not Access Exclusive) lock. > Or the delete could perhaps be arranged to delete the records in ctid > order (although that might still deadlock). Or you could just repeat > the failed transaction. How do you make pg update/delete records, in bulk, in some particular order? (ie, without issuing separate queries for each record)
Presumably something like this?: maciek=# CREATE TABLE test AS SELECT g, random() FROM generate_series(1,1000) g; CREATE maciek=# EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY ctid) x where x.g = test.g; QUERY PLAN --------------------------------------------------------------------------------- Delete on test (cost=188.99..242.34 rows=1940 width=34) -> Hash Join (cost=188.99..242.34 rows=1940 width=34) Hash Cond: (x.g = public.test.g) -> Subquery Scan on x (cost=135.34..159.59 rows=1940 width=32) -> Sort (cost=135.34..140.19 rows=1940 width=10) Sort Key: public.test.ctid -> Seq Scan on test (cost=0.00..29.40 rows=1940 width=10) -> Hash (cost=29.40..29.40 rows=1940 width=10) -> Seq Scan on test (cost=0.00..29.40 rows=1940 width=10) (9 rows)
Maciek Sakrejda <m.sakrejda@gmail.com> writes: > Presumably something like this?: > maciek=# CREATE TABLE test AS SELECT g, random() FROM > generate_series(1,1000) g; > CREATE > maciek=# EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY > ctid) x where x.g = test.g; There's no guarantee that the planner won't re-sort the rows coming from the sub-select, unfortunately. regards, tom lane
On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote: > Maciek Sakrejda <m.sakrejda@gmail.com> writes: > > Presumably something like this?: > > maciek=# CREATE TABLE test AS SELECT g, random() FROM > > generate_series(1,1000) g; > > CREATE > > maciek=# EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY > > ctid) x where x.g = test.g; > > There's no guarantee that the planner won't re-sort the rows coming from > the sub-select, unfortunately. More often than not you can prevent the planner from doing that by putting a OFFSET 0 in the query. Not 100% but better than nothing. We really need ORDER BY for DML. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote: >> There's no guarantee that the planner won't re-sort the rows coming from >> the sub-select, unfortunately. > More often than not you can prevent the planner from doing that by putting a > OFFSET 0 in the query. Not 100% but better than nothing. No, that will accomplish exactly nothing. The ORDER BY is already an optimization fence. The problem is that of the several ways the planner might choose to join the subquery output to the original table, not all will produce the join rows in the same order as the subquery's result is. For instance, when I tried his example I initially got Delete on test (cost=400.88..692.85 rows=18818 width=34) -> Merge Join (cost=400.88..692.85 rows=18818 width=34) Merge Cond: (test.g = x.g) -> Sort (cost=135.34..140.19 rows=1940 width=10) Sort Key: test.g -> Seq Scan on test (cost=0.00..29.40 rows=1940 width=10) -> Sort (cost=265.53..270.38 rows=1940 width=32) Sort Key: x.g -> Subquery Scan on x (cost=135.34..159.59 rows=1940 width=32) -> Sort (cost=135.34..140.19 rows=1940 width=10) Sort Key: test_1.ctid -> Seq Scan on test test_1 (cost=0.00..29.40 rows=1940 width=10) which is going to do the deletes in "g" order, not ctid order; and then after an ANALYZE I got Delete on test (cost=90.83..120.58 rows=1000 width=34) -> Hash Join (cost=90.83..120.58 rows=1000 width=34) Hash Cond: (test.g = x.g) -> Seq Scan on test (cost=0.00..16.00 rows=1000 width=10) -> Hash (cost=78.33..78.33 rows=1000 width=32) -> Subquery Scan on x (cost=65.83..78.33 rows=1000 width=32) -> Sort (cost=65.83..68.33 rows=1000 width=10) Sort Key: test_1.ctid -> Seq Scan on test test_1 (cost=0.00..16.00 rows=1000 width=10) which is going to do the deletes in ctid order, but that's an artifact of using a seqscan on the test table; the order of the subquery's output is irrelevant, since it got hashed. > We really need ORDER BY for DML. Meh. That's outside the SQL standard (not only outside the letter of the standard, but foreign to its very conceptual model) and I don't think the problem really comes up that often. Personally, if I had to deal with this I'd use a plpgsql function (or DO command) that does FOR c IN SELECT ctid FROM table WHERE ... ORDER BY ... LOOP DELETE FROM table WHERE ctid = c; END LOOP; which is not great but at least it avoids client-to-server traffic. Having said all that, are we sure this is even a deletion-order problem? I was wondering about deadlocks from foreign key references, for instance. regards, tom lane
On Fri, Oct 5, 2012 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote: >>> There's no guarantee that the planner won't re-sort the rows coming from >>> the sub-select, unfortunately. > >> More often than not you can prevent the planner from doing that by putting a >> OFFSET 0 in the query. Not 100% but better than nothing. > > No, that will accomplish exactly nothing. The ORDER BY is already an > optimization fence. The problem is that of the several ways the planner > might choose to join the subquery output to the original table, not all > will produce the join rows in the same order as the subquery's result > is. For instance, when I tried his example I initially got > > Delete on test (cost=400.88..692.85 rows=18818 width=34) > -> Merge Join (cost=400.88..692.85 rows=18818 width=34) > Merge Cond: (test.g = x.g) > -> Sort (cost=135.34..140.19 rows=1940 width=10) > Sort Key: test.g > -> Seq Scan on test (cost=0.00..29.40 rows=1940 width=10) > -> Sort (cost=265.53..270.38 rows=1940 width=32) > Sort Key: x.g > -> Subquery Scan on x (cost=135.34..159.59 rows=1940 width=32) > -> Sort (cost=135.34..140.19 rows=1940 width=10) > Sort Key: test_1.ctid > -> Seq Scan on test test_1 (cost=0.00..29.40 rows=1940 width=10) > > which is going to do the deletes in "g" order, not ctid order; > and then after an ANALYZE I got > > Delete on test (cost=90.83..120.58 rows=1000 width=34) > -> Hash Join (cost=90.83..120.58 rows=1000 width=34) > Hash Cond: (test.g = x.g) > -> Seq Scan on test (cost=0.00..16.00 rows=1000 width=10) > -> Hash (cost=78.33..78.33 rows=1000 width=32) > -> Subquery Scan on x (cost=65.83..78.33 rows=1000 width=32) > -> Sort (cost=65.83..68.33 rows=1000 width=10) > Sort Key: test_1.ctid > -> Seq Scan on test test_1 (cost=0.00..16.00 rows=1000 width=10) > > which is going to do the deletes in ctid order, but that's an artifact > of using a seqscan on the test table; the order of the subquery's output > is irrelevant, since it got hashed. > >> We really need ORDER BY for DML. > > Meh. That's outside the SQL standard (not only outside the letter of > the standard, but foreign to its very conceptual model) and I don't > think the problem really comes up that often. Personally, if I had to > deal with this I'd use a plpgsql function (or DO command) that does Can't it be forced like this (assuming it is in fact a vanilla order by problem)? EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY ctid FOR UPDATE) x where x.g = test.g; (emphasis on 'for update') merlin
On Fri, Oct 5, 2012 at 12:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > FOR c IN SELECT ctid FROM table WHERE ... ORDER BY ... LOOP > DELETE FROM table WHERE ctid = c; > END LOOP; Maybe, in that sense, it would be better to optimize client-server protocol for batch operations. PREPARE blah(c) AS DELETE FROM table WHERE ctid = $1; EXECUTE blah(c1), blah(c2), blah(c3), ... ^ 1 transaction, 1 roundtrip, multiple queries
Merlin Moncure <mmoncure@gmail.com> writes: > Can't it be forced like this (assuming it is in fact a vanilla order > by problem)? > EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY > ctid FOR UPDATE) x where x.g = test.g; > (emphasis on 'for update') Hm ... yeah, that might work, once you redefine the problem as "get the row locks in a consistent order" rather than "do the updates in a consistent order". But I'd be inclined to phrase it as EXPLAIN DELETE FROM test USING (SELECT ctid FROM test ORDER BY g FOR UPDATE) x where x.ctid = test.ctid; I'm not sure that "ORDER BY ctid" is really very meaningful here; think about FOR UPDATE switching its attention to updated versions of rows. regards, tom lane
On Friday, October 05, 2012 05:46:05 PM Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote: > >> There's no guarantee that the planner won't re-sort the rows coming from > >> the sub-select, unfortunately. > > > > More often than not you can prevent the planner from doing that by > > putting a OFFSET 0 in the query. Not 100% but better than nothing. > > No, that will accomplish exactly nothing. The ORDER BY is already an > optimization fence. Yea, sorry. I was thinking of related problem/solution. > > We really need ORDER BY for DML. > > Meh. That's outside the SQL standard (not only outside the letter of > the standard, but foreign to its very conceptual model) and I don't > think the problem really comes up that often. Back when I mostly did consulting/development on client code it came up about once a week. I might have a warped view though because thats the kind of thing you would ask a consultant about... > Having said all that, are we sure this is even a deletion-order > problem? I was wondering about deadlocks from foreign key references, > for instance. Absolutely not sure, no. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Process 1 (massive update): update table A set column1=0, column2=0 Process 2 (multiple delete): perform delete_row(user_name, column1, column2) from table A where user_name=YYY The pgsql function delete_row delete the row and do other business logic not related to table A. -----Mensaje original----- De: Claudio Freire [mailto:klaussfreire@gmail.com] Enviado el: viernes, 05 de octubre de 2012 10:27 a.m. Para: Jeff Janes CC: Anibal David Acosta; pgsql-performance@postgresql.org Asunto: Re: [PERFORM] how to avoid deadlock on masive update with multiples delete On Thu, Oct 4, 2012 at 1:10 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > The bulk update could take an Exclusive (not Access Exclusive) lock. > Or the delete could perhaps be arranged to delete the records in ctid > order (although that might still deadlock). Or you could just repeat > the failed transaction. How do you make pg update/delete records, in bulk, in some particular order? (ie, without issuing separate queries for each record)