Thread: regression, deadlock in high frequency single-row UPDATE
Hi, I've discussed this problem on irc a couple of times and think I've found a regression that plagues our application, introduced in some version newer than 9.1.9, and still present in 9.3.5. Multiple instances of the *exact* same single row update: UPDATE "z8z6px927zu6qzzbnb5ntgghxg"."access_grants" ag SET last_issued=DEFAULT FROM "z8z6px927zu6qzzbnb5ntgghxg"."oauth_clients" oc WHERE oc.id = ag.client_id AND ag.entity_name = 'user' AND ag.entity_id = 129 AND oc.client_id = '3hp45h9d4f9wwtx7cvpus6rdb4s5kb9f' RETURNING ag.id ; if performed with sufficient concurrency will produce a deadlock. I've attached a snippet of logs, all uncommented lines from postgresql.conf, and the table definitions of the involved tables. This first instance of the deadlock appears in line 142; I left a bunch of lead-in in case it's relevant. I think this is a regression as we only see the behavior under postgres 9.3.x (reproduced locally on 9.3.4 and 9.3.5 in a VMWare VM running Ubuntu 11.04, but also evident in 9.3.3 on Amazon RDS). I am unable to reproduce in the earlier versions I've been able to test against (9.0.something and 9.1.9). To reproduce the problem, I have to fork about 100 api calls against our application, with the results you see in the logs. The queries in the log are the *only* activity in the application and database at the time and I have not filtered the logs at all, other than snipping to a reasonable window around the problem. I can produce more extensive logs if needed, but there really is nothing else. I have not been able to reproduce the deadlock by making concurrent UPDATEs via what amounts to a bash fork-bomb w/ psql, but I suspect my methodology might be too crude there. The application is able to spin up a large number of lightweight threads fairly quickly and presumably attain the needed level of concurrency. Thanks, A
Attachment
On Fri, Aug 1, 2014 at 2:30 AM, Andrew Sackville-West <awest@janrain.com> wrote: > I have not been able to reproduce the deadlock by making concurrent > UPDATEs via what amounts to a bash fork-bomb w/ psql, but I suspect my > methodology might be too crude there. Another way is using pgbench. Write a script file with the SQL query in it (must be on a single line). Then launch it as: pgbench -f scriptfile.sql -n -c 10 dbname Regards, Marti
On Fri, Aug 01, 2014 at 06:40:51PM +0300, Marti Raudsepp wrote: > On Fri, Aug 1, 2014 at 2:30 AM, Andrew Sackville-West <awest@janrain.com> wrote: > > I have not been able to reproduce the deadlock by making concurrent > > UPDATEs via what amounts to a bash fork-bomb w/ psql, but I suspect my > > methodology might be too crude there. > > Another way is using pgbench. Write a script file with the SQL query > in it (must be on a single line). Then launch it as: > pgbench -f scriptfile.sql -n -c 10 dbname > a perfect, thanks I'll see if I can reproduce this way and follow up. A --
Andrew Sackville-West wrote: > Hi, > > I've discussed this problem on irc a couple of times and think I've > found a regression that plagues our application, introduced in some > version newer than 9.1.9, and still present in 9.3.5. I don't think you have provided everything: for instance I see that your log has INSERTs into table access_tokens, which has a foreign key relationship to/from some other table you showed. I'm too lazy to unscramble the whole thing into a real schema right now. pg_dump output would be more helpful, as would some trivial sample data enough to reproduce the problem. It'd probably be easier to reproduce if you wrap the statements in transactions and have them sleep before COMMIT. (If you run two of these updates in parallel, the first one blocks the second until the first one commits, because they update the same row. What happens if the blocking session runs the aforementioned INSERT while the other is blocked? Perhaps three sessions are needed to show a problem without explicit transactions.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jul 31, 2014 at 08:59:50PM -0400, Alvaro Herrera wrote: > Andrew Sackville-West wrote: > > Hi, > > > > I've discussed this problem on irc a couple of times and think I've > > found a regression that plagues our application, introduced in some > > version newer than 9.1.9, and still present in 9.3.5. > > I don't think you have provided everything: for instance I see that your > log has INSERTs into table access_tokens, which has a foreign key > relationship to/from some other table you showed. Ah you're correct, I'll attach the description of that table. But, really, that is all that's going on in the database. > > I'm too lazy to unscramble the whole thing into a real schema right now. > pg_dump output would be more helpful, as would some trivial sample data > enough to reproduce the problem. It's really minimal data. I'll attach a dump of the access_grants and oauth_clients table. The access_tokens table is not empty, but I'm not sure how relevant the data is. What is shown in the log is typical. There is another table in play here, refresh_tokens, but we're not actually using them in this case -- the table is empty, and the reference to it in access_tokens in null. > It'd probably be easier to reproduce if you wrap the statements in > transactions and have them sleep before COMMIT. The issue here is, with no code changes, this works fine under very heavy load in production against 9.0.x, 9.1.9, etc. But under 9.3.{3,4,5} it fails. That code has no transaction. I think adding a transaction is a distraction from what appears to me to be a regression. Interestingly, if I change the code to include a transaction around the UPDATE, the problem goes away (or so it seems so far in my testing). > (If you run two of > these updates in parallel, the first one blocks the second until the > first one commits, because they update the same row. What happens if > the blocking session runs the aforementioned INSERT while the other is > blocked? Perhaps three sessions are needed to show a problem without > explicit transactions.) I've run pg_bench from 3 machines all pointing at the same database running this same set of statements and it fails to induce the deadlock. It occurs to me that I haven't completely described the sequence of actions, so here is a snippet of the actual SQL that a given action induces, when it works as expected: SET standard_conforming_strings TO on; SET datestyle TO ISO; SET client_encoding TO utf8; SELECT "user_1"."id", "user_1"."uuid", "user_1"."lastUpdated" AT TIME ZONE 'UTC', "user_1"."id" as _id FROM "z8z6px927zu6qzzbnb5ntgghxg"."user_1" AS "user_1" WHERE "user_1"."id" = 129; UPDATE "z8z6px927zu6qzzbnb5ntgghxg"."access_grants" ag SET last_issued=DEFAULT FROM "z8z6px927zu6qzzbnb5ntgghxg"."oauth_clients" oc WHERE oc.id = ag.client_id AND ag.entity_name = 'user' AND ag.entity_id = 129 AND oc.client_id = '3hp45h9d4f9wwtx7cvpus6rdb4s5kb9f' RETURNING ag.id; INSERT INTO "z8z6px927zu6qzzbnb5ntgghxg"."access_tokens" (token, access_id, refresh_token_id, issued, expiry) VALUES ('xys7vzs94huex2y7', 2, null, timezone('UTC'::text, now()), timezone('UTC'::text, now()) + '1:00:00':: interval); This represents one complete action. There are several of these running concurrently, interleaved together. There is no transaction involved here. As far as I can tell, the application is using bog-standard libpq (via FFI wrapper in Haskell) with no non-default configuration applied to the connection. I realize there's not a lot to go on here, and I don't expect help debugging my application. I've got a patch that seems to resolve the problem for us. But I'm suspicious of this change in behavior and hope that pointing it out can help in some way. regards, A > > -- > Álvaro Herrera http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services --
Attachment
Andrew Sackville-West wrote: > > I don't think you have provided everything: for instance I see that your > > log has INSERTs into table access_tokens, which has a foreign key > > relationship to/from some other table you showed. > > Ah you're correct, I'll attach the description of that table. But, > really, that is all that's going on in the database. See, access_tokens has a foreign key to access_grants. So when the insert to access_tokens happens, the process also acquires a lock on a tuple on access_grants, which should not conflict with the UPDATE; but apparently it is blocking in some cases, for unknown reasons. But note that one of the original tables had a large number of foreign keys to other tables; are those necessary to reproduce the problem? I mean, if you just copy those three tables to an otherwise empty database and then run the hanging application, do you see a problem? If not, does adding all those other tables cause the problem to show up? Note that the set of columns covered by unique indexes is very important. Your \d does not have an unique index on the updated column AFAICS; but if it did have one, things would be completely different (because then the insert would definitely conflict with the update). It would help a lot of you posted the table definitions as pg_dump -t output rather than \d, because it's much easier for me to reproduce the setup, rather than cutting and pasting the \d output into CREATE TABLE commands. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Some months later (stupid business priorities...) we have something to report on this. We have been able to produce a minimal schema to demonstrate the problem, and successfully induced the deadlock using pgbench (thanks to Paulo Tanimoto, cc-ed here). Please see: https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3 for detailed instructions on how to trigger the deadlock. I would be thrilled to learn that we've done something wrong here, otherwise I think this represents a regression introduced in 9.3. A On Mon, Aug 04, 2014 at 12:30:27PM -0400, Alvaro Herrera wrote: > Andrew Sackville-West wrote: > > > > I don't think you have provided everything: for instance I see that your > > > log has INSERTs into table access_tokens, which has a foreign key > > > relationship to/from some other table you showed. > > > > Ah you're correct, I'll attach the description of that table. But, > > really, that is all that's going on in the database. > > See, access_tokens has a foreign key to access_grants. So when the > insert to access_tokens happens, the process also acquires a lock on a > tuple on access_grants, which should not conflict with the UPDATE; but > apparently it is blocking in some cases, for unknown reasons. But note > that one of the original tables had a large number of foreign keys to > other tables; are those necessary to reproduce the problem? I mean, if > you just copy those three tables to an otherwise empty database and then > run the hanging application, do you see a problem? If not, does adding > all those other tables cause the problem to show up? > > Note that the set of columns covered by unique indexes is very > important. Your \d does not have an unique index on the updated column > AFAICS; but if it did have one, things would be completely different > (because then the insert would definitely conflict with the update). > > It would help a lot of you posted the table definitions as pg_dump -t > output rather than \d, because it's much easier for me to reproduce the > setup, rather than cutting and pasting the \d output into CREATE TABLE > commands. > > -- > Álvaro Herrera http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs --
On 09/12/14 12:55, Andrew Sackville-West wrote: > Some months later (stupid business priorities...) we have something to > report on this. We have been able to produce a minimal schema to > demonstrate the problem, and successfully induced the deadlock using > pgbench (thanks to Paulo Tanimoto, cc-ed here). > > Please see: > > https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3 > > for detailed instructions on how to trigger the deadlock. > > I would be thrilled to learn that we've done something wrong here, > otherwise I think this represents a regression introduced in 9.3. > > That is interesting - FWIW I can reproduce: - 9.2.9 no deadlock - 9.4rc1 many deadlocks so something has changed after 9.2 for sure! Just mucking about I changed certain things in your schema setup (removed ON DELETE CASCADE, use CURRENT_TIMESTAMP instead of now()...lessen the use of DEFAULT)...however still seeing deadlocks in 9.4, so at least it is easy to reproduce! Regards Mark
For whatever it's worth, I was able to bisect the problem to this commit: commit 0ac5ad5134f2769ccbaefec73844f8504c4d6182 Author: Alvaro Herrera <alvherre@alvh.no-ip.org> Date: Wed Jan 23 12:04:59 2013 -0300 Improve concurrency of foreign key locking ... depesz On Tue, Dec 9, 2014 at 6:58 AM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz > wrote: > On 09/12/14 12:55, Andrew Sackville-West wrote: > >> Some months later (stupid business priorities...) we have something to >> report on this. We have been able to produce a minimal schema to >> demonstrate the problem, and successfully induced the deadlock using >> pgbench (thanks to Paulo Tanimoto, cc-ed here). >> >> Please see: >> >> https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3 >> >> for detailed instructions on how to trigger the deadlock. >> >> I would be thrilled to learn that we've done something wrong here, >> otherwise I think this represents a regression introduced in 9.3. >> >> >> > > That is interesting - FWIW I can reproduce: > > - 9.2.9 no deadlock > - 9.4rc1 many deadlocks > > so something has changed after 9.2 for sure! > > Just mucking about I changed certain things in your schema setup (removed > ON DELETE CASCADE, use CURRENT_TIMESTAMP instead of now()...lessen the use > of DEFAULT)...however still seeing deadlocks in 9.4, so at least it is easy > to reproduce! > > Regards > > Mark > > > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs >
hubert depesz lubaczewski wrote: > For whatever it's worth, I was able to bisect the problem to this commit: > > commit > 0ac5ad5134f2769ccbaefec73844f8504c4d6182 > > Author: Alvaro Herrera <alvherre@alvh.no-ip.org> > > Date: Wed Jan 23 12:04:59 2013 -0300 > > Improve concurrency of foreign key locking I'm not surprised in the least, TBH. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Dec 8, 2014 at 9:58 PM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz > wrote: > On 09/12/14 12:55, Andrew Sackville-West wrote: > >> >> https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3 >> >> for detailed instructions on how to trigger the deadlock. > > > >> > That is interesting - FWIW I can reproduce: > > - 9.2.9 no deadlock > - 9.4rc1 many deadlocks > > so something has changed after 9.2 for sure! > > Just mucking about I changed certain things in your schema setup (removed > ON DELETE CASCADE, use CURRENT_TIMESTAMP instead of now()...lessen the use > of DEFAULT)...however still seeing deadlocks in 9.4, so at least it is easy > to reproduce! > Can you speak to what benefits those changes might yield? Or point to relevant docs? thanks A > > Regards > > Mark > > > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs >
On Tue, Dec 9, 2014 at 6:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > hubert depesz lubaczewski wrote: > > For whatever it's worth, I was able to bisect the problem to this commi= t: > > > > commit > > 0ac5ad5134f2769ccbaefec73844f8504c4d6182 > > > > Author: Alvaro Herrera <alvherre@alvh.no-ip.org> > > > > Date: Wed Jan 23 12:04:59 2013 -0300 > > > > Improve concurrency of foreign key locking > > I'm not surprised in the least, TBH. > > Thanks for your attention to this. Is there anything we're doing wrong that could improve this in current 9.3 versions? A > -- > =C3=81lvaro Herrera http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs >
On 2014-12-09 09:38:33 -0800, Andrew Sackville-West wrote: > On Tue, Dec 9, 2014 at 6:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: > > > hubert depesz lubaczewski wrote: > > > For whatever it's worth, I was able to bisect the problem to this commit: > > > > > > commit > > > 0ac5ad5134f2769ccbaefec73844f8504c4d6182 > > > > > > Author: Alvaro Herrera <alvherre@alvh.no-ip.org> > > > > > > Date: Wed Jan 23 12:04:59 2013 -0300 > > > > > > Improve concurrency of foreign key locking > > > > I'm not surprised in the least, TBH. > > > > > Thanks for your attention to this. Is there anything we're doing wrong that > could improve this in current 9.3 versions? I don't think anybody can seriously can give you a serious workaround until the issue has been diagnosed further. One thing you could try would be to create a unique index that spans all the columns in the table... That'd prevent usage of lower locklevels added in the above commit. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andrew Sackville-West wrote: > Some months later (stupid business priorities...) we have something to > report on this. We have been able to produce a minimal schema to > demonstrate the problem, and successfully induced the deadlock using > pgbench (thanks to Paulo Tanimoto, cc-ed here). > > Please see: > > https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3 > > for detailed instructions on how to trigger the deadlock. > > I would be thrilled to learn that we've done something wrong here, > otherwise I think this represents a regression introduced in 9.3. To avoid having to visit an external URL that might well not exist at all, and to simplify the test case which still has too many extraneous details, here's a simpler version that still causes the UPDATE to deadlock: -- schema.sql: create table pktab (id int primary key, data serial not null); create table fktab (fk int references pktab); insert into pktab values (1); -- query.sql: update pktab set data = default; insert into fktab values (1); This doesn't deadlock in 9.2, yet in 9.3 it raises an error in a few seconds with pgbench -c 16. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/12/14 06:28, Andrew Sackville-West wrote: > On Mon, Dec 8, 2014 at 9:58 PM, Mark Kirkwood > Just mucking about I changed certain things in your schema setup > (removed ON DELETE CASCADE, use CURRENT_TIMESTAMP instead of > now()...lessen the use of DEFAULT)...however still seeing deadlocks > in 9.4, so at least it is easy to reproduce! > > > Can you speak to what benefits those changes might yield? Or point to > relevant docs? > Not so much advantages - just seeing if I could still reproduce the issue with a simpler test case i.e ensuring you were not doing anything 'odd' - and you are not :-) The next step would be trying to figure out what commit introduced this behaviour - but depesz has already beaten me to that (nice work)! Cheers Mark
On Tue, Dec 09, 2014 at 06:43:20PM +0100, Andres Freund wrote: > On 2014-12-09 09:38:33 -0800, Andrew Sackville-West wrote: > > On Tue, Dec 9, 2014 at 6:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com> > > wrote: > > > > > hubert depesz lubaczewski wrote: > > > > For whatever it's worth, I was able to bisect the problem to this commit: > > > > > > > > commit > > > > 0ac5ad5134f2769ccbaefec73844f8504c4d6182 > > > > > > > > Author: Alvaro Herrera <alvherre@alvh.no-ip.org> > > > > > > > > Date: Wed Jan 23 12:04:59 2013 -0300 > > > > > > > > Improve concurrency of foreign key locking > > > > > > I'm not surprised in the least, TBH. > > > > > > > > Thanks for your attention to this. Is there anything we're doing wrong that > > could improve this in current 9.3 versions? > > I don't think anybody can seriously can give you a serious workaround > until the issue has been diagnosed further. agreed. > > One thing you could try would be to create a unique index that spans all > the columns in the table... That'd prevent usage of lower locklevels > added in the above commit. Thanks for this idea. I'll investigate. thanks A
Mark Kirkwood wrote: > Not so much advantages - just seeing if I could still reproduce the issue > with a simpler test case i.e ensuring you were not doing anything 'odd' - > and you are not :-) > > The next step would be trying to figure out what commit introduced this > behaviour - but depesz has already beaten me to that (nice work)! So I traced through the problem using the simplified test case I posted. It goes like this: * tuple on the referenced table has been updated a bunch of times already, and keeps being updated, so there's a large update chain to follow. * two or more transactions (T1 and T2, say) have snapshots in which some version of the tuple (not the last one) is visible. * transaction T3 has a ForKeyShare lock on the latest version of the tuple, which is grabbed because of the INSERT in the referencing table. Note that these locks are propagated upwards in an update chain, so even if T3 locked an old version, all future versions are also locked by T3. * each of T1 and T2 tries to update the version they have visible; this fails because it's not the last version, so they need to go through EvalPlanQual, which walks through the update chain. At this point, T1 passes through heap_lock_tuple, gets the HW tuple lock, marks itself as locker in the tuple's Xmax. T1 is scheduled out; time for T2 to run. T2 goes through heap_lock_tuple, grabs HW lock; examines Xmax, sees that T1 is the locker, goes to sleep until T1 awakes. T1 is scheduled to run again, runs heap_update. The first thing it needs is to do LockTuple ... but T2 is holding that one, so T1 goes to sleep until T2 awakes -- now they are both sleeping on each other. Kaboom. I'm not seeing a solution here. There is something wrong in having to release the HW lock then grab it again, in that EvalPlanQual dance. It sounds like we need to hold on to the tuple's HW lock until after heap_update has run, but it's not at all clear how this would work -- the interface through EvalPlanQual is messy enough as it is without entertaining the idea that a "please keep HW lock until later" flag needs to be passed up and down all over the place. One simple idea that occurs to me is have some global state in heapam.c; when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the HW lock, and it's only released once heap_update is finished. That would keep T2 away from the tuple. We'd need a PG_TRY block in ExecUpdate to reset the global state once the update is done. (It's not entirely clear that we need all this for deletes too. The main difference is that after a delete completes the tuple is gone, which is obviously not the case in an update. Also, deletes don't create update chains.) One thing I haven't thought too much about is why doesn't this happen in 9.2. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/12/14 08:33, Alvaro Herrera wrote: > Mark Kirkwood wrote: > >> Not so much advantages - just seeing if I could still reproduce the issue >> with a simpler test case i.e ensuring you were not doing anything 'odd' - >> and you are not :-) >> >> The next step would be trying to figure out what commit introduced this >> behaviour - but depesz has already beaten me to that (nice work)! > > So I traced through the problem using the simplified test case I posted. > It goes like this: > > * tuple on the referenced table has been updated a bunch of times > already, and keeps being updated, so there's a large update chain to > follow. > > * two or more transactions (T1 and T2, say) have snapshots in which > some version of the tuple (not the last one) is visible. > > * transaction T3 has a ForKeyShare lock on the latest version of the > tuple, which is grabbed because of the INSERT in the referencing > table. Note that these locks are propagated upwards in an update > chain, so even if T3 locked an old version, all future versions are > also locked by T3. > > * each of T1 and T2 tries to update the version they have visible; this > fails because it's not the last version, so they need to go through > EvalPlanQual, which walks through the update chain. > > At this point, T1 passes through heap_lock_tuple, gets the HW tuple > lock, marks itself as locker in the tuple's Xmax. T1 is scheduled out; > time for T2 to run. T2 goes through heap_lock_tuple, grabs HW lock; > examines Xmax, sees that T1 is the locker, goes to sleep until T1 > awakes. T1 is scheduled to run again, runs heap_update. The first > thing it needs is to do LockTuple ... but T2 is holding that one, so T1 > goes to sleep until T2 awakes -- now they are both sleeping on each > other. > > Kaboom. > > I'm not seeing a solution here. There is something wrong in having to > release the HW lock then grab it again, in that EvalPlanQual dance. It > sounds like we need to hold on to the tuple's HW lock until after > heap_update has run, but it's not at all clear how this would work -- > the interface through EvalPlanQual is messy enough as it is without > entertaining the idea that a "please keep HW lock until later" flag > needs to be passed up and down all over the place. > > One simple idea that occurs to me is have some global state in heapam.c; > when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the > HW lock, and it's only released once heap_update is finished. That > would keep T2 away from the tuple. We'd need a PG_TRY block in > ExecUpdate to reset the global state once the update is done. (It's not > entirely clear that we need all this for deletes too. The main > difference is that after a delete completes the tuple is gone, which is > obviously not the case in an update. Also, deletes don't create update > chains.) > > One thing I haven't thought too much about is why doesn't this happen in > 9.2. > Hmmm - interesting, yeah I think the why-it-doesn't-in-9.2 part is the next thing to understand :-) Cheers Mark
On 11/12/14 10:56, Mark Kirkwood wrote: > On 11/12/14 08:33, Alvaro Herrera wrote: >> Mark Kirkwood wrote: >> >>> Not so much advantages - just seeing if I could still reproduce the >>> issue >>> with a simpler test case i.e ensuring you were not doing anything >>> 'odd' - >>> and you are not :-) >>> >>> The next step would be trying to figure out what commit introduced this >>> behaviour - but depesz has already beaten me to that (nice work)! >> >> So I traced through the problem using the simplified test case I posted. >> It goes like this: >> >> * tuple on the referenced table has been updated a bunch of times >> already, and keeps being updated, so there's a large update chain to >> follow. >> >> * two or more transactions (T1 and T2, say) have snapshots in which >> some version of the tuple (not the last one) is visible. >> >> * transaction T3 has a ForKeyShare lock on the latest version of the >> tuple, which is grabbed because of the INSERT in the referencing >> table. Note that these locks are propagated upwards in an update >> chain, so even if T3 locked an old version, all future versions are >> also locked by T3. >> >> * each of T1 and T2 tries to update the version they have visible; this >> fails because it's not the last version, so they need to go through >> EvalPlanQual, which walks through the update chain. >> >> At this point, T1 passes through heap_lock_tuple, gets the HW tuple >> lock, marks itself as locker in the tuple's Xmax. T1 is scheduled out; >> time for T2 to run. T2 goes through heap_lock_tuple, grabs HW lock; >> examines Xmax, sees that T1 is the locker, goes to sleep until T1 >> awakes. T1 is scheduled to run again, runs heap_update. The first >> thing it needs is to do LockTuple ... but T2 is holding that one, so T1 >> goes to sleep until T2 awakes -- now they are both sleeping on each >> other. >> >> Kaboom. >> >> I'm not seeing a solution here. There is something wrong in having to >> release the HW lock then grab it again, in that EvalPlanQual dance. It >> sounds like we need to hold on to the tuple's HW lock until after >> heap_update has run, but it's not at all clear how this would work -- >> the interface through EvalPlanQual is messy enough as it is without >> entertaining the idea that a "please keep HW lock until later" flag >> needs to be passed up and down all over the place. >> >> One simple idea that occurs to me is have some global state in heapam.c; >> when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the >> HW lock, and it's only released once heap_update is finished. That >> would keep T2 away from the tuple. We'd need a PG_TRY block in >> ExecUpdate to reset the global state once the update is done. (It's not >> entirely clear that we need all this for deletes too. The main >> difference is that after a delete completes the tuple is gone, which is >> obviously not the case in an update. Also, deletes don't create update >> chains.) >> >> One thing I haven't thought too much about is why doesn't this happen in >> 9.2. >> > > Hmmm - interesting, yeah I think the why-it-doesn't-in-9.2 part is the > next thing to understand :-) > In the most glaringly obvious (and possibly not at all helpful) analysis - this is because we are using "FOR KEY SHARE" instead of "FOR SHARE" in the newer code. Reverting this (see attached very basic patch) gets rid of the deadlock (tested in 9.5devel). However this is obviously a less than desirable solution, as it loses any possible concurrency improvement - that we worked hard to gain in the patch that added this new lock mode! We really want to figure out how to keep "FOR KEY SHARE" and have it *not* deadlock. Cheers Mark
Attachment
Mark Kirkwood wrote: > In the most glaringly obvious (and possibly not at all helpful) analysis - > this is because we are using "FOR KEY SHARE" instead of "FOR SHARE" in the > newer code. Reverting this (see attached very basic patch) gets rid of the > deadlock (tested in 9.5devel). However this is obviously a less than > desirable solution, as it loses any possible concurrency improvement - that > we worked hard to gain in the patch that added this new lock mode! We really > want to figure out how to keep "FOR KEY SHARE" and have it *not* deadlock. Yeah. The problem is that the INSERT in the fktab table takes a FOR KEY SHARE lock in the pktab table. This puts the inserting process' Xid into the tuple's Xmax, but allows concurrent UPDATE while the inserter transaction is still in progress. If you use a FOR SHARE lock, the UPDATE is disallowed until the inserter transaction is completed. Having the inserter be live is what causes the heap_lock_tuple and heap_update calls in the updater process to take the heavyweight tuple lock. I think what we should do here is NOT take the HW lock unless we're going to sleep in heap_lock_tuple and heap_update. Right now we always take it if HeapTupleSatisfiesUpdate returns HeapTupleBeingUpdated, even if we end up not sleeping to wait for something else. I'm going to experiment with that idea and see if it leads to a solution. I tried the other idea yesterday (to keep the HW tuple lock we acquire in heap_lock_tuple until heap_update is done) but aside from being very complicated and bug-prone, it doesn't solve the problem anyway. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera wrote: > I'm going to experiment with that idea and see if it leads to a > solution. I tried the other idea yesterday (to keep the HW tuple lock > we acquire in heap_lock_tuple until heap_update is done) but aside from > being very complicated and bug-prone, it doesn't solve the problem > anyway. Here's a preliminary patch. It does solve the deadlock in my simplified test case. If Andrew can confirm that it fixes his original problem too, that'd be good. Before this can be committed I need an isolationtester spec file that reproduces the problem. Now that I understand why it happens it should be easy to produce: just have a transaction that does BEGIN, then the insert, and keeps the transaction open; enough other sessions run the UPDATE until the problem pops up. (Also, comments on Would_MultiXactIdWait_Block need work.) FWIW this code should also have slightly better performance than the original coding, since the heavyweight tuple lock acquisition is skipped in some cases. Not sure if that is measurable, though. Maybe in extreme cases such as the one in #8470 ... -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Thu, Dec 11, 2014 at 02:22:56PM -0300, Alvaro Herrera wrote: > Alvaro Herrera wrote: > > > I'm going to experiment with that idea and see if it leads to a > > solution. I tried the other idea yesterday (to keep the HW tuple lock > > we acquire in heap_lock_tuple until heap_update is done) but aside from > > being very complicated and bug-prone, it doesn't solve the problem > > anyway. > > Here's a preliminary patch. It does solve the deadlock in my simplified > test case. If Andrew can confirm that it fixes his original problem > too, that'd be good. We can try that. We don't currently build our own pg, so it'll take a bit for us to get there. I'll report back with results. A
On 12/12/14 06:22, Alvaro Herrera wrote: > Alvaro Herrera wrote: > >> I'm going to experiment with that idea and see if it leads to a >> solution. I tried the other idea yesterday (to keep the HW tuple lock >> we acquire in heap_lock_tuple until heap_update is done) but aside from >> being very complicated and bug-prone, it doesn't solve the problem >> anyway. > > Here's a preliminary patch. It does solve the deadlock in my simplified > test case. If Andrew can confirm that it fixes his original problem > too, that'd be good. > > Before this can be committed I need an isolationtester spec file that > reproduces the problem. Now that I understand why it happens it should > be easy to produce: just have a transaction that does BEGIN, then the > insert, and keeps the transaction open; enough other sessions run the > UPDATE until the problem pops up. (Also, comments on > Would_MultiXactIdWait_Block need work.) > > FWIW this code should also have slightly better performance than the > original coding, since the heavyweight tuple lock acquisition is skipped > in some cases. Not sure if that is measurable, though. Maybe in > extreme cases such as the one in #8470 ... > Yeah, works for me too. I've tried your simplified and also Andrew's original schema + queries against 9.5devel with deadlock.patch applied (neither show any deadlocks in that case). Cheers Mark
On 12/12/14 06:22, Alvaro Herrera wrote: > Before this can be committed I need an isolationtester spec file that > reproduces the problem. Now that I understand why it happens it should > be easy to produce: just have a transaction that does BEGIN, then the > insert, and keeps the transaction open; enough other sessions run the > UPDATE until the problem pops up. (Also, comments on > Would_MultiXactIdWait_Block need work.) > FWIW - I was having a look at the isolationtester spec. Playing with the pgbench setup to see the minimal number of session I needed to provoke the deadlock (unpatched 9.5 code) seemed to converge to about 4. I managed to still see 1 deadlock with each session only doing 1 transaction - i.e: $ pgbench -c 4 -C -n -t 1 -f query.sql deadlock ...which resulted in 1 deadlock. So we may be able to get a reasonable test case that shows this up! I'll take a look at setting up a test case later - but don't let that stop you doing it 1st :-) Cheers Mark
Mark Kirkwood wrote: > FWIW - I was having a look at the isolationtester spec. Playing with the > pgbench setup to see the minimal number of session I needed to provoke the > deadlock (unpatched 9.5 code) seemed to converge to about 4. I managed to > still see 1 deadlock with each session only doing 1 transaction - i.e: > > $ pgbench -c 4 -C -n -t 1 -f query.sql deadlock > > ...which resulted in 1 deadlock. So we may be able to get a reasonable test > case that shows this up! I'll take a look at setting up a test case later - > but don't let that stop you doing it 1st :-) Okay, I created a reproducer ... but it doesn't work. I mean, the commands are fine: if I execute them manually in four sessions in an unpatched server I get a deadlock. But isolationtester doesn't know to handle more than one blocked session, so it just hangs when the third session blocks after the second one is already blocked. :-( Here it is anyway. If we get around to improving isolationtester to handle more blocked sessions, it would be good to add this file. setup { DROP TABLE IF EXISTS pktab; CREATE TABLE pktab (id int PRIMARY KEY, data SERIAL NOT NULL); INSERT INTO pktab VALUES (1, DEFAULT); } teardown { DROP TABLE pktab; } session "s1" step "s1_advlock" { SELECT pg_advisory_lock(142857); } step "s1_chain" { UPDATE pktab SET data = DEFAULT; } step "s1_begin" { BEGIN; } step "s1_grablock" { SELECT * FROM pktab FOR KEY SHARE; } step "s1_advunlock" { SELECT pg_advisory_unlock(142857); } step "s1_commit" { COMMIT; } session "s2" step "s2_update" { UPDATE pktab SET data = DEFAULT WHERE pg_advisory_lock_shared(142857) IS NOT NULL; } session "s3" step "s3_update" { UPDATE pktab SET data = DEFAULT WHERE pg_advisory_lock_shared(142857) IS NOT NULL; } session "s4" step "s4_update" { UPDATE pktab SET data = DEFAULT WHERE pg_advisory_lock_shared(142857) IS NOT NULL; } permutation "s1_advlock" "s2_update" "s3_update" "s4_update" "s1_chain" "s1_grablock" "s1_advunlock" "s1_commit" -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Dec 12, 2014 at 12:52 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Okay, I created a reproducer ... but it doesn't work. I mean, the > commands are fine: if I execute them manually in four sessions in an > unpatched server I get a deadlock. But isolationtester doesn't know to > handle more than one blocked session, so it just hangs when the third > session blocks after the second one is already blocked. :-( We need a stress-testing framework. I hacked together one for the ON CONFLICT UPDATE patch (https://github.com/petergeoghegan/upsert), but a more comprehensive effort is needed. -- Peter Geoghegan
Peter Geoghegan wrote: > On Fri, Dec 12, 2014 at 12:52 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > Okay, I created a reproducer ... but it doesn't work. I mean, the > > commands are fine: if I execute them manually in four sessions in an > > unpatched server I get a deadlock. But isolationtester doesn't know to > > handle more than one blocked session, so it just hangs when the third > > session blocks after the second one is already blocked. :-( > > We need a stress-testing framework. I hacked together one for the ON > CONFLICT UPDATE patch (https://github.com/petergeoghegan/upsert), but > a more comprehensive effort is needed. Hm, that's nice, but I hope you realize that a framework based on shell scripts cannot possibly work for us. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Dec 12, 2014 at 1:16 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Hm, that's nice, but I hope you realize that a framework based on shell > scripts cannot possibly work for us. Of course it couldn't be committed, but maybe this test framework doesn't need to live in the authoritative git repo. The effort might be bike-shedded to death otherwise. Not everyone needs to run these types of tests. -- Peter Geoghegan
Alvaro Herrera wrote: > Alvaro Herrera wrote: > > > I'm going to experiment with that idea and see if it leads to a > > solution. I tried the other idea yesterday (to keep the HW tuple lock > > we acquire in heap_lock_tuple until heap_update is done) but aside from > > being very complicated and bug-prone, it doesn't solve the problem > > anyway. > > Here's a preliminary patch. Here's a finished version of this patch, which I messed a bit with and so needs some extra testing. I want to push this shortly and backpatch to 9.4 and 9.3. I don't want to disrupt the 9.4.0 release next week, but it'd be nice not to ship it with this bug. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 13/12/14 10:57, Alvaro Herrera wrote: > Alvaro Herrera wrote: >> Alvaro Herrera wrote: >> >>> I'm going to experiment with that idea and see if it leads to a >>> solution. I tried the other idea yesterday (to keep the HW tuple lock >>> we acquire in heap_lock_tuple until heap_update is done) but aside from >>> being very complicated and bug-prone, it doesn't solve the problem >>> anyway. >> >> Here's a preliminary patch. > > Here's a finished version of this patch, which I messed a bit with and > so needs some extra testing. I want to push this shortly and backpatch > to 9.4 and 9.3. I don't want to disrupt the 9.4.0 release next week, > but it'd be nice not to ship it with this bug. > This patch seems good too. I'm possibly seeing it performing slightly slower than the previous patch (approx 68 tps vs 63 on average - completely untuned 9.5 running on a single SATA drive, but that could just be natural variation/noise). Regards Mark
On 14/12/14 15:22, Mark Kirkwood wrote: > On 13/12/14 10:57, Alvaro Herrera wrote: >> Alvaro Herrera wrote: >>> Alvaro Herrera wrote: >>> >>>> I'm going to experiment with that idea and see if it leads to a >>>> solution. I tried the other idea yesterday (to keep the HW tuple lock >>>> we acquire in heap_lock_tuple until heap_update is done) but aside from >>>> being very complicated and bug-prone, it doesn't solve the problem >>>> anyway. >>> >>> Here's a preliminary patch. >> >> Here's a finished version of this patch, which I messed a bit with and >> so needs some extra testing. I want to push this shortly and backpatch >> to 9.4 and 9.3. I don't want to disrupt the 9.4.0 release next week, >> but it'd be nice not to ship it with this bug. >> > > This patch seems good too. I'm possibly seeing it performing slightly > slower than the previous patch (approx 68 tps vs 63 on average - > completely untuned 9.5 running on a single SATA drive, but that could > just be natural variation/noise). > Blast - and forgot to clarify that "seems good" means I've tested it against both your and Andrew's deadlock producing schema (no deadlocks with this patch applied)... Cheers Mark
On Fri, Dec 12, 2014 at 06:57:52PM -0300, Alvaro Herrera wrote: > Alvaro Herrera wrote: > > Alvaro Herrera wrote: > > > > > I'm going to experiment with that idea and see if it leads to a > > > solution. I tried the other idea yesterday (to keep the HW tuple lock > > > we acquire in heap_lock_tuple until heap_update is done) but aside from > > > being very complicated and bug-prone, it doesn't solve the problem > > > anyway. > > > > Here's a preliminary patch. > > Here's a finished version of this patch, which I messed a bit with and > so needs some extra testing. I want to push this shortly and backpatch > to 9.4 and 9.3. I don't want to disrupt the 9.4.0 release next week, > but it'd be nice not to ship it with this bug. We have tested this patch on the latest postgresql with our actual application and it appears to work for us. We'll be happy to test a patch against the 9.3 series, using our actual application, as well when it becomes available. thanks! A
Alvaro Herrera wrote: > Alvaro Herrera wrote: > > > I'm going to experiment with that idea and see if it leads to a > > solution. I tried the other idea yesterday (to keep the HW tuple lock > > we acquire in heap_lock_tuple until heap_update is done) but aside from > > being very complicated and bug-prone, it doesn't solve the problem > > anyway. > > Here's a preliminary patch. It does solve the deadlock in my simplified > test case. If Andrew can confirm that it fixes his original problem > too, that'd be good. Pushed this patch to master, 9.4 and 9.3. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Alvaro, On Fri, Dec 26, 2014 at 8:56 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Pushed this patch to master, 9.4 and 9.3. > Thank you for porting and pushing your patch to these other branches. Andrew, our team lead, is out this week, but I wanted to give you some feedback as soon as possible. I've built PostgreSQL 9.3 (git branch REL9_3_STABLE) with and without this patch, and at least in my tests I was able to confirm that it solves our issue. Besides testing it with our simplified pgbench scripts, I also tested it locally with our Haskell web server, which was how we ran into the deadlock error in the first place. In all my tests, I consistently get the deadlock errors without the patch, and no longer see them when the patch is applied. Let us know if you'd like to see anything else tested on our end. Take care, Paulo