Re: regression, deadlock in high frequency single-row UPDATE - Mailing list pgsql-bugs
From | Andrew Sackville-West |
---|---|
Subject | Re: regression, deadlock in high frequency single-row UPDATE |
Date | |
Msg-id | 20140804161205.GA30380@andrew-ThinkPad-X230 Whole thread Raw |
In response to | Re: regression, deadlock in high frequency single-row UPDATE (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: regression, deadlock in high frequency single-row UPDATE
|
List | pgsql-bugs |
On Thu, Jul 31, 2014 at 08:59:50PM -0400, Alvaro Herrera wrote: > Andrew Sackville-West wrote: > > Hi, > > > > I've discussed this problem on irc a couple of times and think I've > > found a regression that plagues our application, introduced in some > > version newer than 9.1.9, and still present in 9.3.5. > > I don't think you have provided everything: for instance I see that your > log has INSERTs into table access_tokens, which has a foreign key > relationship to/from some other table you showed. Ah you're correct, I'll attach the description of that table. But, really, that is all that's going on in the database. > > I'm too lazy to unscramble the whole thing into a real schema right now. > pg_dump output would be more helpful, as would some trivial sample data > enough to reproduce the problem. It's really minimal data. I'll attach a dump of the access_grants and oauth_clients table. The access_tokens table is not empty, but I'm not sure how relevant the data is. What is shown in the log is typical. There is another table in play here, refresh_tokens, but we're not actually using them in this case -- the table is empty, and the reference to it in access_tokens in null. > It'd probably be easier to reproduce if you wrap the statements in > transactions and have them sleep before COMMIT. The issue here is, with no code changes, this works fine under very heavy load in production against 9.0.x, 9.1.9, etc. But under 9.3.{3,4,5} it fails. That code has no transaction. I think adding a transaction is a distraction from what appears to me to be a regression. Interestingly, if I change the code to include a transaction around the UPDATE, the problem goes away (or so it seems so far in my testing). > (If you run two of > these updates in parallel, the first one blocks the second until the > first one commits, because they update the same row. What happens if > the blocking session runs the aforementioned INSERT while the other is > blocked? Perhaps three sessions are needed to show a problem without > explicit transactions.) I've run pg_bench from 3 machines all pointing at the same database running this same set of statements and it fails to induce the deadlock. It occurs to me that I haven't completely described the sequence of actions, so here is a snippet of the actual SQL that a given action induces, when it works as expected: SET standard_conforming_strings TO on; SET datestyle TO ISO; SET client_encoding TO utf8; SELECT "user_1"."id", "user_1"."uuid", "user_1"."lastUpdated" AT TIME ZONE 'UTC', "user_1"."id" as _id FROM "z8z6px927zu6qzzbnb5ntgghxg"."user_1" AS "user_1" WHERE "user_1"."id" = 129; UPDATE "z8z6px927zu6qzzbnb5ntgghxg"."access_grants" ag SET last_issued=DEFAULT FROM "z8z6px927zu6qzzbnb5ntgghxg"."oauth_clients" oc WHERE oc.id = ag.client_id AND ag.entity_name = 'user' AND ag.entity_id = 129 AND oc.client_id = '3hp45h9d4f9wwtx7cvpus6rdb4s5kb9f' RETURNING ag.id; INSERT INTO "z8z6px927zu6qzzbnb5ntgghxg"."access_tokens" (token, access_id, refresh_token_id, issued, expiry) VALUES ('xys7vzs94huex2y7', 2, null, timezone('UTC'::text, now()), timezone('UTC'::text, now()) + '1:00:00':: interval); This represents one complete action. There are several of these running concurrently, interleaved together. There is no transaction involved here. As far as I can tell, the application is using bog-standard libpq (via FFI wrapper in Haskell) with no non-default configuration applied to the connection. I realize there's not a lot to go on here, and I don't expect help debugging my application. I've got a patch that seems to resolve the problem for us. But I'm suspicious of this change in behavior and hope that pointing it out can help in some way. regards, A > > -- > Álvaro Herrera http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services --
Attachment
pgsql-bugs by date: