Re: regression, deadlock in high frequency single-row UPDATE - Mailing list pgsql-bugs
From | Mark Kirkwood |
---|---|
Subject | Re: regression, deadlock in high frequency single-row UPDATE |
Date | |
Msg-id | 5488C197.3060900@catalyst.net.nz Whole thread Raw |
In response to | Re: regression, deadlock in high frequency single-row UPDATE (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: regression, deadlock in high frequency single-row UPDATE
|
List | pgsql-bugs |
On 11/12/14 08:33, Alvaro Herrera wrote: > Mark Kirkwood wrote: > >> Not so much advantages - just seeing if I could still reproduce the issue >> with a simpler test case i.e ensuring you were not doing anything 'odd' - >> and you are not :-) >> >> The next step would be trying to figure out what commit introduced this >> behaviour - but depesz has already beaten me to that (nice work)! > > So I traced through the problem using the simplified test case I posted. > It goes like this: > > * tuple on the referenced table has been updated a bunch of times > already, and keeps being updated, so there's a large update chain to > follow. > > * two or more transactions (T1 and T2, say) have snapshots in which > some version of the tuple (not the last one) is visible. > > * transaction T3 has a ForKeyShare lock on the latest version of the > tuple, which is grabbed because of the INSERT in the referencing > table. Note that these locks are propagated upwards in an update > chain, so even if T3 locked an old version, all future versions are > also locked by T3. > > * each of T1 and T2 tries to update the version they have visible; this > fails because it's not the last version, so they need to go through > EvalPlanQual, which walks through the update chain. > > At this point, T1 passes through heap_lock_tuple, gets the HW tuple > lock, marks itself as locker in the tuple's Xmax. T1 is scheduled out; > time for T2 to run. T2 goes through heap_lock_tuple, grabs HW lock; > examines Xmax, sees that T1 is the locker, goes to sleep until T1 > awakes. T1 is scheduled to run again, runs heap_update. The first > thing it needs is to do LockTuple ... but T2 is holding that one, so T1 > goes to sleep until T2 awakes -- now they are both sleeping on each > other. > > Kaboom. > > I'm not seeing a solution here. There is something wrong in having to > release the HW lock then grab it again, in that EvalPlanQual dance. It > sounds like we need to hold on to the tuple's HW lock until after > heap_update has run, but it's not at all clear how this would work -- > the interface through EvalPlanQual is messy enough as it is without > entertaining the idea that a "please keep HW lock until later" flag > needs to be passed up and down all over the place. > > One simple idea that occurs to me is have some global state in heapam.c; > when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the > HW lock, and it's only released once heap_update is finished. That > would keep T2 away from the tuple. We'd need a PG_TRY block in > ExecUpdate to reset the global state once the update is done. (It's not > entirely clear that we need all this for deletes too. The main > difference is that after a delete completes the tuple is gone, which is > obviously not the case in an update. Also, deletes don't create update > chains.) > > One thing I haven't thought too much about is why doesn't this happen in > 9.2. > Hmmm - interesting, yeah I think the why-it-doesn't-in-9.2 part is the next thing to understand :-) Cheers Mark
pgsql-bugs by date: