Re: regression, deadlock in high frequency single-row UPDATE - Mailing list pgsql-bugs

From Mark Kirkwood
Subject Re: regression, deadlock in high frequency single-row UPDATE
Date
Msg-id 5488C197.3060900@catalyst.net.nz
Whole thread Raw
In response to Re: regression, deadlock in high frequency single-row UPDATE  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: regression, deadlock in high frequency single-row UPDATE
List pgsql-bugs
On 11/12/14 08:33, Alvaro Herrera wrote:
> Mark Kirkwood wrote:
>
>> Not so much advantages - just seeing if I could still reproduce the issue
>> with a simpler test case i.e ensuring you were not doing anything 'odd' -
>> and you are not :-)
>>
>> The next step would be trying to figure out what commit introduced this
>> behaviour - but depesz has already beaten me to that (nice work)!
>
> So I traced through the problem using the simplified test case I posted.
> It goes like this:
>
> * tuple on the referenced table has been updated a bunch of times
>    already, and keeps being updated, so there's a large update chain to
>    follow.
>
> * two or more transactions (T1 and T2, say) have snapshots in which
>    some version of the tuple (not the last one) is visible.
>
> * transaction T3 has a ForKeyShare lock on the latest version of the
>    tuple, which is grabbed because of the INSERT in the referencing
>    table.  Note that these locks are propagated upwards in an update
>    chain, so even if T3 locked an old version, all future versions are
>    also locked by T3.
>
> * each of T1 and T2 tries to update the version they have visible; this
>    fails because it's not the last version, so they need to go through
>    EvalPlanQual, which walks through the update chain.
>
> At this point, T1 passes through heap_lock_tuple, gets the HW tuple
> lock, marks itself as locker in the tuple's Xmax.  T1 is scheduled out;
> time for T2 to run.  T2 goes through heap_lock_tuple, grabs HW lock;
> examines Xmax, sees that T1 is the locker, goes to sleep until T1
> awakes.  T1 is scheduled to run again, runs heap_update.  The first
> thing it needs is to do LockTuple ...  but T2 is holding that one, so T1
> goes to sleep until T2 awakes -- now they are both sleeping on each
> other.
>
> Kaboom.
>
> I'm not seeing a solution here.  There is something wrong in having to
> release the HW lock then grab it again, in that EvalPlanQual dance.  It
> sounds like we need to hold on to the tuple's HW lock until after
> heap_update has run, but it's not at all clear how this would work --
> the interface through EvalPlanQual is messy enough as it is without
> entertaining the idea that a "please keep HW lock until later" flag
> needs to be passed up and down all over the place.
>
> One simple idea that occurs to me is have some global state in heapam.c;
> when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the
> HW lock, and it's only released once heap_update is finished.  That
> would keep T2 away from the tuple.  We'd need a PG_TRY block in
> ExecUpdate to reset the global state once the update is done.  (It's not
> entirely clear that we need all this for deletes too.  The main
> difference is that after a delete completes the tuple is gone, which is
> obviously not the case in an update.  Also, deletes don't create update
> chains.)
>
> One thing I haven't thought too much about is why doesn't this happen in
> 9.2.
>

Hmmm - interesting, yeah I think the why-it-doesn't-in-9.2 part is the
next thing to understand :-)

Cheers

Mark

pgsql-bugs by date:

Previous
From: Jeff Frost
Date:
Subject: Re: BUG #12187: Cant find the postgresql-client installation by yum install
Next
From: Mark Kirkwood
Date:
Subject: Re: regression, deadlock in high frequency single-row UPDATE