Re: Spurious standby query cancellations - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Spurious standby query cancellations
Date
Msg-id CAB7nPqT7H=pep2H95b9bCyiaNQHSB6XGURNmmB0eix5QSUV+Vw@mail.gmail.com
Whole thread Raw
In response to Re: Spurious standby query cancellations  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On Thu, Sep 24, 2015 at 3:33 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Wed, Sep 16, 2015 at 2:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>> On 14 September 2015 at 12:00, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>>>>
>>>> It's now possible to fix this by putting a lock wait on the actual lock
>>>> request, which wasn't available when I first wrote that, hence the crappy
>>>> wait loop. Using the timeout handler would now be the preferred way to solve
>>>> this. We can backpatch that to 9.3 if needed, where they were introduced.
>>>>
>>>> There's an example of how to use lock waits further down on
>>>> ResolveRecoveryConflictWithBufferPin(). Could you have a look at doing it
>>>> that way?
>>>
>>>
>>> It looks like this will take some major surgery.  The heavy weight lock
>>> manager doesn't play well with others when it comes to timeouts other than
>>> its own.  LockBufferForCleanup is a simple retry loop, but the lock manager
>>> is much more complicated than that.
>>
>>
>> Not sure I understand this objection. I can't see a reason that my
>> proposal wouldn't work.
>
>
> On further thought, neither do I.  The attached patch inverts
> ResolveRecoveryConflictWithLock to be called back from the lmgr code so that
> is it like ResolveRecoveryConflictWithBufferPin code.  It does not try to
> cancel the conflicting lock holders from the signal handler, rather it just
> loops an extra time and cancels the transactions on the next call.
>
> It looks like the deadlock detection is adequately handled within normal
> lmgr code within the back-ends of the other parties to the deadlock, so I
> didn't do a timeout for deadlock detection purposes.

Patch moved to next CF because of a lack of reviews. Simon is
registered as reviewer, hence I guess that the ball is on his side of
the field.
-- 
Michael



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Move PinBuffer and UnpinBuffer to atomics
Next
From: Michael Paquier
Date:
Subject: Re: Support for N synchronous standby servers - take 2