On Wed, Dec 23, 2015 at 9:40 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > On Wed, Sep 23, 2015 at 11:33 PM, Jeff Janes <jeff.janes@gmail.com> wrote: >> >> On further thought, neither do I. The attached patch inverts >> ResolveRecoveryConflictWithLock to be called back from the lmgr code so that >> is it like ResolveRecoveryConflictWithBufferPin code. It does not try to >> cancel the conflicting lock holders from the signal handler, rather it just >> loops an extra time and cancels the transactions on the next call. >> >> It looks like the deadlock detection is adequately handled within normal >> lmgr code within the back-ends of the other parties to the deadlock, so I >> didn't do a timeout for deadlock detection purposes. >
That is how I've done it.
It's taken me a while to figure this out.
My testing showed a bug in disable_timeout(), which turns out to be a double-disable, which I've fixed. I'll submit a different patch to put in some diagnostics if such cases show up again, which could happen now we have user-defined timeouts.
What surprises me is that I can't see this patch ever worked as submitted, when run on an assert-enabled build.
If you want this backpatched, please submit versions that apply cleanly and test them. I'm less inclined to do that myself, just regard this as an improvement.
--
Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services