Thread: Hot Standby and deadlock detection

Hot Standby and deadlock detection

From
Simon Riggs
Date:
Greg Stark has requested that I re-add max_standby_delay = -1.
I deferred that in favour of relation-specific conflict resolution,
though that seems too major a change from comments received.

As discussed in various other posts, in order to re-add the -1 option we
need to add deadlock detection. I woke up today with a simplifying
assumption and have worked out a solution, the easy parts of which I
have committed earlier.

Part #2 is to make Startup process do deadlock detection. I attach a WIP
patch for comments since signal handling has been a much-discussed area
in recent weeks.

Normal deadlock detection waits for deadlock_timeout before doing the
detection. That is a simple performance tuning mechanism which I think
is probably unnecessary with hot standby, at least in the first
instance.

The way this would work is if Startup waits on a buffer pin we
immediately send out a request to all backends to cancel themselves if
they are holding the buffer pin required && waiting on a lock. We then
sleep until max_standby_delay. When max_standby_delay = -1 we only sleep
until deadlock timeout and then check (on the Startup process).

That keeps the signal handler code simple and reduces the number of test
cases required to confirm everything is solid.

This patch and the last commit together present everything we need to
reenable max_standby_delay = -1, so that change is included here also.

?

--
 Simon Riggs           www.2ndQuadrant.com

Attachment

Re: Hot Standby and deadlock detection

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> The way this would work is if Startup waits on a buffer pin we
> immediately send out a request to all backends to cancel themselves if
> they are holding the buffer pin required && waiting on a lock. We then
> sleep until max_standby_delay. When max_standby_delay = -1 we only sleep
> until deadlock timeout and then check (on the Startup process).

Should wake up to check for deadlocks after deadlock_timeout also when
max_standby_delay > deadlock_timeout. max_standby_delay could be hours -
we want to detect a deadlock sooner than that.

Generally speaking, max_standby_delay==-1 codepath shouldn't be any
different from the max_standby_delay>0 codepath.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Hot Standby and deadlock detection

From
Simon Riggs
Date:
On Mon, 2010-02-01 at 09:40 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > The way this would work is if Startup waits on a buffer pin we
> > immediately send out a request to all backends to cancel themselves if
> > they are holding the buffer pin required && waiting on a lock. We then
> > sleep until max_standby_delay. When max_standby_delay = -1 we only sleep
> > until deadlock timeout and then check (on the Startup process).
> 
> Should wake up to check for deadlocks after deadlock_timeout also when
> max_standby_delay > deadlock_timeout. max_standby_delay could be hours -
> we want to detect a deadlock sooner than that.

The patch does detect deadlocks sooner that that - "immediately", as
described above.

The simplified logic is

if (MaxStandbyDelay == 0)immediate time out any buffer pin holders
else if (MaxStandbyDelay == -1)wait for deadlock_timeout then check for deadlockers
else if (standby_delay > MaxStandbyDelay)immediate time out on buffer pin
else
{immediate(*) check for deadlockerswait for remainder of time then time out any buffer pin holders
}

(*) Doing it this way makes the logic sigalarm handler code easier/more
bug free. The only difference is a potential performance gain from not
running deadlock detection early.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot Standby and deadlock detection

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> On Mon, 2010-02-01 at 09:40 +0200, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> The way this would work is if Startup waits on a buffer pin we
>>> immediately send out a request to all backends to cancel themselves if
>>> they are holding the buffer pin required && waiting on a lock. We then
>>> sleep until max_standby_delay. When max_standby_delay = -1 we only sleep
>>> until deadlock timeout and then check (on the Startup process).
>> Should wake up to check for deadlocks after deadlock_timeout also when
>> max_standby_delay > deadlock_timeout. max_standby_delay could be hours -
>> we want to detect a deadlock sooner than that.
> 
> The patch does detect deadlocks sooner that that - "immediately", as
> described above.

Umm, so why not run the deadlock check immediately in
max_standby_delay=-1 case as well? Why is that case handled differently
from max_standby_delay>0 case?

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Hot Standby and deadlock detection

From
Simon Riggs
Date:
On Mon, 2010-02-01 at 17:50 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Mon, 2010-02-01 at 09:40 +0200, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >>> The way this would work is if Startup waits on a buffer pin we
> >>> immediately send out a request to all backends to cancel themselves if
> >>> they are holding the buffer pin required && waiting on a lock. We then
> >>> sleep until max_standby_delay. When max_standby_delay = -1 we only sleep
> >>> until deadlock timeout and then check (on the Startup process).
> >> Should wake up to check for deadlocks after deadlock_timeout also when
> >> max_standby_delay > deadlock_timeout. max_standby_delay could be hours -
> >> we want to detect a deadlock sooner than that.
> > 
> > The patch does detect deadlocks sooner that that - "immediately", as
> > described above.
> 
> Umm, so why not run the deadlock check immediately in
> max_standby_delay=-1 case as well? Why is that case handled differently
> from max_standby_delay>0 case?

Cos the code to do that is easy.

I'll do the deadlock check immediately and make it even easier.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot Standby and deadlock detection

From
Simon Riggs
Date:
On Mon, 2010-02-01 at 17:50 +0200, Heikki Linnakangas wrote:

> Umm, so why not run the deadlock check immediately in
> max_standby_delay=-1 case as well? Why is that case handled differently
> from max_standby_delay>0 case?

Done, tested, working.

Will commit tomorrow if no further questions or comments.

--
 Simon Riggs           www.2ndQuadrant.com

Attachment