Re: max_standby_delay considered harmful - Mailing list pgsql-hackers

From Robert Haas
Subject Re: max_standby_delay considered harmful
Date
Msg-id AANLkTinRrbhlHd3f__lg0S8_2AbKWye1Q19Ff7T4udtx@mail.gmail.com
Whole thread Raw
In response to Re: max_standby_delay considered harmful  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: max_standby_delay considered harmful
List pgsql-hackers
On Wed, May 12, 2010 at 11:28 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2010-05-12 at 14:18 +0100, Simon Riggs wrote:
>> On Wed, 2010-05-12 at 08:52 -0400, Robert Haas wrote:
>> > On Wed, May 12, 2010 at 7:26 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > > On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
>> > >
>> > >> I'm not sure what to make of this.  Sometimes not shutting down
>> > >> doesn't sound like a feature to me.
>> > >
>> > > It acts exactly the same in recovery as in normal running. It is not a
>> > > special feature of recovery at all, bug or otherwise.
>> >
>> > Simon, that doesn't make any sense.  We are talking about a backend
>> > getting stuck forever on an exclusive lock that is held by the startup
>> > process and which will never be released (for example, because the
>> > master has shut down and no more WAL can be obtained for replay).  The
>> > startup process does not hold locks in normal operation.
>>
>> When I test it, startup process holding a lock does not prevent shutdown
>> of a standby.
>>
>> I'd be happy to see your test case showing a bug exists and that the
>> behaviour differs from normal running.
>
> Let me put this differently: I accept that Stefan has reported a
> problem. Neither Tom nor myself can reproduce the problem. I've re-run
> Stefan's test case and restarted the server more than 400 times now
> without any issue.

OK, I'm glad to hear you've been testing this.  I wasn't aware of that.

> I re-read your post where you gave what you yourself called "uninformed
> speculation". There's no real polite way to say it, but yes your
> speculation does appear to be uninformed, since it is incorrect. Reasons
> would be not least that Stefan's tests don't actually send any locks to
> the standby anyway (!),

Hmm.  Well, assuming you're correct, that does seem to be a, uh,
slight problem with my theory.

> but even if they did your speculation as to the
> cause is still all wrong, as explained.

You lost me.  I don't understand why the problem that I'm referring to
couldn't happen, even if it's not what's happening here.

> There is no evidence to link this behaviour with HS, as yet, and you
> should be considering the possibility the problem lies elsewhere,
> especially since it could be code you committed that is at fault.

Huh?? The evidence that this bug is linked with HS is that it occurs
on a server running in HS mode, and not otherwise.  As for whether the
bug is code I committed, that's certainly possible, but keep in mind
it didn't work at all before IN HOT STANDBY MODE - and that will be
code you committed.

I'm going to go test this and see if I can figure out what's going on.I hope you will keep at it also - as you point
out,your knowledge of 
this code far exceeds mine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: hot update doesn't work?
Next
From: Peter Eisentraut
Date:
Subject: primary/secondary/master/slave/standby