Thread: Stefan's bug (was: max_standby_delay considered harmful)

Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Wed, May 12, 2010 at 3:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I am wondering if we are not correctly handling the case where we get
> a shutdown request while we are still in the PM_STARTUP state.  It
> looks like we might go ahead and switch to PM_RECOVERY and then
> PM_RECOVERY_CONSISTENT without noticing the shutdown.  There is some
> logic to handle the shutdown when the startup process exits, but if
> the startup process never exits it looks like we might get stuck.

I can reproduce the behavior Stefan is seeing consistently using the
attached patch.

W1: postgres -D ~/pgslave
W2: pg_ctl -D ~/pgslave stop

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Attachment

Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Alvaro Herrera
Date:
Excerpts from Robert Haas's message of mié may 12 20:48:41 -0400 2010:
> On Wed, May 12, 2010 at 3:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > I am wondering if we are not correctly handling the case where we get
> > a shutdown request while we are still in the PM_STARTUP state.  It
> > looks like we might go ahead and switch to PM_RECOVERY and then
> > PM_RECOVERY_CONSISTENT without noticing the shutdown.  There is some
> > logic to handle the shutdown when the startup process exits, but if
> > the startup process never exits it looks like we might get stuck.
> 
> I can reproduce the behavior Stefan is seeing consistently using the
> attached patch.
> 
> W1: postgres -D ~/pgslave
> W2: pg_ctl -D ~/pgslave stop

If there's anything to learn from this patch, is that sleep is
uninterruptible on some platforms.  This is why sleeps elsewhere are
broken down in loops, sleeping in small increments and checking
interrupts each time.  Maybe some of the sleeps in the new HS code need
to be handled this way?
-- 


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Wed, May 12, 2010 at 10:36 PM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> Excerpts from Robert Haas's message of mié may 12 20:48:41 -0400 2010:
>> On Wed, May 12, 2010 at 3:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> > I am wondering if we are not correctly handling the case where we get
>> > a shutdown request while we are still in the PM_STARTUP state.  It
>> > looks like we might go ahead and switch to PM_RECOVERY and then
>> > PM_RECOVERY_CONSISTENT without noticing the shutdown.  There is some
>> > logic to handle the shutdown when the startup process exits, but if
>> > the startup process never exits it looks like we might get stuck.
>>
>> I can reproduce the behavior Stefan is seeing consistently using the
>> attached patch.
>>
>> W1: postgres -D ~/pgslave
>> W2: pg_ctl -D ~/pgslave stop
>
> If there's anything to learn from this patch, is that sleep is
> uninterruptible on some platforms.  This is why sleeps elsewhere are
> broken down in loops, sleeping in small increments and checking
> interrupts each time.  Maybe some of the sleeps in the new HS code need
> to be handled this way?

I don't think the problem is that the sleep is uninterruptible.  I
think the problem is that a smart shutdown request received while in
the PM_STARTUP state does not acted upon until we enter the PM_RUN
state.  That is, there's a race condition between the SIGUSR1 that the
startup process sends to the postmaster to signal that recovery has
begun and the SIGTERM being sent by pg_ctl.

However, since I haven't succeeded in producing a fix yet, take that
with a grain of salt...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Wed, May 12, 2010 at 10:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 12, 2010 at 10:36 PM, Alvaro Herrera
> <alvherre@alvh.no-ip.org> wrote:
>> Excerpts from Robert Haas's message of mié may 12 20:48:41 -0400 2010:
>>> On Wed, May 12, 2010 at 3:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> > I am wondering if we are not correctly handling the case where we get
>>> > a shutdown request while we are still in the PM_STARTUP state.  It
>>> > looks like we might go ahead and switch to PM_RECOVERY and then
>>> > PM_RECOVERY_CONSISTENT without noticing the shutdown.  There is some
>>> > logic to handle the shutdown when the startup process exits, but if
>>> > the startup process never exits it looks like we might get stuck.
>>>
>>> I can reproduce the behavior Stefan is seeing consistently using the
>>> attached patch.
>>>
>>> W1: postgres -D ~/pgslave
>>> W2: pg_ctl -D ~/pgslave stop
>>
>> If there's anything to learn from this patch, is that sleep is
>> uninterruptible on some platforms.  This is why sleeps elsewhere are
>> broken down in loops, sleeping in small increments and checking
>> interrupts each time.  Maybe some of the sleeps in the new HS code need
>> to be handled this way?
>
> I don't think the problem is that the sleep is uninterruptible.  I
> think the problem is that a smart shutdown request received while in
> the PM_STARTUP state does not acted upon until we enter the PM_RUN
> state.  That is, there's a race condition between the SIGUSR1 that the
> startup process sends to the postmaster to signal that recovery has
> begun and the SIGTERM being sent by pg_ctl.
>
> However, since I haven't succeeded in producing a fix yet, take that
> with a grain of salt...

I'm replying to this thread rather than the max_standby_delay thread
because this line of discussion is not actually related to
max_standby_delay.

Fujii Masao previously reported this problem and proposed this fix:
http://archives.postgresql.org/pgsql-hackers/2010-04/msg00592.php

I responded by proposing that we instead simply adjust things so that
a smart shutdown is always handled at the end of recovery, just prior
to entering normal running.  On further reflection, though, I've
concluded that was a dumb idea.  Currently, when a smart shutdown
occurs during recovery, we handle it immediately UNLESS it happens
before we receive the signal that tells us it's OK to start the
background writer, in which case we get confused and lock everyone out
of the database until a fast or immediate shutdown request arrives.
So this is just a race condition, plain and simple.  Therefore I think
Fujii Masao's original idea was the best, but I have what I believe is
an equivalent but simpler implementation, which is attached.

Thoughts?  Should we try to fix this in 8.4 also, or just in HEAD?
8.3 and 8.2 never handle a smart shutdown prior to entering normal
running, and while that seems pretty useless, doing something
different would be a behavior change, so that seems like a
non-starter.  8.4 has the same behavior as HEAD, though it's not
documented in the release notes, so it's not clear how intentional the
change was.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Attachment

Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Sun, 2010-05-16 at 21:25 -0400, Robert Haas wrote:

> I have what I believe is
> an equivalent but simpler implementation, which is attached.

There's no code comments to explain this, so without in-depth analysis
of the problem, Masao's patch and this one its not possible to say
anything.

Please explain in detail why its the right approach and put that in a
comment, so we'll understand now and in the future.

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Mon, May 17, 2010 at 10:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Therefore I think
> Fujii Masao's original idea was the best, but I have what I believe is
> an equivalent but simpler implementation, which is attached.

Seems good.

I found another two problems related to shutdown in PM_STARTUP state:

(1)
Smart or fast shutdown requested in PM_STARTUP state always removes
the backup_label file if it exists. But it might be still required
for subsequent recovery. I changed your patch so that additionally
the postmaster skips deleting the backup_label in that case.

(2)
pg_ctl -ms stop emits the following warning whenever there is the
backup_label file in $PGDATA.

      WARNING: online backup mode is active
      Shutdown will not complete until pg_stop_backup() is called.

This warning doesn't fit in with the shutdown during recovery case.
Since smart shutdown might be requested by other than pg_ctl, the
warning should be emitted in server side rather than client, I think.
How about moving the warning to the server side?

> Thoughts?  Should we try to fix this in 8.4 also, or just in HEAD?
> 8.3 and 8.2 never handle a smart shutdown prior to entering normal
> running, and while that seems pretty useless, doing something
> different would be a behavior change, so that seems like a
> non-starter.  8.4 has the same behavior as HEAD, though it's not
> documented in the release notes, so it's not clear how intentional the
> change was.

In 8.4, smart shutdown during recovery waits until the startup process
has exited. So the backporting to 8.4 doesn't improve any situation,
I think.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 17, 2010 at 3:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Sun, 2010-05-16 at 21:25 -0400, Robert Haas wrote:
>
>> I have what I believe is
>> an equivalent but simpler implementation, which is attached.
>
> There's no code comments to explain this, so without in-depth analysis
> of the problem, Masao's patch and this one its not possible to say
> anything.
>
> Please explain in detail why its the right approach and put that in a
> comment, so we'll understand now and in the future.

The explanation is what I wrote in my previous email: a smart shutdown
request during recovery should be treated the same way BEFORE the
postmaster has been asked to start the background writer and AFTER the
postmaster has been asked to start the background writer.  I'll think
up a suitable comment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Mon, 2010-05-17 at 06:30 -0400, Robert Haas wrote:
> On Mon, May 17, 2010 at 3:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Sun, 2010-05-16 at 21:25 -0400, Robert Haas wrote:
> >
> >> I have what I believe is
> >> an equivalent but simpler implementation, which is attached.
> >
> > There's no code comments to explain this, so without in-depth analysis
> > of the problem, Masao's patch and this one its not possible to say
> > anything.
> >
> > Please explain in detail why its the right approach and put that in a
> > comment, so we'll understand now and in the future.
> 
> The explanation is what I wrote in my previous email: a smart shutdown
> request during recovery should be treated the same way BEFORE the
> postmaster has been asked to start the background writer and AFTER the
> postmaster has been asked to start the background writer.  I'll think
> up a suitable comment.

I think we should review Masao's patch and ask him to make any changes
we think are appropriate. There's no benefit to have multiple patch
authors at one time.

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 17, 2010 at 6:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-05-17 at 06:30 -0400, Robert Haas wrote:
>> On Mon, May 17, 2010 at 3:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > On Sun, 2010-05-16 at 21:25 -0400, Robert Haas wrote:
>> >
>> >> I have what I believe is
>> >> an equivalent but simpler implementation, which is attached.
>> >
>> > There's no code comments to explain this, so without in-depth analysis
>> > of the problem, Masao's patch and this one its not possible to say
>> > anything.
>> >
>> > Please explain in detail why its the right approach and put that in a
>> > comment, so we'll understand now and in the future.
>>
>> The explanation is what I wrote in my previous email: a smart shutdown
>> request during recovery should be treated the same way BEFORE the
>> postmaster has been asked to start the background writer and AFTER the
>> postmaster has been asked to start the background writer.  I'll think
>> up a suitable comment.
>
> I think we should review Masao's patch and ask him to make any changes
> we think are appropriate. There's no benefit to have multiple patch
> authors at one time.

I did review his patch.  It duplicates a few lines of logic and I
found a way to avoid that, so I proposed it.  That seems totally
normal to me and I'm not sure what you're concerned about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 17, 2010 at 3:38 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, May 17, 2010 at 10:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Therefore I think
>> Fujii Masao's original idea was the best, but I have what I believe is
>> an equivalent but simpler implementation, which is attached.
>
> Seems good.
>
> I found another two problems related to shutdown in PM_STARTUP state:
>
> (1)
> Smart or fast shutdown requested in PM_STARTUP state always removes
> the backup_label file if it exists. But it might be still required
> for subsequent recovery. I changed your patch so that additionally
> the postmaster skips deleting the backup_label in that case.

Can you explain in a little more detail how this can cause a problem?
I'm not very familiar with how the backup label is used.

Also, why is this different in PM_STARTUP than in PM_RECOVERY?
PM_RECOVERY doesn't guarantee that we've reached consistency.

> (2)
> pg_ctl -ms stop emits the following warning whenever there is the
> backup_label file in $PGDATA.
>
>      WARNING: online backup mode is active
>      Shutdown will not complete until pg_stop_backup() is called.
>
> This warning doesn't fit in with the shutdown during recovery case.
> Since smart shutdown might be requested by other than pg_ctl, the
> warning should be emitted in server side rather than client, I think.
> How about moving the warning to the server side?

Hmm, I'm not sure whether that's a good idea or not.  Perhaps we
should discuss for 9.1?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Mon, 2010-05-17 at 06:55 -0400, Robert Haas wrote:

> > I think we should review Masao's patch and ask him to make any changes
> > we think are appropriate. There's no benefit to have multiple patch
> > authors at one time.
> 
> I did review his patch.  It duplicates a few lines of logic and I
> found a way to avoid that, so I proposed it.  That seems totally
> normal to me and I'm not sure what you're concerned about.

I think we should concentrate efforts on just one patch: Masao's.

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 17, 2010 at 7:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-05-17 at 06:55 -0400, Robert Haas wrote:
>
>> > I think we should review Masao's patch and ask him to make any changes
>> > we think are appropriate. There's no benefit to have multiple patch
>> > authors at one time.
>>
>> I did review his patch.  It duplicates a few lines of logic and I
>> found a way to avoid that, so I proposed it.  That seems totally
>> normal to me and I'm not sure what you're concerned about.
>
> I think we should concentrate efforts on just one patch: Masao's.

I understand that's your opinion, but you haven't explained why.  My
approach is simpler and Fujii Masao has already endorsed it.  I would
prefer that we focus on the technical issues here rather than who
wrote the patch.  I believe that my approach is better because it
avoids duplicating code, which should reduce the chance of future
bugs, since someone might conceivably update one chunk of code but not
the other.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Mon, 2010-05-17 at 07:33 -0400, Robert Haas wrote:
> I would
> prefer that we focus on the technical issues here rather than who
> wrote the patch.

There are three patches now from 2 authors. I agree we should focus on
the technical issues, but which issues, in which patch? If Masao had
accepted your version he would not have written another, would he?

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Mon, May 17, 2010 at 8:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> (1)
>> Smart or fast shutdown requested in PM_STARTUP state always removes
>> the backup_label file if it exists. But it might be still required
>> for subsequent recovery. I changed your patch so that additionally
>> the postmaster skips deleting the backup_label in that case.
>
> Can you explain in a little more detail how this can cause a problem?
> I'm not very familiar with how the backup label is used.
>
> Also, why is this different in PM_STARTUP than in PM_RECOVERY?
> PM_RECOVERY doesn't guarantee that we've reached consistency.

Before the startup process sends the PMSIGNAL_RECOVERY_STARTED signal
(i.e., when the postmaster is in PM_STARTUP state), it reads the
backup_label file to know the recovery starting WAL location, saves
that information in pg_control file, and rename the file "backup_label"
to "backup_label.old".

If the backup_label file is removed before pg_control is updated,
subsequent recovery cannot get the right recovery starting location.
This is the problem that I'm concerned.

The smart shutdown during recovery and the fast shutdown might call
CancelBackup() and remove the backup_label file. So if shutdown is
requested in PM_STARTUP state, the problem would happen.

In the patch, if shutdown is requested in PM_STARTUP, the postmaster
skips calling CancelBackup() since the backup_label file might be
required.

>> (2)
>> pg_ctl -ms stop emits the following warning whenever there is the
>> backup_label file in $PGDATA.
>>
>>      WARNING: online backup mode is active
>>      Shutdown will not complete until pg_stop_backup() is called.
>>
>> This warning doesn't fit in with the shutdown during recovery case.
>> Since smart shutdown might be requested by other than pg_ctl, the
>> warning should be emitted in server side rather than client, I think.
>> How about moving the warning to the server side?
>
> Hmm, I'm not sure whether that's a good idea or not.  Perhaps we
> should discuss for 9.1?

Okay, this is not a critical problem.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 17, 2010 at 7:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-05-17 at 07:33 -0400, Robert Haas wrote:
>> I would
>> prefer that we focus on the technical issues here rather than who
>> wrote the patch.
>
> There are three patches now from 2 authors. I agree we should focus on
> the technical issues, but which issues, in which patch? If Masao had
> accepted your version he would not have written another, would he?

He wrote his version first.  I reviewed it and submitted a simplified
version.  He reviewed mine and submitted an updated version that
addresses an additional possible issue, which is currently under
discussion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Mon, 2010-05-17 at 16:38 +0900, Fujii Masao wrote:
> On Mon, May 17, 2010 at 10:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > Therefore I think
> > Fujii Masao's original idea was the best, but I have what I believe is
> > an equivalent but simpler implementation, which is attached.
> 
> Seems good.
> 
> I found another two problems related to shutdown in PM_STARTUP state:
> 
> (1)
> Smart or fast shutdown requested in PM_STARTUP state always removes
> the backup_label file if it exists. But it might be still required
> for subsequent recovery. I changed your patch so that additionally
> the postmaster skips deleting the backup_label in that case.

Don't like the name NeedBackupLabel seems too specific. That really
corresponds to "we were in recovery". We should have a couple of
super-states that correspond to am in recovery/am not in recovery so we
can drive it from that.

> (2)
> pg_ctl -ms stop emits the following warning whenever there is the
> backup_label file in $PGDATA.
> 
>       WARNING: online backup mode is active
>       Shutdown will not complete until pg_stop_backup() is called.
> 
> This warning doesn't fit in with the shutdown during recovery case.
> Since smart shutdown might be requested by other than pg_ctl, the
> warning should be emitted in server side rather than client, I think.
> How about moving the warning to the server side?

+1

> > Thoughts?  Should we try to fix this in 8.4 also, or just in HEAD?
> > 8.3 and 8.2 never handle a smart shutdown prior to entering normal
> > running, and while that seems pretty useless, doing something
> > different would be a behavior change, so that seems like a
> > non-starter.  8.4 has the same behavior as HEAD, though it's not
> > documented in the release notes, so it's not clear how intentional the
> > change was.
> 
> In 8.4, smart shutdown during recovery waits until the startup process
> has exited. So the backporting to 8.4 doesn't improve any situation,
> I think.

We shouldn't be discussing backporting a behaviour change.

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 17, 2010 at 7:44 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, May 17, 2010 at 8:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> (1)
>>> Smart or fast shutdown requested in PM_STARTUP state always removes
>>> the backup_label file if it exists. But it might be still required
>>> for subsequent recovery. I changed your patch so that additionally
>>> the postmaster skips deleting the backup_label in that case.
>>
>> Can you explain in a little more detail how this can cause a problem?
>> I'm not very familiar with how the backup label is used.
>>
>> Also, why is this different in PM_STARTUP than in PM_RECOVERY?
>> PM_RECOVERY doesn't guarantee that we've reached consistency.
>
> Before the startup process sends the PMSIGNAL_RECOVERY_STARTED signal
> (i.e., when the postmaster is in PM_STARTUP state), it reads the
> backup_label file to know the recovery starting WAL location, saves
> that information in pg_control file, and rename the file "backup_label"
> to "backup_label.old".
>
> If the backup_label file is removed before pg_control is updated,
> subsequent recovery cannot get the right recovery starting location.
> This is the problem that I'm concerned.
>
> The smart shutdown during recovery and the fast shutdown might call
> CancelBackup() and remove the backup_label file. So if shutdown is
> requested in PM_STARTUP state, the problem would happen.

OK, I think I understand now.  But, the SIGTERM sent by the postmaster
doesn't kill the recovery process unconditionally.  It will invoke
StartupProcShutdownHandler(), which will set set shutdown_requested =
true.  That gets checked by RestoreArchivedFile() and
HandleStartupProcInterrupts(), and I think that neither of those can
get invoked until after the control file has been updated.  Do you see
a way it can happen?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Mon, May 17, 2010 at 10:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> OK, I think I understand now.  But, the SIGTERM sent by the postmaster
> doesn't kill the recovery process unconditionally.  It will invoke
> StartupProcShutdownHandler(), which will set set shutdown_requested =
> true.  That gets checked by RestoreArchivedFile() and
> HandleStartupProcInterrupts(), and I think that neither of those can
> get invoked until after the control file has been updated.  Do you see
> a way it can happen?

Yeah, the way is:
StartupXLOG() --> ReadCheckpointRecord() --> ReadRecord() -->
XLogPageRead() --> XLogFileReadAnyTLI() --> XLogFileRead() -->
RestoreArchivedFile()

ReadCheckpointRecord() is called before pg_control is updated.


ISTM that walreceiver might be invoked even after shutdown is requested.
We should prevent the postmaster from starting up walreceiver if
Shutdown > NoShutdown?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Mon, May 17, 2010 at 9:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> (1)
>> Smart or fast shutdown requested in PM_STARTUP state always removes
>> the backup_label file if it exists. But it might be still required
>> for subsequent recovery. I changed your patch so that additionally
>> the postmaster skips deleting the backup_label in that case.
>
> Don't like the name NeedBackupLabel seems too specific. That really
> corresponds to "we were in recovery". We should have a couple of
> super-states that correspond to am in recovery/am not in recovery so we
> can drive it from that.

ISTM that we can use XLogCtl->SharedRecoveryInProgress for that.
Is this OK?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Tue, 2010-05-18 at 11:40 +0900, Fujii Masao wrote:
> ISTM that walreceiver might be invoked even after shutdown is
> requested. We should prevent the postmaster from starting up
> walreceiver if Shutdown > NoShutdown?

+1

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Tue, 2010-05-18 at 12:02 +0900, Fujii Masao wrote:
> On Mon, May 17, 2010 at 9:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >> (1)
> >> Smart or fast shutdown requested in PM_STARTUP state always removes
> >> the backup_label file if it exists. But it might be still required
> >> for subsequent recovery. I changed your patch so that additionally
> >> the postmaster skips deleting the backup_label in that case.
> >
> > Don't like the name NeedBackupLabel seems too specific. That really
> > corresponds to "we were in recovery". We should have a couple of
> > super-states that correspond to am in recovery/am not in recovery so we
> > can drive it from that.
> 
> ISTM that we can use XLogCtl->SharedRecoveryInProgress for that.
> Is this OK?

That can change state at any time. Would that work? 

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Tue, May 18, 2010 at 2:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Tue, 2010-05-18 at 12:02 +0900, Fujii Masao wrote:
>> On Mon, May 17, 2010 at 9:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> >> (1)
>> >> Smart or fast shutdown requested in PM_STARTUP state always removes
>> >> the backup_label file if it exists. But it might be still required
>> >> for subsequent recovery. I changed your patch so that additionally
>> >> the postmaster skips deleting the backup_label in that case.
>> >
>> > Don't like the name NeedBackupLabel seems too specific. That really
>> > corresponds to "we were in recovery". We should have a couple of
>> > super-states that correspond to am in recovery/am not in recovery so we
>> > can drive it from that.
>>
>> ISTM that we can use XLogCtl->SharedRecoveryInProgress for that.
>> Is this OK?
>
> That can change state at any time. Would that work?

Yes. XLogCtl->SharedRecoveryInProgress is set to TRUE only when
XLogCtl structure is initialized (i.e., XLOGShmemInit), and it's
set to FALSE only at the end of recovery.

Here is the updated patch (fix_smart_shutdown_in_recovery_v3_fujii.patch).
I used XLogCtl->SharedRecoveryInProgress instead of NeedBackupLabel,
which made the patch very simple. And I prevented the postmaster
from invoking walreceiver after shutdown or recovery.

>> (2)
>> pg_ctl -ms stop emits the following warning whenever there is the
>> backup_label file in $PGDATA.
>>
>>       WARNING: online backup mode is active
>>       Shutdown will not complete until pg_stop_backup() is called.
>>
>> This warning doesn't fit in with the shutdown during recovery case.
>> Since smart shutdown might be requested by other than pg_ctl, the
>> warning should be emitted in server side rather than client, I think.
>> How about moving the warning to the server side?

Though I'm not sure if this should be fixed for 9.0, I attached the
patch (move_bkp_cancel_warning_v1.patch).

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Tue, 2010-05-18 at 15:09 +0900, Fujii Masao wrote:
> On Tue, May 18, 2010 at 2:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Tue, 2010-05-18 at 12:02 +0900, Fujii Masao wrote:
> >> On Mon, May 17, 2010 at 9:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >> >> (1)
> >> >> Smart or fast shutdown requested in PM_STARTUP state always removes
> >> >> the backup_label file if it exists. But it might be still required
> >> >> for subsequent recovery. I changed your patch so that additionally
> >> >> the postmaster skips deleting the backup_label in that case.
> >> >
> >> > Don't like the name NeedBackupLabel seems too specific. That really
> >> > corresponds to "we were in recovery". We should have a couple of
> >> > super-states that correspond to am in recovery/am not in recovery so we
> >> > can drive it from that.
> >>
> >> ISTM that we can use XLogCtl->SharedRecoveryInProgress for that.
> >> Is this OK?
> >
> > That can change state at any time. Would that work?
> 
> Yes. XLogCtl->SharedRecoveryInProgress is set to TRUE only when
> XLogCtl structure is initialized (i.e., XLOGShmemInit), and it's
> set to FALSE only at the end of recovery.

You should be using RecoveryInProgress()

> Here is the updated patch (fix_smart_shutdown_in_recovery_v3_fujii.patch).
> I used XLogCtl->SharedRecoveryInProgress instead of NeedBackupLabel,
> which made the patch very simple. And I prevented the postmaster
> from invoking walreceiver after shutdown or recovery.
> 
> >> (2)
> >> pg_ctl -ms stop emits the following warning whenever there is the
> >> backup_label file in $PGDATA.
> >>
> >>       WARNING: online backup mode is active
> >>       Shutdown will not complete until pg_stop_backup() is called.
> >>
> >> This warning doesn't fit in with the shutdown during recovery case.
> >> Since smart shutdown might be requested by other than pg_ctl, the
> >> warning should be emitted in server side rather than client, I think.
> >> How about moving the warning to the server side?
> 
> Though I'm not sure if this should be fixed for 9.0, I attached the
> patch (move_bkp_cancel_warning_v1.patch).
> 
> Regards,
> 
-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Tue, May 18, 2010 at 3:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Tue, 2010-05-18 at 15:09 +0900, Fujii Masao wrote:
>> On Tue, May 18, 2010 at 2:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > On Tue, 2010-05-18 at 12:02 +0900, Fujii Masao wrote:
>> >> On Mon, May 17, 2010 at 9:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> >> >> (1)
>> >> >> Smart or fast shutdown requested in PM_STARTUP state always removes
>> >> >> the backup_label file if it exists. But it might be still required
>> >> >> for subsequent recovery. I changed your patch so that additionally
>> >> >> the postmaster skips deleting the backup_label in that case.
>> >> >
>> >> > Don't like the name NeedBackupLabel seems too specific. That really
>> >> > corresponds to "we were in recovery". We should have a couple of
>> >> > super-states that correspond to am in recovery/am not in recovery so we
>> >> > can drive it from that.
>> >>
>> >> ISTM that we can use XLogCtl->SharedRecoveryInProgress for that.
>> >> Is this OK?
>> >
>> > That can change state at any time. Would that work?
>>
>> Yes. XLogCtl->SharedRecoveryInProgress is set to TRUE only when
>> XLogCtl structure is initialized (i.e., XLOGShmemInit), and it's
>> set to FALSE only at the end of recovery.
>
> You should be using RecoveryInProgress()

Isn't access to a bool variable atomic?

And I think that postmaster should not take any locks since its
deadlock would cause a fatal situation. No?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Tue, May 18, 2010 at 3:45 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Tue, May 18, 2010 at 3:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Tue, 2010-05-18 at 15:09 +0900, Fujii Masao wrote:
>>> On Tue, May 18, 2010 at 2:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> > On Tue, 2010-05-18 at 12:02 +0900, Fujii Masao wrote:
>>> >> On Mon, May 17, 2010 at 9:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> >> >> (1)
>>> >> >> Smart or fast shutdown requested in PM_STARTUP state always removes
>>> >> >> the backup_label file if it exists. But it might be still required
>>> >> >> for subsequent recovery. I changed your patch so that additionally
>>> >> >> the postmaster skips deleting the backup_label in that case.
>>> >> >
>>> >> > Don't like the name NeedBackupLabel seems too specific. That really
>>> >> > corresponds to "we were in recovery". We should have a couple of
>>> >> > super-states that correspond to am in recovery/am not in recovery so we
>>> >> > can drive it from that.
>>> >>
>>> >> ISTM that we can use XLogCtl->SharedRecoveryInProgress for that.
>>> >> Is this OK?
>>> >
>>> > That can change state at any time. Would that work?
>>>
>>> Yes. XLogCtl->SharedRecoveryInProgress is set to TRUE only when
>>> XLogCtl structure is initialized (i.e., XLOGShmemInit), and it's
>>> set to FALSE only at the end of recovery.
>>
>> You should be using RecoveryInProgress()
>
> Isn't access to a bool variable atomic?

If it's not atomic, I'll add the following comment into CancelBackup():
   Don't bother with lock to access XLogCtl->SharedRecoveryInProgress,   because there should be no other processes
runningwhen this code   is reached.
 

Thought?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Tue, 2010-05-18 at 16:05 +0900, Fujii Masao wrote:
> >>> >> ISTM that we can use XLogCtl->SharedRecoveryInProgress for that.
> >>> >> Is this OK?
> >>> >
> >>> > That can change state at any time. Would that work?
> >>>
> >>> Yes. XLogCtl->SharedRecoveryInProgress is set to TRUE only when
> >>> XLogCtl structure is initialized (i.e., XLOGShmemInit), and it's
> >>> set to FALSE only at the end of recovery.
> >>
> >> You should be using RecoveryInProgress()
> >
> > Isn't access to a bool variable atomic?
> 
> If it's not atomic, I'll add the following comment into CancelBackup():
> 
>     Don't bother with lock to access XLogCtl->SharedRecoveryInProgress,
>     because there should be no other processes running when this code
>     is reached.

Call it via a function. There is no need for postmaster to know the
innards of xlog.c, which could change in future. Modularity.

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Tue, May 18, 2010 at 5:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Tue, 2010-05-18 at 16:05 +0900, Fujii Masao wrote:
>> >>> >> ISTM that we can use XLogCtl->SharedRecoveryInProgress for that.
>> >>> >> Is this OK?
>> >>> >
>> >>> > That can change state at any time. Would that work?
>> >>>
>> >>> Yes. XLogCtl->SharedRecoveryInProgress is set to TRUE only when
>> >>> XLogCtl structure is initialized (i.e., XLOGShmemInit), and it's
>> >>> set to FALSE only at the end of recovery.
>> >>
>> >> You should be using RecoveryInProgress()
>> >
>> > Isn't access to a bool variable atomic?
>>
>> If it's not atomic, I'll add the following comment into CancelBackup():
>>
>>     Don't bother with lock to access XLogCtl->SharedRecoveryInProgress,
>>     because there should be no other processes running when this code
>>     is reached.
>
> Call it via a function. There is no need for postmaster to know the
> innards of xlog.c, which could change in future. Modularity.

In the patch, it's accessed in CancelBackup() which is in xlog.c.
CancelBackup() should call further wrapping function?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 17, 2010 at 10:40 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, May 17, 2010 at 10:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> OK, I think I understand now.  But, the SIGTERM sent by the postmaster
>> doesn't kill the recovery process unconditionally.  It will invoke
>> StartupProcShutdownHandler(), which will set set shutdown_requested =
>> true.  That gets checked by RestoreArchivedFile() and
>> HandleStartupProcInterrupts(), and I think that neither of those can
>> get invoked until after the control file has been updated.  Do you see
>> a way it can happen?
>
> Yeah, the way is:
> StartupXLOG() --> ReadCheckpointRecord() --> ReadRecord() -->
> XLogPageRead() --> XLogFileReadAnyTLI() --> XLogFileRead() -->
> RestoreArchivedFile()
>
> ReadCheckpointRecord() is called before pg_control is updated.

OK.  In that case, I'm wondering if we should reverse course and
rejigger the logic so that the shutdown gets processed when we
transition to PM_RECOVERY.  Seems like that might be simpler.

> ISTM that walreceiver might be invoked even after shutdown is requested.
> We should prevent the postmaster from starting up walreceiver if
> Shutdown > NoShutdown?

Well, when we did the previous shutdown patch, we decided it was not
right to kill walreceiver until all backends had exited, so it seems
inconsistent to make the opposite decision here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Tue, May 18, 2010 at 8:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, May 17, 2010 at 10:40 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Mon, May 17, 2010 at 10:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> OK, I think I understand now.  But, the SIGTERM sent by the postmaster
>>> doesn't kill the recovery process unconditionally.  It will invoke
>>> StartupProcShutdownHandler(), which will set set shutdown_requested =
>>> true.  That gets checked by RestoreArchivedFile() and
>>> HandleStartupProcInterrupts(), and I think that neither of those can
>>> get invoked until after the control file has been updated.  Do you see
>>> a way it can happen?
>>
>> Yeah, the way is:
>> StartupXLOG() --> ReadCheckpointRecord() --> ReadRecord() -->
>> XLogPageRead() --> XLogFileReadAnyTLI() --> XLogFileRead() -->
>> RestoreArchivedFile()
>>
>> ReadCheckpointRecord() is called before pg_control is updated.
>
> OK.  In that case, I'm wondering if we should reverse course and
> rejigger the logic so that the shutdown gets processed when we
> transition to PM_RECOVERY.  Seems like that might be simpler.

You mean keeping shutdown waiting until the postmaster has reached
PM_RECOVERY, i.e., the startup process has sent PMSIGNAL_RECOVERY_STARTED?

The startup process must call ReadCheckpointRecord() before sending
that signal. ReadCheckpointRecord() might get stuck for some reasons,
e.g., neither master nor standby might have the recovery starting
checkpoint WAL record. So that signal might not be sent forever,
in this case, shutdown would get stuck.

>> ISTM that walreceiver might be invoked even after shutdown is requested.
>> We should prevent the postmaster from starting up walreceiver if
>> Shutdown > NoShutdown?
>
> Well, when we did the previous shutdown patch, we decided it was not
> right to kill walreceiver until all backends had exited, so it seems
> inconsistent to make the opposite decision here.

Oh, right. How about allowing the postmaster only in PM_STARTUP,
PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
walreceiver? We can keep walreceiver alive until all read only
backends have gone, and prevent unexpected startup of walreceiver.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Tue, May 18, 2010 at 10:40 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Tue, May 18, 2010 at 8:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, May 17, 2010 at 10:40 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> On Mon, May 17, 2010 at 10:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> OK, I think I understand now.  But, the SIGTERM sent by the postmaster
>>>> doesn't kill the recovery process unconditionally.  It will invoke
>>>> StartupProcShutdownHandler(), which will set set shutdown_requested =
>>>> true.  That gets checked by RestoreArchivedFile() and
>>>> HandleStartupProcInterrupts(), and I think that neither of those can
>>>> get invoked until after the control file has been updated.  Do you see
>>>> a way it can happen?
>>>
>>> Yeah, the way is:
>>> StartupXLOG() --> ReadCheckpointRecord() --> ReadRecord() -->
>>> XLogPageRead() --> XLogFileReadAnyTLI() --> XLogFileRead() -->
>>> RestoreArchivedFile()
>>>
>>> ReadCheckpointRecord() is called before pg_control is updated.
>>
>> OK.  In that case, I'm wondering if we should reverse course and
>> rejigger the logic so that the shutdown gets processed when we
>> transition to PM_RECOVERY.  Seems like that might be simpler.
>
> You mean keeping shutdown waiting until the postmaster has reached
> PM_RECOVERY, i.e., the startup process has sent PMSIGNAL_RECOVERY_STARTED?
>
> The startup process must call ReadCheckpointRecord() before sending
> that signal. ReadCheckpointRecord() might get stuck for some reasons,
> e.g., neither master nor standby might have the recovery starting
> checkpoint WAL record. So that signal might not be sent forever,
> in this case, shutdown would get stuck.

Ah, OK.

In terms of removing the backup label file, can we simply have an
additional boolean in the postmaster that indicates whether we've ever
reached PM_RUN, and only consider removing the backup file if so?

>>> ISTM that walreceiver might be invoked even after shutdown is requested.
>>> We should prevent the postmaster from starting up walreceiver if
>>> Shutdown > NoShutdown?
>>
>> Well, when we did the previous shutdown patch, we decided it was not
>> right to kill walreceiver until all backends had exited, so it seems
>> inconsistent to make the opposite decision here.
>
> Oh, right. How about allowing the postmaster only in PM_STARTUP,
> PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
> walreceiver? We can keep walreceiver alive until all read only
> backends have gone, and prevent unexpected startup of walreceiver.

Yes, that seems like something we should be checking, if we aren't already.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Wed, May 19, 2010 at 12:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> In terms of removing the backup label file, can we simply have an
> additional boolean in the postmaster that indicates whether we've ever
> reached PM_RUN, and only consider removing the backup file if so?

Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
same indicator as the boolean you suggested. Thought?

>>>> ISTM that walreceiver might be invoked even after shutdown is requested.
>>>> We should prevent the postmaster from starting up walreceiver if
>>>> Shutdown > NoShutdown?
>>>
>>> Well, when we did the previous shutdown patch, we decided it was not
>>> right to kill walreceiver until all backends had exited, so it seems
>>> inconsistent to make the opposite decision here.
>>
>> Oh, right. How about allowing the postmaster only in PM_STARTUP,
>> PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
>> walreceiver? We can keep walreceiver alive until all read only
>> backends have gone, and prevent unexpected startup of walreceiver.
>
> Yes, that seems like something we should be checking, if we aren't already.

I'll do that.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, May 19, 2010 at 12:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> In terms of removing the backup label file, can we simply have an
>> additional boolean in the postmaster that indicates whether we've ever
>> reached PM_RUN, and only consider removing the backup file if so?
>
> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
> same indicator as the boolean you suggested. Thought?

It feels cleaner and simpler to me to use the information that the
postmaster already collects rather than having it take locks and check
shared memory, but I might be wrong.  Why do you prefer doing it that
way?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
>> same indicator as the boolean you suggested. Thought?

> It feels cleaner and simpler to me to use the information that the
> postmaster already collects rather than having it take locks and check
> shared memory, but I might be wrong.  Why do you prefer doing it that
> way?

The postmaster must absolutely not take locks (once there are competing
processes).  This is non negotiable from a system robustness standpoint.
        regards, tom lane


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Wed, 2010-05-19 at 08:21 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> >> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
> >> same indicator as the boolean you suggested. Thought?
> 
> > It feels cleaner and simpler to me to use the information that the
> > postmaster already collects rather than having it take locks and check
> > shared memory, but I might be wrong.  Why do you prefer doing it that
> > way?
> 
> The postmaster must absolutely not take locks (once there are competing
> processes).  This is non negotiable from a system robustness standpoint.

Masao has not proposed this, in fact his proposal was to deliberately
avoid do so.

I proposed using the state recorded in xlog.c rather than attempting to
duplicate that with a second boolean in postmaster because that seems
likely to be more buggy.

-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Wed, May 19, 2010 at 8:49 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2010-05-19 at 08:21 -0400, Tom Lane wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>> > On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> >> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
>> >> same indicator as the boolean you suggested. Thought?
>>
>> > It feels cleaner and simpler to me to use the information that the
>> > postmaster already collects rather than having it take locks and check
>> > shared memory, but I might be wrong.  Why do you prefer doing it that
>> > way?
>>
>> The postmaster must absolutely not take locks (once there are competing
>> processes).  This is non negotiable from a system robustness standpoint.
>
> Masao has not proposed this, in fact his proposal was to deliberately
> avoid do so.
>
> I proposed using the state recorded in xlog.c rather than attempting to
> duplicate that with a second boolean in postmaster because that seems
> likely to be more buggy.

Well then how are we reading XLogCtl?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Wed, May 19, 2010 at 10:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 19, 2010 at 8:49 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Wed, 2010-05-19 at 08:21 -0400, Tom Lane wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>> > On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> >> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
>>> >> same indicator as the boolean you suggested. Thought?
>>>
>>> > It feels cleaner and simpler to me to use the information that the
>>> > postmaster already collects rather than having it take locks and check
>>> > shared memory, but I might be wrong.  Why do you prefer doing it that
>>> > way?
>>>
>>> The postmaster must absolutely not take locks (once there are competing
>>> processes).  This is non negotiable from a system robustness standpoint.
>>
>> Masao has not proposed this, in fact his proposal was to deliberately
>> avoid do so.
>>
>> I proposed using the state recorded in xlog.c rather than attempting to
>> duplicate that with a second boolean in postmaster because that seems
>> likely to be more buggy.
>
> Well then how are we reading XLogCtl?

In my patch, XLogCtl is directly read in xlog.c without any lock since
there should be no other processes running when CancelBackup() is called.


*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 8975,8980 **** CancelBackup(void)
--- 8975,8987 ---- {       struct stat stat_buf;

+       /*
+        * During recovery, we don't rename the "backup_label" file since
+        * it might be required for subsequent recovery.
+        */
+       if (XLogCtl->SharedRecoveryInProgress)
+               return;
+       /* if the file is not there, return */       if (stat(BACKUP_LABEL_FILE, &stat_buf) < 0)               return;

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Wed, May 19, 2010 at 2:47 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> Oh, right. How about allowing the postmaster only in PM_STARTUP,
>>> PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
>>> walreceiver? We can keep walreceiver alive until all read only
>>> backends have gone, and prevent unexpected startup of walreceiver.
>>
>> Yes, that seems like something we should be checking, if we aren't already.
>
> I'll do that.

Here is the updated version. I added the above-mentioned check
into the patch.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 24, 2010 at 1:27 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, May 19, 2010 at 2:47 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>> Oh, right. How about allowing the postmaster only in PM_STARTUP,
>>>> PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
>>>> walreceiver? We can keep walreceiver alive until all read only
>>>> backends have gone, and prevent unexpected startup of walreceiver.
>>>
>>> Yes, that seems like something we should be checking, if we aren't already.
>>
>> I'll do that.
>
> Here is the updated version. I added the above-mentioned check
> into the patch.

This looks pretty reasonable to me, but I guess I feel like it would
be better to drive the CancelBackup() decision off of whether we've
ever reached PM_RUN rather than consulting XLogCtl.  It just feels
cleaner to me to drive all of the postmaster decisions off of the same
signalling mechanism rather than having a separate one (that only
works because it's used very late in shutdown when we theoretically
don't need a lock) just for this one case.

I could be all wet, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Simon Riggs
Date:
On Mon, 2010-05-24 at 09:26 -0400, Robert Haas wrote:
> On Mon, May 24, 2010 at 1:27 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> > On Wed, May 19, 2010 at 2:47 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> >>>> Oh, right. How about allowing the postmaster only in PM_STARTUP,
> >>>> PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
> >>>> walreceiver? We can keep walreceiver alive until all read only
> >>>> backends have gone, and prevent unexpected startup of walreceiver.
> >>>
> >>> Yes, that seems like something we should be checking, if we aren't already.
> >>
> >> I'll do that.
> >
> > Here is the updated version. I added the above-mentioned check
> > into the patch.
> 
> This looks pretty reasonable to me, but I guess I feel like it would
> be better to drive the CancelBackup() decision off of whether we've
> ever reached PM_RUN rather than consulting XLogCtl. 

That is exactly what XLogCtl tells us and why it is suggested for use.

>  It just feels
> cleaner to me to drive all of the postmaster decisions off of the same
> signalling mechanism rather than having a separate one (that only
> works because it's used very late in shutdown when we theoretically
> don't need a lock) just for this one case.
> 
> I could be all wet, though.
> 
-- Simon Riggs           www.2ndQuadrant.com



Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 24, 2010 at 9:28 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-05-24 at 09:26 -0400, Robert Haas wrote:
>> On Mon, May 24, 2010 at 1:27 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> > On Wed, May 19, 2010 at 2:47 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> >>>> Oh, right. How about allowing the postmaster only in PM_STARTUP,
>> >>>> PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
>> >>>> walreceiver? We can keep walreceiver alive until all read only
>> >>>> backends have gone, and prevent unexpected startup of walreceiver.
>> >>>
>> >>> Yes, that seems like something we should be checking, if we aren't already.
>> >>
>> >> I'll do that.
>> >
>> > Here is the updated version. I added the above-mentioned check
>> > into the patch.
>>
>> This looks pretty reasonable to me, but I guess I feel like it would
>> be better to drive the CancelBackup() decision off of whether we've
>> ever reached PM_RUN rather than consulting XLogCtl.
>
> That is exactly what XLogCtl tells us and why it is suggested for use.

Sure.  My only point is that the postmaster doesn't (and can't) use
that method of getting the information at any other time when it is
needed, so I don't know why we'd want to use it in just this one case.Maybe there's a reason, but it's not obvious to
me.

>>  It just feels
>> cleaner to me to drive all of the postmaster decisions off of the same
>> signalling mechanism rather than having a separate one (that only
>> works because it's used very late in shutdown when we theoretically
>> don't need a lock) just for this one case.
>>
>> I could be all wet, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, May 24, 2010 at 9:28 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Mon, 2010-05-24 at 09:26 -0400, Robert Haas wrote:
>>> This looks pretty reasonable to me, but I guess I feel like it would
>>> be better to drive the CancelBackup() decision off of whether we've
>>> ever reached PM_RUN rather than consulting XLogCtl.
>> 
>> That is exactly what XLogCtl tells us and why it is suggested for use.

> Sure.  My only point is that the postmaster doesn't (and can't) use
> that method of getting the information at any other time when it is
> needed, so I don't know why we'd want to use it in just this one case.
>  Maybe there's a reason, but it's not obvious to me.

I'm with Robert on this.  The postmaster is designed to be driven by an
internal state machine.  Making it rely on the contents of shared memory
is a fundamentally dangerous idea.  It might coincidentally be safe in
this one case, but I can easily imagine that property failing as a result
of subsequent changes.

The postmaster should not look at shared memory if there is any
reasonable alternative, and we clearly have a reasonable alternative.
        regards, tom lane


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Mon, May 24, 2010 at 10:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> This looks pretty reasonable to me, but I guess I feel like it would
> be better to drive the CancelBackup() decision off of whether we've
> ever reached PM_RUN rather than consulting XLogCtl.  It just feels
> cleaner to me to drive all of the postmaster decisions off of the same
> signalling mechanism rather than having a separate one (that only
> works because it's used very late in shutdown when we theoretically
> don't need a lock) just for this one case.

Okay, how about the attached patch? It uses the postmaster-local flag
"ReachedEndOfRecovery" (better name?) instead of XLogCtl one.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 24, 2010 at 10:35 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, May 24, 2010 at 10:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> This looks pretty reasonable to me, but I guess I feel like it would
>> be better to drive the CancelBackup() decision off of whether we've
>> ever reached PM_RUN rather than consulting XLogCtl.  It just feels
>> cleaner to me to drive all of the postmaster decisions off of the same
>> signalling mechanism rather than having a separate one (that only
>> works because it's used very late in shutdown when we theoretically
>> don't need a lock) just for this one case.
>
> Okay, how about the attached patch? It uses the postmaster-local flag
> "ReachedEndOfRecovery" (better name?) instead of XLogCtl one.

Looks good to me.  I will test and apply.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Tom Lane
Date:
Fujii Masao <masao.fujii@gmail.com> writes:
> Okay, how about the attached patch? It uses the postmaster-local flag
> "ReachedEndOfRecovery" (better name?) instead of XLogCtl one.

ReachedNormalRunning, perhaps?
        regards, tom lane


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Tue, May 18, 2010 at 3:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> (2)
>>> pg_ctl -ms stop emits the following warning whenever there is the
>>> backup_label file in $PGDATA.
>>>
>>>       WARNING: online backup mode is active
>>>       Shutdown will not complete until pg_stop_backup() is called.
>>>
>>> This warning doesn't fit in with the shutdown during recovery case.
>>> Since smart shutdown might be requested by other than pg_ctl, the
>>> warning should be emitted in server side rather than client, I think.
>>> How about moving the warning to the server side?
>
> Though I'm not sure if this should be fixed for 9.0, I attached the
> patch (move_bkp_cancel_warning_v1.patch).

This patch is worth applying for 9.0? If not, I'll add it into
the next CF for 9.1.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Mon, May 24, 2010 at 10:35 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, May 24, 2010 at 10:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> This looks pretty reasonable to me, but I guess I feel like it would
>> be better to drive the CancelBackup() decision off of whether we've
>> ever reached PM_RUN rather than consulting XLogCtl.  It just feels
>> cleaner to me to drive all of the postmaster decisions off of the same
>> signalling mechanism rather than having a separate one (that only
>> works because it's used very late in shutdown when we theoretically
>> don't need a lock) just for this one case.
>
> Okay, how about the attached patch? It uses the postmaster-local flag
> "ReachedEndOfRecovery" (better name?) instead of XLogCtl one.

I've committed part of this patch, with the naming change that Tom
suggested.  The parts I haven't committed are:

1. I don't see why we need to reset ReachedEndOfRecovery starting over
from PM_NO_CHILDREN.  It seems to me that once we reach PM_RUN, we
CAN'T go back to needing the backup label file, even if we have a
subsequent backend crash.  If I'm wrong, please let me know why and
I'll go put this back (with an appropriate comment).

2. The changes to avoid launching WALReceiver except during certain
PM_* states.  It seems fairly sensible, but what is the case where
adding this logic prevents a problem?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Tue, May 25, 2010 at 6:19 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Tue, May 18, 2010 at 3:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>> (2)
>>>> pg_ctl -ms stop emits the following warning whenever there is the
>>>> backup_label file in $PGDATA.
>>>>
>>>>       WARNING: online backup mode is active
>>>>       Shutdown will not complete until pg_stop_backup() is called.
>>>>
>>>> This warning doesn't fit in with the shutdown during recovery case.
>>>> Since smart shutdown might be requested by other than pg_ctl, the
>>>> warning should be emitted in server side rather than client, I think.
>>>> How about moving the warning to the server side?
>>
>> Though I'm not sure if this should be fixed for 9.0, I attached the
>> patch (move_bkp_cancel_warning_v1.patch).
>
> This patch is worth applying for 9.0? If not, I'll add it into
> the next CF for 9.1.

I'm not convinced that this is a good idea, because ISTM it will make
the error message to be less likely to be seen by the person running
pg_ctl.  In any case, it's a behavior change, so I think that means
it's a no-go for 9.0.

In terms of 9.1, it might make sense to log something to both places.
But maybe we shouldn't just do it once - maybe it should happen every
30 s or so until we actually manage to shut down, with a list of
what's still blocking shutdown a la errdetail_busy_db.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Wed, May 26, 2010 at 9:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, May 24, 2010 at 10:35 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Mon, May 24, 2010 at 10:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> This looks pretty reasonable to me, but I guess I feel like it would
>>> be better to drive the CancelBackup() decision off of whether we've
>>> ever reached PM_RUN rather than consulting XLogCtl.  It just feels
>>> cleaner to me to drive all of the postmaster decisions off of the same
>>> signalling mechanism rather than having a separate one (that only
>>> works because it's used very late in shutdown when we theoretically
>>> don't need a lock) just for this one case.
>>
>> Okay, how about the attached patch? It uses the postmaster-local flag
>> "ReachedEndOfRecovery" (better name?) instead of XLogCtl one.
>
> I've committed part of this patch, with the naming change that Tom
> suggested.

Thanks!

> The parts I haven't committed are:
>
> 1. I don't see why we need to reset ReachedEndOfRecovery starting over
> from PM_NO_CHILDREN.  It seems to me that once we reach PM_RUN, we
> CAN'T go back to needing the backup label file, even if we have a
> subsequent backend crash.  If I'm wrong, please let me know why and
> I'll go put this back (with an appropriate comment).

That reset has nothing to do with cancellation of the backup mode.
I just added it since postmaster might use ReachedNormalRunning for
another purpose in the future. For now, I have no objection not to
add it.

> 2. The changes to avoid launching WALReceiver except during certain
> PM_* states.  It seems fairly sensible, but what is the case where
> adding this logic prevents a problem?

The problem is that shutdown would get stuck when walreceiver is
invoked in PM_WAIT_BACKEND state. Imagine the following scenario:

1. Fast shutdown is requested just before the startup process calls  RequestXLogStreaming() which is the function to
requestpostmaster  to invoke walreceiver. 

2. pmdie() sends SIGTERM to the startup process, but not walreceiver  because it's not been started yet. Then, pmdie()
switchesthe  state into PM_WAIT_BACKENDS. 

3. The startup process goes through RequestXLogStreaming() and requests  postmaster to start walreceiver before
processingSIGTERM sent from  pmdie(). Then the startup process exits, and postmaster invokes  walreceiver in
PM_WAIT_BACKENDSstate. 

4. Once postmaster has reached PM_WAIT_BACKENDS, there is no way to send  SIGTERM to walreceiver. OTOH, postmaster
cannotadvance the state from  PM_WAIT_BACKENDS until walreceiver has gone away. So shutdown gets  stuck. 

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Fujii Masao
Date:
On Wed, May 26, 2010 at 9:54 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, May 25, 2010 at 6:19 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Tue, May 18, 2010 at 3:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>>> (2)
>>>>> pg_ctl -ms stop emits the following warning whenever there is the
>>>>> backup_label file in $PGDATA.
>>>>>
>>>>>       WARNING: online backup mode is active
>>>>>       Shutdown will not complete until pg_stop_backup() is called.
>>>>>
>>>>> This warning doesn't fit in with the shutdown during recovery case.
>>>>> Since smart shutdown might be requested by other than pg_ctl, the
>>>>> warning should be emitted in server side rather than client, I think.
>>>>> How about moving the warning to the server side?
>>>
>>> Though I'm not sure if this should be fixed for 9.0, I attached the
>>> patch (move_bkp_cancel_warning_v1.patch).
>>
>> This patch is worth applying for 9.0? If not, I'll add it into
>> the next CF for 9.1.
>
> I'm not convinced that this is a good idea, because ISTM it will make
> the error message to be less likely to be seen by the person running
> pg_ctl.  In any case, it's a behavior change, so I think that means
> it's a no-go for 9.0.
>
> In terms of 9.1, it might make sense to log something to both places.
> But maybe we shouldn't just do it once - maybe it should happen every
> 30 s or so until we actually manage to shut down, with a list of
> what's still blocking shutdown a la errdetail_busy_db.

Fair enough. I'll think of the issue in detail again for 9.1.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Stefan's bug (was: max_standby_delay considered harmful)

From
Robert Haas
Date:
On Wed, May 26, 2010 at 9:06 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, May 26, 2010 at 9:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, May 24, 2010 at 10:35 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> On Mon, May 24, 2010 at 10:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> This looks pretty reasonable to me, but I guess I feel like it would
>>>> be better to drive the CancelBackup() decision off of whether we've
>>>> ever reached PM_RUN rather than consulting XLogCtl.  It just feels
>>>> cleaner to me to drive all of the postmaster decisions off of the same
>>>> signalling mechanism rather than having a separate one (that only
>>>> works because it's used very late in shutdown when we theoretically
>>>> don't need a lock) just for this one case.
>>>
>>> Okay, how about the attached patch? It uses the postmaster-local flag
>>> "ReachedEndOfRecovery" (better name?) instead of XLogCtl one.
>>
>> I've committed part of this patch, with the naming change that Tom
>> suggested.
>
> Thanks!
>
>> The parts I haven't committed are:
>>
>> 1. I don't see why we need to reset ReachedEndOfRecovery starting over
>> from PM_NO_CHILDREN.  It seems to me that once we reach PM_RUN, we
>> CAN'T go back to needing the backup label file, even if we have a
>> subsequent backend crash.  If I'm wrong, please let me know why and
>> I'll go put this back (with an appropriate comment).
>
> That reset has nothing to do with cancellation of the backup mode.
> I just added it since postmaster might use ReachedNormalRunning for
> another purpose in the future. For now, I have no objection not to
> add it.

OK, good.  I'm not sure it would be right to add it any way - reached
normal running sounds to me like it ought to mean "reached normal
running, ever" rather than "reached normal running since the last
backend crash".  In any event, it's moot for now.

>> 2. The changes to avoid launching WALReceiver except during certain
>> PM_* states.  It seems fairly sensible, but what is the case where
>> adding this logic prevents a problem?
>
> The problem is that shutdown would get stuck when walreceiver is
> invoked in PM_WAIT_BACKEND state. Imagine the following scenario:
>
> 1. Fast shutdown is requested just before the startup process calls
>   RequestXLogStreaming() which is the function to request postmaster
>   to invoke walreceiver.
>
> 2. pmdie() sends SIGTERM to the startup process, but not walreceiver
>   because it's not been started yet. Then, pmdie() switches the
>   state into PM_WAIT_BACKENDS.
>
> 3. The startup process goes through RequestXLogStreaming() and requests
>   postmaster to start walreceiver before processing SIGTERM sent from
>   pmdie(). Then the startup process exits, and postmaster invokes
>   walreceiver in PM_WAIT_BACKENDS state.
>
> 4. Once postmaster has reached PM_WAIT_BACKENDS, there is no way to send
>   SIGTERM to walreceiver. OTOH, postmaster cannot advance the state from
>   PM_WAIT_BACKENDS until walreceiver has gone away. So shutdown gets
>   stuck.

OK, makes sense.   I have committed this part also.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company