Thread: pg_ctl failover Re: Latches, signals, and waiting

pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Wed, Sep 15, 2010 at 11:14 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 15/09/10 16:55, Tom Lane wrote:
>>
>> So I'm wondering if we couldn't eliminate the five-second sleep
>> requirement here too.  It's problematic anyhow, since somebody looking
>> for energy efficiency will still feel it's too short, while somebody
>> concerned about fast failover will feel it's too long.
>
> Yep.
>
>>  Could the
>> standby triggering protocol be modified so that it involves sending a
>> signal, not just creating a file?
>
> Seems reasonable, at least if we still provide an option for more frequent
> polling and no need to send signal.
>
>> (One issue is that it's not clear what that'd translate to on Windows.)
>
> pg_ctl failover ? At the moment, the location of the trigger file is
> configurable, but if we accept a constant location like "$PGDATA/failover"
> pg_ctl could do the whole thing, create the file and send signal. pg_ctl on
> Window already knows how to send the "signal" via the named pipe signal
> emulation.

The attached patch implements the above-mentioned pg_ctl failover.

Comments? Objections?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: pg_ctl failover Re: Latches, signals, and waiting

From
Itagaki Takahiro
Date:
On Thu, Jan 13, 2011 at 00:14, Fujii Masao <masao.fujii@gmail.com> wrote:
>> pg_ctl failover ? At the moment, the location of the trigger file is
>> configurable, but if we accept a constant location like "$PGDATA/failover"
>> pg_ctl could do the whole thing, create the file and send signal. pg_ctl on
>> Window already knows how to send the "signal" via the named pipe signal
>> emulation.
>
> The attached patch implements the above-mentioned pg_ctl failover.

I have three comments:
- Will we call it "failover"? We will use the command also in "switchover" operations. "pg_ctl promote" might be more
neutral,but users might be hard to imagine replication feature from "promote".
 

- pg_ctl should unlink failover_files when it failed to send failover signals.

- "standby_triggered" variable might be renamed to "failover_triggered" or so.

-- 
Itagaki Takahiro


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Thu, Jan 13, 2011 at 11:29 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Thu, Jan 13, 2011 at 00:14, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> pg_ctl failover ? At the moment, the location of the trigger file is
>>> configurable, but if we accept a constant location like "$PGDATA/failover"
>>> pg_ctl could do the whole thing, create the file and send signal. pg_ctl on
>>> Window already knows how to send the "signal" via the named pipe signal
>>> emulation.
>>
>> The attached patch implements the above-mentioned pg_ctl failover.
>
> I have three comments:

Thanks for the review!

> - Will we call it "failover"? We will use the command also in "switchover"
>  operations. "pg_ctl promote" might be more neutral, but users might be
>  hard to imagine replication feature from "promote".

OK. Similarly, I should also change the word "failover" used in function and
variable names to the "promote"? For example,
#define PROMOTE_SIGNAL_FILE "promote" rather than
#define FAILOVER_SIGNAL_FILE "failover"?

> - pg_ctl should unlink failover_files when it failed to send failover signals.

Good catch.

> - "standby_triggered" variable might be renamed to "failover_triggered" or so.

Furthermore, "failover_triggered" should be renamed to "promote_triggered"?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Heikki Linnakangas
Date:
On 13.01.2011 04:29, Itagaki Takahiro wrote:
> On Thu, Jan 13, 2011 at 00:14, Fujii Masao<masao.fujii@gmail.com>  wrote:
>>> pg_ctl failover ? At the moment, the location of the trigger file is
>>> configurable, but if we accept a constant location like "$PGDATA/failover"
>>> pg_ctl could do the whole thing, create the file and send signal. pg_ctl on
>>> Window already knows how to send the "signal" via the named pipe signal
>>> emulation.
>>
>> The attached patch implements the above-mentioned pg_ctl failover.
>
> I have three comments:
> - Will we call it "failover"? We will use the command also in "switchover"
>    operations. "pg_ctl promote" might be more neutral, but users might be
>    hard to imagine replication feature from "promote".

I agree that "failover" or even "switchover" is too specific. You might 
want promote a server even if you keep the old master still running, if 
you're creating a temporary copy of the master repository for testing 
purposes etc.

+1 for "promote". People unfamiliar with the replication stuff might not 
immediately understand that it's related to replication, but they 
wouldn't have any use for the option anyway. It should be clear to 
anyone who needs it.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Robert Haas
Date:
On Thu, Jan 13, 2011 at 5:00 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 13.01.2011 04:29, Itagaki Takahiro wrote:
>>
>> On Thu, Jan 13, 2011 at 00:14, Fujii Masao<masao.fujii@gmail.com>  wrote:
>>>>
>>>> pg_ctl failover ? At the moment, the location of the trigger file is
>>>> configurable, but if we accept a constant location like
>>>> "$PGDATA/failover"
>>>> pg_ctl could do the whole thing, create the file and send signal. pg_ctl
>>>> on
>>>> Window already knows how to send the "signal" via the named pipe signal
>>>> emulation.
>>>
>>> The attached patch implements the above-mentioned pg_ctl failover.
>>
>> I have three comments:
>> - Will we call it "failover"? We will use the command also in "switchover"
>>   operations. "pg_ctl promote" might be more neutral, but users might be
>>   hard to imagine replication feature from "promote".
>
> I agree that "failover" or even "switchover" is too specific. You might want
> promote a server even if you keep the old master still running, if you're
> creating a temporary copy of the master repository for testing purposes etc.
>
> +1 for "promote". People unfamiliar with the replication stuff might not
> immediately understand that it's related to replication, but they wouldn't
> have any use for the option anyway. It should be clear to anyone who needs
> it.

I agree.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Thu, Jan 13, 2011 at 7:00 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> +1 for "promote". People unfamiliar with the replication stuff might not
> immediately understand that it's related to replication, but they wouldn't
> have any use for the option anyway. It should be clear to anyone who needs
> it.

I did s/failover/promote. Here is the updated patch.

>> - pg_ctl should unlink failover_files when it failed to send failover signals.

Done.

And, I changed some descriptions about trigger in high-availability.sgml
and recovery-config.sgml.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Thu, Jan 13, 2011 at 9:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> I did s/failover/promote. Here is the updated patch.

I rebased the patch to current git master.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: pg_ctl failover Re: Latches, signals, and waiting

From
Tatsuo Ishii
Date:
>> I did s/failover/promote. Here is the updated patch.
> 
> I rebased the patch to current git master.

I'm thinking about implementing a function which does a promotion for
the standby. It will make pgpool lot easier to control the promotion
since it allow to fire the promotion operation (either creating a
trigger file or sending a signal) via SQL, not ssh etc.

If there's enough interest, I will propose such a function for next CF.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Magnus Hagander
Date:
On Fri, Jan 28, 2011 at 08:44, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>> I did s/failover/promote. Here is the updated patch.
>>
>> I rebased the patch to current git master.
>
> I'm thinking about implementing a function which does a promotion for
> the standby. It will make pgpool lot easier to control the promotion
> since it allow to fire the promotion operation (either creating a
> trigger file or sending a signal) via SQL, not ssh etc.

I agree that having this available via SQL would be useful in a number
of cases. pgpool or such being one, but also for example pgadmin.


> If there's enough interest, I will propose such a function for next CF.

Just as a reminder, remember that next CF means 9.2.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Tatsuo Ishii
Date:
> On Fri, Jan 28, 2011 at 08:44, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>>> I did s/failover/promote. Here is the updated patch.
>>>
>>> I rebased the patch to current git master.
>>
>> I'm thinking about implementing a function which does a promotion for
>> the standby. It will make pgpool lot easier to control the promotion
>> since it allow to fire the promotion operation (either creating a
>> trigger file or sending a signal) via SQL, not ssh etc.
> 
> I agree that having this available via SQL would be useful in a number
> of cases. pgpool or such being one, but also for example pgadmin.
> 
> 
>> If there's enough interest, I will propose such a function for next CF.
> 
> Just as a reminder, remember that next CF means 9.2.

Oh, I meant current CF (has started in January)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Fri, Jan 28, 2011 at 4:57 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Fri, Jan 28, 2011 at 08:44, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>>> I did s/failover/promote. Here is the updated patch.
>>>
>>> I rebased the patch to current git master.
>>
>> I'm thinking about implementing a function which does a promotion for
>> the standby. It will make pgpool lot easier to control the promotion
>> since it allow to fire the promotion operation (either creating a
>> trigger file or sending a signal) via SQL, not ssh etc.
>
> I agree that having this available via SQL would be useful in a number
> of cases. pgpool or such being one, but also for example pgadmin.

Agreed. I submitted the patch before, but I forgot to update it
and add it to CF.
http://archives.postgresql.org/message-id/AANLkTimuHbxbuM+zLkaEX3aDqSeiMUE3xb4ww1QtsLmf@mail.gmail.com

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Tatsuo Ishii
Date:
> On Fri, Jan 28, 2011 at 4:57 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Fri, Jan 28, 2011 at 08:44, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>>>> I did s/failover/promote. Here is the updated patch.
>>>>
>>>> I rebased the patch to current git master.
>>>
>>> I'm thinking about implementing a function which does a promotion for
>>> the standby. It will make pgpool lot easier to control the promotion
>>> since it allow to fire the promotion operation (either creating a
>>> trigger file or sending a signal) via SQL, not ssh etc.
>>
>> I agree that having this available via SQL would be useful in a number
>> of cases. pgpool or such being one, but also for example pgadmin.
> 
> Agreed. I submitted the patch before, but I forgot to update it
> and add it to CF.
> http://archives.postgresql.org/message-id/AANLkTimuHbxbuM+zLkaEX3aDqSeiMUE3xb4ww1QtsLmf@mail.gmail.com

Great!
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Robert Haas
Date:
On Fri, Jan 28, 2011 at 3:40 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> On Fri, Jan 28, 2011 at 4:57 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> On Fri, Jan 28, 2011 at 08:44, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>>>>> I did s/failover/promote. Here is the updated patch.
>>>>>
>>>>> I rebased the patch to current git master.
>>>>
>>>> I'm thinking about implementing a function which does a promotion for
>>>> the standby. It will make pgpool lot easier to control the promotion
>>>> since it allow to fire the promotion operation (either creating a
>>>> trigger file or sending a signal) via SQL, not ssh etc.
>>>
>>> I agree that having this available via SQL would be useful in a number
>>> of cases. pgpool or such being one, but also for example pgadmin.
>>
>> Agreed. I submitted the patch before, but I forgot to update it
>> and add it to CF.
>> http://archives.postgresql.org/message-id/AANLkTimuHbxbuM+zLkaEX3aDqSeiMUE3xb4ww1QtsLmf@mail.gmail.com
>
> Great!

I hate to be a wet blanket, but the number of patches in this CF is
going the wrong direction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jan 28, 2011 at 3:40 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>>> Agreed. I submitted the patch before, but I forgot to update it
>>>> and add it to CF.
>>>> http://archives.postgresql.org/message-id/AANLkTimuHbxbuM+zLkaEX3aDqSeiMUE3xb4ww1QtsLmf@mail.gmail.com
>> 
>> Great!

> I hate to be a wet blanket, but the number of patches in this CF is
> going the wrong direction.

Yes.  I'm not sure that the fact that something was discussed months ago
entitles the submitter to a free exemption from the requirement to meet
the CF submission deadline.
        regards, tom lane


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Tatsuo Ishii
Date:
>>>> I did s/failover/promote. Here is the updated patch.
>>>
>>> I rebased the patch to current git master.
>>
>> I'm thinking about implementing a function which does a promotion for
>> the standby. It will make pgpool lot easier to control the promotion
>> since it allow to fire the promotion operation (either creating a
>> trigger file or sending a signal) via SQL, not ssh etc.
> 
> I agree that having this available via SQL would be useful in a number
> of cases. pgpool or such being one, but also for example pgadmin.
> 
> 
>> If there's enough interest, I will propose such a function for next CF.
> 
> Just as a reminder, remember that next CF means 9.2.

Ok. I will write a C user function and add to pgpool source tree. I
think it will be fairly easy to create a trigger file in the function.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Sat, Jan 29, 2011 at 1:11 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:
> Ok. I will write a C user function and add to pgpool source tree. I
> think it will be fairly easy to create a trigger file in the function.

If the "pg_ctl promote" patch will have been committed, I recommend that
the C function should send the signal to the startup process rather than
creating the trigger file. Because the trigger file is checked every for 5s,
which would lengthen the failover time by an average 2.5s.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Tatsuo Ishii
Date:
> If the "pg_ctl promote" patch will have been committed, I recommend that
> the C function should send the signal to the startup process rather than
> creating the trigger file. Because the trigger file is checked every for 5s,
> which would lengthen the failover time by an average 2.5s.

Ok, probably I could make the function smart enough to signal or not
by looking at the PostgreSQL version.

BTW is it possible to export following variable in xlog.c?

static char *TriggerFile = NULL;

That would make coding of the C function lot easier.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Itagaki Takahiro
Date:
On Mon, Jan 31, 2011 at 11:52, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Sat, Jan 29, 2011 at 1:11 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> Ok. I will write a C user function and add to pgpool source tree. I
>> think it will be fairly easy to create a trigger file in the function.
>
> If the "pg_ctl promote" patch will have been committed, I recommend that
> the C function should send the signal to the startup process rather than
> creating the trigger file.

The C function needs to create a trigger file in $PGDATA/promote
before sending signals, no?    system("pg_ctl promote") seems
the easiest way if you use an external module.

-- 
Itagaki Takahiro


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Mon, Jan 31, 2011 at 12:31 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> The C function needs to create a trigger file in $PGDATA/promote
> before sending signals, no?

No. At least in the current patch, just receipt of SIGUSR2 causes the
startup process to end a recovery. The startup process doesn't check
the existence of $PGDATA/promote, though postmaster does.

>  system("pg_ctl promote") seems
> the easiest way if you use an external module.

Yeah, that's true.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Mon, Jan 31, 2011 at 12:35 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> If the "pg_ctl promote" patch will have been committed, I recommend that
>> the C function should send the signal to the startup process rather than
>> creating the trigger file. Because the trigger file is checked every for 5s,
>> which would lengthen the failover time by an average 2.5s.
>
> Ok, probably I could make the function smart enough to signal or not
> by looking at the PostgreSQL version.
>
> BTW is it possible to export following variable in xlog.c?
>
> static char *TriggerFile = NULL;
>
> That would make coding of the C function lot easier.

If you change the function so that it sends the signal or call
system("pg_ctl promote"), exporting that variable seems to
be unnecessary. Because pg_ctl promote can promote
the server even if trigger_file is not supplied. You don't need
to check whether trigger_file is set or not, in the C function.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Tatsuo Ishii
Date:
> On Mon, Jan 31, 2011 at 12:35 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>> If the "pg_ctl promote" patch will have been committed, I recommend that
>>> the C function should send the signal to the startup process rather than
>>> creating the trigger file. Because the trigger file is checked every for 5s,
>>> which would lengthen the failover time by an average 2.5s.
>>
>> Ok, probably I could make the function smart enough to signal or not
>> by looking at the PostgreSQL version.
>>
>> BTW is it possible to export following variable in xlog.c?
>>
>> static char *TriggerFile = NULL;
>>
>> That would make coding of the C function lot easier.
> 
> If you change the function so that it sends the signal or call
> system("pg_ctl promote"), exporting that variable seems to
> be unnecessary. Because pg_ctl promote can promote
> the server even if trigger_file is not supplied. You don't need
> to check whether trigger_file is set or not, in the C function.

True.

BTW for 9.0, perhaps copy&paste parseRecoveryCommandFileLine() from
xlog.c into the C function is the only way to go.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Robert Haas
Date:
On Wed, Jan 19, 2011 at 1:01 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Thu, Jan 13, 2011 at 9:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> I did s/failover/promote. Here is the updated patch.
>
> I rebased the patch to current git master.

This patch looks fine to me.  I will mark it Ready for Committer.

(Someone else please feel free to pick it up for the actual commit, if
you have cycles.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Magnus Hagander
Date:
On Tue, Feb 8, 2011 at 05:24, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jan 19, 2011 at 1:01 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Thu, Jan 13, 2011 at 9:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> I did s/failover/promote. Here is the updated patch.
>>
>> I rebased the patch to current git master.
>
> This patch looks fine to me.  I will mark it Ready for Committer.
>
> (Someone else please feel free to pick it up for the actual commit, if
> you have cycles.)

I see that the docs part of the patch removes the mentioning of
reporting servers - is that intentional, or a mistake? Seems that
usecase still remains, no?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Magnus Hagander
Date:
On Thu, Feb 10, 2011 at 15:25, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Feb 8, 2011 at 05:24, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Jan 19, 2011 at 1:01 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> On Thu, Jan 13, 2011 at 9:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>> I did s/failover/promote. Here is the updated patch.
>>>
>>> I rebased the patch to current git master.
>>
>> This patch looks fine to me.  I will mark it Ready for Committer.
>>
>> (Someone else please feel free to pick it up for the actual commit, if
>> you have cycles.)
>
> I see that the docs part of the patch removes the mentioning of
> reporting servers - is that intentional, or a mistake? Seems that
> usecase still remains, no?

Also, the patch no longer applies, since it conflicts with
faa0550572583f51dba25611ab0f1d1c31de559b.

Since you (Fujii-san) wrote both of them, feel like rebasing it
properly for current master?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
Thanks for the review!

On Thu, Feb 10, 2011 at 11:25 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I see that the docs part of the patch removes the mentioning of
> reporting servers - is that intentional, or a mistake? Seems that
> usecase still remains, no?

It was intentional, but I agree with you. I re-added the mention to
the reporting servers.

On Thu, Feb 10, 2011 at 11:30 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Also, the patch no longer applies, since it conflicts with
> faa0550572583f51dba25611ab0f1d1c31de559b.
>
> Since you (Fujii-san) wrote both of them, feel like rebasing it
> properly for current master?

Yeah, I rebased the patch to the current git master and attached it.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: pg_ctl failover Re: Latches, signals, and waiting

From
Stephen Frost
Date:
Fujii,

* Fujii Masao (masao.fujii@gmail.com) wrote:
> Yeah, I rebased the patch to the current git master and attached it.

Reviewing this, I just had a couple of comments and questions.  Overall,
I think it looks good and hence will be marking it 'Ready for
Committer'.
* You removed trigger_file from the list in  doc/src/sgml/high-availability.sgml and I'm not sure I agree with  that.
It'sstill perfectly valid and could be used by someone  instead of pg_ctl promote.  I'd recommend two things:  - Adding
commentsinto this recovery.conf snippet  - Adding a comment indicationg that trigger_file is only needed if you're not
usingpg_ctl promote.
 
* I'm not happy that pg_ctl.c doesn't #include something which defines  all the file names which are used, couldn't we
usea header which  makes sense and is pulled in by pg_ctl.c and xlog.c to #define all of  these?  Still, that's not
reallythe fault of this patch.
 
* I'm a bit worried that there's just only so many USR signals that we  can send and it looks like we're burning
anotherone here.  Should we  be considering a better way to do this?
 
      Thanks,
    Stephen

Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Tue, Feb 15, 2011 at 2:10 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Fujii,
>
> * Fujii Masao (masao.fujii@gmail.com) wrote:
>> Yeah, I rebased the patch to the current git master and attached it.
>
> Reviewing this, I just had a couple of comments and questions.  Overall,
> I think it looks good and hence will be marking it 'Ready for
> Committer'.

Thanks for the review!

>  * You removed trigger_file from the list in
>   doc/src/sgml/high-availability.sgml and I'm not sure I agree with
>   that.  It's still perfectly valid and could be used by someone
>   instead of pg_ctl promote.  I'd recommend two things:
>   - Adding comments into this recovery.conf snippet

Adding the following is enough?

+# NOTE that if you plan to use "pg_ctl promote" command to promote
+# the standby, no trigger file needs to be specified.

>   - Adding a comment indicationg that trigger_file is only needed if
>         you're not using pg_ctl promote.

Where should I add such a comment?

>  * I'm not happy that pg_ctl.c doesn't #include something which defines
>   all the file names which are used, couldn't we use a header which
>   makes sense and is pulled in by pg_ctl.c and xlog.c to #define all of
>   these?  Still, that's not really the fault of this patch.

That would make sense. But I'm not sure that's possible. As a trial,
I added '#include "access/xlog.h"' into pg_ctl.c and compiled the source,
but I got many compilation errors. So probably hacking Makefiles is
required to do that. Do you know where should be changed?

>  * I'm a bit worried that there's just only so many USR signals that we
>   can send and it looks like we're burning another one here.  Should we
>   be considering a better way to do this?

You're worried about the case where users wrongly send the SIGUSR2
to the startup process, and then the standby is brought up to the master
unexpectedly?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Stephen Frost
Date:
* Fujii Masao (masao.fujii@gmail.com) wrote:
> On Tue, Feb 15, 2011 at 2:10 AM, Stephen Frost <sfrost@snowman.net> wrote:
> >  * You removed trigger_file from the list in
> >   doc/src/sgml/high-availability.sgml and I'm not sure I agree with
> >   that.  It's still perfectly valid and could be used by someone
> >   instead of pg_ctl promote.  I'd recommend two things:
> >   - Adding comments into this recovery.conf snippet
>
> Adding the following is enough?
>
> +# NOTE that if you plan to use "pg_ctl promote" command to promote
> +# the standby, no trigger file needs to be specified.

No..  I was thinking of having comments throughout the whole file,
similar to what we have for postgresql.conf.sample.  I realize it's an
example in the docs, but people are likely to copy & paste it into their
own files.

> >   - Adding a comment indicationg that trigger_file is only needed if
> >         you're not using pg_ctl promote.
>
> Where should I add such a comment?

This was still in the example recovery.conf in high-availability.sgml.

> >  * I'm not happy that pg_ctl.c doesn't #include something which defines
> >   all the file names which are used, couldn't we use a header which
> >   makes sense and is pulled in by pg_ctl.c and xlog.c to #define all of
> >   these?  Still, that's not really the fault of this patch.
>
> That would make sense. But I'm not sure that's possible. As a trial,
> I added '#include "access/xlog.h"' into pg_ctl.c and compiled the source,
> but I got many compilation errors. So probably hacking Makefiles is
> required to do that. Do you know where should be changed?

No, I wouldn't expect that to work..  I would have thought something
that was already included in the necessary places, eg, miscadmin.h.

> >  * I'm a bit worried that there's just only so many USR signals that we
> >   can send and it looks like we're burning another one here.  Should we
> >   be considering a better way to do this?
>
> You're worried about the case where users wrongly send the SIGUSR2
> to the startup process, and then the standby is brought up to the master
> unexpectedly?

No..  My concern was that it looked like we were adding a meaning to
SIGUSR2 which might make it unavailable for other uses later, and at
least according to man 7 signal on my system, there's only 2
User-defined signals available (SIGUSR1 and SIGUSR2)..

I just wouldn't want to use up something which is a finite resource
without good cause, in case we have a need for it later.  Now, perhaps
that's not happening and my concern is unfounded.
Thanks,
    Stephen

Re: pg_ctl failover Re: Latches, signals, and waiting

From
Robert Haas
Date:
On Sun, Feb 13, 2011 at 10:18 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> Thanks for the review!
>
> On Thu, Feb 10, 2011 at 11:25 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> I see that the docs part of the patch removes the mentioning of
>> reporting servers - is that intentional, or a mistake? Seems that
>> usecase still remains, no?
>
> It was intentional, but I agree with you. I re-added the mention to
> the reporting servers.
>
> On Thu, Feb 10, 2011 at 11:30 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> Also, the patch no longer applies, since it conflicts with
>> faa0550572583f51dba25611ab0f1d1c31de559b.
>>
>> Since you (Fujii-san) wrote both of them, feel like rebasing it
>> properly for current master?
>
> Yeah, I rebased the patch to the current git master and attached it.

Committed with minor tweaks to comments and documentation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_ctl failover Re: Latches, signals, and waiting

From
Fujii Masao
Date:
On Wed, Feb 16, 2011 at 11:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Committed with minor tweaks to comments and documentation.

Thanks a lot!

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center