On Fri, Apr 12, 2013 at 5:53 PM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> On 04/11/2013 07:29 PM, Fujii Masao wrote:
>>
>> On Thu, Apr 11, 2013 at 10:25 PM, Hannu Krosing <hannu@2ndquadrant.com>
>> wrote:
>>>
>>> You just shut down the old master and let the standby catch
>>> up (takas a few microseconds ;) ) before you promote it.
>>>
>>> After this you can start up the former master with recovery.conf
>>> and it will follow nicely.
>>
>> No. When you shut down the old master, it might not have been
>> able to send all the WAL records to the standby.
>
> In what cases (other than a standby lagging too much or
> not listening at all) have you observed this ?
>
>> I have observed
>> this situation several times. So in your approach, new standby
>> might fail to catch up with the master nicely.
>
> the page http://wiki.postgresql.org/wiki/Streaming_Replication claims this:
>
> * Graceful shutdown
>
> When smart/fast shutdown is requested, the primary waits to exit
> until XLOG records have been sent to the standby, up to the
> shutdown checkpoint record.
>
> Maybe you were requesting immediate shutdown ?
No. I did fast shutdown.
It's true that the master waits for checkpoint record to be replicated to the
standby when fast shutdown is performed. But the standby can not always
successfully receive all WAL records which the master sent.
To ensure that all WAL records have been replicated to the standby at fast
shutdown, we should make the walsender wait for the standby to write the
checkpoint record and send back the ACK.
Regards,
--
Fujii Masao