On 16 May 2012 11:36, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom@linux.com> wrote:
>> On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote:
>>>> However, this isn't true when I restart the standby. I've been
>>>> informed that this should work fine if a WAL archive has been
>>>> configured (which should be used anyway).
>>>
>>> The WAL archive should be shared by master-replica and replica-replica,
>>> and recovery_target_timeline should be set to latest in replica-replica.
>>> If you configure that way, replica-replica would successfully reconnect to
>>> master-replica with no need to restart it.
>>
>> I had set the archive_command on the primary, then produced a base
>> backup which would have copied the archive settings, but I also added
>> a corresponding recovery_command setting, so everything was pointing
>> at the same archive.
>
> Hmm.. when doing the same, the replica-replica successfully reconnected
> to the master-replica after I shutdown the master-master and promoted the
> master-replica. archive_command is the same in three servers,
> restore_command is the same in two standby servers (i.e., master-replica
> and replica-replica), and recovery_target_timeline is set to 'latest' in two
> standby servers.
I didn't shut down the master-master, but I didn't expect to need to.
I also had recovery_target_timeline set to latest. I also tried
explicitly setting it to the new timeline, and got an error saying
there was no such timeline.
>> But in any case, shouldn't the replication connection be
>> terminated when pg_basebackup is terminated?
>
> +1 To do this, we would need to define SIGINT signal handler and make it
> send QueryCancel packet when Ctrl-C is typed.
Also could we provide some feedback when using the -c spread option,
when there isn't progress within a short period of time? Something
like "Waiting for checkpoint. This can take up to
%checkpoint_timeout%", or something similar, rather than seeing
nothing happening and wondering if something has gone wrong. And also
a note in the documentation saying that, on "quiet" clusters, it may
take some time before the base backup commences. In fact, since
pg_start_backup will exhibit the same behaviour (i.e. no feedback when
waiting for a checkpoint), maybe that should return a notice (if there
are dirty pages) stating that it will complete when the next
checkpoint occurs.
--
Thom