On Wed, Jan 22, 2014 at 6:37 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 01/21/2014 07:31 PM, Fujii Masao wrote:
>>
>> On Fri, Dec 20, 2013 at 9:21 PM, MauMau <maumau307@gmail.com> wrote:
>>>
>>> From: "Fujii Masao" <masao.fujii@gmail.com>
>>>
>>>> ! if (source == XLOG_FROM_ARCHIVE && StandbyModeRequested)
>>>>
>>>> Even when standby_mode is not enabled, we can use cascade replication
>>>> and
>>>> it needs the accumulated WAL files. So I think that
>>>> AllowCascadeReplication()
>>>> should be added into this condition.
>>>>
>>>> ! snprintf(recoveryPath, MAXPGPATH, XLOGDIR "/RECOVERYXLOG");
>>>> ! XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
>>>> !
>>>> ! if (restoredFromArchive)
>>>>
>>>> Don't we need to check !StandbyModeRequested and
>>>> !AllowCascadeReplication()
>>>> here?
>>>
>>>
>>> Oh, you are correct. Okay, done.
>>
>>
>> Thanks! The patch looks good to me. Attached is the updated version of
>> the patch. I added the comments.
>
>
> Sorry for reacting so slowly, but I'm not sure I like this patch. It's a
> quite useful property that all the WAL files that are needed for recovery
> are copied into pg_xlog, even when restoring from archive, even when not
> doing cascading replication. It guarantees that you can restart the standby,
> even if the connection to the archive is lost for some reason. I
> intentionally changed the behavior for archive recovery too, when it was
> introduced for cascading replication. Also, I think it's good that the
> behavior does not depend on whether cascading replication is enabled - it's
> a quite subtle difference.
>
> So, IMHO this is not a bug, it's a feature.
Yep.
> To solve the original problem of running out of disk space in archive
> recovery, I wonder if we should perform restartpoints more aggressively. We
> intentionally don't trigger restatpoings by checkpoint_segments, only
> checkpoint_timeout, but I wonder if there should be an option for that.
That's an option.
> MauMau, did you try simply reducing checkpoint_timeout, while doing
> recovery?
The problem is, we might not be able to perform restartpoints more aggressively
even if we reduce checkpoint_timeout in the server under the archive recovery.
Because the frequency of occurrence of restartpoints depends on not only that
checkpoint_timeout but also the checkpoints which happened while the server
was running.
Regards,
--
Fujii Masao