Re: [9.3 bug] disk space in pg_xlog increases during archive recovery - Mailing list pgsql-hackers

From MauMau
Subject Re: [9.3 bug] disk space in pg_xlog increases during archive recovery
Date
Msg-id 60EB3E4B873541D393B6364D1F008EDA@maumau
Whole thread Raw
In response to Re: [9.3 bug] disk space in pg_xlog increases during archive recovery  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: [9.3 bug] disk space in pg_xlog increases during archive recovery
List pgsql-hackers
From: "Andres Freund" <andres@2ndquadrant.com>
> On 2014-02-02 23:50:40 +0900, Fujii Masao wrote:
>> Right. If standby_mode is enabled, checkpoint_segment can trigger
>> the restartpoint. But the problem is that the timing of restartpoint
>> depends on not only the checkpoint parameters (i.e.,
>> checkpoint_timeout and checkpoint_segments) that are used during
>> archive recovery but also the checkpoint WAL that was generated
>> by the master.
>
> Sure. But we really *need* all the WAL since the last checkpoint's redo
> location locally to be safe.
>
>> For example, could you imagine the case where the master generated
>> only one checkpoint WAL since the last backup and it crashed with
>> database corruption. Then DBA decided to perform normal archive
>> recovery by using the last backup. In this case, even if DBA reduces
>> both checkpoint_timeout and checkpoint_segments, only one
>> restartpoint can occur during recovery. This low frequency of
>> restartpoint might fill up the disk space with lots of WAL files.
>
> I am not sure I understand the point of this scenario. If the primary
> crashed after a checkpoint, there won't be that much WAL since it
> happened...
>
>> > If the issue is that you're not using standby_mode (if so, why?), then
>> > the fix maybe is to make that apply to a wider range of situations.
>>
>> I guess that he is not using standby_mode because, according to
>> his first email in this thread, he said he would like to prevent WAL
>> from accumulating in pg_xlog during normal archive recovery (i.e., PITR).
>
> Well, that doesn't necessarily prevent you from using
> standby_mode... But yes, that might be the case.
>
> I wonder if we shouldn't just always look at checkpoint segments during
> !crash recovery.

Maybe we could consider in that direction, but there is a problem.  Archive
recovery slows down compared to 9.1, because of repeated restartpoints.
Archive recovery should be as fast as possible, because it typically applies
dozens or hundreds of WAL files, and the DBA desires immediate resumption of
operation.

So, I think we should restore 9.1 behavior for archive recovery.  The
attached patch keeps restored archived WAL in pg_xlog/ only during standby
recovery.  It is based on Fujii-san's revison of the patch, with
AllowCascadeReplication() condition removed from two if statements.

Regards
MauMau

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Next
From: "MauMau"
Date:
Subject: Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease