Re: [BUG] non archived WAL removed during production crash recovery - Mailing list pgsql-bugs

From Jehan-Guillaume de Rorthais
Subject Re: [BUG] non archived WAL removed during production crash recovery
Date
Msg-id 20200414180923.759f4d9c@firost
Whole thread Raw
In response to Re: [BUG] non archived WAL removed during production crash recovery  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-bugs
On Fri, 10 Apr 2020 11:00:31 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
[...]
> > > but the mistake here is it thinks that inRecovery represents whether it is
> > > running as a standby or not, but actually it is true on primary during
> > > crash recovery.  
> > 
> > Indeed.
> >   
> > > On the other hand, with the patch, standby with archive_mode=on
> > > wrongly archives WAL files during crash recovery.  
> > 
> > "without the patch" you mean? You are talking about 78ea8b5daab, right?  
> 
> No. I menat the v4 patch in [1].
> 
> [1] https://www.postgresql.org/message-id/20200407171736.61906608%40firost
> 
> Prior to appllying the patch (that is the commit 78ea..),
> XLogArchiveCheckDone() correctly returns true (= allow to remove it)
> for the same condition.
> 
> The proposed patch does the folloing thing.
> 
> if (!XLogArchivingActive() ||
>     recoveryState == ARCHIVE_RECOVERING && !XLogArchivingAlways())
>     return true;
> 
> It doesn't return for the condition "recoveryState=CRASH_RECOVERING
> and archive_mode = on". Then the WAL files is mitakenly marked as
> ".ready" if not marked yet.

Indeed.

But .ready files are then deleted during the first restartpoint. I'm not sure
how to fix this behavior without making the code too complex.

This is discussed in my last answer to Michael few minutes ago as well.

> By the way, the code seems not following the convention a bit
> here. Let the inserting code be in the same style to the existing code
> around.
> 
> +    if ( ! XLogArchivingActive() )
> 
> I think we don't put the  spaces within the parentheses above.

Indeed, This is fixed in patch v5 sent few minutes ago.

> | ARCHIVE_RECOVERING/CRASH_RECOVERING/NOT_IN_RECOVERY
> 
> The first two and the last one are in different style. *I* prefer them
> (if we use it) be "IN_ARCHIVE_RECOVERY/IN_CRASH_RECOVERY/NOT_IN_RECOVERY".

I like Michael's proposal. See v5 of the patch.

Thank you for your review!

Regards,



pgsql-bugs by date:

Previous
From: Jehan-Guillaume de Rorthais
Date:
Subject: Re: [BUG] non archived WAL removed during production crash recovery
Next
From: PG Bug reporting form
Date:
Subject: BUG #16362: yum repo: duplicated definition