Re: [BUG] non archived WAL removed during production crash recovery - Mailing list pgsql-bugs

From Jehan-Guillaume de Rorthais
Subject Re: [BUG] non archived WAL removed during production crash recovery
Date
Msg-id 20200402154915.4984cff2@firost
Whole thread Raw
In response to Re: [BUG] non archived WAL removed during production crash recovery  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: [BUG] non archived WAL removed during production crash recovery
Re: [BUG] non archived WAL removed during production crash recovery
List pgsql-bugs
On Thu, 2 Apr 2020 19:38:59 +0900
Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

> On 2020/04/02 16:23, Kyotaro Horiguchi wrote:
> > At Thu, 2 Apr 2020 14:19:15 +0900, Fujii Masao
> > <masao.fujii@oss.nttdata.com> wrote in
[...]
> >> is whether to remove such WAL files in archive recovery case with
> >> archive_mode=on. Those WAL files would be required when recovering
> >> from the backup taken before that archive recovery happens.
> >> So it seems unsafe to remove them in that case.
> >
> > I'm not sure I'm getting the intention correctly, but I think it
> > responsibility of the operator to provide a complete set of archived
> > WAL files for a backup.  Could you elaborate what operation steps are
> > you assuming of?
>
> Please imagine the case where you need to do archive recovery
> from the database snapshot taken while there are many WAL files
> with .ready files. Those WAL files have not been archived yet.
> In this case, ISTM those WAL files should not be removed until
> they are archived, when archive_mode = on.

If you rely on snapshot without pg_start/stop_backup, I agree. Theses WAL
should be archived if:

* archive_mode >= on for primary
* archive_mode = always for standby

> >> Therefore, IMO that the patch should change the code so that
> >> no unarchived WAL files are removed not only in crash recovery
> >> but also archive recovery. Thought?
> >
> > Agreed if "an unarchived WAL" means "a WAL file that is marked .ready"
> > and it should be archived immediately.  My previous mail is written
> > based on the same thought.
>
> Ok, so our *current* consensus seems the followings. Right?
>
> - If archive_mode=off, any WAL files with .ready files are removed in
>     crash recovery, archive recoery and standby mode.

yes

> - If archive_mode=on, WAL files with .ready files are removed only in
>     standby mode. In crash recovery and archive recovery cases, they keep
>     remaining and would be archived after recovery finishes (i.e., during
>     normal processing).

yes

> - If archive_mode=always, in crash recovery, archive recovery and
>     standby mode, WAL files with .ready files are archived if WAL archiver
>     is running.

yes

> That is, WAL files with .ready files are removed when either
> archive_mode!=always in standby mode or archive_mode=off.

sounds fine to me.

[...]
> >>>>>> Another is to make the startup process remove .ready file if
> >>>>>> necessary.
> >>>>>
> >>>>> I'm not sure to understand this one.
> >>
> >> I was thinking to make the startup process remove such unarchived WAL
> >> files
> >> if archive_mode=on and StandbyModeRequested/ArchiveRecoveryRequested
> >> is true.

Ok, understood.

> > As mentioned above, I don't understand the point of preserving WAL
> > files that are either marked as .ready or not marked at all on a
> > standby with archive_mode=on.
>
> Maybe yes. But I'm not confident about that there is no such case.

Well, it seems to me that this is what you suggested few paragraph away:

  «.ready files are removed when either archive_mode!=always in standby mode»




pgsql-bugs by date:

Previous
From: Jehan-Guillaume de Rorthais
Date:
Subject: Re: [BUG] non archived WAL removed during production crash recovery
Next
From: Fujii Masao
Date:
Subject: Re: [BUG] non archived WAL removed during production crash recovery