Re: [BUG] non archived WAL removed during production crash recovery - Mailing list pgsql-bugs
From | Jehan-Guillaume de Rorthais |
---|---|
Subject | Re: [BUG] non archived WAL removed during production crash recovery |
Date | |
Msg-id | 20200416234308.3e5f64bf@firost Whole thread Raw |
In response to | Re: [BUG] non archived WAL removed during production crash recovery (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: [BUG] non archived WAL removed during production crash recovery
Re: [BUG] non archived WAL removed during production crash recovery |
List | pgsql-bugs |
On Thu, 16 Apr 2020 17:11:00 +0900 Michael Paquier <michael@paquier.xyz> wrote: > On Tue, Apr 14, 2020 at 06:03:19PM +0200, Jehan-Guillaume de Rorthais wrote: > > Thanks for the new version. Thank you for v6. > > On Mon, 13 Apr 2020 16:14:14 +0900 Michael Paquier <michael@paquier.xyz> > > wrote: > >> It seems to me that the initial value of SharedRecoveryState should be > >> CRASH_RECOVERING all the time no? StartupXLOG() is a code path taken > >> even if starting cleanly, and the flag would be reset correctly at the > >> end to NOT_IN_RECOVERY. > > > > Previous version of the patch was setting CRASH_RECOVERING. Fujii-san > > reported (the 4 Apr 2020 02:49:50 +0900) that it could be useful to expose > > a better value until relevant code is reached in StartupXLOG() so > > GetRecoveryState() returns a safer value for futur use. > > > > As I answered upthread, I'm not sure who would need this information before > > the WAL machinery is up though. Note that ARCHIVE_RECOVERING and > > CRASH_RECOVERING are compatible with the previous behavior. > > I am not sure either if we need this information until the startup > process is up, but even if we need it I'd rather keep the code > consistent with the past practice, which was that > SharedRecoveryInProgress got set to true, the equivalent of crash > recovery as that's more generic than the archive recovery switch. OK. > > Maybe the solution would be to init with CRASH_RECOVERING and add some > > comment in GetRecoveryState() to warn the state is "enforced" after the > > XLOG machinery is started and is init'ed to RECOVERING in the meantime? > > > > I initialized it to CRASH_RECOVERING in the new v5 patch and added a comment > > to GetRecoveryState(). > > Not sure that the comment is worth it. Your description of the state > looks enough, and the code is clear that we have just an initial > shared memory state in this case. OK. Will remove later after your review of the tests. > >> It would be good to mention that while in crash recovery, files are > >> not considered as deletable. > > > > Well, in fact, I am still wondering about this. I was hesitant to add a > > shortcut like: > > > > /* no WAL segment cleanup during crash recovery */ > > if (recoveryState == RECOVERT_STATE_CRASH) > > return false; > > > > But, what if for example we crashed for lack of disk space during intensive > > writes? During crash recovery, any WAL marked as .done could be removed and > > allow the system to start again and maybe make even further WAL cleanup by > > archiving some more WAL without competing with previous high write ratio. > > > > When we recover a primary, this behavior seems conform with any value of > > archive_mode, even if it has been changed after crash and before starting > > it. On a standby, we might create .ready files, but they will be removed > > during the first restartpoint if needed. > > I guess that you mean .ready and not .done here? No, I meant .done. > Removing .done files does not matter as the segments related to them are > already gone. Not necessarily. There's a time windows between the moment the archiver set the .done file and when the checkpointer removes the associated WAL file. So, after a PANIC because of lack of space, WAL associated with .done files but not removed yet will be removed during the crash recovery. > Even with that, why should we need to make the backend smarter about > the removal of .ready files during crash recovery. It seems to me > that we should keep them, and an operator could always come by himself > and do some manual cleanup to free some space in the pg_wal partition. We are agree on this. > > I needed a failure so I can test pg_stat_archiver reports it as well. In my > > mind, using "false" would either trigger a failure because false returns 1 > > or... a failure because the command is not found. In either way, the result > > is the same. > > > > Using poll_query_until+pg_stat_file, is a good idea, but not enough as > > archiver reports a failure some moment after the .ready signal file > > appears. > > Using an empty string makes the test more portable, but while I looked > at it I have found out that it leads to delays in the archiver except > if you force the generation of more segments in the test, causing the > logic to get more complicated with the manipulation of the .ready and > .done files. And I was then finding myself to add an > archive_timeout.. Anyway, this reduced the readability of the test so > I am pretty much giving up on this idea. Unless I'm wrong, the empty string does not raise an error in pg_stat_archiver, and I wanted to add a test on this as well. > >> The end of the tests actually relies on the > >> fact that archive_command is set to "false" when the cold backup is > >> taken, before resetting it. I think that it would be cleaner to > >> enforce the configuration you want to test before starting each > >> standby. It becomes easier to understand the flow of the test for the > >> reader. > > > > done as well. > > I have put my hands on the code, and attached is a cleaned up > version for the backend part. Below are some notes. > > + * RECOVERT_STATE_ARCHIVE is set for archive recovery or for a > standby. > Typo here that actually comes from my previous email, and that you > blindly copy-pasted, repeated five times in the tree actually. Oops...yes, even in a comment with RECOVERT_STATE_NONE :/ Sorry. > + RecoveryState recoveryState = GetRecoveryState(); > + > + /* The file is always deletable if archive_mode is "off". */ > + if (!XLogArchivingActive()) > + return true; > There is no point to call GetRecoveryState() is archive_mode = off. good catch! > + * There's two different reasons to recover WAL: when standby mode is > requested > + * or after a crash to catchup with consistency. > Nit: s/There's/There are/. Anyway, I don't see much point in keeping > this comment as the description of each state value should be enough, > so I have removed it. OK > I am currently in the middle of reviewing the test and there are a > couple of things that can be improved but I lack of time today, so > I'll continue tomorrow on it. There is no need to send two separate > patches by the way as the code paths touched are different. OK. Thanks for your review! Let me know if you want me to add/change/fix some tests. Regards,
pgsql-bugs by date: