Re: pg_rewind WAL segments deletion pitfall - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: pg_rewind WAL segments deletion pitfall |
Date | |
Msg-id | 20220927.165054.2142431385277288474.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: pg_rewind WAL segments deletion pitfall (Polina Bungina <bungina@gmail.com>) |
Responses |
Re: pg_rewind WAL segments deletion pitfall
|
List | pgsql-hackers |
At Thu, 1 Sep 2022 13:33:09 +0200, Polina Bungina <bungina@gmail.com> wrote in > Here is the new version of the patch that includes the changes you > suggested. It is smaller now but I doubt if it is as easy to understand as > it used to be. pg_rewind works in two steps. First it constructs file map which decides the action for each file, then second, it performs file operations according to the file map. So, if we are going to do something on some files, that action should be record that in the file map, I think. Regarding the the patch, pg_rewind starts reading segments from the divergence point back to the nearest checkpoint, then moves foward during rewinding. So, the fact that SimpleXLogPageRead have read a segment suggests that the segment is required during the next startup. So I don't think we need to move around the keepWalSeg flag. All files that are wanted while rewinding should be preserved unconditionally. It's annoying that the file path for file map and open(2) have different top directory. But sharing the same path string between the two seems rather ugly.. I feel uncomfortable to directly touch the internal of file_entry_t outside filemap.c. I'd like to hide the internals in filemap.c, but pg_rewind already does that.. + /* + * Some entries (WAL segments) already have an action assigned + * (see SimpleXLogPageRead()). + */ + if (entry->action == FILE_ACTION_NONE) + continue; entry->action = decide_file_action(entry); It might be more reasonable to call decide_file_action() when action is UNDECIDED. > The need of manipulations with the target’s pg_wal/archive_status directory > is a question to discuss… > > At first glance it seems to be useless for .ready files: checkpointer > process will anyway recreate them if archiving is enabled on the rewound > old primary and we will finally have them in the archive. As for the .done > files, it seems reasonable to follow the pg_basebackup logic and keep .done > files together with the corresponding segments (those between the last > common checkpoint and the point of divergence) to protect them from being > archived once again. > > But on the other hand it seems to be not that straightforward: imaging we > have WAL segment X on the target along with X.done file and we decide to > preserve them both (or we download it from archive and force .done file > creation), while archive_mode was set to ‘always’ and the source (promoted > replica) also still has WAL segment X and X.ready file. After pg_rewind we > will end up with both X.ready and X.done, which seems to be not a good > situation (but most likely not critical either). Thanks for the thought. Yes, it's not so straight-forward. And, as you mentioned, the worst result comes from not doing that is that some already-archived segments are archived at next run, which is generally harmless. So I think we're ok to ignore that in this patdh then create other patch if we still want to do that. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: