Hi, Hackers
I noticed something strange. Does it cause nothing?
I didn't detect anything, but feel restless.
Step:
- There are two standbys that connect to primary.
- Kill primary and promote one standby.
- Restart another standby that is reset primary_conninfo to connect new primary.
I expected that the latest WAL segment file in old timeline is renamed with .partial suffix,
but it's not renamed in the restarted standby.
xlog.c says the following, but I didn't understand the bad situation.
* the archive. It's physically present in the new file with new TLI,
* but recovery won't look there when it's recovering to the older
--> * timeline. On the other hand, if we archive the partial segment, and
--> * the original server on that timeline is still running and archives
--> * the completed version of the same segment later, it will fail. (We
* used to do that in 9.4 and below, and it caused such problems).
*
* As a compromise, we rename the last segment with the .partial
* suffix, and archive it. Archive recovery will never try to read
* .partial segments, so they will normally go unused. But in the odd
* PITR case, the administrator can copy them manually to the pg_wal
* directory (removing the suffix). They can be useful in debugging,
* too.
Regards
Ryo Matsumura