Re: Duplicate history file? - Mailing list pgsql-hackers
From | Tatsuro Yamada |
---|---|
Subject | Re: Duplicate history file? |
Date | |
Msg-id | 606d269f-d902-0b21-2bbf-892aaeda2dc9@nttcom.co.jp_1 Whole thread Raw |
In response to | Re: Duplicate history file? (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Duplicate history file?
|
List | pgsql-hackers |
Hi Horiguchi-san, >> Regarding "test ! -f", >> I am wondering how many people are using the test command for >> archive_command. If I remember correctly, the guide provided by >> NTT OSS Center that we are using does not recommend using the test >> command. > > I think, as the PG-REX documentation says, the simple cp works well as > far as the assumption of PG-REX - no double failure happenes, and > following the instruction - holds. I believe that this assumption started to be wrong after archive_mode=always was introduced. As far as I can tell, it doesn't happen when it's archive_mode=on. > On the other hand, I found that the behavior happens more generally. > > If a standby with archive_mode=always craches, it starts recovery from > the last checkpoint. If the checkpoint were in a archived segment, the > restarted standby will fetch the already-archived segment from archive > then fails to archive it. (The attached). > > So, your fear stated upthread is applicable for wider situations. The > following suggestion is rather harmful for the archive_mode=always > setting. > > https://www.postgresql.org/docs/14/continuous-archiving.html >> The archive command should generally be designed to refuse to >> overwrite any pre-existing archive file. This is an important safety >> feature to preserve the integrity of your archive in case of >> administrator error (such as sending the output of two different >> servers to the same archive directory). > > I'm not sure how we should treat this.. Since archive must store > files actually applied to the server data, just being already archived > cannot be the reason for omitting archiving. We need to make sure the > new file is byte-identical to the already-archived version. We could > compare just *restored* file to the same file in pg_wal but it might > be too much of penalty for for the benefit. (Attached second file.) Thanks for creating the patch! > Otherwise the documentation would need someting like the following if > we assume the current behavior. > >> The archive command should generally be designed to refuse to >> overwrite any pre-existing archive file. This is an important safety >> feature to preserve the integrity of your archive in case of >> administrator error (such as sending the output of two different >> servers to the same archive directory). > + For standby with the setting archive_mode=always, there's a case where > + the same file is archived more than once. For safety, it is > + recommended that when the destination file exists, the archive_command > + returns zero if it is byte-identical to the source file. Agreed. That is same solution as I mentioned earlier. If possible, it also would better to write it postgresql.conf (that might be overkill?!). Regards, Tatsuro Yamada
pgsql-hackers by date: