Re: Allow pg_archivecleanup to remove backup history files - Mailing list pgsql-hackers
From | torikoshia |
---|---|
Subject | Re: Allow pg_archivecleanup to remove backup history files |
Date | |
Msg-id | e02ae6b16a32bdd2f4856e9b7f8a6439@oss.nttdata.com Whole thread Raw |
In response to | Re: Allow pg_archivecleanup to remove backup history files (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: Allow pg_archivecleanup to remove backup history files
|
List | pgsql-hackers |
On 2023-05-15 09:18, Michael Paquier wrote: > On Fri, May 12, 2023 at 05:53:45PM +0900, torikoshia wrote: >> On 2023-05-10 17:52, Bharath Rupireddy wrote: >> I was a little concerned about what to do when deleting both the files >> ending in .gz and backup history files. >> Is making it possible to specify both "-x .backup" and "-x .gz" the >> way to >> go? >> >> I also concerned someone might add ".backup" to WAL files, but does >> that >> usually not happen? > > Depends on the archive command, of course. I have seen code using > this suffix for some segment names in the past, FWIW. Thanks for the information. I'm going to stop adding special logic for "-x .backup" and add a new option for removing backup history files. >>> Comments on the patch: >>> 1. Why just only the backup history files? Why not remove the >>> timeline >>> history files too? Is it because there may not be as many tli >>> switches >>> happening as backups? >> >> Yeah, do you think we should also add logic for '-x .history'? > > Timeline history files can be critical pieces when it comes to > assigning a TLI as these could be retrieved by a restore_command > during recovery for a TLI jump or just assign a new TLI number at the > end of recovery, still you could presumably remove the TLI history > files that are older than the TLI the segment defined refers too. > (pg_archivecleanup has no specific logic to look at the history with > child TLIs for example, to keep it simple, and I'd rather keep it this > way). There may be an argument for making that optional, of course, > but it does not strike me as really necessary compared to the backup > history files which are just around for debugging purposes. Agreed. >>> 2.+sub remove_backuphistoryfile_run_check >>> +{ >>> Why to invent a new function when run_check() can be made generic >>> with >>> few arguments passed? >> >> Thanks, I'm going to reconsider it. > > + <para> > + Remove files including backup history files, whose suffix is > <filename>.backup</filename>. > + Note that when <replaceable>oldestkeptwalfile</replaceable> > is a backup history file, > + specified file is kept and only preceding WAL files and > backup history files are removed. > + </para> > > This addition to the documentation looks unprecise to me. Backup > history files have a more complex format than just the .backup > suffix, and this is documented in backup.sgml. I'm going to remove the explanation for the backup history files and just add the hyperlink to the original explanation in backup.sgml. > How about plugging in some long options, and use something more > explicit like --clean-backup-history? Agreed. > > - if ((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) && > + if (((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) || > + (removeBackupHistoryFile && > IsBackupHistoryFileName(walfile))) && > strcmp(walfile + 8, exclusiveCleanupFileName + 8) < 0) > > Could it be a bit cleaner to split this check in two, saving one level > of indentation on the way for its most inner loop? I would imagine > something like: > /* Check file name */ > if (!IsXLogFileName(walfile) && > !IsPartialXLogFileName(walfile)) > continue; > /* Check cutoff point */ > if (strcmp(walfile + 8, exclusiveCleanupFileName + 8) >= 0) > continue; > //rest of the code doing the unlinks. > -- Thanks, that looks better. -- Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
pgsql-hackers by date: