Re: Allow pg_archivecleanup to remove backup history files - Mailing list pgsql-hackers
| From | torikoshia |
|---|---|
| Subject | Re: Allow pg_archivecleanup to remove backup history files |
| Date | |
| Msg-id | e02ae6b16a32bdd2f4856e9b7f8a6439@oss.nttdata.com Whole thread Raw |
| In response to | Re: Allow pg_archivecleanup to remove backup history files (Michael Paquier <michael@paquier.xyz>) |
| Responses |
Re: Allow pg_archivecleanup to remove backup history files
|
| List | pgsql-hackers |
On 2023-05-15 09:18, Michael Paquier wrote:
> On Fri, May 12, 2023 at 05:53:45PM +0900, torikoshia wrote:
>> On 2023-05-10 17:52, Bharath Rupireddy wrote:
>> I was a little concerned about what to do when deleting both the files
>> ending in .gz and backup history files.
>> Is making it possible to specify both "-x .backup" and "-x .gz" the
>> way to
>> go?
>>
>> I also concerned someone might add ".backup" to WAL files, but does
>> that
>> usually not happen?
>
> Depends on the archive command, of course. I have seen code using
> this suffix for some segment names in the past, FWIW.
Thanks for the information.
I'm going to stop adding special logic for "-x .backup" and add a new
option for removing backup history files.
>>> Comments on the patch:
>>> 1. Why just only the backup history files? Why not remove the
>>> timeline
>>> history files too? Is it because there may not be as many tli
>>> switches
>>> happening as backups?
>>
>> Yeah, do you think we should also add logic for '-x .history'?
>
> Timeline history files can be critical pieces when it comes to
> assigning a TLI as these could be retrieved by a restore_command
> during recovery for a TLI jump or just assign a new TLI number at the
> end of recovery, still you could presumably remove the TLI history
> files that are older than the TLI the segment defined refers too.
> (pg_archivecleanup has no specific logic to look at the history with
> child TLIs for example, to keep it simple, and I'd rather keep it this
> way). There may be an argument for making that optional, of course,
> but it does not strike me as really necessary compared to the backup
> history files which are just around for debugging purposes.
Agreed.
>>> 2.+sub remove_backuphistoryfile_run_check
>>> +{
>>> Why to invent a new function when run_check() can be made generic
>>> with
>>> few arguments passed?
>>
>> Thanks, I'm going to reconsider it.
>
> + <para>
> + Remove files including backup history files, whose suffix is
> <filename>.backup</filename>.
> + Note that when <replaceable>oldestkeptwalfile</replaceable>
> is a backup history file,
> + specified file is kept and only preceding WAL files and
> backup history files are removed.
> + </para>
>
> This addition to the documentation looks unprecise to me. Backup
> history files have a more complex format than just the .backup
> suffix, and this is documented in backup.sgml.
I'm going to remove the explanation for the backup history files and
just add the hyperlink to the original explanation in backup.sgml.
> How about plugging in some long options, and use something more
> explicit like --clean-backup-history?
Agreed.
>
> - if ((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) &&
> + if (((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) ||
> + (removeBackupHistoryFile &&
> IsBackupHistoryFileName(walfile))) &&
> strcmp(walfile + 8, exclusiveCleanupFileName + 8) < 0)
>
> Could it be a bit cleaner to split this check in two, saving one level
> of indentation on the way for its most inner loop? I would imagine
> something like:
> /* Check file name */
> if (!IsXLogFileName(walfile) &&
> !IsPartialXLogFileName(walfile))
> continue;
> /* Check cutoff point */
> if (strcmp(walfile + 8, exclusiveCleanupFileName + 8) >= 0)
> continue;
> //rest of the code doing the unlinks.
> --
Thanks, that looks better.
--
Regards,
--
Atsushi Torikoshi
NTT DATA CORPORATION
pgsql-hackers by date: