Re: Allow pg_archivecleanup to remove backup history files - Mailing list pgsql-hackers

From torikoshia
Subject Re: Allow pg_archivecleanup to remove backup history files
Date
Msg-id e02ae6b16a32bdd2f4856e9b7f8a6439@oss.nttdata.com
Whole thread Raw
In response to Re: Allow pg_archivecleanup to remove backup history files  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Allow pg_archivecleanup to remove backup history files
List pgsql-hackers
On 2023-05-15 09:18, Michael Paquier wrote:
> On Fri, May 12, 2023 at 05:53:45PM +0900, torikoshia wrote:
>> On 2023-05-10 17:52, Bharath Rupireddy wrote:
>> I was a little concerned about what to do when deleting both the files
>> ending in .gz and backup history files.
>> Is making it possible to specify both "-x .backup" and "-x .gz" the 
>> way to
>> go?
>> 
>> I also concerned someone might add ".backup" to WAL files, but does 
>> that
>> usually not happen?
> 
> Depends on the archive command, of course.  I have seen code using
> this suffix for some segment names in the past, FWIW.

Thanks for the information.
I'm going to stop adding special logic for "-x .backup" and add a new
option for removing backup history files.

>>> Comments on the patch:
>>> 1. Why just only the backup history files? Why not remove the 
>>> timeline
>>> history files too? Is it because there may not be as many tli 
>>> switches
>>> happening as backups?
>> 
>> Yeah, do you think we should also add logic for '-x .history'?
> 
> Timeline history files can be critical pieces when it comes to
> assigning a TLI as these could be retrieved by a restore_command
> during recovery for a TLI jump or just assign a new TLI number at the
> end of recovery, still you could presumably remove the TLI history
> files that are older than the TLI the segment defined refers too.
> (pg_archivecleanup has no specific logic to look at the history with
> child TLIs for example, to keep it simple, and I'd rather keep it this
> way).  There may be an argument for making that optional, of course,
> but it does not strike me as really necessary compared to the backup
> history files which are just around for debugging purposes.

Agreed.

>>> 2.+sub remove_backuphistoryfile_run_check
>>> +{
>>> Why to invent a new function when run_check() can be made generic 
>>> with
>>> few arguments passed?
>> 
>> Thanks, I'm going to reconsider it.
> 
> +       <para>
> +         Remove files including backup history files, whose suffix is
> <filename>.backup</filename>.
> +         Note that when <replaceable>oldestkeptwalfile</replaceable>
> is a backup history file,
> +         specified file is kept and only preceding WAL files and
> backup history files are removed.
> +       </para>
> 
> This addition to the documentation looks unprecise to me.  Backup
> history files have a more complex format than just the .backup
> suffix, and this is documented in backup.sgml.

I'm going to remove the explanation for the backup history files and
just add the hyperlink to the original explanation in backup.sgml.

> How about plugging in some long options, and use something more
> explicit like --clean-backup-history?

Agreed.

> 
> -   if ((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) &&
> +   if (((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) ||
> +           (removeBackupHistoryFile && 
> IsBackupHistoryFileName(walfile))) &&
>             strcmp(walfile + 8, exclusiveCleanupFileName + 8) < 0)
> 
> Could it be a bit cleaner to split this check in two, saving one level
> of indentation on the way for its most inner loop?  I would imagine
> something like:
>     /* Check file name */
>     if (!IsXLogFileName(walfile) &&
>     !IsPartialXLogFileName(walfile))
>     continue;
>     /* Check cutoff point */
>     if (strcmp(walfile + 8, exclusiveCleanupFileName + 8) >= 0)
>         continue;
>     //rest of the code doing the unlinks.
> --
Thanks, that looks better.

-- 
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Kirk Wolak
Date:
Subject: psql: Could we get "-- " prefixing on the **** QUERY **** outputs? (ECHO_HIDDEN)
Next
From: Michael Paquier
Date:
Subject: Re: Allow pg_archivecleanup to remove backup history files