Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?) - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?) |
Date | |
Msg-id | CALj2ACWNHCEefPXqFw7LfZa_b3UAi4uvvv6tf7qQSjRF2tL61A@mail.gmail.com Whole thread Raw |
In response to | Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?) (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)
|
List | pgsql-hackers |
On Wed, Sep 21, 2022 at 12:27 PM Michael Paquier <michael@paquier.xyz> wrote: > > >>> I've also taken help of the error callback mechanism to clean up the > >>> allocated memory in case of a failure. For do_pg_abort_backup() cases, > >>> I think we don't need to clean the memory because that gets called on > >>> proc exit (before_shmem_exit()). > >> > >> Memory could still bloat while the process running the SQL functions > >> is running depending on the error code path, anyway. > > > > I didn't get your point. Can you please elaborate it? I think adding > > error callbacks at the right places would free up the memory for us. > > Please note that we already are causing memory leaks on HEAD today. > > I mean that HEAD makes no effort in freeing this memory in > TopMemoryContext on session ERROR. Correct. We can also solve it as part of this commit. Please let me know your thoughts on 0002 patch. > I have a few comments on 0001. > > +#include <access/xlogbackup.h> > Double quotes wanted here. Ah, my bad. Corrected now. > deallocate_backup_variables() is the only part of xlogbackup.c that > includes references of the tablespace map_and backup_label > StringInfos. I would be tempted to fully decouple that from > xlogbackup.c/h for the time being. There's no problem with it IMO, after all, they are backup related variables. And that function reduces a bit of duplicate code. > - tblspc_map_file = makeStringInfo(); > Not sure that there is a need for a rename here. We're referring tablespace_map and backup_label internally all around, just to be in sync, I wanted to rename it while we're refactoring this code. > +void > +build_backup_content(BackupState *state, bool ishistoryfile, StringInfo str) > +{ > It would be more natural to have build_backup_content() do by itself > the initialization of the StringInfo for the contents of backup_label > and return it as a result of the routine? This is short-lived in > xlogfuncs.c when the backup ends. See the below explaination. > @@ -248,18 +250,25 @@ perform_base_backup(basebackup_options *opt, bbsink *sink) > [...] > + /* Construct backup_label contents. */ > + build_backup_content(backup_state, false, backup_label); > > Actually, for base backups, perhaps it would be more intuitive to > build and free the StringInfo of the backup_label when we send it for > base.tar rather than initializing it at the beginning and freeing it > at the end? sendFileWithContent() is in a for-loop and we are good if we call build_backup_content() before do_pg_backup_start() just once. Also, allocating backup_label in the for loop makes error handling trickier, how do we free-up when sendFileWithContent() errors out? Of course, we can allocate backup_label once even in the for loop with bool first_time sort of variable and store StringInfo *ptr_backup_label; in error callback info, but that would make things unnecessarily complex, instead we're good allocating and creating backup_label content at the beginning and freeing-it up at the end. > - * pg_backup_start: set up for taking an on-line backup dump > + * pg_backup_start: start taking an on-line backup. > * > - * Essentially what this does is to create a backup label file in $PGDATA, > - * where it will be archived as part of the backup dump. The label file > - * contains the user-supplied label string (typically this would be used > - * to tell where the backup dump will be stored) and the starting time and > - * starting WAL location for the dump. > + * This function starts the backup and creates tablespace_map contents. > > The last part of the comment is still correct while the former is not, > so this loses some information. Added that part before pg_backup_stop() now where it makes sense with the refactoring. I'm attaching the v10 patch-set with the above review comments addressed. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: