Re: WIP/PoC for parallel backup - Mailing list pgsql-hackers

From Asif Rehman
Subject Re: WIP/PoC for parallel backup
Date
Msg-id CADM=Jejj10qjvMs4aviFck+Qq2wwXedoFOm9unD-z1q7Dxzffg@mail.gmail.com
Whole thread Raw
In response to Re: WIP/PoC for parallel backup  (Asif Rehman <asifr.rehman@gmail.com>)
List pgsql-hackers


On Wed, Oct 30, 2019 at 7:16 PM Asif Rehman <asifr.rehman@gmail.com> wrote:


On Mon, Oct 28, 2019 at 8:29 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 28, 2019 at 10:03 AM Asif Rehman <asifr.rehman@gmail.com> wrote:
> I have updated the patch to include the changes suggested by Jeevan. This patch also implements the thread workers instead of
> processes and fetches a single file at a time. The tar format has been disabled for first version of parallel backup.

Looking at 0001-0003:

It's not clear to me what the purpose of the start WAL location is
supposed to be. As far as I can see, SendBackupFiles() stores it in a
variable which is then used for exactly nothing, and nothing else uses
it.  It seems like that would be part of a potential incremental
backup feature, but I don't see what it's got to do with parallel full
backup.

'startptr' is used by sendFile() during checksum verification. Since
SendBackupFiles() is using sendFIle we have to set a valid WAL location.


The tablespace_path option appears entirely unused, and I don't know
why that should be necessary here, either.

This is to calculate the basepathlen. We need to exclude the tablespace location (or 
base path) from the filename before it is sent to the client with sendFile call. I added
this option primarily to avoid performing string manipulation on filename to extract the
tablespace location and then calculate the basepathlen.

Alternatively we can do it by extracting the base path from the received filename. What
do you suggest?
 

STORE_BACKUPFILE() seems like maybe it should be a function rather
than a macro, and also probably be renamed, because it doesn't store
files and the argument's not necessarily a file.
Sure.
 

SendBackupManifest() does not send a backup manifest in the sense
contemplated by the email thread on that subject.  It sends a file
list.  That seems like the right idea - IMHO, anyway - but you need to
do a thorough renaming.

I'm considering the following command names:
START_BACKUP
- Starts the backup process

SEND_BACKUP_FILELIST (Instead of SEND_BACKUP_MANIFEST)
- Sends the list of all files (along with file information such as filename, file type (directory/file/link),
file size and file mtime for each file) to be backed up.

SEND_BACKUP_FILES
- Sends one or more files to the client.

STOP_BACKUP
- Stops the backup process.

I'll update the function names accordingly after your confirmation. Of course, suggestions for
better names are welcome.
 

I think it would be fine to decide that this facility won't support
exclusive-mode backup.

Sure. Will drop this patch.
 

I don't think much of having both sendDir() and sendDir_(). The latter
name is inconsistent with any naming convention we have, and there
seems to be no reason not to just add an argument to sendDir() and
change the callers. 

I think we should rename - perhaps as a preparatory patch - the
sizeonly flag to dryrun, or something like that.

Sure, will take care of it.


The resource cleanup does not look right.  You've included calls to
PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, 0) in both StartBackup()
and StopBackup(), but what happens if there is an error or even a
clean shutdown of the connection in between? I think that there needs
to be some change here to ensure that a walsender will always call
base_backup_cleanup() when it exits; I think that'd probably remove
the need for any PG_ENSURE_ERROR_CLEANUP calls at all, including ones
we have already.  This might also be something that could be done as a
separate, prepatory refactoring patch.

You're right. I didn't handle this case properly. I will removed PG_ENSURE_ERROR_CLEANUP
calls and replace it with before_shmem_exit handler. This way whenever backend process exits,
base_backup_cleanup will be called:
- If it exists before calling the do_pg_stop_backup, base_backup_cleanup will take care of cleanup.
- otherwise in case of a clean shutdown (after calling do_pg_stop_backup) then base_backup_cleanup
will simply return without doing anything.



The updated patches are attached.

Thanks,
 
--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: MarkBufferDirtyHint() and LSN update
Next
From: Robert Haas
Date:
Subject: Re: [Proposal] Global temporary tables