Re: WIP/PoC for parallel backup - Mailing list pgsql-hackers

From Jeevan Chalke
Subject Re: WIP/PoC for parallel backup
Date
Msg-id CAM2+6=VcMgSje+d7rtbcmqioymZKNqkoDMFSzacHjTfXEe=Bfw@mail.gmail.com
Whole thread Raw
In response to Re: WIP/PoC for parallel backup  (Asif Rehman <asifr.rehman@gmail.com>)
Responses Re: WIP/PoC for parallel backup  (Asif Rehman <asifr.rehman@gmail.com>)
List pgsql-hackers
Hi Asif,

On Thu, Jan 30, 2020 at 7:10 PM Asif Rehman <asifr.rehman@gmail.com> wrote:

Here are the the updated patches, taking care of the issues pointed
earlier. This patch adds the following commands (with specified option):

START_BACKUP [LABEL '<label>'] [FAST]
STOP_BACKUP [NOWAIT]
LIST_TABLESPACES [PROGRESS]
LIST_FILES [TABLESPACE]
LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
SEND_FILES '(' FILE, FILE... ')' [START_WAL_LOCATION 'X/X']
                            [NOVERIFY_CHECKSUMS]


Parallel backup is not making any use of tablespace map, so I have
removed that option from the above commands. There is a patch pending
to remove the exclusive backup; we can further refactor the do_pg_start_backup
function at that time, to remove the tablespace information and move the
creation of tablespace_map file to the client.


I have disabled the maxrate option for parallel backup. I intend to send
out a separate patch for it. Robert previously suggested to implement
throttling on the client-side. I found the original email thread [1] 
where throttling was proposed and added to the server. In that thread,
it was originally implemented on the client-side, but per many suggestions,
it was moved to server-side.

So, I have a few suggestions on how we can implement this:

1- have another option for pg_basebackup (i.e. per-worker-maxrate) where
the user could choose the bandwidth allocation for each worker. This approach
can be implemented on the client-side as well as on the server-side.

2- have the maxrate, be divided among workers equally at first. and the 
let the main thread keep adjusting it whenever one of the workers finishes.
I believe this would only be possible if we handle throttling on the client.
Also, as I understand it, implementing this will introduce additional mutex
for handling of bandwidth consumption data so that rate may be adjusted 
according to data received by threads.


--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca



The latest changes look good to me. However, the patch set is missing the documentation.
Please add those.

Thanks

--
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Is custom MemoryContext prohibited?
Next
From: Craig Ringer
Date:
Subject: Re: Postgres 32 bits client compilation fail. Win32 bits client is supported?