Home > mailing lists

Re: WIP/PoC for parallel backup - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: WIP/PoC for parallel backup
Date	October 4, 2019 12:07:42
Msg-id	CA+Tgmobd+Lh8sO7V4wow3-9cfOf45MoyDh3pHP+3mp+8VgNh_w@mail.gmail.com Whole thread Raw
In response to	Re: WIP/PoC for parallel backup (Asif Rehman <asifr.rehman@gmail.com>)
List	pgsql-hackers

Tree view

On Fri, Oct 4, 2019 at 7:02 AM Asif Rehman <asifr.rehman@gmail.com> wrote:
> Based on my understanding your main concern is that the files won't be distributed fairly i.e one worker might get a
bigfile and take more time while others get done early with smaller files? In this approach I have created a list of
filesin descending order based on there sizes so all the big size files will come at the top. The maximum file size in
PGis 1GB so if we have four workers who are picking up file from the list one by one, the worst case scenario is that
oneworker gets a file of 1GB to process while others get files of smaller size. However with this approach of
descendingfiles based on size and handing it out to workers one by one, there is a very high likelihood of workers
gettingwork evenly. does this address your concerns? 

Somewhat, but I'm not sure it's good enough. There are lots of reasons
why two processes that are started at the same time with the same
amount of work might not finish at the same time.

I'm also not particularly excited about having the server do the
sorting based on file size.  Seems like that ought to be the client's
job, if the client needs the sorting.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Fujii Masao
Date: 04 October 2019, 12:03:18
Subject: Re: recovery_min_apply_delay in archive recovery causes assertionfailure in latch

From: Robert Haas
Date: 04 October 2019, 12:25:56
Subject: Re: let's kill AtSubStart_Notify

Re: WIP/PoC for parallel backup - Mailing list pgsql-hackers

Previous

Next