Re: WIP/PoC for parallel backup - Mailing list pgsql-hackers

From Ahsan Hadi
Subject Re: WIP/PoC for parallel backup
Date
Msg-id CA+9bhCLKArdiEGETNyx36McDs+ssEDG=bKfj56p5ib8aLxndLw@mail.gmail.com
Whole thread Raw
In response to Re: WIP/PoC for parallel backup  (Rushabh Lathia <rushabh.lathia@gmail.com>)
Responses Re: WIP/PoC for parallel backup
List pgsql-hackers


On Mon, May 4, 2020 at 6:22 PM Rushabh Lathia <rushabh.lathia@gmail.com> wrote:


On Thu, Apr 30, 2020 at 4:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Apr 29, 2020 at 6:11 PM Suraj Kharage
<suraj.kharage@enterprisedb.com> wrote:
>
> Hi,
>
> We at EnterpriseDB did some performance testing around this parallel backup to check how this is beneficial and below are the results. In this testing, we run the backup -
> 1) Without Asif’s patch
> 2) With Asif’s patch and combination of workers 1,2,4,8.
>
> We run those test on two setup
>
> 1) Client and Server both on the same machine (Local backups)
>
> 2) Client and server on a different machine (remote backups)
>
>
> Machine details:
>
> 1: Server (on which local backups performed and used as server for remote backups)
>
> 2: Client (Used as a client for remote backups)
>
>
...
>
>
> Client & Server on the same machine, the result shows around 50% improvement in parallel run with worker 4 and 8.  We don’t see the huge performance improvement with more workers been added.
>
>
> Whereas, when the client and server on a different machine, we don’t see any major benefit in performance.  This testing result matches the testing results posted by David Zhang up thread.
>
>
>
> We ran the test for 100GB backup with parallel worker 4 to see the CPU usage and other information. What we noticed is that server is consuming the CPU almost 100% whole the time and pg_stat_activity shows that server is busy with ClientWrite most of the time.
>
>

Was this for a setup where the client and server were on the same
machine or where the client was on a different machine?  If it was for
the case where both are on the same machine, then ideally, we should
see ClientRead events in a similar proportion?

In the particular setup, the client and server were on different machines. 


During an offlist discussion with Robert, he pointed out that current
basebackup's code doesn't account for the wait event for the reading
of files which can change what pg_stat_activity shows?  Can you please
apply his latest patch to improve basebackup.c's code [1] which will
take care of that waitevent before getting the data again?

[1] - https://www.postgresql.org/message-id/CA%2BTgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA%40mail.gmail.com


Sure, we can try out this and do a similar run to collect the pg_stat_activity output.

Have you had the chance to try this out?
 
 
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




--
Rushabh Lathia


--
Highgo Software (Canada/China/Pakistan)
URL : http://www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
EMAIL: mailto: ahsan.hadi@highgo.ca

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Parallel Seq Scan vs kernel read ahead
Next
From: Rushabh Lathia
Date:
Subject: Re: WIP/PoC for parallel backup