Re: Parallel copy - Mailing list pgsql-hackers

From vignesh C
Subject Re: Parallel copy
Date
Msg-id CALDaNm0q-Nh+V_Lj87QndFiXM=FMUA=nk1SGQSZ+J0ozZg-AEQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel copy  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Parallel copy
List pgsql-hackers


On Thu, Aug 27, 2020 at 8:04 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> - Parallel Copy with 1 worker ran slower than normal Copy in a couple
> of cases (I did question if allowing 1 worker was useful in my patch
> review).

Thanks Greg for your review & testing.
I had executed various tests with 1GB, 2GB & 5GB with 100 columns without parallel mode & with 1 parallel worker. Test result for the same is as given below:
TestWithout parallel modeWith 1 Parallel worker
1GB csv file 100 columns
(100 bytes data in each column)
62 seconds47 seconds (1.32X)
1GB csv file 100 columns
(1000 bytes data in each column)
89 seconds78 seconds (1.14X)
2GB csv file 100 columns
(1 byte data in each column)
277 seconds256 seconds (1.08X)
5GB csv file 100 columns
(100 byte data in each column)
515 seconds445 seconds (1.16X)

I have run the tests multiple times and have noticed the similar execution times in all the runs for the above tests.
In the above results there is slight improvement with 1 worker. In my tests I did not observe the degradation for copy with 1 worker compared to the non parallel copy. Can you share with me the script you used to generate the data & the ddl of the table, so that it will help me check that scenario you faced the problem.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Append with naive multiplexing of FDWs
Next
From: Magnus Hagander
Date:
Subject: Re: file_fdw vs relative paths