Home > mailing lists

Re: Parallel copy - Mailing list pgsql-hackers

From	vignesh C
Subject	Re: Parallel copy
Date	August 31, 2020 13:43:48
Msg-id	CALDaNm0q-Nh+V_Lj87QndFiXM=FMUA=nk1SGQSZ+J0ozZg-AEQ@mail.gmail.com Whole thread Raw
In response to	Re: Parallel copy (Greg Nancarrow <gregn4422@gmail.com>)
Responses	Re: Parallel copy
List	pgsql-hackers

Tree view

On Thu, Aug 27, 2020 at 8:04 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> - Parallel Copy with 1 worker ran slower than normal Copy in a couple
> of cases (I did question if allowing 1 worker was useful in my patch
> review).

Thanks Greg for your review & testing.
I had executed various tests with 1GB, 2GB & 5GB with 100 columns without parallel mode & with 1 parallel worker. Test result for the same is as given below:

Test	Without parallel mode	With 1 Parallel worker
1GB csv file 100 columns (100 bytes data in each column)	62 seconds	47 seconds (1.32X)
1GB csv file 100 columns (1000 bytes data in each column)	89 seconds	78 seconds (1.14X)
2GB csv file 100 columns (1 byte data in each column)	277 seconds	256 seconds (1.08X)
5GB csv file 100 columns (100 byte data in each column)	515 seconds	445 seconds (1.16X)

I have run the tests multiple times and have noticed the similar execution times in all the runs for the above tests.
In the above results there is slight improvement with 1 worker. In my tests I did not observe the degradation for copy with 1 worker compared to the non parallel copy. Can you share with me the script you used to generate the data & the ddl of the table, so that it will help me check that scenario you faced the problem.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Etsuro Fujita
Date: 31 August 2020, 13:10:39
Subject: Re: Append with naive multiplexing of FDWs

From: Magnus Hagander
Date: 31 August 2020, 14:10:58
Subject: Re: file_fdw vs relative paths

Re: Parallel copy - Mailing list pgsql-hackers

Previous

Next