Re: Parallel copy - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Parallel copy
Date
Msg-id CALj2ACUEvYbad7Gjk8+GOdT2tNNfPbvvqLfdwcBNrPcin6zE_g@mail.gmail.com
Whole thread Raw
In response to Re: Parallel copy  (vignesh C <vignesh21@gmail.com>)
Responses Re: Parallel copy  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Parallel copy  (Ashutosh Sharma <ashu.coek88@gmail.com>)
List pgsql-hackers
On Wed, Jul 22, 2020 at 7:56 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for reviewing and providing the comments Ashutosh.
> Please find my thoughts below:
>
> On Fri, Jul 17, 2020 at 7:18 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > Some review comments (mostly) from the leader side code changes:  
> >
> > 3) Should we allow Parallel Copy when the insert method is CIM_MULTI_CONDITIONAL?
> >
> > +   /* Check if the insertion mode is single. */
> > +   if (FindInsertMethod(cstate) == CIM_SINGLE)
> > +       return false;
> >
> > I know we have added checks in CopyFrom() to ensure that if any trigger (before row or instead of) is found on any of partition being loaded with data, then COPY FROM operation would fail, but does it mean that we are okay to perform parallel copy on partitioned table. Have we done some performance testing with the partitioned table where the data in the input file needs to be routed to the different partitions?
> >
>
> Partition data is handled like what Amit had told in one of earlier mails [1].  My colleague Bharath has run performance test with partition table, he will be sharing the results.
>

I ran tests for partitioned use cases - results are similar to that of non partitioned cases[1].

parallel workerstest case 1(exec time in sec): copy from csv file, 5.1GB, 10million tuples, 4 range partitions, 3 indexes on integer columns unique datatest case 2(exec time in sec): copy from csv file, 5.1GB, 10million tuples, 4 range partitions, unique data
0205.403(1X)135(1X)
2114.724(1.79X)59.388(2.27X)
499.017(2.07X)56.742(2.34X)
899.722(2.06X)66.323(2.03X)
1698.147(2.09X)66.054(2.04X)
2097.723(2.1X)66.389(2.03X)
3097.048(2.11X)70.568(1.91X)

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Default setting for enable_hashagg_disk
Next
From: "Lu, Chenyang"
Date:
Subject: [PATCH] keep the message consistent in buffile.c