Re: Parallel copy - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel copy
Date
Msg-id CAA4eK1L+aJ6O2u1_PEss_9FKgS8fUyWufFpZFk+jHbWvQoDkOA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel copy  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Sat, Oct 3, 2020 at 6:20 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> Hello Vignesh,
>
> I've done some basic benchmarking on the v4 version of the patches (but
> AFAIKC the v5 should perform about the same), and some initial review.
>
> For the benchmarking, I used the lineitem table from TPC-H - for 75GB
> data set, this largest table is about 64GB once loaded, with another
> 54GB in 5 indexes. This is on a server with 32 cores, 64GB of RAM and
> NVME storage.
>
> The COPY duration with varying number of workers (specified using the
> parallel COPY option) looks like this:
>
>       workers    duration
>      ---------------------
>             0        1366
>             1        1255
>             2         704
>             3         526
>             4         434
>             5         385
>             6         347
>             7         322
>             8         327
>
> So this seems to work pretty well - initially we get almost linear
> speedup, then it slows down (likely due to contention for locks, I/O
> etc.). Not bad.
>

+1. These numbers (> 4x speed up) look good to me.


-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Resetting spilled txn statistics in pg_stat_replication
Next
From: Petru Ghita
Date:
Subject: POC: contrib/unaccent as IMMUTABLE