Home > mailing lists

Re: Parallel copy - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Parallel copy
Date	October 3, 2020 00:49:59
Msg-id	20201003004959.73ot57oeikhtuq4u@development Whole thread
In response to	Re: Parallel copy (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses	Re: Parallel copy Re: Parallel copy
List	pgsql-hackers

Tree view

Hello Vignesh,

I've done some basic benchmarking on the v4 version of the patches (but
AFAIKC the v5 should perform about the same), and some initial review.

For the benchmarking, I used the lineitem table from TPC-H - for 75GB
data set, this largest table is about 64GB once loaded, with another
54GB in 5 indexes. This is on a server with 32 cores, 64GB of RAM and
NVME storage.

The COPY duration with varying number of workers (specified using the
parallel COPY option) looks like this:

      workers    duration
     ---------------------
            0        1366
            1        1255
            2         704
            3         526
            4         434
            5         385
            6         347
            7         322
            8         327

So this seems to work pretty well - initially we get almost linear
speedup, then it slows down (likely due to contention for locks, I/O
etc.). Not bad.

I've only done a quick review, but overall the patch looks in fairly
good shape.

1) I don't quite understand why we need INCREMENTPROCESSED and
RETURNPROCESSED, considering it just does ++ or return. It just
obfuscated the code, I think.

2) I find it somewhat strange that BeginParallelCopy can just decide not
to do parallel copy after all. Why not to do this decisions in the
caller? Or maybe it's fine this way, not sure.

3) AFAIK we don't modify typedefs.list in patches, so these changes
should be removed. 

4) IsTriggerFunctionParallelSafe actually checks all triggers, not just
one, so the comment needs minor rewording.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: James Coleman
Date: 02 October 2020, 23:07:00
Subject: Re: enable_incremental_sort changes query behavior

From: Andy Fan
Date: 03 October 2020, 02:05:59
Subject: Re: Improve choose_custom_plan for initial partition prune case

Re: Parallel copy - Mailing list pgsql-hackers

Previous

Next