Home > mailing lists

Re: Parallel copy - Mailing list pgsql-hackers

From	Ants Aasma
Subject	Re: Parallel copy
Date	February 18, 2020 15:29:20
Msg-id	CANwKhkPmM18UYpOt_AEB4JC6fa0dfA1PfgiQyNzeNUxEpG=XUw@mail.gmail.com Whole thread Raw
In response to	Re: Parallel copy (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: Parallel copy (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-hackers

Tree view

On Tue, 18 Feb 2020 at 12:20, Amit Kapila <amit.kapila16@gmail.com> wrote:
> This is something similar to what I had also in mind for this idea.  I
> had thought of handing over complete chunk (64K or whatever we
> decide).  The one thing that slightly bothers me is that we will add
> some additional overhead of copying to and from shared memory which
> was earlier from local process memory.  And, the tokenization (finding
> line boundaries) would be serial.  I think that tokenization should be
> a small part of the overall work we do during the copy operation, but
> will do some measurements to ascertain the same.

I don't think any extra copying is needed. The reader can directly
fread()/pq_copymsgbytes() into shared memory, and the workers can run
CopyReadLineText() inner loop directly off of the buffer in shared memory.

For serial performance of tokenization into lines, I really think a SIMD
based approach will be fast enough for quite some time. I hacked up the code in
the simdcsv  project to only tokenize on line endings and it was able to
tokenize a CSV file with short lines at 8+ GB/s. There are going to be many
other bottlenecks before this one starts limiting. Patch attached if you'd
like to try that out.

Regards,
Ants Aasma

Attachment

simdcsv-find-only-lineendings.diff

pgsql-hackers by date:

From: Juan José Santamaría Flecha
Date: 18 February 2020, 14:26:06
Subject: Re: Clean up some old cruft related to Windows

From: Fujii Masao
Date: 18 February 2020, 15:31:57
Subject: Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side

Re: Parallel copy - Mailing list pgsql-hackers

Attachment

Previous

Next