Re: An idea for parallelizing COPY within one backend - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: An idea for parallelizing COPY within one backend
Date
Msg-id 1204109249.4252.477.camel@ebony.site
Whole thread Raw
In response to Re: An idea for parallelizing COPY within one backend  (Dimitri Fontaine <dfontaine@hi-media.com>)
List pgsql-hackers
On Wed, 2008-02-27 at 09:09 +0100, Dimitri Fontaine wrote:
> Hi,
> 
> Le mercredi 27 février 2008, Florian G. Pflug a écrit :
> > Upon reception of a COPY INTO command, a backend would
> > .) Fork off a "dealer" and N "worker" processes that take over the
> > client connection. The "dealer" distributes lines received from the
> > client to the N workes, while the original backend receives them
> > as tuples back from the workers.
> 
> This looks so much like what pgloader does now (version 2.3.0~dev2, release 
> candidate) at the client side, when configured for it, that I can't help 
> answering the mail :)
>  http://pgloader.projects.postgresql.org/dev/pgloader.1.html#_parallel_loading
>   section_threads = N
>   split_file_reading = False
> 
> Of course, the backends still have to parse the input given by pgloader, which 
> only pre-processes data. I'm not sure having the client prepare the data some 
> more (binary format or whatever) is a wise idea, as you mentionned and wrt 
> Tom's follow-up. But maybe I'm all wrong, so I'm all ears!

ISTM the external parallelization approach is more likely to help us
avoid bottlenecks, so I support Dimitri's approach.

We also need error handling which pgloader also has. 

Writing error handling and parallelization into COPY isn't going to be
easy, and not very justifiable either if we already have both.

There might be a reason to re-write it in C one day, but that will be
fairly easy task if we ever need to do it.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com 



pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: pg_dump additional options for performance
Next
From: Richard Huxton
Date:
Subject: Full text search - altering the default parser