Re: An idea for parallelizing COPY within one backend - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: An idea for parallelizing COPY within one backend
Date
Msg-id 47C5874F.9090907@enterprisedb.com
Whole thread Raw
In response to Re: An idea for parallelizing COPY within one backend  ("A.M." <agentm@themactionfaction.com>)
List pgsql-hackers
A.M. wrote:
> 
> On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote:
> 
>> Dimitri Fontaine wrote:
>>> Of course, the backends still have to parse the input given by 
>>> pgloader, which only pre-processes data. I'm not sure having the 
>>> client prepare the data some more (binary format or whatever) is a 
>>> wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm 
>>> all wrong, so I'm all ears!
>>
>> As far as I understand, pgloader starts N threads or processes that 
>> open up N individual connections to the server. In that case, moving 
>> then text->binary conversion from the backend into the loader won't 
>> give any
>> additional performace I'd say.
>>
>> The reason that I'd love some within-one-backend solution is that I'd 
>> allow you to utilize more than one CPU for a restore within a *single* 
>> transaction. This is something that a client-side solution won't be 
>> able to deliver, unless major changes to the architecture of postgres 
>> happen first...
> 
> It seems like multiple backends should be able to take advantage of 2PC 
> for transaction safety.

Yes, whatever is coordinating the multiple backends (a master backend? i 
haven't followed this thread closely) would then have to have logic to 
finish the prepared transactions if you crash after you've committed one 
but not all of them. IOW, it would need a mini transaction log of its own.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Proposal: wildcards in pg_service.conf
Next
From: "Florian G. Pflug"
Date:
Subject: Re: An idea for parallelizing COPY within one backend