Home > mailing lists

Re: An idea for parallelizing COPY within one backend - Mailing list pgsql-hackers

From	Florian G. Pflug
Subject	Re: An idea for parallelizing COPY within one backend
Date	February 27, 2008 16:03:53
Msg-id	47C597E6.5060609@phlo.org Whole thread Raw
In response to	Re: An idea for parallelizing COPY within one backend (Andrew Dunstan <andrew@dunslane.net>)
Responses	Re: An idea for parallelizing COPY within one backend (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Andrew Dunstan wrote:
> Florian G. Pflug wrote:
>>> Would it be possible to determine when the copy is starting that this 
>>> case holds, and not use the parallel parsing idea in those cases?
>>
>> In theory, yes. In pratice, I don't want to be the one who has to 
>> answer to an angry user who just suffered a major drop in COPY 
>> performance after adding an ENUM column to his table.
>>
> I am yet to be convinced that this is even theoretically a good path to 
> follow. Any sufficiently large table could probably be partitioned and 
> then we could use the parallelism that is being discussed for pg_restore 
> without any modification to the backend at all. Similar tricks could be 
> played by an external bulk loader for third party data sources.

That assumes that some specific bulkloader like pg_restore, pgloader
or similar is used to perform the load. Plain libpq-users would either 
need to duplicate the logic these loaders contain, or wouldn't be able 
to take advantage of fast loads.

Plus, I'd see this as a kind of testbed for gently introducing 
parallelism into postgres backends (especially thinking about sorting 
here). CPU gain more and more cores, so in the long run I fear that we 
will have to find ways to utilize more than one of those to execute a 
single query.

But of course the architectural details need to be sorted out before any 
credible judgement about the feasability of this idea can be made...

regards, Florian Pflug

pgsql-hackers by date:

From: Alvaro Herrera
Date: 27 February 2008, 15:56:31
Subject: ResourceOwners for Snapshots? holdable portals

From: Tom Lane
Date: 27 February 2008, 16:16:44
Subject: Re: An idea for parallelizing COPY within one backend

Re: An idea for parallelizing COPY within one backend - Mailing list pgsql-hackers

Previous

Next