Re: concurrent COPY performance - Mailing list pgsql-hackers

From Stefan Kaltenbrunner
Subject Re: concurrent COPY performance
Date
Msg-id 4A387397.9090607@kaltenbrunner.cc
Whole thread Raw
In response to Re: concurrent COPY performance  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-hackers
Merlin Moncure wrote:
> On Tue, Jun 16, 2009 at 12:47 PM, Stefan
> Kaltenbrunner<stefan@kaltenbrunner.cc> wrote:
>> Hi!
>>
>> I have been doing some bulk loading testing recently - mostly with a focus
>> on answering why we are "only" getting a (max of) cores/2(up to around 8
>> cores even less with more) speedup using parallel restore.
>> What I found is that on some fast IO-subsystem we are CPU bottlenecked on
>> concurrent copy which is able to utilize WAL bypass (and scale up to around
>> cores/2) and performance without wal bypass is very bad.
>> In the WAL logged case we are only able to get a 50% speedup using the
>> second process already and we are never able to scale better than 3x (up to
>> 8 cores) and performance degrades even after that point.
> 
> how are you bypassing wal?  do I read this properly that on your 8
> core system you are getting 4x speedup with wal bypass and 3x speedup
> without?

The test is simply executing something like psql -c "BEGIN;TRUNCATE 
lineitem1;COPY lineitem1 FROM ....;COMMIT;". in parallel with the source 
file being hosted on a seperate array and primed into the OS buffercache.
The box has 8cores/16 threads actually - I get a 3x speedup up to using 
8 processes without wal-bypass but on higher connection counts the 
performances degraded.
Utilizing wal bypass I get near perfect scalability up to using 4 
connections and a maximum speedup of ~8x by using 16 connections (ie all 
threads)


Stefan


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: concurrent COPY performance
Next
From: Fujii Masao
Date:
Subject: Re: postmaster recovery and automatic restart suppression