Re: Load experimentation - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: Load experimentation
Date
Msg-id dcc563d10912072359l771d7b7et8a231266d4d7bf7e@mail.gmail.com
Whole thread Raw
In response to Re: Load experimentation  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
On Tue, Dec 8, 2009 at 12:58 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Tue, Dec 8, 2009 at 12:22 AM, Ben Brehmer <benbrehmer@gmail.com> wrote:
>> Thanks for all the responses. I have one more thought;
>>
>> Since my input data is split into about 200 files (3GB each), I could
>> potentially spawn one load command for each file. What would be the maximum
>> number of input connections Postgres can handle without bogging down? When I
>> say 'input connection' I mean "psql -U postgres -d dbname -f
>> one_of_many_sql_files".
>
> This is VERY dependent on your IO capacity and number of cores.  My
> experience is that unless you're running on a decent number of disks,
> you'll run out of IO capacity first in most machines.  n pairs of
> mirrors in a RAID-10 can handle x input threads where x has some near
> linear relation to n.  Have 100 disks in a RAID-10 array?  You can
> surely handle dozens of load threads with no IO wait.  Have 4 disks in
> a RAID-10?  Maybe two to four load threads will max you out.  Once
> you're IO bound, adding more threads and more CPUs won't help, it'll
> hurt.  The only way to really know is to benchmark it, but i'd guess
> that about half as many import threads as mirror pairs in a RAID-10
> (or just drives if you're using RAID-0) would be a good place to start
> and work from there.

Note that if you start running out of CPU horsepower first the
degradation will be less harsh as you go past the knee in the
performance curve.    IO has a sharper knee than CPU.

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Load experimentation
Next
From: Dimitri Fontaine
Date:
Subject: Re: Load experimentation