Re: Load experimentation - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: Load experimentation
Date
Msg-id dcc563d10912072358h6e3e37b9jdd9d349ffcc3e103@mail.gmail.com
Whole thread Raw
In response to Re: Load experimentation  (Ben Brehmer <benbrehmer@gmail.com>)
Responses Re: Load experimentation  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
On Tue, Dec 8, 2009 at 12:22 AM, Ben Brehmer <benbrehmer@gmail.com> wrote:
> Thanks for all the responses. I have one more thought;
>
> Since my input data is split into about 200 files (3GB each), I could
> potentially spawn one load command for each file. What would be the maximum
> number of input connections Postgres can handle without bogging down? When I
> say 'input connection' I mean "psql -U postgres -d dbname -f
> one_of_many_sql_files".

This is VERY dependent on your IO capacity and number of cores.  My
experience is that unless you're running on a decent number of disks,
you'll run out of IO capacity first in most machines.  n pairs of
mirrors in a RAID-10 can handle x input threads where x has some near
linear relation to n.  Have 100 disks in a RAID-10 array?  You can
surely handle dozens of load threads with no IO wait.  Have 4 disks in
a RAID-10?  Maybe two to four load threads will max you out.  Once
you're IO bound, adding more threads and more CPUs won't help, it'll
hurt.  The only way to really know is to benchmark it, but i'd guess
that about half as many import threads as mirror pairs in a RAID-10
(or just drives if you're using RAID-0) would be a good place to start
and work from there.

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: Load experimentation
Next
From: Scott Marlowe
Date:
Subject: Re: Load experimentation