Thread: How to achieve sustained disk performance of 1.25 GB write for 5 mins

How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Eric Comeau
Date:
This is not directly a PostgreSQL performance question but I'm hoping
some of the chaps that build high IO PostgreSQL servers on here can help.

We build file transfer acceleration s/w (and use PostgreSQL as our
database) but we need to build a test server that can handle a sustained
write throughput of 1,25 GB for 5 mins.

Why this number, because we want to push a 10 Gbps network link for 5-8
mins, 10Gbps = 1.25 GB write, and would like to drive it for 5-8 mins
which would be 400-500 GB.

Note this is just a "test" server therefore it does not need fault
tolerance.

Thanks in advance,
Eric

Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
"J. Roeleveld"
Date:
On Wednesday 17 November 2010 15:26:56 Eric Comeau wrote:
> This is not directly a PostgreSQL performance question but I'm hoping
> some of the chaps that build high IO PostgreSQL servers on here can help.
>
> We build file transfer acceleration s/w (and use PostgreSQL as our
> database) but we need to build a test server that can handle a sustained
> write throughput of 1,25 GB for 5 mins.
>
> Why this number, because we want to push a 10 Gbps network link for 5-8
> mins, 10Gbps = 1.25 GB write, and would like to drive it for 5-8 mins
> which would be 400-500 GB.
>
> Note this is just a "test" server therefore it does not need fault
> tolerance.
>
> Thanks in advance,
> Eric

I'm sure there are others with more experience on this, but if you don't need
failt tolerance, a bunch of fast disks in striping-mode (so-called RAID-0) on
seperated channels (eg. different PCI-Express channels) would be my first step.

Alternatively, if you don't care if the data is actually stored, couldn't you
process it with a program that does a checksum over the data transmitted and
then ignores/forgets it? (eg. forget about disk-storage and do it all in
memory?)

--
Joost

Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Digimer
Date:
On 11/17/2010 09:26 AM, Eric Comeau wrote:
> This is not directly a PostgreSQL performance question but I'm hoping
> some of the chaps that build high IO PostgreSQL servers on here can help.
>
> We build file transfer acceleration s/w (and use PostgreSQL as our
> database) but we need to build a test server that can handle a sustained
> write throughput of 1,25 GB for 5 mins.
>
> Why this number, because we want to push a 10 Gbps network link for 5-8
> mins, 10Gbps = 1.25 GB write, and would like to drive it for 5-8 mins
> which would be 400-500 GB.
>
> Note this is just a "test" server therefore it does not need fault
> tolerance.
>
> Thanks in advance,
> Eric
>

Off hand, I would suggest:

8x http://www.kingston.com/ssd/vplus100.asp (180MB/sec sustained write)
stripped (RAID 0, you did say that you don't care about safety). That
should be 1.44GB/sec write, minus overhead.

1x
http://www.lsi.com/channel/products/raid_controllers/3ware_9690sa8i/index.html
RAID card (note that it's the internal port model, despite the image)

4x http://usa.chenbro.com/corporatesite/products_detail.php?sku=114 (for
mounting the drives)

That would be about the minimum I should expect you can pay to get that
kind of performance. Others are free to dis/agree. :)

--
Digimer
E-Mail: digimer@alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org

Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Lutz Steinborn
Date:
On Wed, 17 Nov 2010 09:26:56 -0500
Eric Comeau <ecomeau@signiant.com> wrote:

> This is not directly a PostgreSQL performance question but I'm hoping
> some of the chaps that build high IO PostgreSQL servers on here can help.
>
> We build file transfer acceleration s/w (and use PostgreSQL as our
> database) but we need to build a test server that can handle a sustained
> write throughput of 1,25 GB for 5 mins.
>
> Why this number, because we want to push a 10 Gbps network link for 5-8
> mins, 10Gbps = 1.25 GB write, and would like to drive it for 5-8 mins
> which would be 400-500 GB.
>
> Note this is just a "test" server therefore it does not need fault
> tolerance.

Get a machine with enough RAM and run postgresql from RAM disk. Write a
start script to copy the RAM disk back to normal disk then stopping and back
to RAM disk for start.

this must be the fastest solution.

--
Lutz


Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Ivan Voras
Date:
On 11/17/10 15:26, Eric Comeau wrote:
> This is not directly a PostgreSQL performance question but I'm hoping
> some of the chaps that build high IO PostgreSQL servers on here can help.
>
> We build file transfer acceleration s/w (and use PostgreSQL as our
> database) but we need to build a test server that can handle a sustained
> write throughput of 1,25 GB for 5 mins.

Just to clarify: you need 1.25 GB/s write throughput?

For one thing, you need not only fast storage but also a fast CPU and
file system. If you are going to stream this data directly over the
network in a single FTP-like session, you need fast single-core
performance (so buy the fastest low-core-count CPU possible) and a file
system which doesn't interfere much with raw data streaming. If you're
using Linux I'd guess either something very simple like ext2 or complex
but designed for the task like XFS might be best. On FreeBSD, ZFS is
great for streaming but you'll spend a lot of time tuning it :)

 From the hardware POW, since you don't really need high IOPS rates, you
can go much cheaper with a large number of cheap desktop drives than
with SSD-s, if you can build something like this:
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/

You don't need the storage space here, but you *do* need many drives to
achieve speed in RAID (remember to overdesign and assume 50 MB/s per drive).

Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Merlin Moncure
Date:
On Wed, Nov 17, 2010 at 9:26 AM, Eric Comeau <ecomeau@signiant.com> wrote:
> This is not directly a PostgreSQL performance question but I'm hoping some
> of the chaps that build high IO PostgreSQL servers on here can help.
>
> We build file transfer acceleration s/w (and use PostgreSQL as our database)
> but we need to build a test server that can handle a sustained write
> throughput of 1,25 GB for 5 mins.
>
> Why this number, because we want to push a 10 Gbps network link for 5-8
> mins, 10Gbps = 1.25 GB write, and would like to drive it for 5-8 mins which
> would be 400-500 GB.
>
> Note this is just a "test" server therefore it does not need fault
> tolerance.

I really doubt you will see 1.25gb/sec over 10gige link.  Even if you
do though, you will hit a number of bottlenecks if you want to see
anything close to those numbers.  Even with really fast storage you
will probably become cpu bound, or bottlenecked in the WAL, or some
other place.

*) what kind of data do you expect to be writing out at this speed?
*) how many transactions per second will you expect to have?
*) what is the architecture of the client? how many connections will
be open to postgres writing?
*) how many cores are in this box? what kind?

merlin

Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Scott Carey
Date:
On Nov 17, 2010, at 7:28 AM, Digimer wrote:

> On 11/17/2010 09:26 AM, Eric Comeau wrote:
>> This is not directly a PostgreSQL performance question but I'm hoping
>> some of the chaps that build high IO PostgreSQL servers on here can help.
>>
>> We build file transfer acceleration s/w (and use PostgreSQL as our
>> database) but we need to build a test server that can handle a sustained
>> write throughput of 1,25 GB for 5 mins.
>>
>> Why this number, because we want to push a 10 Gbps network link for 5-8
>> mins, 10Gbps = 1.25 GB write, and would like to drive it for 5-8 mins
>> which would be 400-500 GB.
>>
>> Note this is just a "test" server therefore it does not need fault
>> tolerance.
>>
>> Thanks in advance,
>> Eric
>>
>
> Off hand, I would suggest:
>
> 8x http://www.kingston.com/ssd/vplus100.asp (180MB/sec sustained write)
> stripped (RAID 0, you did say that you don't care about safety). That
> should be 1.44GB/sec write, minus overhead.

Can get cheaper disks that go ~135MB/sec write and a couple more of them.

>
> 1x
> http://www.lsi.com/channel/products/raid_controllers/3ware_9690sa8i/index.html
> RAID card (note that it's the internal port model, despite the image)
>

You'll need 2 RAID cards with software raid-0 on top to sustain this rate, or simply pure software raid-0.  A single
raidcard tends to be unable to sustain reads or writes that high, no matter how many drives you put on it. 

The last time I tried a 3ware card, it couldn't go past 380MB/sec with 10 drives. 6 to 10 drives in raid 10 were all
thesame sequential througput, only random iops went up.  Maybe raid0 is better.  Software raid is usually fastest for
raid0, 1, and 10, other than write cache effects (which are strong and important for a real world db). 

I get ~1000MB/sec out of 2 Adaptec 5805s with linux 'md' software raid 0 on top of these (each are raid 10 with 10
drives). If i did not care about data reliability I'd go with anything that had a lot of ports (perhaps a couple cheap
SAScards without complicated raid features) and software raid 0. 


> 4x http://usa.chenbro.com/corporatesite/products_detail.php?sku=114 (for
> mounting the drives)
>
> That would be about the minimum I should expect you can pay to get that
> kind of performance. Others are free to dis/agree. :)
>
> --
> Digimer
> E-Mail: digimer@alteeve.com
> AN!Whitepapers: http://alteeve.com
> Node Assassin:  http://nodeassassin.org
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Scott Carey
Date:
On Nov 17, 2010, at 10:48 AM, Scott Carey wrote:

>>
>> Off hand, I would suggest:
>>
>> 8x http://www.kingston.com/ssd/vplus100.asp (180MB/sec sustained write)
>> stripped (RAID 0, you did say that you don't care about safety). That
>> should be 1.44GB/sec write, minus overhead.
>
> Can get cheaper disks that go ~135MB/sec write and a couple more of them.
>

Another option, two of these (650MB+ /sec sustained) in raid 0:
http://www.anandtech.com/show/3997/ocz-revodrive-x2-review/3

No external enclosure required, no raid card required (the card is basically 4 ssd's raided together in one package).
Just2 PCIe slots.  The cost seems to be not too bad, at least for "how much does it cost to go 600MB/sec".   $1200 will
gettwo of them, for a total of 480GB and 1300MB/sec. 

Note, these are not data-safe on power failure.



Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
"Eric Comeau"
Date:


On 10-11-17 12:28 PM, Merlin Moncure wrote:

On Wed, Nov 17, 2010 at 9:26 AM, Eric Comeau <ecomeau@signiant.com> wrote:
> This is not directly a PostgreSQL performance question but I'm hoping some
> of the chaps that build high IO PostgreSQL servers on here can help.
>
> We build file transfer acceleration s/w (and use PostgreSQL as our database)
> but we need to build a test server that can handle a sustained write
> throughput of 1,25 GB for 5 mins.
>
> Why this number, because we want to push a 10 Gbps network link for 5-8
> mins, 10Gbps = 1.25 GB write, and would like to drive it for 5-8 mins which
> would be 400-500 GB.
>
> Note this is just a "test" server therefore it does not need fault
> tolerance.

I really doubt you will see 1.25gb/sec over 10gige link.  Even if you
do though, you will hit a number of bottlenecks if you want to see
anything close to those numbers.  Even with really fast storage you
will probably become cpu bound, or bottlenecked in the WAL, or some
other place.

*) what kind of data do you expect to be writing out at this speed?
Large Video files ... our s/w is used to displace FTP.

*) how many transactions per second will you expect to have?

Ideally 1 large file, but it may have to be multiple. We find that if we send multiple files it just causes the disk to thrash more so we get better throughput by sending one large file.

*) what is the architecture of the client? how many connections will
be open to postgres writing?

Our s/w can do multiple streams, but I believe we get better performance with 1 stream handling one large file, you could have 4 streams with 4 files in flight, but the disk thrashes more... postgres is not be writing the file data, our agent reports back to postgres stats on the transfer rate being achieved ... postgres transactions is not the issue. The client and server are written in C and use UDP (with our own error correction) to achieve high network throughput as opposed to TCP.

*) how many cores are in this box? what kind?

Well obviously thats part of the equation as well, but its sort of unbounded right now not defined, but our s/w is multi-threaded and can make use of the multiple cores... so I'll say for now at a minimum 4.


merlin

Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Ivan Voras
Date:
On 11/17/10 22:11, Eric Comeau wrote:

>> *) what kind of data do you expect to be writing out at this speed?
>>
> Large Video files ... our s/w is used to displace FTP.
>>
>> *) how many transactions per second will you expect to have?
>>
> Ideally 1 large file, but it may have to be multiple. We find that if we
> send multiple files it just causes the disk to thrash more so we get
> better throughput by sending one large file.
 >
>> *) what is the architecture of the client? how many connections will
>> be open to postgres writing?
>>
> Our s/w can do multiple streams, but I believe we get better performance
> with 1 stream handling one large file, you could have 4 streams with 4
> files in flight, but the disk thrashes more... postgres is not be
> writing the file data, our agent reports back to postgres stats on the
> transfer rate being achieved ... postgres transactions is not the issue.
> The client and server are written in C and use UDP (with our own error
> correction) to achieve high network throughput as opposed to TCP.

I hope you know what you are doing, there is a large list of tricks used
by modern high performance FTP and web servers to get maximum
performance from hardware and the operating system while minimizing CPU
usage - and most of them don't work with UDP.

Before you test with real hardware, try simply sending dummy data or
/dev/null data (i.e. not from disks, not from file systems) and see how
it goes.


Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Greg Smith
Date:
Eric Comeau wrote:
Ideally 1 large file, but it may have to be multiple. We find that if we send multiple files it just causes the disk to thrash more so we get better throughput by sending one large file.

If it's really one disk, sure.  The problem you're facing is that your typical drive controller is going to top out at somewhere between 300 - 500MB/s of sequential writes before it becomes the bottleneck.  Above somewhere between 6 and 10 drives attached to one controller on current hardware, adding more to a RAID-0 volume only increases the ability to handle seeks quickly.  If you want to try and do this with traditional hard drives, I'd guess you'd need 3 controllers with at least 4 short-stroked drives attached to each to have any hope of hitting 1.25GB/s.  Once you do that, you'll run into CPU time as the next bottleneck.  At that point, you'll probably need one CPU per controller, all writing out at once, to keep up with your target.

The only popular hardware design that comes to mind aimed at this sort of thing was Sun's "Thumper" design, most recently seen in the Sun Fire X4540.  That put 8 controllers with 6 disks attached to each, claiming "demonstrated up to 2 GB/sec from disk to network".  It will take a design like that, running across multiple controllers, to get what you're looking for on the disk side--presuming everything else keeps up.

One of the big SSD-on-PCI-e designs mentioned here already may very well end up being a better choice for you here though, as those aren't going to require quite as much hardware all get wired up.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services and Support        www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: How to achieve sustained disk performance of 1.25 GB write for 5 mins

From
Dimitri
Date:
You may also try the Sun's F5100 (flash storage array) - you may
easily get 700 MB/s just with a single I/O stream (single process), so
just with 2 streams you'll get your throughput.. - The array has 2TB
total space and max throughput should be around 4GB/s..

Rgds,
-Dimitri


On 11/18/10, Greg Smith <greg@2ndquadrant.com> wrote:
> Eric Comeau wrote:
>> Ideally 1 large file, but it may have to be multiple. We find that if
>> we send multiple files it just causes the disk to thrash more so we
>> get better throughput by sending one large file.
>
> If it's really one disk, sure.  The problem you're facing is that your
> typical drive controller is going to top out at somewhere between 300 -
> 500MB/s of sequential writes before it becomes the bottleneck.  Above
> somewhere between 6 and 10 drives attached to one controller on current
> hardware, adding more to a RAID-0 volume only increases the ability to
> handle seeks quickly.  If you want to try and do this with traditional
> hard drives, I'd guess you'd need 3 controllers with at least 4
> short-stroked drives attached to each to have any hope of hitting
> 1.25GB/s.  Once you do that, you'll run into CPU time as the next
> bottleneck.  At that point, you'll probably need one CPU per controller,
> all writing out at once, to keep up with your target.
>
> The only popular hardware design that comes to mind aimed at this sort
> of thing was Sun's "Thumper" design, most recently seen in the Sun Fire
> X4540.  That put 8 controllers with 6 disks attached to each, claiming
> "demonstrated up to 2 GB/sec from disk to network".  It will take a
> design like that, running across multiple controllers, to get what
> you're looking for on the disk side--presuming everything else keeps up.
>
> One of the big SSD-on-PCI-e designs mentioned here already may very well
> end up being a better choice for you here though, as those aren't going
> to require quite as much hardware all get wired up.
>
> --
> Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
> PostgreSQL Training, Services and Support        www.2ndQuadrant.us
> "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
>
>