Re: Which hardware ? - Mailing list pgsql-performance
From | Scott Marlowe |
---|---|
Subject | Re: Which hardware ? |
Date | |
Msg-id | dcc563d10806171128s59518d6bi5468f621e7bd2ce3@mail.gmail.com Whole thread Raw |
In response to | Re: Which hardware ? ("Lionel" <lionel@art-informatique.com>) |
List | pgsql-performance |
On Tue, Jun 17, 2008 at 11:59 AM, Lionel <lionel@art-informatique.com> wrote: > "Scott Marlowe" wrote: >> You're absolutely right though, we really need to know the value of >> fast performance here. > > the main problem is that my customers are used to have their reporting after > few seconds. > They want do have 10 times more data but still have the same speed, which > is, I think, quite impossible. > >> If you're running aggregations of numbers used for filling out >> quarterly reports, not so much. > > The application is used to analyse products sales behaviour, display charts, > perform comparisons, study progression... > 10-40 seconds seems to be a quite good performance. > More than 1 minute will be too slow (meaning they won't pay for that). > > I did some test with a 20 millions lines database on a single disk dual core > 2GB win XP system (default postgresql config), most of the time is spent in > I/O: 50-100 secs for statements that scan 6 millions of lines, which will > happen. Almost no CPU activity. > > So here is the next question: 4 disks RAID10 (did not find a french web host > yet) or 5 disk RAID5 (found at 600euros/month) ? > I don't want to have any RAID issue... > I did not have any problem with my basic RAID1 since many years, and don't > want that to change. Do you have root access on your servers? then just ask for 5 disks with one holding the OS / Apps and you'll do the rest. Software RAID is probably a good fit for cheap right now. If you can set it up yourself, you might be best off with >2 disk RAID-1. 5 750G disks in a RAID-1 yields 750G of storage (duh) but allows for five different readers to operate without the heads having to seek. large amounts of data can be read at a medium speed from a RAID-1 like this. But most RAID implementations don't aggregate bandwidth for RAID-1. They do for RAID-0. So, having a huge RAID-0 zero array allows for reading a large chunk of data really fast from all disks at once. RAID1+0 gives you the ability to tune this in either direction. But the standard config of a 4 disk setup (striping two mirrors, each made from two disks, is a good compromise to start with. Average read speed of array is doubled, and the ability to have two reads not conflict helps too. RAID5 is a comproise to provide the most storage while having mediocre performance or, when degraded, horrifficaly poor performance. Hard drives are cheap, hosting not as much. Also, always look at optimizing their queries. A lot of analysis is done by brute force queries that rewritten intelligently suddenly run in minutes not hours. or seconds not minutes.
pgsql-performance by date: