Re: Hardware recommendations to scale to silly load - Mailing list pgsql-performance
From | scott.marlowe |
---|---|
Subject | Re: Hardware recommendations to scale to silly load |
Date | |
Msg-id | Pine.LNX.4.33.0308270909500.1338-100000@css120.ihs.com Whole thread Raw |
In response to | Hardware recommendations to scale to silly load (matt <matt@ymogen.net>) |
List | pgsql-performance |
On 27 Aug 2003, matt wrote: > I'm wondering if the good people out there could perhaps give me some > pointers on suitable hardware to solve an upcoming performance issue. > I've never really dealt with these kinds of loads before, so any > experience you guys have would be invaluable. Apologies in advance for > the amount of info below... > > My app is likely to come under some serious load in the next 6 months, > but the increase will be broadly predictable, so there is time to throw > hardware at the problem. > > Currently I have a ~1GB DB, with the largest (and most commonly accessed > and updated) two tables having 150,000 and 50,000 rows. > > A typical user interaction with the system involves about 15 > single-table selects, 5 selects with joins or subqueries, 3 inserts, and > 3 updates. The current hardware probably (based on benchmarking and > profiling) tops out at about 300 inserts/updates *or* 2500 selects per > second. > > There are multiple indexes on each table that updates & inserts happen > on. These indexes are necessary to provide adequate select performance. > > Current hardware/software: > Quad 700MHz PIII Xeon/1MB cache > 3GB RAM > RAID 10 over 4 18GB/10,000rpm drives > 128MB battery backed controller cache with write-back enabled > Redhat 7.3, kernel 2.4.20 > Postgres 7.2.3 (stock redhat issue) > > I need to increase the overall performance by a factor of 10, while at > the same time the DB size increases by a factor of 50. e.g. 3000 > inserts/updates or 25,000 selects per second, over a 25GB database with > most used tables of 5,000,000 and 1,000,000 rows. It will likely take a combination of optimizing your database structure / methods and increasing your hardware / OS performance. You probably, more than anything, should look at some kind of superfast, external storage array that has dozens of drives, and a large battery backed cache. You may be able to approximate this yourself with just a few dual channel Ultra 320 SCSI cards and a couple dozen hard drives. The more spindles you throw at a database, generally speaking, the more parallel load it can handle. You may find that once you get to 10 or 20 drives, RAID 5 or 5+0 or 0+5 will be outrunning 1+0/0+1 due to fewer writes. You likely want to look at the fastest CPUs with the fastest memory you can afford. those 700MHz xeons are likely using PC133 memory, which is painfully slow compared to the stuff pumping data out at 4 to 8 times the rate of the older stuff. Maybe an SGI Altix could do this? Have you looked at them? They're not cheap, but they do look to be quite fast, and can scale to 64 CPUs if need be. They're interbox communication fabric is faster than most CPU's front side busses. > Notably, the data is very time-sensitive, so the active dataset at any > hour is almost certainly going to be more on the order of 5GB than 25GB > (plus I'll want all the indexes in RAM of course). > > Also, and importantly, the load comes but one hour per week, so buying a > Starfire isn't a real option, as it'd just sit idle the rest of the > time. I'm particularly interested in keeping the cost down, as I'm a > shareholder in the company! Interesting. If you can't spread the load out, can you batch some parts of it? Or is the whole thing interactive therefore needing to all be done in real time at once? > So what do I need? whether you like it or not, you're gonna need heavy iron if you need to do this all in one hour once a week. > Can anyone who has (or has ever had) that kind of > load in production offer any pointers, anecdotes, etc? Any theoretical > musings also more than welcome. Comments upon my sanity will be > referred to my doctor. > > If the best price/performance option is a second hand 32-cpu Alpha > running VMS I'd be happy to go that way ;-) Actually, I've seen stuff like that going on Ebay pretty cheap lately. I saw a 64 CPU E10k (366 MHz CPUs) with 64 gigs ram and 20 hard drives going for $24,000 a month ago. Put Linux or BSD on it and Postgresql should fly.
pgsql-performance by date: