Re: Hardware recommendations to scale to silly load - Mailing list pgsql-performance
From | Ron Johnson |
---|---|
Subject | Re: Hardware recommendations to scale to silly load |
Date | |
Msg-id | 1061953146.18108.291.camel@haggis Whole thread Raw |
In response to | Hardware recommendations to scale to silly load (matt <matt@ymogen.net>) |
Responses |
Re: Hardware recommendations to scale to silly load
Re: Hardware recommendations to scale to silly load |
List | pgsql-performance |
On Tue, 2003-08-26 at 20:35, matt wrote: > I'm wondering if the good people out there could perhaps give me some > pointers on suitable hardware to solve an upcoming performance issue. > I've never really dealt with these kinds of loads before, so any > experience you guys have would be invaluable. Apologies in advance for > the amount of info below... > > My app is likely to come under some serious load in the next 6 months, > but the increase will be broadly predictable, so there is time to throw > hardware at the problem. > > Currently I have a ~1GB DB, with the largest (and most commonly accessed > and updated) two tables having 150,000 and 50,000 rows. > > A typical user interaction with the system involves about 15 > single-table selects, 5 selects with joins or subqueries, 3 inserts, and > 3 updates. The current hardware probably (based on benchmarking and > profiling) tops out at about 300 inserts/updates *or* 2500 selects per > second. > > There are multiple indexes on each table that updates & inserts happen > on. These indexes are necessary to provide adequate select performance. > > Current hardware/software: > Quad 700MHz PIII Xeon/1MB cache > 3GB RAM > RAID 10 over 4 18GB/10,000rpm drives > 128MB battery backed controller cache with write-back enabled Much more cache needed. Say 512MB per controller? > Redhat 7.3, kernel 2.4.20 > Postgres 7.2.3 (stock redhat issue) Upgrade to Pg 7.3.4! > I need to increase the overall performance by a factor of 10, while at > the same time the DB size increases by a factor of 50. e.g. 3000 Are you *sure* about that???? 3K updates/inserts per second xlates to 10,800,000 per hour. That, my friend, is a WHOLE HECK OF A LOT! > inserts/updates or 25,000 selects per second, over a 25GB database with Likewise: 90,000,000 selects per hour. > most used tables of 5,000,000 and 1,000,000 rows. > > Notably, the data is very time-sensitive, so the active dataset at any During the 1 hour surge, will SELECTs at 10 minutes after the hour depend on INSERTs at 5 minutes after the hour? If not, maybe you could pump the INSERT/UPDATE records into flat files, to be processed after the 1-hour surge is complete. That may reduce the h/w requirements. > hour is almost certainly going to be more on the order of 5GB than 25GB > (plus I'll want all the indexes in RAM of course). > > Also, and importantly, the load comes but one hour per week, so buying a Only one hour out of 168????? May I ask what kind of app it is? > Starfire isn't a real option, as it'd just sit idle the rest of the > time. I'm particularly interested in keeping the cost down, as I'm a > shareholder in the company! What a fun exercises. Ok, lets see: Postgres 7.3.4 RH AS 2.1 12GB RAM motherboard with 64 bit 66MHz PCI slots 4 - Xenon 3.0GHz (1MB cache) CPUs 8 - 36GB 15K RPM as RAID10 on a 64 bit 66MHz U320 controller having 512MB cache (for database) 2 - 36GB 15K RPM as RAID1 on a 64 bit 66MHz U320 controller having 512MB cache (for OS, swap, WAL files) 1 - library tape drive plugged into the OS' SCSI controller. I prefer DLT, but that's my DEC bias. 1 - 1000 volt UPS. If you know when the flood will be coming, you could perform SELECT * FROM ... WHERE statements on an indexed field, to pull the relevant data into Linux's buffers. Yes, the 8 disks is capacity-overkill, but the 8 high-speed spindles is what you're looking for. > So what do I need? Can anyone who has (or has ever had) that kind of > load in production offer any pointers, anecdotes, etc? Any theoretical > musings also more than welcome. Comments upon my sanity will be > referred to my doctor. > > If the best price/performance option is a second hand 32-cpu Alpha > running VMS I'd be happy to go that way ;-) I'd love to work on a GS320! You may even pick one up for a million or 2. The license costs for VMS & Rdb would eat you, though. Rdb *does* have ways, though, using large buffers and hashed indexes, with the table tuples stored on the same page as the hashed index keys, to make such accesses *blazingly* fast. > Many thanks for reading this far. -- ----------------------------------------------------------------- Ron Johnson, Jr. ron.l.johnson@cox.net Jefferson, LA USA "A C program is like a fast dance on a newly waxed dance floor by people carrying razors." Waldi Ravens
pgsql-performance by date: