Thread: Question about hosting and server grade
Hi. I have a questionf or people who run high traffic websites. We are considering a new dedicated server host for a set of 25 domains, about 5 of which are very high traffic (80 million clicks a day each). A lot of this is VIEW content, but there may be a million or so INSERTs and UPDATEs. I am told that the biggest speed boost and performance comes from memory and fast hard disk. So I'm looking for at least a 16GB RAM and SCSI 10k 300GB hard disks. We will use CentOS 5 with Apache 2. I am also told that PHP etc is okay, but Postgresql (the database) is the one that hogs resources after a while. So for the database server I need a high end server. My question: What's the high end recommendation? Is the following config of 4 x quadcore Dunnington Intels with 4 disks on RAID 10 be good enough for the above sites? Can I run a database on this config of servers for my kind of traffic, or do I need a separate one for PG? I suppose the traffic will grow large quite quickly so the 300GB may be low, but that we can add as we go along. Thanks for any thoughts! -------- Quad Processor Hex Core Intel 7450 - 2.40GHz (Dunnington) - 6 x 9MB (L2) 12MB (L3) cache Second Processor Hex Core Intel 7450 - 2.40GHz (Dunnington) - 6 x 9MB (L2) 12MB (L3) Third Processor Hex Core Intel 7450 - 2.40GHz (Dunnington) - 6 x 9MB (L2) 12MB (L3) cache Fourth Processor Hex Core Intel 7450 - 2.40GHz (Dunnington) - 6 x 9MB (L2) 12MB (L3) cache 16 GB FB-DIMM Registered 533/667 1000 Mbps public uplink 1000 Mbps private uplink Disk Controller RAID 10 HD1: 300GB SA-SCSI 10K RPM HD2: 300GB SA-SCSI 10K RPM HD3: 300GB SA-SCSI 10K RPM HD4: 300GB SA-SCSI 10K RPM CentOS 5 (32 bit)
On Wed, Mar 25, 2009 at 11:42 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > Hi. I have a questionf or people who run high traffic websites. > > We are considering a new dedicated server host for a set of 25 > domains, about 5 of which are very high traffic (80 million clicks a > day each). A lot of this is VIEW content, but there may be a million > or so INSERTs and UPDATEs. Given an 8 hour day, and all million happening then, that's 1000000 / 8*60*60 or 34 transactions per second. That's not too bad. > I am told that the biggest speed boost and performance comes from > memory and fast hard disk. So I'm looking for at least a 16GB RAM and > SCSI 10k 300GB hard disks. > > We will use CentOS 5 with Apache 2. I am also told that PHP etc is > okay, but Postgresql (the database) is the one that hogs resources > after a while. So for the database server I need a high end server. It's not that so much that it's hard to distribute database load across > 1 server. I can build a farm with 10 PHP servers and a load balancer easy enough. Building a 10 db farm that replicate between each other is much more work, and may or may not scale particularly well. So, with a DB, you are putting more eggs in fewer baskets. > My question: What's the high end recommendation? Is the following > config of 4 x quadcore Dunnington Intels with 4 disks on RAID 10 be > good enough for the above sites? Can I run a database on this config > of servers for my kind of traffic, or do I need a separate one for PG? > I suppose the traffic will grow large quite quickly so the 300GB may > be low, but that we can add as we go along. I'd spend more money on your disks and RAID controllers, and less on CPUs. If you have all those cores and 16 or 32 Gig of ram, and your RAID controller / 4 disk RAID-10 is your choke point, you can't just upgrade overnight. Spend your money on more RAM, (32G isn't much more than 16G and I've seen it make a world of difference on our servers). Spend it on disks. Number of disks is often more important than RPM etc. Spend it on fast RAID controllers with battery backed cache. Then, consider upgrading your CPUs. We have 8 opteron cores in our servers, and 12 Disk RAID-10s under a very fast RAID controller, and we are still I/O not CPU bound. Move pg_xlog to its own RAID-1 set. As a minimum buy a server with enough expansion slots that you can add the disks later. The cost difference between a 4 drive 1U case and a 16 drive 3U case is not all that much, and it gives you the option of adding some drives as you go along. But all of this depends on the type of workload your db has to do. If you're running memory hungry select queries, focus on more memory. If you're running lots and lots of little queries with a mix of update, insert, delete and select, focus on the drives / controller. If you're running queries that require a lot of CPU, then focus more on that. I haven't seen a lot of workloads that tend to be cpu heavy enough to need 16 cores and only 4 drives. I have seen a lot that required 2 cores and 40+ drives to run fast. So the real answer is to test your workload on something close to what you're looking at using for a db server and look for bottlenecks. I'm betting I/O will be the biggest one once you've got enough memory.
On Wed, 2009-03-25 at 23:12 +0530, Phoenix Kiula wrote: > Hi. I have a questionf or people who run high traffic websites. > > My question: What's the high end recommendation? Is the following > config of 4 x quadcore Dunnington Intels with 4 disks on RAID 10 be > good enough for the above sites? Can I run a database on this config > of servers for my kind of traffic, or do I need a separate one for PG? > I suppose the traffic will grow large quite quickly so the 300GB may > be low, but that we can add as we go along. > A 4 Disk RAID 10 will give you ~ 100MBs random write per second, max. What type of IO are you using now? (This is also assumes an actual decent RAID controller). Joshua D. Drake -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
On Wed, 25 Mar 2009 13:19:12 -0600 Scott Marlowe <scott.marlowe@gmail.com> wrote: > Spend your money on more RAM, (32G isn't much more than 16G and > I've seen it make a world of difference on our servers). Spend it > on disks. Number of disks is often more important than RPM etc. > Spend it on fast RAID controllers with battery backed cache. > Then, consider upgrading your CPUs. We have 8 opteron cores in > our servers, and 12 Disk RAID-10s under a very fast RAID > controller, and we are still I/O not CPU bound. [snip] > But all of this depends on the type of workload your db has to > do. If you're running memory hungry select queries, focus on more > memory. If you're running lots and lots of little queries with a > mix of update, insert, delete and select, focus on the drives / > controller. If you're running queries that require a lot of CPU, > then focus more on that. Could IO load show up as apparent CPU load? I mean I've a pretty small DB. It should fit nearly all in RAM... or at least... after 1 day of load I can see the box may use 50K of swap. Anyway when I update the main table (~1M rows and a gin index) I can see the CPU reaching its limit. Most frequent updates involves 5K-20K changed record. On normal workload the most intensive queries run in 200ms with few exceptions and the BIG table is mostly in read access only. It would be nice if the update would be a bit faster since I'm still forced to do them during working hours... because people on the other side are still convinced it is not worth to clean rubbish at the source, so sometimes updates fail for inconsistent data. Unfortunately... I can add ram and disks but all the sockets for CPU are used. The box has 2 old Xeon HT at 3.2GHz. It's on RAID5 (not my choice) on a decent controller and has 4Gb of RAM. -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Thu, Mar 26, 2009 at 09:06:03AM +0100, Ivan Sergio Borgonovo wrote: > Could IO load show up as apparent CPU load? I may not be interpreting you correctly; but, as I understand it, if your IO subsystem is too slow then your CPUs are going to be idling. So if your CPUs are sitting at 100% utilisation then you're CPU bound and not IO bound. If your dataset mainly fits in RAM then SELECTs are always going to be CPU (or RAM to cache bandwidth) bound. You'll always be waiting for your disks when you modify data so if you consider your UPDATEs too slow you should look at what's going on in your system when they're happening. -- Sam http://samason.me.uk/
On Thursday 26 March 2009, Ivan Sergio Borgonovo <mail@webthatworks.it> wrote: > Could IO load show up as apparent CPU load? It would show up as CPU busy in iowait state. If the CPU is actually busy it would show mostly in user state, some in system. -- Even a sixth-grader can figure out that you can’t borrow money to pay off your debt