Thread: Hardware suggestions for Linux/PGSQL server
Hi everyone, I want to pick your brains for hardware suggestions about a Linux-based PostgreSQL 7.4 server. It will be a dedicated DB server backing our web sites and hit by application servers (which do connection pooling). I've hopefully provided all relevant information below. Any thoughts, comments or suggestions are welcome. Our current server and database: Mac OS X Server 10.2.8 single 1.25GHz G4 2 GB 333MHz RAM 7200 rpm SCSI drive for OS, logs 15k rpm SCSI drive for data PostgreSQL 7.3.4 1 database, 1.1 GB in size, growing by ~15 MB / week 60 tables, 1 schema, largest is 1m rows, 1 at 600k, 3 at 100k Peak traffic: 500 UPDATEs, INSERTs and DELETEs / minute 6000 SELECTs / minutes 90 connections Performance is fine most of the time, but not during peak loads. We're never swapping and disk IO during the SELECT peaks is hardly anything (under 3MB/sec). I think UPDATE peaks might be saturating disk IO. Normally, most queries finish in under .05 seconds. Some take 2-3 seconds. During peaks, the fast queries are just OK and the slower ones take too long (like over 8 seconds). We're moving to Linux from OS X for improved stability and more hardware options. We need to do this soon. The current server is max'd out at 2GB RAM and I'm afraid might start swapping in a month. Projected database/traffic in 12 months: Database size will be at least 2.5 GB Largest table still 1m rows, but 100k tables will grow to 250k Will be replicated to a suitable standby slave machine Peak traffic: 2k UPDATEs, INSERTs, DELETEs / minute 20k SELECTs / minute 150 - 200 connections We're willing to shell out extra bucks to get something that will undoubtedly handle the projected peak load in 12 months with excellent performance. But we're not familiar with PG's performance on Linux and don't like to waste money. I've been thinking of this (overkill? not enough?): 2 Intel 32-bit CPUs Lowest clock speed chip for the fastest available memory bus 4 GB RAM (maybe we only need 3 GB to start with?) SCSI RAID 1 for OS For PostgreSQL data and logs ... 15k rpm SCSI disks RAID 5, 7 disks, 256MB battery-backed write cache (Should we save $ and get a 4-disk RAID 10 array?) I wonder about the 32bit+bigmem vs. 64bit question. At what database size will we need more than 4GB RAM? We'd like to always have enough RAM to cache the entire database. While 64bit is in our long-term future, we're willing to stick with 32bit Linux until 64bit Linux on Itanium/Opteron and 64bit PostgreSQL "settle in" to proven production-quality. TIA, - Jeff -- Jeff Bohmer VisionLink, Inc. _________________________________ 303.402.0170 www.visionlink.org _________________________________ People. Tools. Change. Community.
Jeff Bohmer wrote: > We're willing to shell out extra bucks to get something that will > undoubtedly handle the projected peak load in 12 months with excellent > performance. But we're not familiar with PG's performance on Linux and > don't like to waste money. Properly tuned, PG on Linux runs really nice. A few people have mentioned the VM swapping algorithm on Linux is semi-dumb. I get around that problem by having a ton of memory and almost no swap. > I've been thinking of this (overkill? not enough?): > 2 Intel 32-bit CPUs > Lowest clock speed chip for the fastest available memory bus > 4 GB RAM (maybe we only need 3 GB to start with?) > SCSI RAID 1 for OS > For PostgreSQL data and logs ... > 15k rpm SCSI disks > RAID 5, 7 disks, 256MB battery-backed write cache > (Should we save $ and get a 4-disk RAID 10 array?) > > I wonder about the 32bit+bigmem vs. 64bit question. At what database > size will we need more than 4GB RAM? With 4GB of RAM, you're already running into bigmem. By default, Linux gives 2GB of address space to programs and 2GB to kernel. I usually see people quote 5%-15% penalty in general for using PAE versus a flat address space. I've seen simple MySQL benchmarks where 64-bit versions run 35%+ faster versus 32-bit+PAE but how that translates to PG, I dunno yet. > We'd like to always have enough RAM to cache the entire database. While > 64bit is in our long-term future, we're willing to stick with 32bit > Linux until 64bit Linux on Itanium/Opteron and 64bit PostgreSQL "settle > in" to proven production-quality. Well if this is the case, you probably should get an Opteron server *now* and just run 32-bit Linux on it until you're sure about the software. No point in buying a Xeon and then throwing the machine away in a year when you decide you need 64-bit for more speed.
>Properly tuned, PG on Linux runs really nice. A few people have >mentioned the VM swapping algorithm on Linux is semi-dumb. I get >around that problem by having a ton of memory and almost no swap. I think we want your approach: enough RAM to avoid swapping altogether. >With 4GB of RAM, you're already running into bigmem. By default, >Linux gives 2GB of address space to programs and 2GB to kernel. It seems I don't fully understand the bigmem situation. I've searched the archives, googled, checked RedHat's docs, etc. But I'm getting conflicting, incomplete and/or out of date information. Does anyone have pointers to bigmem info or configuration for the 2.4 kernel? If Linux is setup with 2GB for kernel and 2GB for user, would that be OK with a DB size of 2-2.5 GB? I'm figuring the kernel will cache most/all of the DB in it's 2GB and there's 2GB left for PG processes. Where does PG's SHM buffers live, kernel or user? (I don't plan on going crazy with buffers, but will guess we'd need about 128MB, 256MB at most.) >I usually see people quote 5%-15% penalty in general for using PAE >versus a flat address space. I've seen simple MySQL benchmarks where >64-bit versions run 35%+ faster versus 32-bit+PAE but how that >translates to PG, I dunno yet. > >>We'd like to always have enough RAM to cache the entire database. >>While 64bit is in our long-term future, we're willing to stick with >>32bit Linux until 64bit Linux on Itanium/Opteron and 64bit >>PostgreSQL "settle in" to proven production-quality. > >Well if this is the case, you probably should get an Opteron server >*now* and just run 32-bit Linux on it until you're sure about the >software. No point in buying a Xeon and then throwing the machine >away in a year when you decide you need 64-bit for more speed. That's a good point. I had forgotten about the option to run 32bit on an Operton. If we had 3GB or 4GB initially on an Opteron, we'd need bigmem for 32bit Linux, right? This might work nicely since we'd factor in the penalty from PAE for now and have the performance boost from moving to 64bit available on demand. Not having to build another DB server in a year would also be nice. FYI, we need stability first and performance second. Thank you, - Jeff -- Jeff Bohmer VisionLink, Inc. _________________________________ 303.402.0170 www.visionlink.org _________________________________ People. Tools. Change. Community.
Just one more piece of advice, you might want to look into a good battery backed cache hardware RAID controller. They work quite well for heavily updated databases. The more drives you throw at the RAID array the faster it will be.
Jeff Bohmer wrote: > It seems I don't fully understand the bigmem situation. I've searched > the archives, googled, checked RedHat's docs, etc. But I'm getting > conflicting, incomplete and/or out of date information. Does anyone > have pointers to bigmem info or configuration for the 2.4 kernel? Bigmem is the name for Linux's PAE support. > If Linux is setup with 2GB for kernel and 2GB for user, would that be OK > with a DB size of 2-2.5 GB? I'm figuring the kernel will cache most/all > of the DB in it's 2GB and there's 2GB left for PG processes. Where does > PG's SHM buffers live, kernel or user? (I don't plan on going crazy > with buffers, but will guess we'd need about 128MB, 256MB at most.) PG's SHM buffers live in user. Whether Linux's OS caches lives in user or kernel, I think it's in kernel and I remember reading a max of ~950KB w/o bigmem which means your 3.5GB of available OS memory will definitely have to be swapped in and out of kernel space using PAE. >> Well if this is the case, you probably should get an Opteron server >> *now* and just run 32-bit Linux on it until you're sure about the >> software. No point in buying a Xeon and then throwing the machine away >> in a year when you decide you need 64-bit for more speed. > > That's a good point. I had forgotten about the option to run 32bit on > an Operton. If we had 3GB or 4GB initially on an Opteron, we'd need > bigmem for 32bit Linux, right? > > This might work nicely since we'd factor in the penalty from PAE for now > and have the performance boost from moving to 64bit available on > demand. Not having to build another DB server in a year would also be > nice. > > FYI, we need stability first and performance second. We ordered a 2x Opteron server the moment the CPU was released and it's been perfect -- except for one incident where the PCI riser card had drifted out of the PCI slot due to the heavy SCSI cables connected to the card. I think most of the Opteron server MBs are pretty solid but you want extra peace-of-mind, you could get a server from Newisys as they pack in a cartload of extra monitoring features.
Jeff Bohmer wrote: >> Well if this is the case, you probably should get an Opteron server >> *now* and just run 32-bit Linux on it until you're sure about the >> software. No point in buying a Xeon and then throwing the machine away >> in a year when you decide you need 64-bit for more speed. > > > That's a good point. I had forgotten about the option to run 32bit on > an Operton. If we had 3GB or 4GB initially on an Opteron, we'd need > bigmem for 32bit Linux, right? > > This might work nicely since we'd factor in the penalty from PAE for now > and have the performance boost from moving to 64bit available on > demand. Not having to build another DB server in a year would also be > nice. FWIW, there are only two pieces of software that need 64bit aware for a typical server job. Kernel and glibc. Rest of the apps can do fine as 32 bits unless you are oracle and insist on outsmarting OS. In fact running 32 bit apps on 64 bit OS has plenty of advantages like effectively using the cache. Unless you need 64bit, going for 64bit software is not advised. Shridhar -- ----------------------------- Shridhar Daithankar LIMS CPE Team Member, PSPL. mailto:shridhar_daithankar@persistent.co.in Phone:- +91-20-5676700 Extn.270 Fax :- +91-20-5676701 -----------------------------
Shridhar Daithankar wrote: > > FWIW, there are only two pieces of software that need 64bit aware for a > typical server job. Kernel and glibc. Rest of the apps can do fine as 32 > bits unless you are oracle and insist on outsmarting OS. > > In fact running 32 bit apps on 64 bit OS has plenty of advantages like > effectively using the cache. Unless you need 64bit, going for 64bit > software is not advised. This is a good point. While doing research on this matter a few months back, I saw comments by people testing 64-bit MySQL that some operations would run faster and some slower due to the use of 64-bit datatypes versus 32-bit. The best solution in the end is probably to run 32-bit Postgres under a 64-bit kernel -- unless your DB tends to have a lot of 64-bit datatypes.
>Just one more piece of advice, you might want to look into a good battery >backed cache hardware RAID controller. They work quite well for heavily >updated databases. The more drives you throw at the RAID array the faster >it will be. I've seen this list often recommended such a setup. We'll probably get battery-backed write cache and start out with a 4 disk RAID 10 array. Then add more disks and change RAID 5 if more read performance is needed. Thanks, - Jeff -- Jeff Bohmer VisionLink, Inc. _________________________________ 303.402.0170 www.visionlink.org _________________________________ People. Tools. Change. Community.
>Shridhar Daithankar wrote: >> >>FWIW, there are only two pieces of software that need 64bit aware >>for a typical server job. Kernel and glibc. Rest of the apps can do >>fine as 32 bits unless you are oracle and insist on outsmarting OS. >> >>In fact running 32 bit apps on 64 bit OS has plenty of advantages >>like effectively using the cache. Unless you need 64bit, going for >>64bit software is not advised. > >This is a good point. While doing research on this matter a few >months back, I saw comments by people testing 64-bit MySQL that some >operations would run faster and some slower due to the use of 64-bit >datatypes versus 32-bit. The best solution in the end is probably to >run 32-bit Postgres under a 64-bit kernel -- unless your DB tends to >have a lot of 64-bit datatypes. Thanks Shridhar and William, This advice has been very helpful. I would imagine a lot of folks are, or will soon be looking at 32- vs. 64-bit just for memory reasons and not 64-bit apps. - Jeff -- Jeff Bohmer VisionLink, Inc. _________________________________ 303.402.0170 www.visionlink.org _________________________________ People. Tools. Change. Community.
I don't know what your budget is, but there are now 10k RPM SATA 150 drives on the market. Their price/performance is impressive. You may want to consider going with a bunch of these instead of SCSI disks (more spindles vs. faster spindles). 3ware makes a hardware raid card that can drive up to 12 SATA disks. I have been told by a few people who have used it that the linux driver is very solid. Drew Jeff Bohmer wrote: >> Just one more piece of advice, you might want to look into a good >> battery >> backed cache hardware RAID controller. They work quite well for heavily >> updated databases. The more drives you throw at the RAID array the >> faster >> it will be. > > > I've seen this list often recommended such a setup. We'll probably > get battery-backed write cache and start out with a 4 disk RAID 10 > array. Then add more disks and change RAID 5 if more read performance > is needed. > > Thanks, > - Jeff
In the last exciting episode, drew@xyzzy.dhs.org ("Andrew G. Hammond") wrote: > I don't know what your budget is, but there are now 10k RPM SATA 150 > drives on the market. Their price/performance is impressive. You may > want to consider going with a bunch of these instead of SCSI disks > (more spindles vs. faster spindles). 3ware makes a hardware raid > card that can drive up to 12 SATA disks. I have been told by a few > people who have used it that the linux driver is very solid. We got a couple of those in for testing purposes; when opportunity presents itself, I'll have to check to see if they are any more honest about commits than traditional IDE drives. If they still "lie" the same way IDE drives do, it is entirely possible that they are NOT nearly as impressive as you presently imagine. It's not much good if they're "way fast" if you can't trust them to actually store data when they claim it is stored... -- (reverse (concatenate 'string "gro.gultn" "@" "enworbbc")) http://www.ntlug.org/~cbbrowne/lisp.html "Much of this software was user-friendly, meaning that it was intended for users who did not know anything about computers, and furthermore had absolutely no intention whatsoever of learning." -- A. S. Tanenbaum, "Modern Operating Systems, ch 1.2.4"
>In the last exciting episode, drew@xyzzy.dhs.org ("Andrew G. Hammond") wrote: >> I don't know what your budget is, but there are now 10k RPM SATA 150 >> drives on the market. Their price/performance is impressive. You may >> want to consider going with a bunch of these instead of SCSI disks >> (more spindles vs. faster spindles). 3ware makes a hardware raid >> card that can drive up to 12 SATA disks. I have been told by a few >> people who have used it that the linux driver is very solid. > >We got a couple of those in for testing purposes; when opportunity >presents itself, I'll have to check to see if they are any more honest >about commits than traditional IDE drives. > >If they still "lie" the same way IDE drives do, it is entirely >possible that they are NOT nearly as impressive as you presently >imagine. It's not much good if they're "way fast" if you can't trust >them to actually store data when they claim it is stored... We lost data because of this very problem when a UPS didn't signal the shut down before it ran out of juice. Here's an excellent explanation of the problem: http://archives.postgresql.org/pgsql-general/2003-10/msg01343.php This post indicates that SATA drives still have problems, but a new ATA standard might fix things in the future: http://archives.postgresql.org/pgsql-general/2003-10/msg01395.php SATA RAID is a good option for a testing server, though. - Jeff -- Jeff Bohmer VisionLink, Inc. _________________________________ 303.402.0170 www.visionlink.org _________________________________ People. Tools. Change. Community.