Thread: Server Hardware Configuration
We are running PostgreSQL as the back-end to a spam scanning system. The database holds suspected spam, and user configuration information. A web interface allows people to accept, or (usually) discard the trapped messages. So, most data is write once, read at most once, delete. The total size of the db is about 16gig in size. And, we expect it could grow to 4 times this as more users are opted into spam scanning. During most of the day, the machine is only lightly loaded. There are two bursts of activity: the nightly vacuum, and the first thing in the morning spam checking. Our current db machine has two hyper-threaded 2.4 GHz Xeon processors, 4 gig of main memory, and is attached to a JBOD configured with RAID 5 for the database, and mirrored disks for the DB logs. It is time to upgrade the machine. Two possibilities present themselves. 1. PowerEdge 6850 4 3.16 GHz Xeon processors 16 gig of memory Internal RAID 5 (only 3 disks) 2 Mirrored disks for root and db log. 2. PowerEdge 2850 2 Dual core 2.8GHz Xeon processors 8 gig of memory JBOD with RAID 5, and mirrored db log. Both configurations will cost about the same, within $\Delta$ for an acceptable value of $\Delta$. The idea behind the first is to keep the entire database in memory, by way of the disk cache. Alas, to keep it affordable (The extra memory is expensive) the JBOD must be jettisoned. The second is a larger version of our current configuration. (The 6850 with a JBOD would stretch the budget beyond $\Delta$, and the expense would be difficult to justify.) I'm looking for any comments, or suggestions. With expected growth, the first configuration seems out of balance---it will likely start off fast, but with growth the slower disk configuration will likely be a problem. Is anybody running PostgreSQL in a large memory, slower disk configuration? What are your experiences. Thank You, Mike P.S. We are investigating if the current IBM JBOD will work with the Dell PERC cards. But, even if they do, the current JBOD is populated with soon to be extended warranty disks, and so progressively costly. -- Michael D. Sofka sofkam@rpi.edu C&CT Sr. Systems Programmer Email, TeX, epistemology. Rensselaer Polytechnic Institute, Troy, NY. http://www.rpi.edu/~sofkam/
Performance questions are terrible to answer because we all use our systems in different ways. Here's my 2 bits for what they're worth. > The idea behind the first is to keep the > entire database in memory, by way of the disk cache. What you describe is a real-time system. Does your requirements call for real-time performance ? Remember that performance from memory is much more expensive than disk i/o (at least up to a certain point). > A web interface allows people to accept, or (usually) > discard the trapped messages. IMHO this means no real-time. > So, most data is write once, read at most once, delete. This is not exactly an optimum case for caching. I would suggest thinking really hard before going for an all memory solution. From what you write I would suggest a firm focus on disc i/o. But in the end the best person to ask really is yourself. How is the system loaded ? Is disk i/o maxed out ? Are the cpu's overload'ed ? Is it paging like cracy ? I suggest looking at your current bottleneck first. It's likely to be the most cost-efficient route out. Could also be some magic pg tunning tricks out there you've missed. Who knows. I spent two weeks on google and tuning last year with amazing results. Also remember that pg collects performance statistics. Those will help a lot in finding out what's really going on. This can also reveal where the database is being used inefficient by the application, missing index'es etc. At present I'm running a 130gig base on a 4gig mem four way 6650 with an external (fiber channel) raid 5 box. I admit the load is not that terrible or time sensitive (web self-service) but the performance is still pretty hot. The requests are very scattered so i/o is key. My cache benefit is mostly from the index's. Good luck in your quest for "bang per buck" ;-) Cheers, John
Two general comments: most people find that Opterons perform much better than Xeons. With some versions of PostgreSQL, the difference is over 50%. RAID5 generally doesn't make for a fast database. The problem is that there is a huge amount of overhead everytime you go to write something out to a RAID5 array. With careful tuning of the background writer you might be able to avoid some of that penalty, though your read performance will likely still be affected by the write overhead. BTW, -performance is a better list for info about this. If you look in the archives you'll be able to read a lot of threads from people seeking hardware advice. On Thu, Nov 17, 2005 at 09:54:38AM -0500, Michael D. Sofka wrote: > We are running PostgreSQL as the back-end to a spam scanning system. The > database holds suspected spam, and user configuration information. A > web interface allows people to accept, or (usually) discard the trapped > messages. So, most data is write once, read at most once, delete. > > The total size of the db is about 16gig in size. And, we expect it > could grow to 4 times this as more users are opted into spam scanning. > During most of the day, the machine is only lightly loaded. There are > two bursts of activity: the nightly vacuum, and the first thing in the > morning spam checking. > > Our current db machine has two hyper-threaded 2.4 GHz Xeon processors, 4 > gig of main memory, and is attached to a JBOD configured with RAID 5 for > the database, and mirrored disks for the DB logs. > > It is time to upgrade the machine. Two possibilities present themselves. > > 1. PowerEdge 6850 > 4 3.16 GHz Xeon processors > 16 gig of memory > Internal RAID 5 (only 3 disks) > 2 Mirrored disks for root and db log. > > 2. PowerEdge 2850 > 2 Dual core 2.8GHz Xeon processors > 8 gig of memory > JBOD with RAID 5, and mirrored db log. > > Both configurations will cost about the same, within $\Delta$ for an > acceptable value of $\Delta$. The idea behind the first is to keep the > entire database in memory, by way of the disk cache. Alas, to keep it > affordable (The extra memory is expensive) the JBOD must be jettisoned. > The second is a larger version of our current configuration. (The 6850 > with a JBOD would stretch the budget beyond $\Delta$, and the expense > would be difficult to justify.) > > I'm looking for any comments, or suggestions. With expected growth, the > first configuration seems out of balance---it will likely start off > fast, but with growth the slower disk configuration will likely be a > problem. Is anybody running PostgreSQL in a large memory, slower disk > configuration? What are your experiences. > > Thank You, > > Mike > > P.S. We are investigating if the current IBM JBOD will work with the > Dell PERC cards. But, even if they do, the current JBOD is populated > with soon to be extended warranty disks, and so progressively costly. > > -- > Michael D. Sofka sofkam@rpi.edu > C&CT Sr. Systems Programmer Email, TeX, epistemology. > Rensselaer Polytechnic Institute, Troy, NY. http://www.rpi.edu/~sofkam/ > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
I'd advise staying far away from RAID5 -- the link below is one frequently pointed to in Informix discussions, but I thinkthe points apply to any RDBMS. If you value your data (and sanity) stay with a more reliable setup -- performance isnot the only problem with RAID5. <http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt> In general, lots of spindles and lots of RAM seems to reduce the pain of large data sets (our larger servers have about 15-18gigs), but we're not doing a lot of OLTP, so our setup might not be similar to what you need. Greg Williamson DBA GlobeXplorer LLC -----Original Message----- From: pgsql-admin-owner@postgresql.org on behalf of Jim C. Nasby Sent: Sun 11/20/2005 9:53 AM To: Michael D. Sofka Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] Server Hardware Configuration Two general comments: most people find that Opterons perform much better than Xeons. With some versions of PostgreSQL, the difference is over 50%. RAID5 generally doesn't make for a fast database. The problem is that there is a huge amount of overhead everytime you go to write something out to a RAID5 array. With careful tuning of the background writer you might be able to avoid some of that penalty, though your read performance will likely still be affected by the write overhead. BTW, -performance is a better list for info about this. If you look in the archives you'll be able to read a lot of threads from people seeking hardware advice. On Thu, Nov 17, 2005 at 09:54:38AM -0500, Michael D. Sofka wrote: > We are running PostgreSQL as the back-end to a spam scanning system. The > database holds suspected spam, and user configuration information. A > web interface allows people to accept, or (usually) discard the trapped > messages. So, most data is write once, read at most once, delete. > > The total size of the db is about 16gig in size. And, we expect it > could grow to 4 times this as more users are opted into spam scanning. > During most of the day, the machine is only lightly loaded. There are > two bursts of activity: the nightly vacuum, and the first thing in the > morning spam checking. > > Our current db machine has two hyper-threaded 2.4 GHz Xeon processors, 4 > gig of main memory, and is attached to a JBOD configured with RAID 5 for > the database, and mirrored disks for the DB logs. > > It is time to upgrade the machine. Two possibilities present themselves. > > 1. PowerEdge 6850 > 4 3.16 GHz Xeon processors > 16 gig of memory > Internal RAID 5 (only 3 disks) > 2 Mirrored disks for root and db log. > > 2. PowerEdge 2850 > 2 Dual core 2.8GHz Xeon processors > 8 gig of memory > JBOD with RAID 5, and mirrored db log. > > Both configurations will cost about the same, within $\Delta$ for an > acceptable value of $\Delta$. The idea behind the first is to keep the > entire database in memory, by way of the disk cache. Alas, to keep it > affordable (The extra memory is expensive) the JBOD must be jettisoned. > The second is a larger version of our current configuration. (The 6850 > with a JBOD would stretch the budget beyond $\Delta$, and the expense > would be difficult to justify.) > > I'm looking for any comments, or suggestions. With expected growth, the > first configuration seems out of balance---it will likely start off > fast, but with growth the slower disk configuration will likely be a > problem. Is anybody running PostgreSQL in a large memory, slower disk > configuration? What are your experiences. > > Thank You, > > Mike > > P.S. We are investigating if the current IBM JBOD will work with the > Dell PERC cards. But, even if they do, the current JBOD is populated > with soon to be extended warranty disks, and so progressively costly. > > -- > Michael D. Sofka sofkam@rpi.edu > C&CT Sr. Systems Programmer Email, TeX, epistemology. > Rensselaer Polytechnic Institute, Troy, NY. http://www.rpi.edu/~sofkam/ > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings !DSPAM:4380b2ea101321248155993!
On Sun, 2005-11-20 at 11:53 -0600, Jim C. Nasby wrote: > Two general comments: most people find that Opterons perform much better > than Xeons. With some versions of PostgreSQL, the difference is over > 50%. Could you be more specific on that? Which version of Postgres perform better on Opteron than on Xeon? > RAID5 generally doesn't make for a fast database. The problem is that > there is a huge amount of overhead everytime you go to write something > out to a RAID5 array. With careful tuning of the background writer you > might be able to avoid some of that penalty, though your read > performance will likely still be affected by the write overhead. RAID5 was not ment to improve performance, but to minimize disaster and downtime when your hard disk dies. We're using RAID5 with postgres. In the last 3 years we changed 5 disks, but the system downtime was zero minutes. Mike -- Mario Splivalo Mob-Art mario.splivalo@mobart.hr "I can do it quick, I can do it cheap, I can do it well. Pick any two."
Dnia poniedziałek, 21 listopada 2005 10:34, Mario Splivalo napisał: > On Sun, 2005-11-20 at 11:53 -0600, Jim C. Nasby wrote: > > Two general comments: most people find that Opterons perform much better > > than Xeons. With some versions of PostgreSQL, the difference is over > > 50%. > > Could you be more specific on that? Which version of Postgres perform > better on Opteron than on Xeon? Try http://85.128.68.44 - I made some test about Xeon and Opteron > > > RAID5 generally doesn't make for a fast database. The problem is that > > there is a huge amount of overhead everytime you go to write something > > out to a RAID5 array. With careful tuning of the background writer you > > might be able to avoid some of that penalty, though your read > > performance will likely still be affected by the write overhead. > > RAID5 was not ment to improve performance, but to minimize disaster and > downtime when your hard disk dies. We're using RAID5 with postgres. In > the last 3 years we changed 5 disks, but the system downtime was zero > minutes. I'm ready enough to put some tests about different RAID's for Postgresql - but I will soon. However almost all people I know preffer RAID10 for database like PGSQL. Marcin > > > Mike
On Mon, 2005-11-21 at 10:58 +0100, Marcin Giedz wrote: > Dnia poniedziałek, 21 listopada 2005 10:34, Mario Splivalo napisał: > > On Sun, 2005-11-20 at 11:53 -0600, Jim C. Nasby wrote: > > > Two general comments: most people find that Opterons perform much better > > > than Xeons. With some versions of PostgreSQL, the difference is over > > > 50%. > > > > Could you be more specific on that? Which version of Postgres perform > > better on Opteron than on Xeon? > > Try http://85.128.68.44 - I made some test about Xeon and Opteron Cool! :) Could you give specs on processors itself? The clock, cache, and stuff? > > RAID5 was not ment to improve performance, but to minimize disaster and > > downtime when your hard disk dies. We're using RAID5 with postgres. In > > the last 3 years we changed 5 disks, but the system downtime was zero > > minutes. > > I'm ready enough to put some tests about different RAID's for Postgresql - but > I will soon. However almost all people I know preffer RAID10 for database > like PGSQL. What do you mean when you say RAID10? Raid 1+0, or striped raid 5? Mike -- Mario Splivalo Mob-Art mario.splivalo@mobart.hr "I can do it quick, I can do it cheap, I can do it well. Pick any two."
Dnia poniedziałek, 21 listopada 2005 11:23, Mario Splivalo napisał: > On Mon, 2005-11-21 at 10:58 +0100, Marcin Giedz wrote: > > Dnia poniedziałek, 21 listopada 2005 10:34, Mario Splivalo napisał: > > > On Sun, 2005-11-20 at 11:53 -0600, Jim C. Nasby wrote: > > > > Two general comments: most people find that Opterons perform much > > > > better than Xeons. With some versions of PostgreSQL, the difference > > > > is over 50%. > > > > > > Could you be more specific on that? Which version of Postgres perform > > > better on Opteron than on Xeon? > > > > Try http://85.128.68.44 - I made some test about Xeon and Opteron > > Cool! :) Could you give specs on processors itself? The clock, cache, > and stuff? Look at this site once again - made some additional info. > > > > RAID5 was not ment to improve performance, but to minimize disaster and > > > downtime when your hard disk dies. We're using RAID5 with postgres. In > > > the last 3 years we changed 5 disks, but the system downtime was zero > > > minutes. > > > > I'm ready enough to put some tests about different RAID's for Postgresql > > - but I will soon. However almost all people I know preffer RAID10 for > > database like PGSQL. > > What do you mean when you say RAID10? Raid 1+0, or striped raid 5? Raid 1+0. > > Mike
Marcin Giedz wrote: > Dnia poniedzia??ek, 21 listopada 2005 11:23, Mario Splivalo napisa??: > > Cool! :) Could you give specs on processors itself? The clock, cache, > > and stuff? > > Look at this site once again - made some additional info. It says you have "2 x Opteron Dual Core" -- so is it two dual core CPUs, or one dual core? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Dnia poniedziałek, 21 listopada 2005 16:31, Alvaro Herrera napisał: > Marcin Giedz wrote: > > Dnia poniedzia??ek, 21 listopada 2005 11:23, Mario Splivalo napisa??: > > > Cool! :) Could you give specs on processors itself? The clock, cache, > > > and stuff? > > > > Look at this site once again - made some additional info. > > It says you have "2 x Opteron Dual Core" -- so is it two dual core CPUs, Exactly!!! Marcin > or one dual core?
--On Saturday, November 19, 2005 12:15:46 AM +0000 John Jensen <jrj@ft.fo> wrote: > What you describe is a real-time system. Does your > requirements call for real-time performance ? > Remember that performance from memory is much more > expensive than disk i/o (at least up to a certain point). No, we are not RT. This is a good point. Stability, reliability, and reasonable response time are what we require. It stores spam. We don't want to go to heroic efforts to save the spam in the event of failure. (We've been there, and don't want to do it again. :-) > This is not exactly an optimum case for caching. I would > suggest thinking really hard before going for an all memory > solution. From what you write I would suggest a firm focus > on disc i/o. ... > I suggest looking at your current bottleneck first. > It's likely to be the most cost-efficient route out. I/O is our bottleneck. The machine is not CPU loaded. And, in fact, our current performance is good. The machine upgrade is planned with a service upgrade. Current hardware is old, and so getting more expensive to support. We also anticipate service growth (read, more spam), and so are planning accordingly. > Good luck in your quest for "bang per buck" ;-) Thank You, --On Sunday, November 20, 2005 11:53:37 AM -0600 "Jim C. Nasby" <jnasby@pervasive.com> wrote: > Two general comments: most people find that Opterons perform much better > than Xeons. With some versions of PostgreSQL, the difference is over > 50%. Interesting. Alas, Opteron is not a choice. :-( > RAID5 generally doesn't make for a fast database. The problem is that > there is a huge amount of overhead everytime you go to write something > out to a RAID5 array. With careful tuning of the background writer you > might be able to avoid some of that penalty, though your read > performance will likely still be affected by the write overhead. We've had experience with RAID 5, and RAID 1+0 on various servers. We always use RAID with battery backed RAM, which greatly improves RAID 5 performance. RAID 1+0 is always faster, but cost is always an issue. > BTW, -performance is a better list for info about this. If you look in > the archives you'll be able to read a lot of threads from people seeking > hardware advice. Thank you, I'll look there. Mike -- Michael D. Sofka sofkam@rpi.edu C&CT Sr. Systems Programmer Email, TeX, epistemology. Rensselaer Polytechnic Institute, Troy, NY. http://www.rpi.edu/~sofkam/
Marcin Giedz wrote: > Dnia poniedzia??ek, 21 listopada 2005 16:31, Alvaro Herrera napisa??: > > Marcin Giedz wrote: > > > Dnia poniedzia??ek, 21 listopada 2005 11:23, Mario Splivalo napisa??: > > > > Cool! :) Could you give specs on processors itself? The clock, cache, > > > > and stuff? > > > > > > Look at this site once again - made some additional info. > > > > It says you have "2 x Opteron Dual Core" -- so is it two dual core CPUs, > Exactly!!! So the comparison is not really very fair. You are comparing the performance of 4 Opterons versus 2 Xeons. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, Nov 21, 2005 at 10:34:36AM +0100, Mario Splivalo wrote: > > RAID5 generally doesn't make for a fast database. The problem is that > > there is a huge amount of overhead everytime you go to write something > > out to a RAID5 array. With careful tuning of the background writer you > > might be able to avoid some of that penalty, though your read > > performance will likely still be affected by the write overhead. > > RAID5 was not ment to improve performance, but to minimize disaster and > downtime when your hard disk dies. We're using RAID5 with postgres. In > the last 3 years we changed 5 disks, but the system downtime was zero > minutes. And the same would have been true with RAID10. In fact, RAID10 is more reliable than RAID5; depending on what drives fail it's possible to lose up to half of a RAID10 array without any data loss. If you ever lose more than 2 drives at once with RAID5, your data is gone. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Mon, Nov 21, 2005 at 09:41:36AM -0500, Michael D. Sofka wrote: > >I suggest looking at your current bottleneck first. > >It's likely to be the most cost-efficient route out. > > I/O is our bottleneck. The machine is not CPU loaded. And, in fact, > our current performance is good. The machine upgrade is planned with a > service upgrade. Current hardware is old, and so getting more expensive > to support. We also anticipate service growth (read, more spam), and > so are planning accordingly. Which, as I mentioned, is why RAID5 is not a good solution if you're doing any writes at all. You're talking about a 16G database that you expect to grow to 64G. That would fit happily in a RAID1 (mirror) of two SCSI 72G drives. I haven't priced that kind of stuff out recently, but I believe you're looking at $300-$500. If that doesn't provide enough performance, go to a RAID10 and add more drives. If you're doing much writing at all, spring for a battery-backed controller so you can enable write caching. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Way overkill... all you really need is a decently sized (160G or a pair of 80's) hard-disk and a P-III / AMD Athlon 750 MHz with 512MB Ram and FreeBSD 5 and postgreSQL.... With databases... it's all about disk-performance anyway... ""Michael D. Sofka"" <sofkam@rpi.edu> wrote in message news:E8FEC853EDA289B924D8F213@betelgeuse.cct.rpi.edu... > We are running PostgreSQL as the back-end to a spam scanning system. The > database holds suspected spam, and user configuration information. A > web interface allows people to accept, or (usually) discard the trapped > messages. So, most data is write once, read at most once, delete. > > The total size of the db is about 16gig in size. And, we expect it > could grow to 4 times this as more users are opted into spam scanning. > During most of the day, the machine is only lightly loaded. There are > two bursts of activity: the nightly vacuum, and the first thing in the > morning spam checking. > > Our current db machine has two hyper-threaded 2.4 GHz Xeon processors, 4 > gig of main memory, and is attached to a JBOD configured with RAID 5 for > the database, and mirrored disks for the DB logs. > > It is time to upgrade the machine. Two possibilities present themselves. > > 1. PowerEdge 6850 > 4 3.16 GHz Xeon processors > 16 gig of memory > Internal RAID 5 (only 3 disks) > 2 Mirrored disks for root and db log. > > 2. PowerEdge 2850 > 2 Dual core 2.8GHz Xeon processors > 8 gig of memory > JBOD with RAID 5, and mirrored db log. > > Both configurations will cost about the same, within $\Delta$ for an > acceptable value of $\Delta$. The idea behind the first is to keep the > entire database in memory, by way of the disk cache. Alas, to keep it > affordable (The extra memory is expensive) the JBOD must be jettisoned. > The second is a larger version of our current configuration. (The 6850 > with a JBOD would stretch the budget beyond $\Delta$, and the expense > would be difficult to justify.) > > I'm looking for any comments, or suggestions. With expected growth, the > first configuration seems out of balance---it will likely start off > fast, but with growth the slower disk configuration will likely be a > problem. Is anybody running PostgreSQL in a large memory, slower disk > configuration? What are your experiences. > > Thank You, > > Mike > > P.S. We are investigating if the current IBM JBOD will work with the > Dell PERC cards. But, even if they do, the current JBOD is populated > with soon to be extended warranty disks, and so progressively costly. > > -- > Michael D. Sofka sofkam@rpi.edu > C&CT Sr. Systems Programmer Email, TeX, epistemology. > Rensselaer Polytechnic Institute, Troy, NY. http://www.rpi.edu/~sofkam/ > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
On Monday 21 November 2005 12:12, you wrote: > On Mon, Nov 21, 2005 at 09:41:36AM -0500, Michael D. Sofka wrote: > You're talking about a 16G database that you expect to grow to 64G. That > would fit happily in a RAID1 (mirror) of two SCSI 72G drives. I haven't > priced that kind of stuff out recently, but I believe you're looking at > $300-$500. If that doesn't provide enough performance, go to a RAID10 > and add more drives. If you're doing much writing at all, spring for a > battery-backed controller so you can enable write caching. More like $7k, till you add in the JBOD, the Perc card and 15,000 rpm SCSI disks, with hot spars (the journal mirror goes on the same JBOD). But you're right, we will have sufficient disks on the JBOD for RAID 1+0. Mike -- Michael D. Sofka sofkam@rpi.edu C&CT Sr. Systems Programmer Email, TeX, epistemology. Rensselaer Polytechnic Institute, Troy, NY. http://www.rpi.edu/~sofkam/