Thread: the RAID question, again
Hi, I want to ask the 'which RAID setup is best for PostgreSQL?' question again. I've read a large portion of the archives of this list, but generally the answer is 'depends on your needs' with a few different camps. My needs are as follows: dedicated PostgreSQL server for a website, which does much more select queries than insert/updates (although, due to a lot of caching outside the database, we will be doing more updates than usual for a website). The machine which will be built for this is going to be something like a dual Xeon 2.4GHz, 4GB RAM, and a SCSI hardware RAID controller with some cache RAM and 6-7 36GB 15K rpm disks. We have good experiences with ICP Vortex controllers, so I'll probably end up buying on of those again (the GDT8514RZ looks nice: http://www.icp-vortex.com/english/product/pci/rzu320/8514rz_e.htm ) We normally use Debian linux with a 2.4 kernel, but we're thinking we might play around with FreeBSD and see how that runs before making the final choice. The RAID setup I have in my head is as follows: 4 disks for a RAID 10 array, for the PG data area 2 disks for a RAID 1 array, for the OS, swap (it won't swap) and, most importantly, WAL files 1 disk for hot spare RAID 1 isn't ideal for a WAL disks because of the (small) write penalty, but I'm not sure I want to risk losing the WAL files. As far as I know PG doesn't really like losing them :) This array shouldn't see much I/O outside of the WAL files, since the OS and PG itself should be completely in RAM when it's started up. RAID 5 is more cost-effective for the data storage, but write-performance is much lower than RAID 10. The hot-spare is non-negotiable, it has saved my life a number of times ;) Performance and reliability are the prime concerns for this setup. We normally run our boxes at extremely high loads because we don't have the budget we need. Cost is an issue, but since our website is always growing at an insane pace I'd rather drop some cash on a fast server now and hope to hold out till the end of this year than having to rush out and buy another mediocre server in a few months. Am I on the right track or does anyone have any tips I could use? On a side note: this box will be bought a few days or weeks from now and tested during a week or so before we put it in our production environment (if everything goes well). If anyone is interested in any benchmark results from it (possibly even FreeBSD vs Linux :)) that can probably be arranged. Vincent van Leeuwen Media Design - http://www.mediadesign.nl/
Vincent, In my eyes the best disk I/O configuration is a balance of performance, price and administrative effort. Your set-up looks relatively good. Howver, price seems not to be your greatest concern. Otherwise you would favor RAID 5 and/or leave out the spare disk. One improvement area may be to put all 6 disks into a RAID 10 group. That way you have more I/O bandwith. One watchout is that the main memory of your machine may be better than the one of your RAID controller. The RAID controller has Integrated 128MB PC133 ECC SDRAM. You did not state what kind of memory your server has. Regards, Nikolaus On Wed, 16 Apr 2003 18:26:58 +0200, Vincent van Leeuwen wrote: > > Hi, > > I want to ask the 'which RAID setup is best for > PostgreSQL?' question again. > I've read a large portion of the archives of this list, > but generally the > answer is 'depends on your needs' with a few different > camps. > > My needs are as follows: dedicated PostgreSQL server > for a website, which does > much more select queries than insert/updates (although, > due to a lot of > caching outside the database, we will be doing more > updates than usual for a > website). > > The machine which will be built for this is going to be > something like a dual > Xeon 2.4GHz, 4GB RAM, and a SCSI hardware RAID > controller with some cache RAM > and 6-7 36GB 15K rpm disks. We have good experiences > with ICP Vortex > controllers, so I'll probably end up buying on of those > again (the GDT8514RZ > looks nice: > http://www.icp-vortex.com/english/product/pci/rzu320/8514rz_e.htm > ) > > We normally use Debian linux with a 2.4 kernel, but > we're thinking we might > play around with FreeBSD and see how that runs before > making the final choice. > > The RAID setup I have in my head is as follows: > > 4 disks for a RAID 10 array, for the PG data area > 2 disks for a RAID 1 array, for the OS, swap (it won't > swap) and, most > importantly, WAL files > 1 disk for hot spare > > RAID 1 isn't ideal for a WAL disks because of the > (small) write penalty, but > I'm not sure I want to risk losing the WAL files. As > far as I know PG doesn't > really like losing them :) This array shouldn't see > much I/O outside of the > WAL files, since the OS and PG itself should be > completely in RAM when it's > started up. > > RAID 5 is more cost-effective for the data storage, but > write-performance is > much lower than RAID 10. > > The hot-spare is non-negotiable, it has saved my life a > number of times ;) > > Performance and reliability are the prime concerns for > this setup. We normally > run our boxes at extremely high loads because we don't > have the budget we > need. Cost is an issue, but since our website is always > growing at an insane > pace I'd rather drop some cash on a fast server now and > hope to hold out till > the end of this year than having to rush out and buy > another mediocre server > in a few months. > > Am I on the right track or does anyone have any tips I > could use? > > > On a side note: this box will be bought a few days or > weeks from now and > tested during a week or so before we put it in our > production environment (if > everything goes well). If anyone is interested in any > benchmark results from > it (possibly even FreeBSD vs Linux :)) that can > probably be arranged. > > > Vincent van Leeuwen > Media Design - http://www.mediadesign.nl/ > > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: you can get off all lists at once with the > unregister command > (send "unregister YourEmailAddressHere" to > majordomo@postgresql.org)
Vincent, > One watchout is that the main memory of your machine > may be better than the one of your RAID controller. > The RAID controller has Integrated 128MB PC133 ECC > SDRAM. You did not state what kind of memory your > server has. Nickolaus has a good point. With a high-end Linux server, and a medium-end RAID card, it's sometimes faster to use Linux software RAID than harware raid. Not all the time, though. -- Josh Berkus Aglio Database Solutions San Francisco
On 2003-04-16 19:32:54 -0700, Nikolaus Dilger wrote: > One improvement area may be to put all 6 disks into a > RAID 10 group. That way you have more I/O bandwith. A concern I have about that setup is that a large WAL write will have to wait for 6 spindles to write the data before returning instead of 2 spindles. But as you say it does create way more I/O bandwidth. I think I'll just test that when the box is here instead of speculating further :) > One watchout is that the main memory of your machine > may be better than the one of your RAID controller. > The RAID controller has Integrated 128MB PC133 ECC > SDRAM. You did not state what kind of memory your > server has. > On 2003-04-16 20:20:50 -0700, Josh Berkus wrote: > Nickolaus has a good point. With a high-end Linux server, and a medium-end > RAID card, it's sometimes faster to use Linux software RAID than harware > raid. Not all the time, though. I've heard rumors that software raid performs poor when stacking raid layers (raid 0 on raid 1). Not sure if that's still true though. My own experiences with linux software raid (raid 5 on a low-cost fileserver for personal use) are very good (especially in the reliability department, I've recovered from two-disk failures due to controllers hanging up with only a few percent data loss), although I've never been overly concerned with performance on that setup so haven't really tested that. But if this controller is medium-end, could anyone recommend a high-end RAID card that has excellent linux support? One of the things I especially like about ICP Vortex products is the official linux support and the excellent software utility for monitoring and (re)configuring the raid arrays. Comes in handy when replacing hot-spares and rebuilding failed arrays while keeping the box running :) Vincent van Leeuwen Media Design - http://www.mediadesign.nl/
Vincent, > But if this controller is medium-end, could anyone recommend a high-end > RAID card that has excellent linux support? One of the things I especially > like about ICP Vortex products is the official linux support and the > excellent software utility for monitoring and (re)configuring the raid > arrays. Comes in handy when replacing hot-spares and rebuilding failed > arrays while keeping the box running :) No, just negative advice. Mylex support is dead until someone steps into the shoes of the late developer of that driver. Adaptec is only paying their linux guy to do Red Hat support for their new RAID cards, so you're SOL with other distributions. -- Josh Berkus Aglio Database Solutions San Francisco
On Tue, 22 Apr 2003, Vincent van Leeuwen wrote: > On 2003-04-16 19:32:54 -0700, Nikolaus Dilger wrote: > > One improvement area may be to put all 6 disks into a > > RAID 10 group. That way you have more I/O bandwith. > > A concern I have about that setup is that a large WAL write will have to wait > for 6 spindles to write the data before returning instead of 2 spindles. But > as you say it does create way more I/O bandwidth. I think I'll just test that > when the box is here instead of speculating further :) Not in a RAID 10. Assuming the setup is: RAID0-0: disk0, disk1, disk2 RAID0-1: disk3, disk4, disk5 RAID1-0: RAID0-0, RAID0-1 Then a write would only have to wait on two disks. Assuming the physical setup is one SCSI channel for RAID0-0 and one for RAID0-1, then both drives can write at the same time and your write performance is virtually identical to a single drive. > On 2003-04-16 20:20:50 -0700, Josh Berkus wrote: > > Nickolaus has a good point. With a high-end Linux server, and a medium-end > > RAID card, it's sometimes faster to use Linux software RAID than harware > > raid. Not all the time, though. > > I've heard rumors that software raid performs poor when stacking raid layers > (raid 0 on raid 1). Not sure if that's still true though. I tested it and was probably the one spreading the rumors. I was testing on Linux kernels 2.4.9 at the time on a Dual PPro - 200 with 256 Meg RAM and 6 Ultra Wide 4 gig SCSI drives at 10krpm. I've also tested other setups. My experience was that RAID5 and RAID1 were no faster on top of RAID0 then on bare drives. note that I didn't test for massive parallel performance, which would probably have better performance with the extra platters. I was testing something like 4 to 10 simo connects with pgbench and my own queries, some large, some small. > My own experiences > with linux software raid (raid 5 on a low-cost fileserver for personal use) > are very good (especially in the reliability department, I've recovered from > two-disk failures due to controllers hanging up with only a few percent data > loss), although I've never been overly concerned with performance on that > setup so haven't really tested that. My experience with Linux RAID is similar to yours. It's always been rock solid reliable, and acutally seems more intuitive to me now than any of the hardware RAID cards I've played with. Plus you can FORCE it to do what you want, whereas many cards refuse to do what you want. for really fast RAID, look at external RAID enclosures, that take x drives and make them look like one great big drive. Good speed and easy to manage, and to Linux it's just a big drive, so you don't need any special drivers for it.
We use LSI Megaraid cards for all of our servers. Their older cards are a bit dated now, but the new Elite 1650 is a pretty nice card. The Adaptec cards are pretty hot, but as Josh has pointed out their reference driver is for RedHat. Granted, that doesn't bother us here at OFS because that's all we use on machine but to each their own. Sincerely, Will LaShell On Tue, 2003-04-22 at 10:18, Josh Berkus wrote: > Vincent, > > > But if this controller is medium-end, could anyone recommend a high-end > > RAID card that has excellent linux support? One of the things I especially > > like about ICP Vortex products is the official linux support and the > > excellent software utility for monitoring and (re)configuring the raid > > arrays. Comes in handy when replacing hot-spares and rebuilding failed > > arrays while keeping the box running :) > > No, just negative advice. Mylex support is dead until someone steps into the > shoes of the late developer of that driver. Adaptec is only paying their > linux guy to do Red Hat support for their new RAID cards, so you're SOL with > other distributions. > > -- > Josh Berkus > Aglio Database Solutions > San Francisco > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)