Thread: Asking for assistance in determining storage requirements
You assistance is appreciated.
BODY{font:10pt Tahoma, Verdana, sans-serif;}p.MsoNormal,li.MsoNormal,div.MsoNormal {margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;font-family:"Times New Roman";}a:link,span.MsoHyperlink {color:blue;text-decoration:underline;}a:visited,span.MsoHyperlinkFollowed {color:purple;text-decoration:underline;}span.EmailStyle17 {mso-style-type:personal-compose;font-family:Arial;color:windowtext;}@page Section1 {size:612.0pt 792.0pt;margin:72.0pt 90.0pt 72.0pt 90.0pt;}div.Section1 {page:Section1;}
Sincerely,
Chris Barnes
Recognia Inc.
Senior DBA
Attention all humans. We are your photos. Free us.
I have question regarding disk storage for postgres servers
We are thinking long term about scalable storage and performance and would like some advise
or feedback about what other people are using.
We would like to get as much performance from our file systems as possible.
We use ibm 3650 quad processor with onboard SAS controller ( 3GB/Sec) with 15,000rpm drives.
We use raid 1 for the centos operating system and the wal archive logs.
The postgres database is on 5 drives configured as raid 5 with a global hot spare.
We are curious about using SAN with fiber channel hba and if anyone else uses this technology.
We would also like to know if people have preference to the level of raid with/out striping.
Sincerely,
Chris Barnes
Recognia Inc.
Senior DBA
Attention all humans. We are your photos. Free us.
On Thu, Jul 9, 2009 at 11:15 AM, Chris Barnes<compuguruchrisbarnes@hotmail.com> wrote: > We are curious about using SAN with fiber channel hba and if anyone else > uses this technology. > > We would also like to know if people have preference to the level of raid > with/out striping. I used SurfRAID Triton external RAID units connected to Sun X4100 boxes via LSI Fibre Channel cards. I run them as RAID6 plus hot spare with a total of 16 drives. This is extremely fast and provides for up to 2 disk failure. The key is to have 1 or 2 gigs of cache on the RAID units. I also crank up the RAM on the servers to at least 20Gb.
No other takers on this one? I'm wondering what exactly "direct attached storage" entails? At PG Con I heard a lot about using only direct-attached storage, and not a SAN. Are there numbers to back this up? Does fibre-channel count as direct-attached storage? I'm thinking it would. What exactly is recommended against? Any strorage that is TCP/IP based? On Thu, Jul 9, 2009 at 11:15 AM, Chris Barnes<compuguruchrisbarnes@hotmail.com> wrote: > You assistance is appreciated. > > > I have question regarding disk storage for postgres servers > > > > We are thinking long term about scalable storage and performance and would > like some advise > or feedback about what other people are using. > > > > We would like to get as much performance from our file systems as possible. > > > > We use ibm 3650 quad processor with onboard SAS controller ( 3GB/Sec) with > 15,000rpm drives. > > We use raid 1 for the centos operating system and the wal archive logs. > > The postgres database is on 5 drives configured as raid 5 with a global hot > spare. > > > > We are curious about using SAN with fiber channel hba and if anyone else > uses this technology. > > We would also like to know if people have preference to the level of raid > with/out striping. > > Sincerely, > > Chris Barnes > Recognia Inc. > Senior DBA > > ________________________________ > Attention all humans. We are your photos. Free us. -- “Don't eat anything you've ever seen advertised on TV” - Michael Pollan, author of "In Defense of Food"
On Thu, 2009-07-09 at 11:15 -0400, Chris Barnes wrote: > > We would like to get as much performance from our file systems > as possible. Then avoid RAID 5. Raid 10 is a pretty good option for most loads. Actually, RAID 5 is quite decent for read-mostly large volume storage where you really need to be disk-space efficient. However, if you spread the RAID 5 out over enough disks for it to start getting fast reads, you face a high risk of disk failure during RAID rebuild. For that reason, consider using RAID 6 instead - over a large set of disks - so you're better protected against disk failures during rebuild. If you're doing much INSERTing / UPDATEing then RAID 5/6 are not for you. RAID 10 is pretty much the default choice for write-heavy loads. > The postgres database is on 5 drives configured as raid 5 with > a global hot spare. > We are curious about using SAN with fiber channel hba and if > anyone else uses this technology. There are certainly people on the list using PostgreSQL on a FC SAN. It comes up in passing quite a bit. It's really, REALLY important to make sure your SAN honours fsync() though - at least to the point making sure the SAN hardware has the data in battery-backed cache before returning from the fsync() call. Otherwise you risk serious data loss. I'd be unpleasantly surprised if any SAN shipped with SAN or FC HBA configuration that disregarded fsync() but it _would_ make benchmark numbers look better, so it's not safe to assume without testing. From general impressions gathered from the list ( I don't use such large scale gear myself and can't speak personally ) it does seem like most systems built for serious performance use direct-attached SAS arrays. People also seem to separate out read-mostly/archival tables, update-heavy tables, the WAL, temp table space, and disk sort space into different RAID sets. -- Craig Ringer
On Thu, Jul 9, 2009 at 9:15 AM, Chris Barnes<compuguruchrisbarnes@hotmail.com> wrote: > You assistance is appreciated. > > I have question regarding disk storage for postgres servers > > We are thinking long term about scalable storage and performance and would > like some advise or feedback about what other people are using. > > We would like to get as much performance from our file systems as possible. > > We use ibm 3650 quad processor with onboard SAS controller ( 3GB/Sec) with > 15,000rpm drives. > > We use raid 1 for the centos operating system and the wal archive logs. > > The postgres database is on 5 drives configured as raid 5 with a global hot > spare. OK, two things jump out at me. One is that you aren't using a hardware RAID controller with battery backed cache, and you're using RAID-5. For most non-db applications, RAID-5 and no battery backed cache is just fine. For some DB applications like a reporting db or batch processing it's ok too. For DB applications that handle lots of small transactions, it's a really bad choice. Looking through the pgsql-performance archives, you'll see RAID-10 and HW RAID with battery backed cache mentioned over and over again, and for good reasons. RAID-10 is much more resilient, and a good HW RAID controller with battery backed cache can re-order writes into groups that are near each other on the same drive pair to make overall throughput higher, as well as making burst throughput to be higher as well by fsyncing immediately when you issue a write. I'm assuming you have 8 hard drives to play with. If that's the case, you can have a RAID-1 for the OS etc and a RAID-10 with 4 disks and two hot spares, OR a RAID-10 with 6 disks and no hot spares. As long as you pay close attention to your server and catch failed drives and replace them by hand that might work, but it really sits wrong with me. > We are curious about using SAN with fiber channel hba and if anyone else > uses this technology. Yep, again, check the pgsql-perform archives. Note that the level of complexity is much higher, as is the cost, and if you're talking about a dozen or two dozen drives, you're often much better off just having a good direct attached set of disks, either with an embedded RAID controller, or JBOD and using an internal RAID controller to handle them. The top of the line RAID controllers that can handle 24 or so disks run $1200 to $1500. Taking the cost of the drives out of the equation, I'm pretty sure any FC/SAN setup is gonna cost a LOT more than that single RAID card. I can buy a 16 drive 32TB DAS box for about $6k to $7k or so, plug it into a simple but fast SCSI controller ($400 tops) and be up in a few minutes. Setting up a new SAN is never that fast, easy, or cheap. OTOH, if you've got a dozen servers that need lots and lots of storage, a SAN will start making more sense since it makes managing lots of hard drives easier. > We would also like to know if people have preference to the level of raid > with/out striping. RAID-10, then RAID-10 again, then RAID-1. RAID-6 for really big reporting dbs where storage is more important than performance, and the data is mostly read anyways. RAID-5 is to be avoided, period. If you have 6 disks in a RAID-6 with no spare, you're better off than a RAID-5 with 5 disks and a spare, as in RAID-6 the "spare" is kind of already built in.