Thread: Hardware/OS recommendations for large databases (5TB)
Does anyone have recommendations for hardware and/or OS to work with around 5TB datasets?
The data is for analysis, so there is virtually no inserting besides a big bulk load. Analysis involves full-database aggregations - mostly basic arithmetic and grouping. In addition, much smaller subsets of data would be pulled and stored to separate databases.
I have been working with datasets no bigger than around 30GB, and that (I'm afraid to admit) has been in MSSQL.
Thanks,
Adam
> Does anyone have recommendations for hardware and/or OS to work with around > 5TB datasets? Hardware-wise I'd say dual core opterons. One dual-core-opteron performs better than two single-core at the same speed. Tyan makes some boards that have four sockets, thereby giving you 8 cpu's (if you need that many). Sun and HP also makes nice hardware although the Tyan board is more competetive priced. OS wise I would choose the FreeBSD amd64 port but partititions larger than 2 TB needs some special care, using gpt rather than disklabel etc., tools like fsck may not be able to completely check partitions larger than 2 TB. Linux or Solaris with either LVM or Veritas FS sounds like candidates. > I have been working with datasets no bigger than around 30GB, and that (I'm > afraid to admit) has been in MSSQL. Well, our data are just below 30 GB so I can't help you there :-) regards Claus
> Hardware-wise I'd say dual core opterons. One dual-core-opteron > performs better than two single-core at the same speed. Tyan makes > some boards that have four sockets, thereby giving you 8 cpu's (if you > need that many). Sun and HP also makes nice hardware although the Tyan > board is more competetive priced. just FYI: tyan makes a 8 socket motherboard (up to 16 cores!): http://www.swt.com/vx50.html It can be loaded with up to 128 gb memory if all the sockets are filled :). Merlin
On Nov 15, 2005, at 3:28 AM, Claus Guttesen wrote: > Hardware-wise I'd say dual core opterons. One dual-core-opteron > performs better than two single-core at the same speed. Tyan makes at 5TB data, i'd vote that the application is disk I/O bound, and the difference in CPU speed at the level of dual opteron vs. dual-core opteron is not gonna be noticed. to maximize disk, try getting a dedicated high-end disk system like nstor or netapp file servers hooked up to fiber channel, then use a good high-end fiber channel controller like one from LSI. and go with FreeBSD amd64 port. It is *way* fast, especially the FreeBSD 6.0 disk system.
> at 5TB data, i'd vote that the application is disk I/O bound, and the > difference in CPU speed at the level of dual opteron vs. dual-core > opteron is not gonna be noticed. > > to maximize disk, try getting a dedicated high-end disk system like > nstor or netapp file servers hooked up to fiber channel, then use a > good high-end fiber channel controller like one from LSI. > > and go with FreeBSD amd64 port. It is *way* fast, especially the > FreeBSD 6.0 disk system. I'm (also) FreeBSD-biased but I'm not shure whether the 5 TB fs will work so well if tools like fsck are needed. Gvinum could be one option but I don't have any experience in that area. regards Claus
On Nov 16, 2005, at 4:50 PM, Claus Guttesen wrote: > I'm (also) FreeBSD-biased but I'm not shure whether the 5 TB fs will > work so well if tools like fsck are needed. Gvinum could be one option > but I don't have any experience in that area. Then look into an external filer and mount via NFS. Then it is not FreeBSD's responsibility to manage the volume.