Thread: Large Databases
What is the linux and/or postgres limitation for very large databases, if any? We are looking at 6T-20T. My understanding is that if the hardware supports it, then it can be done in postgres. But can hardware support that? --elein ============================================================ elein@varlena.com Varlena, LLC www.varlena.com PostgreSQL Consulting, Support & Training PostgreSQL General Bits http://www.varlena.com/GeneralBits/ ============================================================= I have always depended on the [QA] of strangers.
elein wrote: > What is the linux and/or postgres limitation for very > large databases, if any? We are looking at 6T-20T. > My understanding is that if the hardware supports it, > then it can be done in postgres. But can hardware > support that? I've never had the pleasure to actually see one of those, but for example IBM sells their storage servers, which can hold up to 224 SCSI drives. The largest drive now s 146G, so that gives you 32T. With some RAIDing you are looking at 16T of storage in one box (huge it may be--actually occupies whole 90U rack). You can use more of them I guess, since version 8 has tablespaces. Be prepared to spend some half a mil for this beast and we are talking US dollars here (listing price is $760k, but who would shop for listing prices, eh?). Beware, nuclear powerplant not included. If you get one of those, let us know how Postgres hums on it :) -- Michal Taborsky http://www.taborsky.cz
elein wrote: > What is the linux and/or postgres limitation for very > large databases, if any? We are looking at 6T-20T. > My understanding is that if the hardware supports it, > then it can be done in postgres. But can hardware > support that? I've recently been going through a project to support what will become a 5 to 6 TB Postgres database (initially it will be about 300GB after conversion from the source system). A few significant things I've learned along the way: 1) The linux 2.4 kernel has a block device size limit of 2 TB. 2) The linux 2.6 kernel supports *huge* block device size -- I don't have it in front of me, but IIRC it was in the peta-bytes range. 3) xfs, jfs, and ext3 all can handle more than the 6TB we needed them to handle. 4) One of the leading SAN vendors initially claimed to be able to support our desire to have a single 6TB volume. We found that when pushed hard, we would get disk corruption (archives are down, but see HACKERS on 8/21/04 for a message I posted on the topic). Now we are being told that they don't support the linux 2.6 kernel, and therefore don't support > 2TB volumes. So the choices seem to be: a) Use symlinks or Postgres 8.0.0beta tablespaces to split your data across multiple 2 TB volumes. b) Use NFS mounted NAS. We are already a big NetApp shop, so NFS mounted NAS is the direction we'll likely take. It appears (from their online docs) that NetApp can have individual volumes up to 16 TB. We should be confirming that with them in the next day or two. HTH, Joe
I thought NFS was not recommended. Did I misunderstand this or is there some kind of limitation to using different kinds(?) of NFS. Thank you for the excellent info. --elein On Tue, Aug 31, 2004 at 01:54:41PM -0700, Joe Conway wrote: > elein wrote: > >What is the linux and/or postgres limitation for very > >large databases, if any? We are looking at 6T-20T. > >My understanding is that if the hardware supports it, > >then it can be done in postgres. But can hardware > >support that? > > I've recently been going through a project to support what will become a > 5 to 6 TB Postgres database (initially it will be about 300GB after > conversion from the source system). A few significant things I've > learned along the way: > > 1) The linux 2.4 kernel has a block device size limit of 2 TB. > > 2) The linux 2.6 kernel supports *huge* block device size -- I don't > have it in front of me, but IIRC it was in the peta-bytes range. > > 3) xfs, jfs, and ext3 all can handle more than the 6TB we needed them to > handle. > > 4) One of the leading SAN vendors initially claimed to be able to > support our desire to have a single 6TB volume. We found that when > pushed hard, we would get disk corruption (archives are down, but see > HACKERS on 8/21/04 for a message I posted on the topic). Now we are > being told that they don't support the linux 2.6 kernel, and > therefore don't support > 2TB volumes. > > So the choices seem to be: > a) Use symlinks or Postgres 8.0.0beta tablespaces to split your data > across multiple 2 TB volumes. > > b) Use NFS mounted NAS. > > We are already a big NetApp shop, so NFS mounted NAS is the direction > we'll likely take. It appears (from their online docs) that NetApp can > have individual volumes up to 16 TB. We should be confirming that with > them in the next day or two. > > HTH, > > Joe
elein wrote: > I thought NFS was not recommended. Did I misunderstand this > or is there some kind of limitation to using different kinds(?) > of NFS. I've seen that sentiment voiced over and over. And a few years ago, I would have joined in. But the fact is *many* large Oracle installations now run over NFS to NAS. When it was first suggested to us, our Oracle DBAs said "no way". But when we were forced to try it due to hardware failure (on our attached fibre channel array) a few years ago, we found it to be *faster* than the locally attached array, much more flexible, and very robust. Our Oracle DBAs would never give it up at this point. I suppose there *may* be some fundamental technical difference that makes Postgres less reliable than Oracle when using NFS, but I'm not sure what it would be -- if anyone knows of one, please speak up ;-). Early testing on NFS mounted NAS has been favorable, i.e. at least the data does not get corrupted as it did on the SAN. And like I said, our only other option appears to be spreading the data over multiple volumes, which is a route we'd rather not take. Joe
On Tue, 2004-08-31 at 15:07, Joe Conway wrote: > I suppose there *may* be some fundamental technical difference that > makes Postgres less reliable than Oracle when using NFS, but I'm not > sure what it would be -- if anyone knows of one, please speak up ;-). > Early testing on NFS mounted NAS has been favorable, i.e. at least the > data does not get corrupted as it did on the SAN. And like I said, our > only other option appears to be spreading the data over multiple > volumes, which is a route we'd rather not take. I have been doing a *lot* of testing of PG 7.4 over NFS with a couple of EMC Celerras and have had excellent results thus far. My best NFS results were within about 15% of the speed of my best SAN results. However, my results changed drastically under the 2.6 kernel, when the NFS results stayed about the same as 2.4, but the SAN jumped about 50% in transactions per second.
Cott Lang wrote: > My best NFS results were within about 15% of the speed of my best SAN > results. Good info, and consistent with what I've seen. > However, my results changed drastically under the 2.6 kernel, when the > NFS results stayed about the same as 2.4, but the SAN jumped about 50% > in transactions per second. > Very interesting. Whose SAN are you using that supports the 2.6 kernel? Thanks, Joe
On Tue, 2004-08-31 at 20:37, Joe Conway wrote: > > > However, my results changed drastically under the 2.6 kernel, when the > > NFS results stayed about the same as 2.4, but the SAN jumped about 50% > > in transactions per second. > > Very interesting. Whose SAN are you using that supports the 2.6 kernel? I'm using EMC Clariions, but they do not officially support using the 2.6 kernel. Rumor (from them) has it that it will be supported in October. I tested it because I felt like it would be useful knowledge moving forward. :)