Thread: Row Limit on tables
Is there a limit on the number of rows you can have in a table? Couple experts have predicted that 20Million or 200Million was the limit. I would like to hear your expereinces. My machine is a Redhat 7.1 x86 Box running Postgres7.1.2 Thank you Sheheryar Sewani Houston, TX _________________________________________________________________ Join the world�s largest e-mail service with MSN Hotmail. http://www.hotmail.com
According to this, it's unlimited based on HD and RAM size. http://www.us.postgresql.org/users-lounge/limitations.html Ray -----Original Message----- From: shey sewani [mailto:pakix2000@hotmail.com] Sent: Friday, May 31, 2002 8:26 AM To: pgsql-general@postgresql.org Subject: [GENERAL] Row Limit on tables Is there a limit on the number of rows you can have in a table? Couple experts have predicted that 20Million or 200Million was the limit. I would like to hear your expereinces. My machine is a Redhat 7.1 x86 Box running Postgres7.1.2 Thank you Sheheryar Sewani Houston, TX _________________________________________________________________ Join the world's largest e-mail service with MSN Hotmail. http://www.hotmail.com ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Hi, On Fri, 31 May 2002, shey sewani wrote: > > > Is there a limit on the number of rows you can have in a table? > Couple experts have predicted that 20Million or 200Million was the limit. I > would like to hear your expereinces. > > My machine is a Redhat 7.1 x86 Box running Postgres7.1.2 I cannot remember the definition exactly; but as far as I remember, it was with count() function; which returns 32-bit integer, and for tables having more than 2 billion rows, count() has really no "meaning". That's why numbger of rows in a table is only limited to your available physical disk space or swap. Best regards. -- Devrim GUNDUZ devrim@oper.metu.edu.tr devrim.gunduz@linux.org.tr Web : http://devrim.oper.metu.edu.tr -------------------------------------
On Fri, 31 May 2002, shey sewani wrote: > Is there a limit on the number of rows you can have in a table? > Couple experts have predicted that 20Million or 200Million was the limit. I > would like to hear your expereinces. Nowhere near that low, that's for sure. I'm playing with a 500 million row table right now. And this is just on a single cheap IDE disk, not on real hardware that would let me work with properly large databases.... cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
I'd like to see the EXPLAIN on a VIEW for this. :-) Ray -----Original Message----- From: Curt Sampson [mailto:cjs@cynic.net] Sent: Friday, May 31, 2002 10:23 AM To: shey sewani Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Row Limit on tables On Fri, 31 May 2002, shey sewani wrote: > Is there a limit on the number of rows you can have in a table? > Couple experts have predicted that 20Million or 200Million was the limit. I > would like to hear your expereinces. Nowhere near that low, that's for sure. I'm playing with a 500 million row table right now. And this is just on a single cheap IDE disk, not on real hardware that would let me work with properly large databases.... cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
"shey sewani" <pakix2000@hotmail.com> writes: > Is there a limit on the number of rows you can have in a table? http://www.ca.postgresql.org/docs/faq-english.html#4.5 The smallest useful row size would probably be 40 bytes (2 int columns plus 32 bytes overhead), so 16TB/40 = 400 billion rows is the most you could possibly fit given the table size limit (unless you raise BLCKSZ to 32K, which would give you another factor of 4). In practice, "what will fit on your disk" is the limit. > Couple experts have predicted that 20Million or 200Million was the > limit. These "experts" evidently have no clue what they are talking about. Perhaps they neglected to read the thread running parallel to this one that mentions throwaway tests on 500M-row tables... I saw nearby a claim that count(*) would fail at 2G rows --- but count(*) returns int8 in 7.2 and later. regards, tom lane Sheheryar Sewani Houston, TX _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.
Thank you for your replies. We have eclipsed the 200 Million mark in our table. I have read the FAQ's about Postgres that stated there was no limit on the number of rows, and that the size of the database would be limited to the size of your Hard drives only. The performance of select queries has not degregraded as much as I expected. The indexes are doing their job. Sheheryar Sewani ----Original Message Follows---- From: Tom Lane <tgl@sss.pgh.pa.us> To: "shey sewani" <pakix2000@hotmail.com> CC: pgsql-general@postgresql.org Subject: Re: [GENERAL] Row Limit on tables Date: Fri, 31 May 2002 13:39:37 -0400 Received: from [192.204.191.242] by hotmail.com (3.2) with ESMTP id MHotMailBEC1005C00BF4004321EC0CCBFF204920; Fri, 31 May 2002 10:39:41 -0700 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g4VHdcW19260;Fri, 31 May 2002 13:39:38 -0400 (EDT)
On Fri, 31 May 2002, Tom Lane wrote: > In practice, "what will fit on your disk" is the limit. Actually, not even. Or so I think for most cases. Playing around with my little 500M row table, I notice that the table itself is only 25 GB, and even with 4 indices, the whole shebang is less than 60 GB. What's killing me is disk access speed. Currently, my "standard" query is 70 sec., entirely due to limitations on the number of random I/Os I can do per second. I can fairly cheaply halve this problem by striping the database across two disks, but then I double the space available. If that leads me to double the database size, I'm back in the same hole I was in before, maybe worse. It's way, way too easy these days to run up a terrabyte of RAID-5 storage. One Escalade 7850 controller ($500) plus eight 160 GB drives ($250 each) sets you out about $2500. But the problem is, all this storage often doesn't have the I/O bandwidth you need actually to make use of it.... cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
> > In practice, "what will fit on your disk" is the limit. > > Actually, not even. Or so I think for most cases. (snip) > I can fairly cheaply halve this problem by striping the database across > two disks, but then I double the space available. If that leads me to > double the database size, I'm back in the same hole I was in before, > maybe worse. You seem to be confusing the question "How many rows can I have in a table?" with "How fast will query (X) run on a table with (Y) rows?" The question and documents we're talking about are dealing with hard ceilings, not with performance. In fact, trying to give performance estimates for such situations is kind of silly - queries on such tables could be anywhere from very, very fast to very, very slow. Many, many factors are involved. > It's way, way too easy these days to run up a terrabyte of RAID-5 > storage. One Escalade 7850 controller ($500) plus eight 160 GB drives > ($250 each) sets you out about $2500. But the problem is, all this > storage often doesn't have the I/O bandwidth you need actually to make > use of it.... That's one extreme end of the spectrum, why not look at the other? You could load up on low-latency, 10K RPM 9-gig drives, and have amazing throughput with only a very small fraction of the total storage capacity. It's all in what you're looking for. steve
On Fri, 31 May 2002, Steve Wolfe wrote: > You seem to be confusing the question "How many rows can I have in a > table?" with "How fast will query (X) run on a table with (Y) rows?" IMHO, they're part of the same question. Given unlimited time, sure your table size limit is disk space. But it seems to me, from looking at the conversations here, that people don't care to wait unlimited time for the results of a query. > That's one extreme end of the spectrum, why not look at the other? You > could load up on low-latency, 10K RPM 9-gig drives, and have amazing > throughput with only a very small fraction of the total storage capacity. I'm not convinced that the thoughput would be all that much better, in fact, given an equivalant number of drives. I'd be interested in hearing about the results of tests that show otherwise. How many I/Os per second does a good modern 10K RPM SCSI drive do these days, anyway? cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
At 03:25 AM 6/1/02 +0900, Curt Sampson wrote: >It's way, way too easy these days to run up a terrabyte of RAID-5 >storage. One Escalade 7850 controller ($500) plus eight 160 GB drives >($250 each) sets you out about $2500. But the problem is, all this >storage often doesn't have the I/O bandwidth you need actually to make >use of it.... Use smaller HDDs, or if cost is not a big issue just use the faster portions of the big HDDs. The smaller HDDs are about the same speed as the big ones of the same model range- the bigger ones just have more platters. But the bigger ones allow you to throw away slower space. By using just the faster portions (the outer edges) you can reduce seek time by a few milliseconds, and keep throughput in the higher bands. Cheerio, Link.
On Sat, 1 Jun 2002, Lincoln Yeoh wrote: > The smaller HDDs are about the same speed as the big ones of the same model > range- the bigger ones just have more platters. Used to be the case. It's not so often so any more. > But the bigger ones allow you to throw away slower space. By using just the > faster portions (the outer edges) you can reduce seek time by a few > milliseconds, and keep throughput in the higher bands. Right. That still works! cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
>I cannot remember the definition exactly; but as far as I remember, it was >with count() function; which returns 32-bit integer, and for tables having >more than 2 billion rows, count() has really no "meaning". In 7.2.1 the aggregate functions now return longs (64-bit values). Cheers, Doug
>It's way, way too easy these days to run up a terrabyte of RAID-5 >storage. One Escalade 7850 controller ($500) plus eight 160 GB drives >($250 each) sets you out about $2500. But the problem is, all this >storage often doesn't have the I/O bandwidth you need actually to make >use of it.... FYI... The Escalade controller configured as you suggest here is known to do 125-145mb/sec at least... Much faster than any single IDE. (I use Escalades until very recently when I had to switch to SCSI for just that last bit more of performance.) Cheers, Doug
On Sat, 1 Jun 2002, Doug Fields wrote: > FYI... The Escalade controller configured as you suggest here is known to > do 125-145mb/sec at least... When doing database stuff, you're most likely to run into the number of I/Os per second limit long before the bandwidth limit. On my queries, an IDE drive perfectly cable of transferring 30 MB/sec. ends up transferring more like two or three, because it can't do more than about 140 reads per second, and the reads are only 16K. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC