Thread: Billions of records?
Hi, Can anyone tell me what the limitations are regarding records? Say I have a table with 50 fields of 20 characters each. If a table can be 16 TB, how many records can it hold? If you all have any other size benchmarks or such please include it. Is PG ready for huge db's? How big is the biggest known? -- Best regards, John Bercik E-Mail: bercikj@musc.edu Systems Programmer Office: 843/792-1715 Medical University of South Carolina Fax: 843/792-0816 Room 826F Clinical Science Building 76 Jonathan Lucas Street Charleston, SC, USA 29425
On Tuesday 15 July 2003 19:15, John Bercik wrote: > Hi, > > Can anyone tell me what the limitations are regarding records? > Say I have a table with 50 fields of 20 characters each. If a table can > be 16 TB, how many records can it hold? Each tuple in postgresql has around 28 bytes of overhead. Index tuple has 12 bytes. Besides varchar would include additional overhead. And char won't be true char because it always stores length of string. Although I have specialised char types developed by some kind hackers which are true chars. Other than that there is no limit on size of table. Check this http://www.postgresql.org/docs/faqs/FAQ.html#4.5 BTW, is it coincidence that your figure of 16TB matches the limit there? > If you all have any other size benchmarks or such please include it. Is > PG ready for huge db's? How big is the biggest known? I have had references of databases as big as 300GB. People routinely have more than 10GB databases. You can dig archives for the same. HTH Shridhar
Shridhar Daithankar <shridhar_daithankar@persistent.co.in> writes: > Other than that there is no limit on size of table. Check this > http://www.postgresql.org/docs/faqs/FAQ.html#4.5 > BTW, is it coincidence that your figure of 16TB matches the limit there? No, because he was correctly quoting the FAQ ;-). The 16TB limit comes from the fact that BlockNumber is 32 bits, so you cannot have a table larger than 4 billion blocks. (The FAQ is conservatively assuming that the limit is 2 billion blocks; 2G blocks * 8KB block size = 16TB. In principle 4 billion should work, since BlockNumber is an unsigned int. But there used to be places that sloppily used signed arithmetic on block numbers, and we aren't entirely sure we've gotten rid of all of 'em. If anyone has facilities to test behavior with tables between 16TB and 32TB in size, I'd be interested to hear the results ...) >> If you all have any other size benchmarks or such please include it. Is >> PG ready for huge db's? How big is the biggest known? The 4TB database mentioned in the FAQ belongs to the American Chemical Society (something about scanning all their journals back to the mid-1800s...); you can dig in the PG list archives for details. Another large database I'm aware of is the 2-micron sky survey: http://pegasus.astro.umass.edu/ which covers a bit under half a billion stars; it is reportedly about 150GB when loaded into Postgres. The UMass people seem to be happy with the performance they get... regards, tom lane
On Tuesday 15 July 2003 20:01, John Bercik wrote: > Thanks for the reply. So given my conditions, how many records can PG > hold? I got the 16TB from the limitations page but I don't see how to > calculate how many records I can hold. OK. Let's calculate. You have 50 fields of 20 char each. So each field would go for 24 bytes at least assuming a 4 byte integer for length. So a tuple size is 50*24+28=1228 bytes. So in a page of 8KB, you would get 6 tuples. In 16TB you can accomodate, 2GB pages. So you can accomodate around 12 billion tuples for your table schema. If you get true char types, that would accomodate 7 tuples per page. So you could get around 14 billion tuples in there. HTH BTW How much data you actually have? Shridhar
On Tue, Jul 15, 2003 at 07:53:37PM +0530, Shridhar Daithankar wrote: > Each tuple in postgresql has around 28 bytes of overhead. Index tuple has 12 Is this accurate? http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=page.html indicates a per-tuple overhead of 23 bytes. Or have things changed in 7.4? BTW, is there any documentation on the structure of index pages? -- Jim C. Nasby, Database Consultant jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
"Jim C. Nasby" <jim@nasby.net> writes: > On Tue, Jul 15, 2003 at 07:53:37PM +0530, Shridhar Daithankar wrote: >> Each tuple in postgresql has around 28 bytes of overhead. Index tuple has 12 > Is this accurate? > http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=page.html > indicates a per-tuple overhead of 23 bytes. Or have things changed in > 7.4? He's probably assuming you are using OIDs (which is the default). > BTW, is there any documentation on the structure of index pages? Same as heap pages... http://developer.postgresql.org/docs/postgres/page.html The index tuple header layout is different from heap tuple headers, but the other info on that page applies. regards, tom lane
On Wed, Jul 16, 2003 at 06:07:22PM -0400, Tom Lane wrote: > "Jim C. Nasby" <jim@nasby.net> writes: > > BTW, is there any documentation on the structure of index pages? > > Same as heap pages... > http://developer.postgresql.org/docs/postgres/page.html > The index tuple header layout is different from heap tuple headers, > but the other info on that page applies. Sorry, said one thing and meant another... is there documentation on index *tuple* layout? -- Jim C. Nasby, Database Consultant jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
I have an internals presentation on my home page below. There is a PDF that will give a general overview,and of course the source code has the details. --------------------------------------------------------------------------- Jim C. Nasby wrote: > On Wed, Jul 16, 2003 at 06:07:22PM -0400, Tom Lane wrote: > > "Jim C. Nasby" <jim@nasby.net> writes: > > > BTW, is there any documentation on the structure of index pages? > > > > Same as heap pages... > > http://developer.postgresql.org/docs/postgres/page.html > > The index tuple header layout is different from heap tuple headers, > > but the other info on that page applies. > > Sorry, said one thing and meant another... is there documentation on > index *tuple* layout? > -- > Jim C. Nasby, Database Consultant jim@nasby.net > Member: Triangle Fraternity, Sports Car Club of America > Give your computer some brain candy! www.distributed.net Team #1828 > > Windows: "Where do you want to go today?" > Linux: "Where do you want to go tomorrow?" > FreeBSD: "Are you guys coming, or what?" > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073