Thread: 4G row table?
[linux, 700MHz athlon, 512MB RAM, 700GB 10kRPM SCSI HW RAID, postgresql 7.2] We're setting up a DB of IC test data, which is very simple and regular, but large. One project (we get three or four per year) has ~4 giga bits, each specified by a few small integer values, e.g.: Name Type Values ---------------------- wafer int 1-50 die_row int 2-7 die_col int 2-7 testtype string (~10 different short strings) vdd int 0-25 bit_col int 0-127 bit_row int 0-511 value bit 0 or 1 with 4 billion(4e9) rows. I would guess to make wafer, die_row, etc. be of type "char", probably testtype a char too with a separate testtype lookup table. Even so, this will be a huge table. Questions: How much overhead will there be in the table in addition to the 9 bytes of data I see? How big will the primary index on the first seven columns be? Will this schema work at all? Of course, we could pack 128 bits into an 8 byte "text" field (or should we use bit(128)?), but lose some ease of use, especially for naive (but important) users. Comments, suggestions? -- George -- I cannot think why the whole bed of the ocean is not one solid mass of oysters, so prolific they seem. Ah, I am wandering! Strange how the brain controls the brain! -- Sherlock Holmes in "The Dying Detective"
On Thu, 19 Dec 2002 14:10:58 -0500, george young <gry@ll.mit.edu> wrote: >with 4 billion(4e9) rows. I would guess to make wafer, die_row, etc. be of >type "char", probably testtype a char too with a separate testtype lookup table. >Even so, this will be a huge table. Don't know if you can store 0-127 in a "char" column ... Anyway, it doesn't matter, if it does not cause the tuple size to cross a 4 byte boundary, because the tuple size will be rounded up to a multiple of 4. >Questions: How much overhead will there be in the table in addition to the >9 bytes of data I see? There is a page header (ca. 20 bytes) per page (8K by default). Then you have a tuple header and 4 bytes ItemIdData per tuple. PG 7.2: Without NULLs a tuple header is 32 bytes, add 4 bytes for each tuple containing at least one NULL column. PG 7.3: 24 bytes tuple header (with and without NULLs, because you have only 8 columns). >How big will the primary index on the first seven columns be? Don't know offhand. No time now to dig it out. Will answer tomorrow, if nobody else jumps in ... Servus Manfred
George, > [linux, 700MHz athlon, 512MB RAM, 700GB 10kRPM SCSI HW RAID, postgresql 7.2] What kind of RAID? How many drives? Will you be updating the data frequently, or mostly just running reports on it? With 4G rows, you will have *heavy* disk access, so the configuration and quality of your disk array is a big concern. You also might think about upping th ememory if you can. > We're setting up a DB of IC test data, which is very simple and regular, but large. > One project (we get three or four per year) has ~4 giga bits, each specified by > a few small integer values, e.g.: > Name Type Values > ---------------------- > wafer int 1-50 > die_row int 2-7 > die_col int 2-7 > testtype string (~10 different short strings) > vdd int 0-25 > bit_col int 0-127 > bit_row int 0-511 > value bit 0 or 1 > > with 4 billion(4e9) rows. I would guess to make wafer, die_row, etc. be of > type "char", probably testtype a char too with a separate testtype lookup table. > Even so, this will be a huge table. 1. Use INT2 and not INT for the INT values above. If you can hire a PostgreSQL hacker, have them design a new data type for you, an unsigned INT1 which will cut your storage space even further. 2. Do not use CHAR for wafer & die-row. CHAR requries min 3bytes storage; INT2 is only 2 bytes. 3. If you can use a lookup table for testtype, make it another INT2 and create a numeric key for the lookup table. > Questions: How much overhead will there be in the table in addition to the > 9 bytes of data I see? There's more than 9 bytes in the above. Count again. > How big will the primary index on the first seven columns > be? Will this schema work at all? As large as the 7 columns themselves, plus a little more. I suggest creating a surrogate key as an int8 sequence to refer to most rows. > Of course, we could pack 128 bits into an 8 byte "text" field (or should we use bit(128)?), > but lose some ease of use, especially for naive (but important) users. This is also unlikely to be more efficient due to the translation<->conversion process requried to access the data when you query. > Comments, suggestions? Unless you have a *really* good RAID array, expect slow performance on this hardware platform. -- -Josh Berkus Aglio Database Solutions San Francisco
Josh: Why do you say to expect slow performance on this hardware? Is there something specific about the configuration that worries you? Or, just lots of data in the database, so the data will be on disk and not in the cache (system or postgresql)? What do you classify as *slow*? Obviously, he is dependent on the I/O channel given the size of the tables. So, good indexing will be required to help on the queries. No comments on the commit rate for this data (I am guessing that it is slow, given the description of the database), so that may or may not be an issue. Depending on the type of queries, perhaps clustering will help, along with some good partitioning indexes. I just don't see the slow in the hardware. Of course, if he is targeting lots of concurrent queries, better add some additional processors, or better yet, use ERSERVER and replicate the data to a farm of machines. [To avoid the I/O bottleneck of lots of concurrent queries against these large tables]. I guess there are a lot of assumptions on the data's use to decide if the hardware is adequate or not :-) Charlie Josh Berkus wrote: >George, > > >>[linux, 700MHz athlon, 512MB RAM, 700GB 10kRPM SCSI HW RAID, postgresql 7.2] >> >> > >What kind of RAID? How many drives? Will you be updating the data >frequently, or mostly just running reports on it? > > >Unless you have a *really* good RAID array, expect slow performance on this >hardware platform. > > > -- Charles H. Woloszynski ClearMetrix, Inc. 115 Research Drive Bethlehem, PA 18015 tel: 610-419-2210 x400 fax: 240-371-3256 web: www.clearmetrix.com
On Thu, 2002-12-19 at 13:10, george young wrote: > [linux, 700MHz athlon, 512MB RAM, 700GB 10kRPM SCSI HW RAID, postgresql 7.2] > We're setting up a DB of IC test data, which is very simple and regular, but large. > One project (we get three or four per year) has ~4 giga bits, each specified by > a few small integer values, e.g.: > Name Type Values > ---------------------- > wafer int 1-50 > die_row int 2-7 > die_col int 2-7 > testtype string (~10 different short strings) > vdd int 0-25 > bit_col int 0-127 > bit_row int 0-511 > value bit 0 or 1 > > with 4 billion(4e9) rows. I would guess to make wafer, die_row, etc. be of > type "char", probably testtype a char too with a separate testtype lookup table. > Even so, this will be a huge table. How many records per day will be inserted? Will they ever be updated? Do you have to have *ALL* 4 billion records in the same table at the same time? As Josh Berkus mentioned, wafer thru bit_col can be converted to INT2, if you make testtype use a lookup table; thus, each tuple could be shrunk to 20 bytes, plus 24 bytes per tuple (in v7.3) that would make the table a minimum of 189 billion bytes, not including index!!! Rethink your solution... One possibility would to have a set of tables, with names like: TEST_DATA_200301 TEST_DATA_200302 TEST_DATA_200303 TEST_DATA_200304 TEST_DATA_200305 TEST_DATA_200306 TEST_DATA_200307 TEST_DATA_<etc> Then, each month do "CREATE VIEW TEST_DATA AS TEST_DATA_yyyymm" for the current month. > Questions: How much overhead will there be in the table in addition to the > 9 bytes of data I see? How big will the primary index on the first seven columns > be? Will this schema work at all? > > Of course, we could pack 128 bits into an 8 byte "text" field (or should we use bit(128)?), > but lose some ease of use, especially for naive (but important) users. > > Comments, suggestions? > > -- George > -- +---------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "My advice to you is to get married: If you find a good wife, | | you will be happy; if not, you will become a philosopher." | | Socrates | +---------------------------------------------------------------+
On Thu, 19 Dec 2002 14:10:58 -0500, george young <gry@ll.mit.edu> wrote: >with 4 billion(4e9) rows. >How big will the primary index on the first seven columns be? If you manage to pack the key into 8 bytes (by using a custom 1 byte integer datatype) and if there are no NULLs: 75 GB with a 100% fill factor, 114 GB with a 66% fill factor, realistically something in between. Note that frequent updates can cause index growth. >Will this schema work at all? You have a somewhat unusual identifier : payload ratio (8B : 1b). It depends on the planned use, but I'm not sure if *any* database is the right solution. You have "only" 30670848000 (30G) possible different key combinations, more than 1/8 of them (4G) are actually used. A 7-dimensional array of double-bits (1 bit to indicate a valid value and 1 bit payload) would require not more than 8 GB. If you plan to use a database because you have to answer ad-hoc queries, you will almost certainly need additonal indices. Servus Manfred
Charlie, > Why do you say to expect slow performance on this hardware? Is > there something specific about the configuration that worries you? > Or, just lots of data in the database, so the data will be on disk > and not in the cache (system or postgresql)? > What do you classify as *slow*? Obviously, he is dependent on the > I/O channel given the size of the tables. So, good indexing will be > required to help on the queries. No comments on the commit rate for > this data (I am guessing that it is slow, given the description of > the database), so that may or may not be an issue. > Depending on the type of queries, perhaps clustering will help, along > with some good partitioning indexes. > I just don't see the slow in the hardware. Of course, if he is > targeting lots of concurrent queries, better add some additional > processors, or better yet, use ERSERVER and replicate the data to a > farm of machines. [To avoid the I/O bottleneck of lots of concurrent > queries against these large tables]. > > I guess there are a lot of assumptions on the data's use to decide if > the hardware is adequate or not :-) Well, slow is relative. It may be fast enough for him. Me, I'd be screaming in frustration. Take, for example, an index scan on the primary key. Assuming that he can get the primary key down to 12 bytes per node using custom data types, that's still: 12bytes * 4,000,000,000 rows = 48 GB for the index As you can see, it's utterly impossible for him to load even the primary key index into his 512 MB of RAM (of which no more than 200mb can go to Postgres anyway without risking conflicts over RAM). A Sort-and-Limit on the primary key, for example, would require swapping the index from RAM to swap space as much as 480 times! (though probably more like 100 times on average) With a slow RAID array and the hardware he described to us, this would mean, most likely, that a simple sort-and-limit on primary key query could take hours to execute. Even with really fast disk access, we're talking tens of minutes at least. -Josh Berkus