Thread: database size estimates
Hi, I am trying to estimate the size of a table composed of 51754000 rows. Each row has 31 attributes: 16 x bit(24) and 15 x bit(8) So, the payload should be: 51754000 x ( 16 x 24 + 15 x 24 ) bits = 3115 MB Now, from what I understand from postgresql manual is that the overhead is composed of 32 bytes per row + 20 bytes per page. This leads me to approx. 1700 MB overhead. Therefore a total table size of 4900 MB. However, when I load the table, the table reaches the size of 21500 MB, i.e., 400% of my estimate. The table has no toast table, no index. So I am wondering if someone could give me a better estimate. Cheers, Francois
Attachment
Francois Deliege wrote: > Hi, > > I am trying to estimate the size of a table composed of 51754000 rows. > Each row has 31 attributes: 16 x bit(24) and 15 x bit(8) > > So, the payload should be: > 51754000 x ( 16 x 24 + 15 x 24 ) bits = 3115 MB What data types are those exactly? If those 24-bit fields are in fact text, varchar, char(x) or other data types that are stored as variable length fields, the varlen header will take 4 bytes. And then there's alignment, those 24-bit fields are most almost certainly 4-byte aligned, which means that there'll be one byte of padding between them. The upcoming 8.3 release will be much better in that respect, It'll use just a 1 byte varlen header per field instead of 4 bytes for small values like yours. You might want to test a CVS snapshot. > Now, from what I understand from postgresql manual is that the overhead > is composed of > 32 bytes per row + 20 bytes per page. > This leads me to approx. 1700 MB overhead. > Therefore a total table size of 4900 MB. In addition, there will be on average 1/2 rows worth of wasted space on every page. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Francois Deliege wrote: >> Hi, >> >> I am trying to estimate the size of a table composed of 51754000 rows. >> Each row has 31 attributes: 16 x bit(24) and 15 x bit(8) >> >> So, the payload should be: >> 51754000 x ( 16 x 24 + 15 x 24 ) bits = 3115 MB > > What data types are those exactly? If those 24-bit fields are in fact text, > varchar, char(x) or other data types that are stored as variable length fields, And sadly that includes bit() if you're being literal. As of Postgres 8.1 you can see how much space a column is taking up using the pg_column_size() function. This won't include alignment padding but will include the length header for that column. You can see how much a given row is taking up by passing the entire row to og_column_size with something like pg_column_size(tab.*) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com