Thread: Smaller data types use same disk space
Hi, I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MBof disk space respectively. The difference is that the "Big" table uses data types that take up more space (integerrather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/rowfor the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality,it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" tablebe about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+ Thanks, Robert Table "Big" Column | Type | Bytes ----------+------------------+----------- rmid | integer | 4 date | date | 4 rmfactor | text | 7 (about 3 characters/cell) id | integer | 4 value | double precision | 8 --------------------------------- Total Bytes/Row 27 Rows 10M Actual Size 493MB Table "Small" Column | Type | Bytes --------+----------+----------- rmid | smallint | 2 date | date | 4 rmfid | smallint | 2 (rmfid is a smallint index into the rmfactor table) id | integer | 4 value | real | 4 --------------------------------- Total Bytes/Row 16 Rows 10M Actual Size 487MB
On 07/24/2012 03:21 PM, McGehee, Robert wrote: > Hi, > I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MBof disk space respectively. The difference is that the "Big" table uses data types that take up more space (integerrather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/rowfor the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality,it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" tablebe about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+ > > Thanks, Robert > > Table "Big" > Column | Type | Bytes > ----------+------------------+----------- > rmid | integer | 4 > date | date | 4 > rmfactor | text | 7 (about 3 characters/cell) > id | integer | 4 > value | double precision | 8 > --------------------------------- > Total Bytes/Row 27 > Rows 10M > Actual Size 493MB > > > Table "Small" > Column | Type | Bytes > --------+----------+----------- > rmid | smallint | 2 > date | date | 4 > rmfid | smallint | 2 (rmfid is a smallint index into the rmfactor table) > id | integer | 4 > value | real | 4 > --------------------------------- > Total Bytes/Row 16 > Rows 10M > Actual Size 487MB See here for the gory details: http://www.postgresql.org/docs/9.1/interactive/storage-page-layout.html See in particular: Table 55-4. HeapTupleHeaderData Layout From the text: ""All table rows are structured in the same way. There is a fixed-size header (occupying 23 bytes on most machines.." which breaks you assumption of the Big/Small row size comparison. > > > -- Adrian Klaver adrian.klaver@gmail.com
On 07/24/2012 03:21 PM, McGehee, Robert wrote: > Hi, > I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MBof disk space respectively. The difference is that the "Big" table uses data types that take up more space (integerrather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/rowfor the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality,it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" tablebe about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+ > > Thanks, Robert > > Table "Big" > Column | Type | Bytes > ----------+------------------+----------- > rmid | integer | 4 > date | date | 4 > rmfactor | text | 7 (about 3 characters/cell) > id | integer | 4 > value | double precision | 8 > --------------------------------- > Total Bytes/Row 27 > Rows 10M > Actual Size 493MB > > > Table "Small" > Column | Type | Bytes > --------+----------+----------- > rmid | smallint | 2 > date | date | 4 > rmfid | smallint | 2 (rmfid is a smallint index into the rmfactor table) > id | integer | 4 > value | real | 4 > --------------------------------- > Total Bytes/Row 16 > Rows 10M > Actual Size 487MB > > > More questions than answers: What version of PostgreSQL? How are your determining the space used by a table? Why are you assuming 7 bytes for a 3-character value? (Character values up to 126 characters long only have 1-character overhead.) What is the fill-factor on the tables? (Should default to 100% but don't know how you are configured.) Do the tables have OIDs or not? Other considerations are that rows don't split across pages so there is a bit of waste per page. Also there could be compression considerations though I'm not sure that small rows like this will be compressed. Cheers, Steve
Adrian Klaver <adrian.klaver@gmail.com> writes: > On 07/24/2012 03:21 PM, McGehee, Robert wrote: >> I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MBof disk space respectively. The difference is that the "Big" table uses data types that take up more space (integerrather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/rowfor the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality,it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" tablebe about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+ > See here for the gory details: > http://www.postgresql.org/docs/9.1/interactive/storage-page-layout.html > See in particular: > Table 55-4. HeapTupleHeaderData Layout > From the text: > ""All table rows are structured in the same way. There is a fixed-size > header (occupying 23 bytes on most machines.." which breaks you > assumption of the Big/Small row size comparison. Aside from the tuple header, there are alignment considerations that you have to allow for. So for instance, if your columns are int, smallint, int, that will take the same amount of space as 3 ints, because the savings disappears into alignment of the third int. You need two adjacent smallints to get any benefit. regards, tom lane
Very interesting points. Thanks for the documentation link and the point about alignment. As a test of Tom's suggestion to group smallints together to avoid alignment problems, I changed the column order from smallint, date, smallint, integer, real TO smallint, smallint, date, integer, real This resulted in a "Small" table that is 15% smaller than the original "Small" table (414MB vs 487MB). I wasn't aware thattables could be optimized by switching column order like this. This could be a good note to make in either the "DataTypes" portion of the PostgreSQL manual or the "Performance Optimization" portion of the PostgreSQL Wiki. One might even imagine a future version of PostgreSQL using an efficient disk layout that may not match the table layoutin order to avoid wasted space from padding. I suppose this already happens to some extent with the different storagemodes (plain, extended, external). Steve also correctly pointed out that my text string probably only takes up 3-4 bytes rather than 7 bytes. Meaning the "Small"table uses only 7-8 bytes/row less than the "Big" table. For 10M rows, the expected savings should be about 70-80MB.This matches the 79MB of savings I see once I reorder the columns following Tom's suggestion. All is right in theworld! Thanks to all, Robert SC> More questions than answers: SC> What version of PostgreSQL? 9.1.1 (I should have mentioned that) SC> How are your determining the space used by a table? \d+ SC> Why are you assuming 7 bytes for a 3-character value? (Character values SC> up to 126 characters long only have 1-character overhead.) You are correct. I read the wrong documentation (docs for 8.2 says overhead is 4 bytes, but docs for my 9.1 version saysoverhead is only one byte for small strings). So I believe I should assume only 4 bytes of total usage here, maybe lessif it's compressed. SC> What is the fill-factor on the tables? (Should default to 100% but don't SC> know how you are configured.) I'm using 100%, but it shouldn't matter as I was giving sizes without the index anyway. SC> Do the tables have OIDs or not? No -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Tuesday, July 24, 2012 10:00 PM To: Adrian Klaver Cc: McGehee, Robert; pgsql-general@postgresql.org Subject: Re: [GENERAL] Smaller data types use same disk space Adrian Klaver <adrian.klaver@gmail.com> writes: > On 07/24/2012 03:21 PM, McGehee, Robert wrote: >> I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MBof disk space respectively. The difference is that the "Big" table uses data types that take up more space (integerrather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/rowfor the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality,it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" tablebe about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+ > See here for the gory details: > http://www.postgresql.org/docs/9.1/interactive/storage-page-layout.html > See in particular: > Table 55-4. HeapTupleHeaderData Layout > From the text: > ""All table rows are structured in the same way. There is a fixed-size > header (occupying 23 bytes on most machines.." which breaks you > assumption of the Big/Small row size comparison. Aside from the tuple header, there are alignment considerations that you have to allow for. So for instance, if your columns are int, smallint, int, that will take the same amount of space as 3 ints, because the savings disappears into alignment of the third int. You need two adjacent smallints to get any benefit. regards, tom lane
"McGehee, Robert" <Robert.McGehee@geodecapital.com> writes: > One might even imagine a future version of PostgreSQL using an > efficient disk layout that may not match the table layout in order to > avoid wasted space from padding. Yeah, this has been discussed multiple times. The sticking point is the extra infrastructure needed to have a physical column numbering different from the user-visible numbering, and the 100% certainty of introducing a lot of bugs due to bits of code using one type of column number where they should have used the other. We'll probably get it done someday, but don't hold your breath ... regards, tom lane
On Wed, Jul 25, 2012 at 1:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "McGehee, Robert" <Robert.McGehee@geodecapital.com> writes: >> One might even imagine a future version of PostgreSQL using an >> efficient disk layout that may not match the table layout in order to >> avoid wasted space from padding. > > Yeah, this has been discussed multiple times. The sticking point is > the extra infrastructure needed to have a physical column numbering > different from the user-visible numbering, and the 100% certainty of > introducing a lot of bugs due to bits of code using one type of column > number where they should have used the other. We'll probably get it > done someday, but don't hold your breath ... Has there been any discussion of providing the ability to re-order table columns through an ALTER TABLE command? I would love to see this; when I add in a new column, I often want to put it next to something just to be more visually appealing when I'm running ad-hoc queries. It could potentially address this problem as well. Mike
On Wed, Jul 25, 2012 at 5:20 PM, Mike Christensen <mike@kitchenpc.com> wrote: > On Wed, Jul 25, 2012 at 1:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> "McGehee, Robert" <Robert.McGehee@geodecapital.com> writes: >>> One might even imagine a future version of PostgreSQL using an >>> efficient disk layout that may not match the table layout in order to >>> avoid wasted space from padding. >> >> Yeah, this has been discussed multiple times. The sticking point is >> the extra infrastructure needed to have a physical column numbering >> different from the user-visible numbering, and the 100% certainty of >> introducing a lot of bugs due to bits of code using one type of column >> number where they should have used the other. We'll probably get it >> done someday, but don't hold your breath ... > > Has there been any discussion of providing the ability to re-order > table columns through an ALTER TABLE command? I would love to see > this; when I add in a new column, I often want to put it next to > something just to be more visually appealing when I'm running ad-hoc > queries. It could potentially address this problem as well. see: http://wiki.postgresql.org/wiki/Alter_column_position as well as extensive discussion in the archives. merlin
Mike Christensen <mike@kitchenpc.com> writes: > On Wed, Jul 25, 2012 at 1:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, this has been discussed multiple times. The sticking point is >> the extra infrastructure needed to have a physical column numbering >> different from the user-visible numbering, and the 100% certainty of >> introducing a lot of bugs due to bits of code using one type of column >> number where they should have used the other. We'll probably get it >> done someday, but don't hold your breath ... > Has there been any discussion of providing the ability to re-order > table columns through an ALTER TABLE command? It's more or less the same discussion. To do either one you need to decouple the internal column order from what the user sees. I do not think we'd bother with building the infrastructure involved if the only application were squeezing out alignment padding; it's really the (constant) requests for some kind of "ALTER TABLE REORDER COLUMNS" feature that make it worth thinking about. regards, tom lane
On 07/26/2012 07:12 AM, Tom Lane wrote: > It's more or less the same discussion. To do either one you need to > decouple the internal column order from what the user sees. I do not > think we'd bother with building the infrastructure involved if the > only application were squeezing out alignment padding; Is there any way to determine the amount of space lost to alignment padding within a row at the moment - without creation of dummy tables? I don't see any admin functions for determining the on-disk size of a row. Of course, I've been known to be hopelessly blind before. -- Craig Ringer
On 26/07/12 04:09, McGehee, Robert wrote: > Very interesting points. Thanks for the documentation link and the point about alignment. > > As a test of Tom's suggestion to group smallints together to avoid alignment problems, I changed the column order from > smallint, date, smallint, integer, real TO > smallint, smallint, date, integer, real > > This resulted in a "Small" table that is 15% smaller than the original "Small" table (414MB vs 487MB). I wasn't aware thattables could be optimized by switching column order like this. This could be a good note to make in either the "DataTypes" portion of the PostgreSQL manual or the "Performance Optimization" portion of the PostgreSQL Wiki. > > One might even imagine a future version of PostgreSQL using an efficient disk layout that may not match the table layoutin order to avoid wasted space from padding. I suppose this already happens to some extent with the different storagemodes (plain, extended, external). > > Steve also correctly pointed out that my text string probably only takes up 3-4 bytes rather than 7 bytes. Meaning the"Small" table uses only 7-8 bytes/row less than the "Big" table. For 10M rows, the expected savings should be about 70-80MB.This matches the 79MB of savings I see once I reorder the columns following Tom's suggestion. All is right in theworld! > > Thanks to all, > Robert > > > SC> More questions than answers: > SC> What version of PostgreSQL? > > 9.1.1 (I should have mentioned that) > > SC> How are your determining the space used by a table? > > \d+ > > SC> Why are you assuming 7 bytes for a 3-character value? (Character values > SC> up to 126 characters long only have 1-character overhead.) > > You are correct. I read the wrong documentation (docs for 8.2 says overhead is 4 bytes, but docs for my 9.1 version saysoverhead is only one byte for small strings). So I believe I should assume only 4 bytes of total usage here, maybe lessif it's compressed. > > SC> What is the fill-factor on the tables? (Should default to 100% but don't > SC> know how you are configured.) > > I'm using 100%, but it shouldn't matter as I was giving sizes without the index anyway. > > SC> Do the tables have OIDs or not? > > No > > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Tuesday, July 24, 2012 10:00 PM > To: Adrian Klaver > Cc: McGehee, Robert; pgsql-general@postgresql.org > Subject: Re: [GENERAL] Smaller data types use same disk space > > Adrian Klaver <adrian.klaver@gmail.com> writes: >> On 07/24/2012 03:21 PM, McGehee, Robert wrote: >>> I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MBof disk space respectively. The difference is that the "Big" table uses data types that take up more space (integerrather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/rowfor the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality,it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" tablebe about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+ >> See here for the gory details: >> http://www.postgresql.org/docs/9.1/interactive/storage-page-layout.html >> See in particular: >> Table 55-4. HeapTupleHeaderData Layout >> From the text: >> ""All table rows are structured in the same way. There is a fixed-size >> header (occupying 23 bytes on most machines.." which breaks you >> assumption of the Big/Small row size comparison. > Aside from the tuple header, there are alignment considerations that you > have to allow for. So for instance, if your columns are int, smallint, > int, that will take the same amount of space as 3 ints, because the > savings disappears into alignment of the third int. You need two > adjacent smallints to get any benefit. > > regards, tom lane > Curious, does changing the order to: date, real, integer, smallint, smallint make any difference? More specifically, putting the types into decreasing order of the size of alignment requirements - i.e. putting types which require 8 byte alignment before those requiring 4 byte alignment... Cheers, Gavin
> > Has there been any discussion of providing the ability to re-order > > table columns through an ALTER TABLE command? I would love to see > > this; when I add in a new column, I often want to put it next to > > something just to be more visually appealing when I'm running ad-hoc > > queries. It could potentially address this problem as well. > > see: http://wiki.postgresql.org/wiki/Alter_column_position > as well as extensive discussion in the archives. > > merlin Hello, actually columns position can be changed with alter table using ALTER TYPE ... USING. All constraints, indexes and possible serials naturally still need to be handled as well However, I'm not sure that this alternative may have any advantage upon the method described on the wiki page. create temp table test (a serial, b varchar); insert into test (b) values ('a'),('b'); create index a_i on test(a); create index b_i on test(b); create index ab_i on test(a,b); alter table test ALTER a TYPE varchar USING b, ALTER b TYPE int USING a; alter table test rename b to _b; alter table test rename a to b; alter table test rename _b to a; select * from test; \d+ test Marc Mamin
On Wed, Jul 25, 2012 at 4:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Mike Christensen <mike@kitchenpc.com> writes: >> On Wed, Jul 25, 2012 at 1:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Yeah, this has been discussed multiple times. The sticking point is >>> the extra infrastructure needed to have a physical column numbering >>> different from the user-visible numbering, and the 100% certainty of >>> introducing a lot of bugs due to bits of code using one type of column >>> number where they should have used the other. We'll probably get it >>> done someday, but don't hold your breath ... > >> Has there been any discussion of providing the ability to re-order >> table columns through an ALTER TABLE command? > > It's more or less the same discussion. To do either one you need to > decouple the internal column order from what the user sees. I do not > think we'd bother with building the infrastructure involved if the > only application were squeezing out alignment padding; it's really > the (constant) requests for some kind of "ALTER TABLE REORDER COLUMNS" > feature that make it worth thinking about. I don't really think you'd need to decouple the internal column order from what the user sees. A REORDER COLUMNS command should re-build the table with the columns in the specified order. Internally, it should be no different from making a new table, copying all the data over, then deleting the old table. If there's any optimizations that can be done (such as making this faster on large tables), those could be done in future versions. I'd just like to changing column order easier without remaking the table or renaming columns and changing their data types (as suggested by Marc) Mike
On Thu, Jul 26, 2012 at 11:02 AM, Mike Christensen <mike@kitchenpc.com> wrote: > I don't really think you'd need to decouple the internal column order > from what the user sees. A REORDER COLUMNS command should re-build > the table with the columns in the specified order. Internally, it > should be no different from making a new table, copying all the data > over, then deleting the old table. If there's any optimizations that > can be done (such as making this faster on large tables), those could > be done in future versions. I'd just like to changing column order > easier without remaking the table or renaming columns and changing > their data types (as suggested by Marc) That's a controversial point: doing it that way makes reordering of large tables highly impractical. A column map turns that into a catalog update which can be done at any time. I would argue that you can have it both ways: implement the map and have table rebuilding operations (like TRUNCATE and CLUSTER) opportunistically do the physical swap. merlin
Merlin Moncure <mmoncure@gmail.com> writes: > On Thu, Jul 26, 2012 at 11:02 AM, Mike Christensen <mike@kitchenpc.com> wrote: >> I don't really think you'd need to decouple the internal column order >> from what the user sees. A REORDER COLUMNS command should re-build >> the table with the columns in the specified order. > That's a controversial point: doing it that way makes reordering of > large tables highly impractical. In particular, if the implementation works like that, you hardly need any system support at all. You can do the equivalent today with a few SQL commands: create a new table by selecting columns from the old, drop old table, rename new into place. The universal assumption has been that REORDER COLUMNS needs to work by just adjusting a few catalog entries, or it's not worth bothering with. regards, tom lane
On Thu, Jul 26, 2012 at 9:12 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Thu, Jul 26, 2012 at 11:02 AM, Mike Christensen <mike@kitchenpc.com> wrote: >> I don't really think you'd need to decouple the internal column order >> from what the user sees. A REORDER COLUMNS command should re-build >> the table with the columns in the specified order. Internally, it >> should be no different from making a new table, copying all the data >> over, then deleting the old table. If there's any optimizations that >> can be done (such as making this faster on large tables), those could >> be done in future versions. I'd just like to changing column order >> easier without remaking the table or renaming columns and changing >> their data types (as suggested by Marc) > > That's a controversial point: doing it that way makes reordering of > large tables highly impractical. A column map turns that into a > catalog update which can be done at any time. I would argue that you > can have it both ways: implement the map and have table rebuilding > operations (like TRUNCATE and CLUSTER) opportunistically do the > physical swap. Yea, it all comes down to level of effort. It sounds like adding the ability to store columns in a different order than they're displayed would have a huge development cost associated with it, however it would have some other advantages. For example, perhaps Postgres could always optimize the column order on disk automatically (similar to how a compiler does).. However, once the syntax is defined for how you re-order a column, the underlying implementation could be changed and improved in future versions without breaking changes. Mike
On Thu, Jul 26, 2012 at 11:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> On Thu, Jul 26, 2012 at 11:02 AM, Mike Christensen <mike@kitchenpc.com> wrote: >>> I don't really think you'd need to decouple the internal column order >>> from what the user sees. A REORDER COLUMNS command should re-build >>> the table with the columns in the specified order. > >> That's a controversial point: doing it that way makes reordering of >> large tables highly impractical. > > In particular, if the implementation works like that, you hardly need > any system support at all. You can do the equivalent today with a few > SQL commands: create a new table by selecting columns from the old, > drop old table, rename new into place. Er, well, if you did that you'd also have to: 1) drop and recreate and foreign keys referring to your table, rebuild triggers, etc 2) drop and recreate and views and functions with a dependency on the table type 3) if you happened to have inlined the table type directly into another table for purposes of storage, punt. (table based composites have seem to suggest only logical reordering is possible anyways unless you want to cascade the physical reorder). Definitely non-trivial. CLUSTER requires a lock, but doesn't require messing around with the RI and other dependencies. You'd still have to work out #3 though. merlin
Tom Lane wrote: > > That's a controversial point: doing it that way makes reordering of > > large tables highly impractical. > > In particular, if the implementation works like that, you hardly need > any system support at all. You can do the equivalent today with a few > SQL commands: create a new table by selecting columns from the old, > drop old table, rename new into place. The universal assumption has > been that REORDER COLUMNS needs to work by just adjusting a few catalog > entries, or it's not worth bothering with. But if the table has indexes, triggers, integrity constraints, check constraints, or default values for columns, moving these after a CREATE TABLE AS select ... is not easy. Personally, every time I had the need to reorder columns, having it as a fast operation was irrelevant to me, whereas figuring out how to deal with the above was the time-consuming part. If we had the feature but it was as slow as say, CLUSTER, that would already be quite nice. Best regards, -- Daniel PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org
On 2012-07-25, Steve Crawford <scrawford@pinpointresearch.com> wrote: > > How are your determining the space used by a table? > > Why are you assuming 7 bytes for a 3-character value? (Character values > up to 126 characters long only have 1-character overhead.) character values upto 127 octets long only have one octet overhead. three characters could take as many as 18 octets before overhead. (not that I've ever encountered a unicode code point past 2^24 -) -- ⚂⚃ 100% natural