Thread: probable faq: internal format of numerics
This is probably a FAQ, but I've googled the question and haven't found an answer. What is the internal format of numerics? I have a situation where I'm dealing with monetary values (so I want to use numerics), but I can probably put fairly hard limits on how many digits of precision I need. So I'm wondering where the stepping points are- the points were adding another digit of precision increases the size of the row. I can think of three different ways numerics could be stored in PostGreSQL: 1) fixed precision integer format. Numeric(a,b) is stored as an n-bit signed binary number (where n is large enough that 2^(n-1) > 10^a), and a integer value i represents the value i * 10^(-b). This means a numeric(18,4) takes up two 32-bit word, while a numeric(19,4) takes up three (or four?) 32-bit words, and the number 12345 represents the value 1.2345. 2) nibble-based BCD Every digit is stored (in BCD form) in a 4-bit nibble, so two digits fit in a byte. Here, a numeric(16,4) takes up 64 bits of space, while a numeric(17,4) takes up 68 (aka 96 or 128) bits of space, and the value 1.2345 is stored as the hexadecimal number 0x12345. 3) byte-based BCD. Same as above, except that only one digit is stored per byte, not two. Generally, you're storing the ASCII text strings if you're doing this, so that the value 1.2345 is stored as the hexadecimal string 0x31 0x32 0x33 0x34 0x35, and a numeric(16,4) takes up 16 bytes, while a numeric(17,4) takes up 17 (20? 24?) bytes. I'm hoping that #1 will be the correct answer, but I haven't been able to find out yet. Sorry for asking a faq. Brian
Brian Hurt <bhurt@janestcapital.com> writes: > This is probably a FAQ, but I've googled the question and haven't found > an answer. What is the internal format of numerics? "Use the source, Luke" http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/utils/numeric.h http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/numeric.c > I can think of three different ways numerics could be stored > in PostGreSQL: It's actually not any of the above, but it's definitely decimal not binary. regards, tom lane
Tom Lane wrote:
This is what I was hoping to not do. It's generally not possible to understand a peice of a project out of context- even well commented and well structured code like the Postgres code looks to be.
That said, it looks like they're doing a variant of my #2- they're holding 5 decimal digits every 2 bytes, plus an extra byte or so for the sign. But consider the NumericVar structure defined in numeric.c- is one of those stored with every row of a table, or are most of the values stored once per table and only the digits stored in the row? From a cursory reading of the code, it's hard for me to tell.
Brian
Brian Hurt <bhurt@janestcapital.com> writes:This is probably a FAQ, but I've googled the question and haven't found an answer. What is the internal format of numerics?"Use the source, Luke"
This is what I was hoping to not do. It's generally not possible to understand a peice of a project out of context- even well commented and well structured code like the Postgres code looks to be.
That said, it looks like they're doing a variant of my #2- they're holding 5 decimal digits every 2 bytes, plus an extra byte or so for the sign. But consider the NumericVar structure defined in numeric.c- is one of those stored with every row of a table, or are most of the values stored once per table and only the digits stored in the row? From a cursory reading of the code, it's hard for me to tell.
Brian
Brian Hurt <bhurt@janestcapital.com> writes: > That said, it looks like they're doing a variant of my #2- they're > holding 5 decimal digits every 2 bytes, plus an extra byte or so for the > sign. Four digits per 2 bytes, actually ... five wouldn't fit (99999 > 65535). > But consider the NumericVar structure defined in numeric.c- is > one of those stored with every row of a table, NumericVar is just a transient calculational representation. NumericData (in numeric.h) is what's stored in the database. So it's basically 8 bytes overhead plus 2 bytes for each group of 4 decimal digits. regards, tom lane