Thread: probable faq: internal format of numerics

probable faq: internal format of numerics

From
Brian Hurt
Date:
This is probably a FAQ, but I've googled the question and haven't found
an answer.  What is the internal format of  numerics?  I have a
situation where I'm dealing with monetary values (so I want to use
numerics), but I can probably put fairly hard limits on how many digits
of precision I need.  So I'm wondering where the stepping points are-
the points were adding another digit of precision increases the size of
the row.  I can think of three different ways numerics could be stored
in PostGreSQL:

1) fixed precision integer format.  Numeric(a,b) is stored as an n-bit
signed binary number (where n is large enough that 2^(n-1) > 10^a), and
a integer value i represents the value i * 10^(-b).  This means a
numeric(18,4) takes up two 32-bit word, while a numeric(19,4) takes up
three (or four?) 32-bit words, and the number 12345 represents the value
1.2345.

2) nibble-based BCD  Every digit is stored (in BCD form) in a 4-bit
nibble, so two digits fit in a byte.  Here, a numeric(16,4) takes up 64
bits of space, while a numeric(17,4) takes up 68 (aka 96 or 128) bits of
space, and the value 1.2345 is stored as the hexadecimal number 0x12345.

3) byte-based BCD.  Same as above, except that only one digit is stored
per byte, not two.  Generally, you're storing the ASCII text strings if
you're doing this, so that the value 1.2345 is stored as the hexadecimal
string 0x31 0x32 0x33 0x34 0x35, and a numeric(16,4) takes up 16 bytes,
while a numeric(17,4) takes up 17 (20?  24?) bytes.

I'm hoping that #1 will be the correct answer, but I haven't been able
to find out yet.  Sorry for asking a faq.

Brian


Re: probable faq: internal format of numerics

From
Tom Lane
Date:
Brian Hurt <bhurt@janestcapital.com> writes:
> This is probably a FAQ, but I've googled the question and haven't found
> an answer.  What is the internal format of  numerics?

"Use the source, Luke"

http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/utils/numeric.h
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/numeric.c

> I can think of three different ways numerics could be stored
> in PostGreSQL:

It's actually not any of the above, but it's definitely decimal not
binary.

            regards, tom lane

Re: probable faq: internal format of numerics

From
Brian Hurt
Date:
Tom Lane wrote:
Brian Hurt <bhurt@janestcapital.com> writes: 
This is probably a FAQ, but I've googled the question and haven't found 
an answer.  What is the internal format of  numerics?   
"Use the source, Luke"
 

This is what I was hoping to not do.  It's generally not possible to understand a peice of a project out of context- even well commented and well structured code like the Postgres code looks to be.

That said, it looks like they're doing a variant of my #2- they're holding 5 decimal digits every 2 bytes, plus an extra byte or so for the sign.  But consider the NumericVar structure defined in numeric.c- is one of those stored with every row of a table, or are most of the values stored once per table and only the digits stored in the row?  From a cursory reading of the code, it's hard for me to tell.

Brian

Re: probable faq: internal format of numerics

From
Tom Lane
Date:
Brian Hurt <bhurt@janestcapital.com> writes:
> That said, it looks like they're doing a variant of my #2- they're
> holding 5 decimal digits every 2 bytes, plus an extra byte or so for the
> sign.

Four digits per 2 bytes, actually ... five wouldn't fit (99999 > 65535).

> But consider the NumericVar structure defined in numeric.c- is
> one of those stored with every row of a table,

NumericVar is just a transient calculational representation.
NumericData (in numeric.h) is what's stored in the database.  So it's
basically 8 bytes overhead plus 2 bytes for each group of 4 decimal
digits.

            regards, tom lane