Thread: libpq binary transfer of the numeric data type

libpq binary transfer of the numeric data type

From

Eliot Simcoe

Date:

14 June 2004, 15:28:50

Hello Everyone,

I am writing an Objective-C wrapper of libpq and would like to
implement automatic type conversion between the standard PostgreSQL
data types and associated Objective-C wrapper classes.

it has been a struggle using the binary I/O method supported by
PQexecParams() for both parameter passing and result retrieval due to
the lack of documentation on the internal data representations used by
PostgreSQL, and after reading a number of posts from 1999 frowning upon
its usage I am aware of the issues associate with this approach. I've
searched the lists numerous times for information on the binary
representations of a number of types such as Date and Numeric in vain.
Finally, after about a day of poking header files (literally) I was
able to extract the internal data structures from the PgSQL source
code. Now I am facing the challenge of deciphering their usage.

typedef struct
{
    SInt32        varlen;                    // variable size (std varlena header)
    SInt16        n_weight;                // weight of 1st digit
    UInt16        n_sign_dscale;            // sign + display scale
    SInt8        n_data[1];                // digits
}
sql_numeric;

Could someone please explain to me what the varlen field is used for?
It seems to hold enormous values in my testing, so it couldn't possibly
be the allocated number of bytes or bits... or the number of digits
stored in the n_data field. The n_data field itself is a mystery. I
can't seem to figure out exactly how it is storing the digits.
According to numeric.c n_data is an array of signed short integers.
Each short integer represents four numbers in some sort of
binary-coded-decimal like means. I can extract the values in the first
two shorts, but the rest of the results seem garbled. Here is some
source code illustrating what I have done:

sql_numeric numeric = (sql_numeric)(*((sql_numeric *)[data bytes]));
UInt16 dscale = numeric.n_sign_dscale & NUMERIC_DISPLAY_SCALE_MASK;
NSMutableString *string = [NSMutableString string];
UInt32 i, c = ([data length] - 8) / 2;
UInt32 partShorts = (dscale + 3) / 4;
UInt32 wholeShorts = c - partShorts;

for( i = 0 ; i < wholeShorts ; i++ )
{
    SInt16 value = (SInt16)(((SInt16 *)(numeric.n_data))[i]);
    char block[5];
    UInt32 j = 0;
    char t;

    sprintf( block, "%d", value );

    while( (t = block[j++]) != '\0' )
    {
        [string appendFormat:@"%c", t];
    }
}

[string appendString:@"."];

for( i = wholeShorts ; i < c ; i++ )
{
    SInt16 value = (SInt16)(((SInt16 *)(numeric.n_data))[i]);
    char block[5];
    UInt32 j = 0;
    char t;

    sprintf( block, "%d", value );

    while( (t = block[j++]) != '\0' )
    {
        [string appendFormat:@"%c", t];
    }
}

NSLog( string );

This is obviously a quick hack in a desperate attempt to figure out how
the data is packed, but the results are strange. The input to the above
is the numeric 123456777654.890123. The output (as displayed by
NSLog()) is 123456774353.-267846, indicating that the first two shorts
are decoded "correctly", but the rest is completely wrong. I am
puzzled.

Anyway, my first question is: How can I convert this into a double
value? I know it isn't ideal, but at least it will allow the developer
to access the number is some sort of meaningful way.

My second question is related to an apparent difference between the
documentation and one of the examples provided. Is the data always
converted to big endian byte order when transferred over the network,
or is it dependent on the serving architecture? It seems to me that
this should be standardized, and the third example program seems to
state that it is. I have read, however, posts to these lists indicating
that it is not.

Anyway, sorry for this long winded post, but I have been struggling for
a while and would really appreciate any help anyone can give me.
Thanks in advance,

Eliot Simcoe
Software Engineer
Vantine Imaging, LLC.
315-790-1773

Re: libpq binary transfer of the numeric data type

From

David Stanaway

Date:

14 June 2004, 15:57:20

On Mon, 2004-06-14 at 13:25, Eliot Simcoe wrote:
> Hello Everyone,
> 
> I am writing an Objective-C wrapper of libpq and would like
> toimplement automatic type conversion between the standard
> PostgreSQLdata types and associated Objective-C wrapper classes.

Hi,

I don't use Objective-C but I thought you might want to have a look at
opengroupware.org  They use postgresql as their RDBMS and interface with
it with Objective-C so they already have a wrapper for it.

-- 
David Stanaway <david@stanaway.net>

Re: libpq binary transfer of the numeric data type

From

Eliot Simcoe

Date:

14 June 2004, 16:02:42

  On Jun 14, 2004, at 2:51 PM, Jeroen T. Vermeulen wrote:

> On Mon, Jun 14, 2004 at 02:25:45PM -0400, Eliot Simcoe wrote:
>
>> Could someone please explain to me what the varlen field is used for?
>> It seems to hold enormous values in my testing, so it couldn't 
>> possibly
>> be the allocated number of bytes or bits... or the number of digits
>
> In case you're on a little-endian platform, have you remembered to 
> convert
> it to network byte order first?
>
>
> Jeroen
>
>

The platform I am developing on is big endian, so this shouldn't be an 
issue as far as I understand.

Eliot Simcoe

Re: libpq binary transfer of the numeric data type

From

"Jeroen T. Vermeulen"

Date:

14 June 2004, 16:28:00

On Mon, Jun 14, 2004 at 03:02:10PM -0400, Eliot Simcoe wrote:
> The platform I am developing on is big endian, so this shouldn't be an 
> issue as far as I understand.

That's right.  The data are all in network byte order, i.e. big-endian.

FWIW I'm doing the C++ frontend and I decided not to touch binary transfers
at all for now.  There are other reasons as well, but lack of a clear and
stable description was one of them...

Jeroen

Re: libpq binary transfer of the numeric data type

From

Eliot Simcoe

Date:

14 June 2004, 16:35:10

I'm aware of a few wrappers that exist, but none that do type 
conversion. The particular wrapper in use by opengroupware.org seems to 
rely on Apple's Enterprise Objects framework which is no longer 
supported on Mac OS X AFAIK. In addition, I don't believe it does 
anything but extremely limited type conversion.

On Jun 14, 2004, at 2:57 PM, David Stanaway wrote:

> On Mon, 2004-06-14 at 13:25, Eliot Simcoe wrote:
>> Hello Everyone,
>>
>> I am writing an Objective-C wrapper of libpq and would like
>> toimplement automatic type conversion between the standard
>> PostgreSQLdata types and associated Objective-C wrapper classes.
>
>
> Hi,
>
> I don't use Objective-C but I thought you might want to have a look at
> opengroupware.org  They use postgresql as their RDBMS and interface 
> with
> it with Objective-C so they already have a wrapper for it.
>
> -- 
> David Stanaway <david@stanaway.net>
>
> ---------------------------(end of 
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html

Re: libpq binary transfer of the numeric data type

From

Tom Lane

Date:

14 June 2004, 16:38:05

Eliot Simcoe <esimcoe@mac.com> writes:
> it has been a struggle using the binary I/O method supported by 
> PQexecParams() for both parameter passing and result retrieval due to 
> the lack of documentation on the internal data representations used by 
> PostgreSQL,

As of 7.4 the "binary" representation used on-the-wire is NOT
necessarily the same as the server internal representation; it is
big-endian for one thing, and the layout is simplified where possible.
The documentation is still somewhat lacking; best bet is to read the
source code for the send/recv routines for the data types you are
interested in.

> and after reading a number of posts from 1999 frowning upon 
> its usage I am aware of the issues associate with this approach.

Anything dated 1999 is not very relevant to the approach we are actually
using today.

> Could someone please explain to me what the varlen field is used for? 

varlen is not present in any of the 7.4 on-the-wire formats.  According
to numeric_recv, * External format is a sequence of int16's: * ndigits, weight, sign, dscale, NumericDigits.

Some other relevant comments are
* The value represented by a NumericVar is determined by the sign, weight,* ndigits, and digits[] array.* Note: the
firstdigit of a NumericVar's value is assumed to be multiplied* by NBASE ** weight.    Another way to say it is that
thereare weight+1* digits before the decimal point.  It is possible to have weight < 0.

* dscale, or display scale, is the nominal precision expressed as number* of digits after the decimal point (it must
alwaysbe >= 0 at present).* dscale may be more than the number of physically stored fractional digits,* implying that
wehave suppressed storage of significant trailing zeroes.* It should never be less than the number of stored digits,
sincethat would* imply hiding digits that are present.  NOTE that dscale is always expressed* in *decimal* digits, and
soit may correspond to a fractional number of* base-NBASE digits --- divide by DEC_DIGITS to convert to NBASE digits.

Personally I would suggest using text format instead of binary, at least
for "numeric"-type values.  There really is not enough gain from dealing
with the binary format to justify doing the work and dealing with
possible future incompatibilities.  Binary format makes sense to me for
simple integers and maybe for floats.
        regards, tom lane