Thread: libpq binary transfer of the numeric data type
Hello Everyone, I am writing an Objective-C wrapper of libpq and would like to implement automatic type conversion between the standard PostgreSQL data types and associated Objective-C wrapper classes. it has been a struggle using the binary I/O method supported by PQexecParams() for both parameter passing and result retrieval due to the lack of documentation on the internal data representations used by PostgreSQL, and after reading a number of posts from 1999 frowning upon its usage I am aware of the issues associate with this approach. I've searched the lists numerous times for information on the binary representations of a number of types such as Date and Numeric in vain. Finally, after about a day of poking header files (literally) I was able to extract the internal data structures from the PgSQL source code. Now I am facing the challenge of deciphering their usage. typedef struct { SInt32 varlen; // variable size (std varlena header) SInt16 n_weight; // weight of 1st digit UInt16 n_sign_dscale; // sign + display scale SInt8 n_data[1]; // digits } sql_numeric; Could someone please explain to me what the varlen field is used for? It seems to hold enormous values in my testing, so it couldn't possibly be the allocated number of bytes or bits... or the number of digits stored in the n_data field. The n_data field itself is a mystery. I can't seem to figure out exactly how it is storing the digits. According to numeric.c n_data is an array of signed short integers. Each short integer represents four numbers in some sort of binary-coded-decimal like means. I can extract the values in the first two shorts, but the rest of the results seem garbled. Here is some source code illustrating what I have done: sql_numeric numeric = (sql_numeric)(*((sql_numeric *)[data bytes])); UInt16 dscale = numeric.n_sign_dscale & NUMERIC_DISPLAY_SCALE_MASK; NSMutableString *string = [NSMutableString string]; UInt32 i, c = ([data length] - 8) / 2; UInt32 partShorts = (dscale + 3) / 4; UInt32 wholeShorts = c - partShorts; for( i = 0 ; i < wholeShorts ; i++ ) { SInt16 value = (SInt16)(((SInt16 *)(numeric.n_data))[i]); char block[5]; UInt32 j = 0; char t; sprintf( block, "%d", value ); while( (t = block[j++]) != '\0' ) { [string appendFormat:@"%c", t]; } } [string appendString:@"."]; for( i = wholeShorts ; i < c ; i++ ) { SInt16 value = (SInt16)(((SInt16 *)(numeric.n_data))[i]); char block[5]; UInt32 j = 0; char t; sprintf( block, "%d", value ); while( (t = block[j++]) != '\0' ) { [string appendFormat:@"%c", t]; } } NSLog( string ); This is obviously a quick hack in a desperate attempt to figure out how the data is packed, but the results are strange. The input to the above is the numeric 123456777654.890123. The output (as displayed by NSLog()) is 123456774353.-267846, indicating that the first two shorts are decoded "correctly", but the rest is completely wrong. I am puzzled. Anyway, my first question is: How can I convert this into a double value? I know it isn't ideal, but at least it will allow the developer to access the number is some sort of meaningful way. My second question is related to an apparent difference between the documentation and one of the examples provided. Is the data always converted to big endian byte order when transferred over the network, or is it dependent on the serving architecture? It seems to me that this should be standardized, and the third example program seems to state that it is. I have read, however, posts to these lists indicating that it is not. Anyway, sorry for this long winded post, but I have been struggling for a while and would really appreciate any help anyone can give me. Thanks in advance, Eliot Simcoe Software Engineer Vantine Imaging, LLC. 315-790-1773
On Mon, 2004-06-14 at 13:25, Eliot Simcoe wrote: > Hello Everyone, > > I am writing an Objective-C wrapper of libpq and would like > toimplement automatic type conversion between the standard > PostgreSQLdata types and associated Objective-C wrapper classes. Hi, I don't use Objective-C but I thought you might want to have a look at opengroupware.org They use postgresql as their RDBMS and interface with it with Objective-C so they already have a wrapper for it. -- David Stanaway <david@stanaway.net>
On Jun 14, 2004, at 2:51 PM, Jeroen T. Vermeulen wrote: > On Mon, Jun 14, 2004 at 02:25:45PM -0400, Eliot Simcoe wrote: > >> Could someone please explain to me what the varlen field is used for? >> It seems to hold enormous values in my testing, so it couldn't >> possibly >> be the allocated number of bytes or bits... or the number of digits > > In case you're on a little-endian platform, have you remembered to > convert > it to network byte order first? > > > Jeroen > > The platform I am developing on is big endian, so this shouldn't be an issue as far as I understand. Eliot Simcoe
On Mon, Jun 14, 2004 at 03:02:10PM -0400, Eliot Simcoe wrote: > The platform I am developing on is big endian, so this shouldn't be an > issue as far as I understand. That's right. The data are all in network byte order, i.e. big-endian. FWIW I'm doing the C++ frontend and I decided not to touch binary transfers at all for now. There are other reasons as well, but lack of a clear and stable description was one of them... Jeroen
I'm aware of a few wrappers that exist, but none that do type conversion. The particular wrapper in use by opengroupware.org seems to rely on Apple's Enterprise Objects framework which is no longer supported on Mac OS X AFAIK. In addition, I don't believe it does anything but extremely limited type conversion. On Jun 14, 2004, at 2:57 PM, David Stanaway wrote: > On Mon, 2004-06-14 at 13:25, Eliot Simcoe wrote: >> Hello Everyone, >> >> I am writing an Objective-C wrapper of libpq and would like >> toimplement automatic type conversion between the standard >> PostgreSQLdata types and associated Objective-C wrapper classes. > > > Hi, > > I don't use Objective-C but I thought you might want to have a look at > opengroupware.org They use postgresql as their RDBMS and interface > with > it with Objective-C so they already have a wrapper for it. > > -- > David Stanaway <david@stanaway.net> > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html
Eliot Simcoe <esimcoe@mac.com> writes: > it has been a struggle using the binary I/O method supported by > PQexecParams() for both parameter passing and result retrieval due to > the lack of documentation on the internal data representations used by > PostgreSQL, As of 7.4 the "binary" representation used on-the-wire is NOT necessarily the same as the server internal representation; it is big-endian for one thing, and the layout is simplified where possible. The documentation is still somewhat lacking; best bet is to read the source code for the send/recv routines for the data types you are interested in. > and after reading a number of posts from 1999 frowning upon > its usage I am aware of the issues associate with this approach. Anything dated 1999 is not very relevant to the approach we are actually using today. > Could someone please explain to me what the varlen field is used for? varlen is not present in any of the 7.4 on-the-wire formats. According to numeric_recv, * External format is a sequence of int16's: * ndigits, weight, sign, dscale, NumericDigits. Some other relevant comments are * The value represented by a NumericVar is determined by the sign, weight,* ndigits, and digits[] array.* Note: the firstdigit of a NumericVar's value is assumed to be multiplied* by NBASE ** weight. Another way to say it is that thereare weight+1* digits before the decimal point. It is possible to have weight < 0. * dscale, or display scale, is the nominal precision expressed as number* of digits after the decimal point (it must alwaysbe >= 0 at present).* dscale may be more than the number of physically stored fractional digits,* implying that wehave suppressed storage of significant trailing zeroes.* It should never be less than the number of stored digits, sincethat would* imply hiding digits that are present. NOTE that dscale is always expressed* in *decimal* digits, and soit may correspond to a fractional number of* base-NBASE digits --- divide by DEC_DIGITS to convert to NBASE digits. Personally I would suggest using text format instead of binary, at least for "numeric"-type values. There really is not enough gain from dealing with the binary format to justify doing the work and dealing with possible future incompatibilities. Binary format makes sense to me for simple integers and maybe for floats. regards, tom lane