Re: Portable interfaces ... - Mailing list pgsql-interfaces
From | Preston A. Elder |
---|---|
Subject | Re: Portable interfaces ... |
Date | |
Msg-id | 1080412297.23111.147.camel@temple.srealm.net.au Whole thread Raw |
In response to | Re: Portable interfaces ... (Peter Eisentraut <peter_e@gmx.net>) |
Responses |
Re: Portable interfaces ...
|
List | pgsql-interfaces |
On Fri, 2004-03-26 at 05:22, Peter Eisentraut wrote: > I don't understand what the API has to do with Unicode. Either your > strings are in Unicode or they are not. This is set by your > application. OK, lets say I'm writing a library that is not supposed to know anything at all about the database structure. I'm just writing something to give the client-programmer a generic interface to many different storage types (postgres, berkelydb, an XML file, etc). In this instance, they may give me a hint as to whether to allow standard or wide strings. I need to: 1) verify that the database will allow wide (multi-byte) strings if they have chosen to. 2) find out the encoding style of that db/table so I know how to encode it into an multi-byte string (since I don't think the libpq API accepts wchar_t strings, but I think it accepts multi-byte strings). 3) Also know the encoding style so that I can actually encode my wchar_t strings into multi-byte strings (and vice versa) - since this is very locale dependant. As I said, unicode in postgres seems to be an afterthought - this is also evidenced that encoding can only be specified at the database level, when most databases I've seen allow it down to the column level (with inheritance from table then database). I found out the encoding is available with pg_result->content_encoding. However 1) The definition of pg_result is not part of the 'standard' API (its in the 'private' libpq headers). So again, I cannot rely on systems having it defined. 2) It is an integer, which maps more to an internal encoding type than to anything I can use with a standard multibyte<->wchar_t function. > > For now, I am (grudgingly) copying the OID codes, and going to use > > system-based multibyte/unicode functions (and hope to heck that they > > are compatible with postgres, but I get the feeling they are not > > fully compatible after browsing pg_wchar.h) > They should be. I should also not have to 'hope it works' for standard (non-user defined) types. Nor should I have to execute a select to figure out the oid type for a standard type - which incurs not just the penalty of having to do a select and waiting for SQL to get back to me, but also a string comparison of the results (often repeatedly - strcmp(result, "bool")==0, strcmp(result, "int4")==0, ...) to check what it is. I realise I could just do this on startup, and build my own table, but its still a CPU/wait time penalty. And copying constants out of elsewhere when the constants should be provided is also rather hackish. > > is, right now, I have no verification on things like string lengths > > because theres no real way to find out the maximum length of a field > > (encoding-specific - ie. for UTF-8, a length of 20 means 20 chars, > > for UTF-16, a length of 20 means 20 wchar_t's (40 bytes)). > The libpq API gives you a way to determine the binary length of a > returned datum. You need to allocate that dynamically. I'm not talking about allocation here. I'm talking about string length. Or more correctly, maximum string length. I want to find out from the database the maximum length of a string I can pass - even if its in bytes (in which case I divide it by 2 for 2-byte encoding systems ... kindof). Which is very different to how many bytes it uses up in the database's storage. > The plans are more or less that if you don't like it, try using a > different one, such as ODBC, or if a different programming language > than C. Which is the wrong attitude for a database product trying to compete in what is turning into a more and more crowded field. I'm not using ODBC for one simple reason - I don't want the users of my library to have to configure an ODBC interface to their database, etc. Nor do I want them to have to install anything extra apart from postgres that they may have to install for ODBC support. I'm trying to keep my library as native as possible with database interfaces, which is why I'm not even using libpq++. For now, I'm going to have to hack my way around these issues, with a series of ugly hacks - but there is no way I could ever in good concience recommend any database application developer who is writing code in C or C++ use postgres as their back end database without serious API changes that will actually provide decent information on standard types and column/table/database configuration. Without having to manually do some selects, of which are not documented anywhere. For example, there is nowhere in the docs that tells me "To get the OID of column 'y' in table 'x', do: SELECT atttypid FROM pg_attribute WHERE attrelid = (SELECT relfilenode FROM pg_class WHERE relname = 'x') AND attname = 'y' ". That would be an extremely useful statement to have in the docs. Then even if the API did not support such blatantly obvious things like this, I would at least know how to hack it without having to research what I imagine would be commonly needed functionality in any reasonably complex application or library where the application/library is not supposed to know, or rely on the underlying data types. -- PreZ Founder The Neuromancy Society http://www.neuromancy.net
pgsql-interfaces by date: