Thread: libpq / SQL3
I'm looking at implementing the SQL99 C interface, which looks more or less reasonable. There are some annoyances however. The API to describe the return result SQLDescribeCol takes a pointer to a SQLSMALLINT to return the type of the returned column. There are a whole lot of #define values in the standards document with specified values for each of the standard types. This is annoying because an Oid is bigger than a SQLSMALLINT, and the postgres oid values are not the same as the standards #define values. Now what it is tempting to do is to change the API to instead take a pointer to a Oid, and redefine the #define values to the standard oid values for postgres. However this would obviously be a change to the API. Or I could define a new API, which kinda defeats the purpose of using a standard API since the standard would become largely useless for postgres users. Any thoughts? I'm tempted to define a new datatype typedef Oid SQLDATATYPE; This is what the standard should have done IMHO. It would be one of those minor incompatibilities that people trying to write portable code could easily fix to be portable between implementations, simply by defining this variable as a SQLDATATYPE instead of SQLSMALLINT. Or I could go for a custom API. I guess it's probably all a bit of a wasted argument since only postgres will have implemented the API. Maybe we can set the standard?
Chris Bitmead writes: > I'm looking at implementing the SQL99 C interface, which looks more or > less reasonable. There are some annoyances however. > The API to describe the return result SQLDescribeCol takes a pointer > to a SQLSMALLINT to return the type of the returned column. There are > a whole lot of #define values in the standards document with specified > values for each of the standard types. This is annoying because an Oid > is bigger than a SQLSMALLINT, and the postgres oid values are not the > same as the standards #define values. Then it seems we need to add a column to pg_type to keep track the "sqltypeid" as an int2. It would be annoying but doable. The alternative for the moment would be to hard-code the translation at the client side, i.e., have SQLDescribeCol translate the oid it received to some standard number, but that would have obvious problems with user-defined types. > I'm tempted to define a new datatype > typedef Oid SQLDATATYPE; > This is what the standard should have done IMHO. The standard doesn't require that system catalogs are implemented as user-space tables, but equating types to oids would have effectively imposed that requirement. > I guess it's probably all a bit of a wasted argument since only > postgres will have implemented the API. Maybe we can set the standard? I wonder. I doubt that they invented this API out of the blue. (Although quite honestly it sometimes looks like it. Note how they religiously avoid pointers everywhere.) It looks like a great goal to achieve though. Having a standard as reference is always good. ("It is so because SQL says so." "Your API might be nice, but ours is standard.") Btw., I've been considering implementing this as a rather thin layer on top of libpq, the idea being that those who want to write portable applications can use SQL/CLI, and those who want to use Postgres-specific features use libpq. I guess you'd rather completely replace libpq? I'd be afraid of effectively abandoning libpq, with everything that's build upon it. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
> Then it seems we need to add a column to pg_type to keep track the > "sqltypeid" as an int2. It would be annoying but doable. That occured to me, but it doesn't seem worth cluttering up the back-end to achieve. > The alternative > for the moment would be to hard-code the translation at the client side, > i.e., have SQLDescribeCol translate the oid it received to some standard > number, but that would have obvious problems with user-defined types. For user-defined types, you're out of luck anyway as far as strictly following the standard. I think what you're saying is strictly follow the standard anyway. I guess you're right, it's just annoying when the standard is slightly lame but could be fixed with some ever-so subtle changes. > > I'm tempted to define a new datatype > > typedef Oid SQLDATATYPE; > > This is what the standard should have done IMHO. > > The standard doesn't require that system catalogs are implemented as > user-space tables, but equating types to oids would have effectively > imposed that requirement. What I'm saying is that if the standard allowed for an SQLDATATYPE type, whose exact type is implementation-defined, then implementations could choose without affecting portablility. > > I guess it's probably all a bit of a wasted argument since only > > postgres will have implemented the API. Maybe we can set the standard? > > I wonder. I doubt that they invented this API out of the blue. (Although > quite honestly it sometimes looks like it. Note how they religiously avoid > pointers everywhere.) Yes that is strange. I started off by defining various types as structures thinking that all those SQLSHORTINT's everywhere were either suggestions, or put there because they didn't have an idea of what the implementation might do. Later I realised that maybe they really mean them to be SHORTINTS, and I wonder whether I should change them back or whether the client doesn't need to know. Looks like I'll change them back I guess to be really strict about it. > It looks like a great goal to achieve though. Having > a standard as reference is always good. ("It is so because SQL says so." > "Your API might be nice, but ours is standard.") > > Btw., I've been considering implementing this as a rather thin layer on > top of libpq, That would be worth considering, except that part of the idea of me going to the new API is to avoid some of the annoyances of libpq such as the non-streaming nature of it. Maybe when everyone is comfortable with the stability of the new library then libpq can be redone in terms of SQL3? It would be pretty easy I think. > the idea being that those who want to write portable > applications can use SQL/CLI, and those who want to use Postgres-specific > features use libpq. I guess you'd rather completely replace libpq? I'd be > afraid of effectively abandoning libpq, with everything that's build upon > it.
Peter Eisentraut <peter_e@gmx.net> writes: > Then it seems we need to add a column to pg_type to keep track the > "sqltypeid" as an int2. It would be annoying but doable. The alternative > for the moment would be to hard-code the translation at the client side, > i.e., have SQLDescribeCol translate the oid it received to some standard > number, but that would have obvious problems with user-defined types. But there are no standard numbers for user-defined types, now are there? Might as well use the type OID for them. Adding another column to pg_type inside the backend is not too hard, but to transmit that data to the frontend in every query would mean an incompatible protocol change, which is a much greater amount of pain. I doubt it's worth it. Putting the translation table into SQLDescribeCol is no worse than having the ODBC driver do a similar translation, which no one has complained about in my recollection. regards, tom lane
Tom Lane wrote: > > Peter Eisentraut <peter_e@gmx.net> writes: > > Then it seems we need to add a column to pg_type to keep track the > > "sqltypeid" as an int2. It would be annoying but doable. The alternative > > for the moment would be to hard-code the translation at the client side, > > i.e., have SQLDescribeCol translate the oid it received to some standard > > number, but that would have obvious problems with user-defined types. > > But there are no standard numbers for user-defined types, now are there? Well the standard lists numbers for each type, with the comment in the header "sqlcli.h Header File for SQL CLI. * The actual header file must contain at least the information * specified here, exceptthat the comments may vary." So if you are pedantic I guess you have to use their numbers?? The other problem as I said is that their type is a short, whereas Oid is a long, so there is no guarantee it will fit. I guess the core types will fit because they happen to be smaller than this. > Might as well use the type OID for them. > > Adding another column to pg_type inside the backend is not too hard, > but to transmit that data to the frontend in every query would mean > an incompatible protocol change, which is a much greater amount of > pain. I doubt it's worth it. Putting the translation table into > SQLDescribeCol is no worse than having the ODBC driver do a similar > translation, which no one has complained about in my recollection. I agree.
Chris Bitmead <chris@bitmead.com> writes: > Tom Lane wrote: >> But there are no standard numbers for user-defined types, now are there? > Well the standard lists numbers for each type, ... > So if you are pedantic I guess you have to use their numbers?? The other > problem as I said is that their type is a short, whereas Oid is a long, > so there is no guarantee it will fit. I'd read that as saying that you have to use their numbers for the types that are called out in the standard. But user-defined types cannot be called out in the standard (or have they standardized prescience as well?) so we're on our own about how to represent those. >> Might as well use the type OID for them. I had second thoughts about this, because one of the things I think will be happening in the not-too-distant future is that we'll be offering a configure-time choice about whether OID is 4 or 8 bytes (that is, long or long long). I suspect it'd be a bad idea to have core aspects of libpq's API change in a binary-incompatible fashion depending on a server configuration choice. What might be the best bet is for this translation function to return "short" as in the spec, with the spec-defined values for the datatypes known to the spec, and a single "UNKNOWN" value for everything else. Apps that need to tell the difference among user-defined types could look at either the type OID or the type name, taking a binary- compatibility risk if they insist on using the OID in binary form (as long as they treat it as an ASCII string they probably aren't affected by 4 vs 8 bytes...) But a bog-standard app would never look at either, because it's only using bog-standard datatypes, no? regards, tom lane
> What might be the best bet is for this translation function to return > "short" as in the spec, with the spec-defined values for the datatypes > known to the spec, and a single "UNKNOWN" value for everything else. > Apps that need to tell the difference among user-defined types could > look at either the type OID or the type name, taking a binary- > compatibility risk if they insist on using the OID in binary form > (as long as they treat it as an ASCII string they probably aren't > affected by 4 vs 8 bytes...) But a bog-standard app would never look > at either, because it's only using bog-standard datatypes, no? So you are saying map to the standard-defined values. Good idea. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Tom Lane wrote: > What might be the best bet is for this translation function to return > "short" as in the spec, with the spec-defined values for the datatypes > known to the spec, and a single "UNKNOWN" value for everything else. > Apps that need to tell the difference among user-defined types could > look at either the type OID or the type name, taking a binary- > compatibility risk if they insist on using the OID in binary form > (as long as they treat it as an ASCII string they probably aren't > affected by 4 vs 8 bytes...) But a bog-standard app would never look > at either, because it's only using bog-standard datatypes, no? I agree, but perhaps for different reasons. I don't see any other choice. I'm making good progress on implementing the SQL99, but it is a lot trickier than I thought. libpq is cruftier than meets the eye. Can anybody (i.e Peter :) provide any insight on how the SQL99 API handles variable length datatypes where you don't know the length in a particular tuple in advance?
Chris Bitmead writes: > Can anybody (i.e Peter :) provide any insight on how the SQL99 API > handles variable length datatypes where you don't know the length in a > particular tuple in advance? Clause 5.9 "Character string retrieval" might provide some insight, although it's probably not what you had hoped for. T = target (where you want to store it) L = length of value V = the value b) Otherwise, let NB be the length in octets of a null terminator in the character set of T. Case: i) If Lis not greater than (TL-NB), then the first (L+NB) octets of T are set to V concatenated with a single implementation-definednull character that terminates a C character string. The values of the remaining characters of T are implementation-dependent. ii) Otherwise, T is set to the first (TL-NB) octets of V concatenatedwith a single implementation-defined null character that terminates a C character string and a -=> completion condition is raised: warning - string data, -=> right truncation. So highly robust applications would have to call DescribeCol before any GetData or similar call in order to allocate a sufficiently sized buffer. Which is a problem if DescribeCol doesn't know about user-defined data types. But remember that SQL does not provide any variable-without-limit length types, so there is theoretically never any uncertainty about what kind of buffer to allocate if you know the query. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut wrote: > So highly robust applications would have to call DescribeCol before any > GetData or similar call in order to allocate a sufficiently sized buffer. > Which is a problem if DescribeCol doesn't know about user-defined data > types. DescribeCol can be made to know about all data types. The problem is that DescribeCol I don't think is designed to be called after every fetch, so it doesn't know how big each entry is. > But remember that SQL does not provide any variable-without-limit length > types, so there is theoretically never any uncertainty about what kind of > buffer to allocate if you know the query. Pretty lame. But I saw somewhere in the document that GetData is able to retrieve big fields piece by piece. But I could never figure out how that is supposed to happen. Then there is the stuff about handling blobs, which I get the feeling from some of the wording that this interface is supposed to handle any big field, but it's also a bit obscure.