Thread: Alternative new libpq interface.
Some people suggested it might be a good idea to define a new interface, maybe call it libpq2. (Actually this would be a good time to abandon pq = postquel in favour of libpg.a ). I'll therefore put forward the following proposal as a possible starting point. I'm not particularly committed to either this proposal, my previous proposal, or perhaps Peter's proposal. A new interface is probably the cleanest, but the current library probably isn't all bad either. My idea is that there should be a very low level interface that has a minimum of bloat and features and caching and copying. This would be especially nice for me writing an ODMG interface because the ODMG interface would be needing to cache and copy things about so having libpq doing it too is extra overhead. It could also form the basis of a trivial re-implementation of the current libpq in terms of this interface. So the criteria I used for the low level interface is... *) Future-Proof. In preference to a PGconnect routine with 6 differentarguments, you create an empty PGConnection, set variousattributesvia setter functions and then call connect. That way it is futureproof against needing more arguments. Similarfor execQuery. *) Speed. Lean and mean. We can write a fancier interface on top ofthis for people who want convenience over speed. At thispoint Ihavn't attempted to design one. Thus the getValue routine (pg_valuebelow), is not null-terminated. The higherlevel interface can makesure of that if needed. In any case some sorts of data may containnulls. The main thing I dislike about the current interface is that it's not low-level enough. It won't let me get around the features that I don't want (like caching the entire result). Ok guys, which way do you want me to go? Or will someone else come up with something better? /* The Postgres Low-Level Interface */ typedef int PG_ErrorCode; /* Just creates an empty connection object. Like C++ new() */ PG_Connection *pg_newConnection(); void pg_freeConnection(PG_Connection *con); /* setter functions */ void pg_setDb(con); void pg_setUserName(con); void pg_setPassword(con); /* Connect to the database. TRUE for success */ PG_Boolean pg_connect(con); /* Find out the error code for what happened */ /* In the future there should be a unified error code system */ PG_ErrorCode pg_connect_error(PG_Connection * con); /* Just creates an empty query object */ PGquery * pg_newQuery(PGConnection *con); void pg_freeQuery(PGquery *q); /* setter function */ void pg_setSQL(char *query); /* Executes the query */ PG_Boolean pg_exec_sql(PGquery *q); typedef int PG_NextStatus; #define PG_NEXT_EOF 0 /* No more records */ #define PG_NEXT_OK 1 /* Returned a record */ #define PG_NEXT_ERROR -1 /* Error */ /* get the next record */ PG_NextStatus pg_next(PG_Connection *con); /* did the last record returned mark the start of a new group? */ PG_Boolean pg_new_group(PG_query *q); typedef int PG_Length; /* Get the data from a field, specifying the field number and returning the length of the data */ void *pg_value(PGquery *q, int field_num, PG_Length *len); PG_Boolean pg_is_null(PGquery *q, int field_num); /* If update/insert or delete, returns the number of rows affected */ int pg_num_rows_affected(PGquery *q); /* Returns the oid of the last inserted object */ Oid pg_last_oid(PGquery *q); /* Get the field name */ char *pg_field_name(PGquery *q, int field_num); /* Get the field type */ Oid pg_field_type(PGquery *q, int field_num); /* Find out the error code for what happened */ /* In the future there should be a unified error code system */ PG_ErrorCode pg_query_error(PGquery *q); /* Get a meaningful Error message for a code */ char *pg_errorMessage(PG_ErrorCode);
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes: > The main thing I dislike about the current interface is that it's not > low-level enough. It won't let me get around the features that I don't > want (like caching the entire result). Bear in mind that "avoiding the features you don't want" is not cost-free. In particular, I have seen no discussion in this thread of the implications that streaming read would have for error handling. In the current libpq, you either get a complete error-free result set or you don't. If there is to be a streaming interface then it must take into account the possibility of an error partway through the fetch. Applications that use the interface will also incur extra complexity from having to undo whatever they might have done with the initial part of the result data. Still, something along the lines of your sketch seems worth pursuing. Personally I've never once had any use for the "random access to result set" aspect of libpq's API, so it seems like buffering the whole set is a pretty high price to pay for a small simplification in error handling. My gut feeling about this is that if a complete rewrite is being considered, it ought to be done as a new interface library that's independent of libpq. libpq has its limitations, but it's moderately well debugged and lots of apps depend on it. A rewrite will need time to stabilize and to attract new apps --- unless you want to guarantee 100.00% backward compatibility, which I bet you won't. regards, tom lane
-- > My gut feeling about this is that if a complete rewrite is being > considered, it ought to be done as a new interface library that's > independent of libpq. I was thinking more along the lines of massaging the current libpq to support the new interface/features rather than starting with a blank slate. As you say libpq is well debugged and there are a lot of fine details in there I don't want to mess with. My aims are to get the OO features and streaming behaviour working with a hopefully stable interface. Does that affect your gut feeling? Your error observations are significant and I think they dismiss my 1st suggestion. That leaves the possibilities of the whole new interface versus massaging the current interface with streaming/grouping APIs.
On Thu, 6 Jul 2000, Tom Lane wrote: > Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes: > > The main thing I dislike about the current interface is that it's not > > low-level enough. It won't let me get around the features that I don't > > want (like caching the entire result). > > Bear in mind that "avoiding the features you don't want" is not > cost-free. In particular, I have seen no discussion in this thread > of the implications that streaming read would have for error handling. > > In the current libpq, you either get a complete error-free result set > or you don't. If there is to be a streaming interface then it must > take into account the possibility of an error partway through the > fetch. Applications that use the interface will also incur extra > complexity from having to undo whatever they might have done with > the initial part of the result data. > > Still, something along the lines of your sketch seems worth pursuing. > Personally I've never once had any use for the "random access to result > set" aspect of libpq's API, so it seems like buffering the whole set > is a pretty high price to pay for a small simplification in error > handling. > > My gut feeling about this is that if a complete rewrite is being > considered, it ought to be done as a new interface library that's > independent of libpq. libpq has its limitations, but it's moderately > well debugged and lots of apps depend on it. A rewrite will need time > to stabilize and to attract new apps --- unless you want to guarantee > 100.00% backward compatibility, which I bet you won't. Agreed, which was why I had suggested going to a libpq2 and leaving the current libpq intact ... but, I was always confused as to why pq vs pg, so Chris going to a libpg.a sounds like a really nice way to accomplish this without causing any headaches with 'legacy apps' that are tied to libpq ... What I'd suggest is leave libpq in for a few releases, until libpg stabilizes and then look at removing it and directing ppl over to libpq ...
On Thu, 6 Jul 2000, Chris Bitmead wrote: > > -- > > > My gut feeling about this is that if a complete rewrite is being > > considered, it ought to be done as a new interface library that's > > independent of libpq. > > I was thinking more along the lines of massaging the current libpq to > support the new interface/features rather than starting with a blank > slate. As you say libpq is well debugged and there are a lot of fine > details in there I don't want to mess with. > > My aims are to get the OO features and streaming behaviour working with > a hopefully stable interface. > > Does that affect your gut feeling? Your error observations are > significant and I think they dismiss my 1st suggestion. That leaves the > possibilities of the whole new interface versus massaging the current > interface with streaming/grouping APIs. cp -rp libpq libpg;cvs add libpg? if nothing else, it would give a template to build from without risking problems to current apps using libpq ... I'm not 100% certain that I'm reading Tom correct, but by 'independent of libpq', I'm taking it that libpg wouldn't need libpq to compile ... ?
Chris Bitmead <chris@bitmead.com> writes: >> My gut feeling about this is that if a complete rewrite is being >> considered, it ought to be done as a new interface library that's >> independent of libpq. > I was thinking more along the lines of massaging the current libpq to > support the new interface/features rather than starting with a blank > slate. As you say libpq is well debugged and there are a lot of fine > details in there I don't want to mess with. No reason you shouldn't steal liberally from the existing code, of course. > My aims are to get the OO features and streaming behaviour working with > a hopefully stable interface. > Does that affect your gut feeling? The thing that was bothering me was offhand suggestions about "let's reimplement the existing libpq API atop some redesigned lower layer". I think that's a recipe for trouble, in that it could introduce bugs and incompatibilities that will break existing applications. I'd rather see us leave libpq alone and start a separate development thread for the new version. That also has the advantage that you're not hogtied by compatibility considerations. regards, tom lane
Chris Bitmead writes: > Some people suggested it might be a good idea to define a new > interface, maybe call it libpq2. If you want to implement a new C API, look at SQL/CLI in ISO/IEC 9075-3:1999. It would be a shame if we created yet another proprietary API. Having said that, I don't follow the reasoning to create a completely new client library just for streaming results. A lot of work was put in the existing one, and if you extend it carefully then you might reap the benefits of that. Creating a new API is a tedious process that needs to be done very carefully. And also keep in mind that the majority of users these days doesn't use libpq directly. All the other language interfaces would have to be converted, that's a major effort that will never get done. What we'd end up with are two different APIs that are only half-maintained each. And a backend that has to support them both. > The main thing I dislike about the current interface is that it's not > low-level enough. It won't let me get around the features that I don't > want (like caching the entire result). Then factor out the low-level routines and make them part of the API. You could certainly re-implement the current "get all rows" as "while (rows left) { row = malloc(); read(&row); }". -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut wrote: > If you want to implement a new C API, look at SQL/CLI in ISO/IEC > 9075-3:1999. It would be a shame if we created yet another proprietary > API. As usual, our resident standards guru comes and saves the day. :-) Ok, I'm going to implement the SQL3 C API, which is a streaming API. The one change I'll make is I'll be adding a Boolean SQLIsNewGroup(hstmt), so that the OO stuff can tell when a new object type is on the way. Oh and I'll have some appropriate APIs for postgres specific extensions, like SQLLastInsertOid().
On Thu, 06 Jul 2000 15:50:13 +1000, Chris Bitmead wrote: > >My idea is that there should be a very low level interface that has a >minimum of bloat and features and caching and copying. This would be >especially nice for me writing an ODMG interface because the ODMG >interface would be needing to cache and copy things about so having >libpq doing it too is extra overhead. It could also form the basis of >a trivial re-implementation of the current libpq in terms of this >interface. What does it mean: ODMG interface. I've the ODMG 3.0 book in front of me and i do not know, what you would like to create ... why is caching and copying a need for ODMG ??? Marten Feldtmann ---- Marten Feldtmann, Germany
Marten Feldtmann wrote: > > On Thu, 06 Jul 2000 15:50:13 +1000, Chris Bitmead wrote: > > > > >My idea is that there should be a very low level interface that has a > >minimum of bloat and features and caching and copying. This would be > >especially nice for me writing an ODMG interface because the ODMG > >interface would be needing to cache and copy things about so having > >libpq doing it too is extra overhead. It could also form the basis of > >a trivial re-implementation of the current libpq in terms of this > >interface. > > What does it mean: ODMG interface. I've the ODMG 3.0 book in front > of me and i do not know, what you would like to create ... why is > caching and copying a need for ODMG ??? Each programming language has a specified ODMG interface. Database objects are mapped 1:1 with language objects. Every time you read a database object a language object is created to represent it. Now if you read the same database object in different places in your code. Maybe the same object is "navigated" to via different paths, you don't want two objects created in memory to represent that object. If that happened you could have a confusing integrity situation. So with an ODMG interface it keeps track of what database objects are in memory at any one time - think of it as a cache, and makes sure that if you request the same object again, it doesn't construct a new one but returns the existing one. Of course when you create one of these language objects, the values must be copied into the fields of the object. That's where the copying comes in. Now some object databases are implemented by just transferring whole database pages to the client side. Obviously they have pretty low overhead in terms of memory copying data from one place to another. A postgres style architecture _can_ compete with this, but I suspect it must try harder in libpq in terms of how many times a bit of memory coming in is copied around the place. (Or maybe not. Maybe that is premature optimisation).
On Tue, 11 Jul 2000 11:05:04 +1000, Chris Bitmead wrote: > >Each programming language has a specified ODMG interface. Database >objects are mapped 1:1 with language objects. Every time you read >a database object a language object is created to represent it. > Ok, this is defined as the language bindungs mentioned in this book. >Now if you read the same database object in different places in your >code. Maybe the same object is "navigated" to via different paths, >you don't want two objects created in memory to represent that object. >If that happened you could have a confusing integrity situation. > >So with an ODMG interface it keeps track of what database objects >are in memory at any one time - think of it as a cache, and makes >sure that if you request the same object again, it doesn't construct >a new one but returns the existing one. > Hmmm, what you want is not that easy. It means, that the object data is stored several times on the client: - you MUST hold an independent cache for each open connection to the database.- you MUST copy the values from the cacheto the language dependent representation. And you still do not get the result you want to have: the integrity problem. What happens, if the cache is not big enough. How are cached objects thrown away ? Garbage Collector in the cache system ?? And another point: this has nothing to do with an ODMG interface. It's just a nice performance hint for database access, but ODMG has nothing to do with it. Normally the identity is assured by the language binding - either by the database (as you would like it) or by the binding of a particular language to this database. To get an ODMG language binding you may use the libpq. You may put a cache system on top of this libpq and you have the thing you perhaps want to have. That's all you really need. What indeed would be a big win, it the chance to retrieve different result sets with one query ! Marten ---- Marten Feldtmann, Germany
Marten Feldtmann wrote: > Hmmm, what you want is not that easy. It means, that the object > data is stored several times on the client: > > - you MUST hold an independent cache for each open connection > to the database. > - you MUST copy the values from the cache to the language > dependent representation. No it's stored once on the client. The language dependant cache IS the cache. > And you still do not get the result you want to have: the > integrity problem. What happens, if the cache is not big > enough. How are cached objects thrown away ? Garbage Collector > in the cache system ?? The most simple scenario is that all objects are discarded upon transaction commit. Beyond that, there are other scenarios. Like if you want to reclaim some cache then UPDATE the database with any changes and leave the transaction open. If you need an object again then you read it in again. But to a large extent, memory management is based on the model of the programming language that you use, and managing it properly. Even if you use JDBC you can't just slurp gigabytes into memory. You have to re-use memory according to the conventions of the language in use. > And another point: this has nothing to do with an ODMG interface. > It's just a nice performance hint for database access, but > ODMG has nothing to do with it. What has nothing to do with ODMG? > Normally the identity is assured by the language binding - either > by the database (as you would like it) or by the binding of a > particular language to this database. > > To get an ODMG language binding you may use the libpq. You may > put a cache system on top of this libpq and you have the thing > you perhaps want to have. That's all you really need. Yes, but it's nice to compete on performance too. Whether libpq has inefficiencies that prevent that is to be seen. Many commercial ODBMSes are blindingly fast on object retrieval. > What indeed would be a big win, it the chance to retrieve different > result sets with one query ! I'm working on it.
On Tue, 11 Jul 2000 15:50:10 +1000, Chris Bitmead wrote: >Marten Feldtmann wrote: > >> Hmmm, what you want is not that easy. It means, that the object >> data is stored several times on the client: >> >> - you MUST hold an independent cache for each open connection >> to the database. >> - you MUST copy the values from the cache to the language >> dependent representation. > >No it's stored once on the client. The language dependant cache IS >the cache. Ok, then the new libpg has no own cache. That was not clear in your posting. Databases like Versant and Oracle do have client based caching system, which are NOT the language dependant cache - but an overall client based cache. This is mainly due to performance improvements they expect from that feature. > >> And you still do not get the result you want to have: the >> integrity problem. What happens, if the cache is not big >> enough. How are cached objects thrown away ? Garbage Collector >> in the cache system ?? > >The most simple scenario is that all objects are discarded upon >transaction >commit. Which is handled by the language binding ... correct ? I had a strange feeling when you wrote, that you want to write an ODMG interface, but never ever mentioned a programming language ! Therefore I thought you would like to create a new libpg with some support for an ODMG interface and I asked myself: what is so important, that I need to write a new lipq > >> Normally the identity is assured by the language binding - either >> by the database (as you would like it) or by the binding of a >> particular language to this database. >> >> To get an ODMG language binding you may use the libpq. You may >> put a cache system on top of this libpq and you have the thing >> you perhaps want to have. That's all you really need. > >Yes, but it's nice to compete on performance too. Whether libpq has >inefficiencies that prevent that is to be seen. Many commercial >ODBMSes are blindingly fast on object retrieval. > Hmmm, what should I say. I've seen PostgreSQL beeing blindingly fast fetching objects belonging to associations from one object. This has mostly something to do with the object model and it's mapping into the database ... Marten ---- Marten Feldtmann, Germany