Thread: Alternative new libpq interface.

Alternative new libpq interface.

From
Chris Bitmead
Date:
Some people suggested it might be a good idea to define a new
interface, maybe call it libpq2. (Actually this would be a good time
to abandon pq = postquel in favour of libpg.a ).

I'll therefore put forward the following proposal as a possible
starting point. I'm not particularly committed to either this
proposal, my previous proposal, or perhaps Peter's proposal. A new
interface is probably the cleanest, but the current library probably
isn't all bad either.

My idea is that there should be a very low level interface that has a
minimum of bloat and features and caching and copying. This would be
especially nice for me writing an ODMG interface because the ODMG
interface would be needing to cache and copy things about so having
libpq doing it too is extra overhead. It could also form the basis of
a trivial re-implementation of the current libpq in terms of this
interface.

So the criteria I used for the low level interface is...

*) Future-Proof. In preference to a PGconnect routine with 6 differentarguments, you create an empty PGConnection, set
variousattributesvia setter functions and then call connect. That way it is futureproof against needing more arguments.
Similarfor execQuery.
 

*) Speed. Lean and mean. We can write a fancier interface on top ofthis for people who want convenience over speed. At
thispoint Ihavn't attempted to design one. Thus the getValue routine (pg_valuebelow), is not null-terminated. The
higherlevel interface can makesure of that if needed. In any case some sorts of data may containnulls.
 

The main thing I dislike about the current interface is that it's not 
low-level enough. It won't let me get around the features that I don't 
want (like caching the entire result).

Ok guys, which way do you want me to go? Or will someone else come 
up with something better?

/* The Postgres Low-Level Interface 
*/

typedef int PG_ErrorCode;
/* Just creates an empty connection object. Like C++ new() */
PG_Connection *pg_newConnection();
void pg_freeConnection(PG_Connection *con);
/* setter functions */
void pg_setDb(con);
void pg_setUserName(con);
void pg_setPassword(con);
/* Connect to the database. TRUE for success */
PG_Boolean pg_connect(con);
/* Find out the error code for what happened */
/* In the future there should be a unified error code system */
PG_ErrorCode pg_connect_error(PG_Connection * con);

/* Just creates an empty query object */
PGquery * pg_newQuery(PGConnection *con);
void pg_freeQuery(PGquery *q);
/* setter function */
void pg_setSQL(char *query);
/* Executes the query */
PG_Boolean pg_exec_sql(PGquery *q);

typedef int PG_NextStatus;
#define PG_NEXT_EOF          0 /* No more records */
#define PG_NEXT_OK           1 /* Returned a record */
#define PG_NEXT_ERROR       -1 /* Error */

/* get the next record */
PG_NextStatus pg_next(PG_Connection *con);
/* did the last record returned mark the start of a new group? */
PG_Boolean pg_new_group(PG_query *q);

typedef int PG_Length;

/* Get the data from a field, specifying the field number and
returning the length of the data */
void *pg_value(PGquery *q, int field_num, PG_Length *len);
PG_Boolean pg_is_null(PGquery *q, int field_num);
/* If update/insert or delete, returns the number of rows affected */
int pg_num_rows_affected(PGquery *q);
/* Returns the oid of the last inserted object */
Oid pg_last_oid(PGquery *q);
/* Get the field name */
char *pg_field_name(PGquery *q, int field_num);
/* Get the field type */
Oid pg_field_type(PGquery *q, int field_num);
/* Find out the error code for what happened */
/* In the future there should be a unified error code system */
PG_ErrorCode pg_query_error(PGquery *q);

/* Get a meaningful Error message for a code */
char *pg_errorMessage(PG_ErrorCode);


Re: Alternative new libpq interface.

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> The main thing I dislike about the current interface is that it's not 
> low-level enough. It won't let me get around the features that I don't 
> want (like caching the entire result).

Bear in mind that "avoiding the features you don't want" is not
cost-free.  In particular, I have seen no discussion in this thread
of the implications that streaming read would have for error handling.

In the current libpq, you either get a complete error-free result set
or you don't.  If there is to be a streaming interface then it must
take into account the possibility of an error partway through the
fetch.  Applications that use the interface will also incur extra
complexity from having to undo whatever they might have done with
the initial part of the result data.

Still, something along the lines of your sketch seems worth pursuing.
Personally I've never once had any use for the "random access to result
set" aspect of libpq's API, so it seems like buffering the whole set
is a pretty high price to pay for a small simplification in error
handling.

My gut feeling about this is that if a complete rewrite is being
considered, it ought to be done as a new interface library that's
independent of libpq.  libpq has its limitations, but it's moderately
well debugged and lots of apps depend on it.  A rewrite will need time
to stabilize and to attract new apps --- unless you want to guarantee
100.00% backward compatibility, which I bet you won't.
        regards, tom lane


Re: Alternative new libpq interface.

From
Chris Bitmead
Date:
-- 

> My gut feeling about this is that if a complete rewrite is being
> considered, it ought to be done as a new interface library that's
> independent of libpq.  

I was thinking more along the lines of massaging the current libpq to
support the new interface/features rather than starting with a blank
slate. As you say libpq is well debugged and there are a lot of fine
details in there I don't want to mess with.

My aims are to get the OO features and streaming behaviour working with
a hopefully stable interface.

Does that affect your gut feeling? Your error observations are
significant and I think they dismiss my 1st suggestion. That leaves the
possibilities of the whole new interface versus massaging the current
interface with streaming/grouping APIs.


Re: Alternative new libpq interface.

From
The Hermit Hacker
Date:
On Thu, 6 Jul 2000, Tom Lane wrote:

> Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> > The main thing I dislike about the current interface is that it's not 
> > low-level enough. It won't let me get around the features that I don't 
> > want (like caching the entire result).
> 
> Bear in mind that "avoiding the features you don't want" is not
> cost-free.  In particular, I have seen no discussion in this thread
> of the implications that streaming read would have for error handling.
> 
> In the current libpq, you either get a complete error-free result set
> or you don't.  If there is to be a streaming interface then it must
> take into account the possibility of an error partway through the
> fetch.  Applications that use the interface will also incur extra
> complexity from having to undo whatever they might have done with
> the initial part of the result data.
> 
> Still, something along the lines of your sketch seems worth pursuing.
> Personally I've never once had any use for the "random access to result
> set" aspect of libpq's API, so it seems like buffering the whole set
> is a pretty high price to pay for a small simplification in error
> handling.
> 
> My gut feeling about this is that if a complete rewrite is being
> considered, it ought to be done as a new interface library that's
> independent of libpq.  libpq has its limitations, but it's moderately
> well debugged and lots of apps depend on it.  A rewrite will need time
> to stabilize and to attract new apps --- unless you want to guarantee
> 100.00% backward compatibility, which I bet you won't.

Agreed, which was why I had suggested going to a libpq2 and leaving the
current libpq intact ... but, I was always confused as to why pq vs pg, so
Chris going to a libpg.a sounds like a really nice way to accomplish this
without causing any headaches with 'legacy apps' that are tied to libpq
...

What I'd suggest is leave libpq in for a few releases, until libpg
stabilizes and then look at removing it and directing ppl over to libpq
...





Re: Alternative new libpq interface.

From
The Hermit Hacker
Date:
On Thu, 6 Jul 2000, Chris Bitmead wrote:

> 
> -- 
> 
> > My gut feeling about this is that if a complete rewrite is being
> > considered, it ought to be done as a new interface library that's
> > independent of libpq.  
> 
> I was thinking more along the lines of massaging the current libpq to
> support the new interface/features rather than starting with a blank
> slate. As you say libpq is well debugged and there are a lot of fine
> details in there I don't want to mess with.
> 
> My aims are to get the OO features and streaming behaviour working with
> a hopefully stable interface.
> 
> Does that affect your gut feeling? Your error observations are
> significant and I think they dismiss my 1st suggestion. That leaves the
> possibilities of the whole new interface versus massaging the current
> interface with streaming/grouping APIs.

cp -rp libpq libpg;cvs add libpg?

if nothing else, it would give a template to build from without risking
problems to current apps using libpq ... I'm not 100% certain that I'm
reading Tom correct, but by 'independent of libpq', I'm taking it that
libpg wouldn't need libpq to compile ... ?



Re: Alternative new libpq interface.

From
Tom Lane
Date:
Chris Bitmead <chris@bitmead.com> writes:
>> My gut feeling about this is that if a complete rewrite is being
>> considered, it ought to be done as a new interface library that's
>> independent of libpq.  

> I was thinking more along the lines of massaging the current libpq to
> support the new interface/features rather than starting with a blank
> slate. As you say libpq is well debugged and there are a lot of fine
> details in there I don't want to mess with.

No reason you shouldn't steal liberally from the existing code, of
course.

> My aims are to get the OO features and streaming behaviour working with
> a hopefully stable interface.

> Does that affect your gut feeling?

The thing that was bothering me was offhand suggestions about "let's
reimplement the existing libpq API atop some redesigned lower layer".
I think that's a recipe for trouble, in that it could introduce bugs
and incompatibilities that will break existing applications.  I'd
rather see us leave libpq alone and start a separate development
thread for the new version.  That also has the advantage that you're
not hogtied by compatibility considerations.
        regards, tom lane


Re: Alternative new libpq interface.

From
Peter Eisentraut
Date:
Chris Bitmead writes:

> Some people suggested it might be a good idea to define a new
> interface, maybe call it libpq2.

If you want to implement a new C API, look at SQL/CLI in ISO/IEC
9075-3:1999. It would be a shame if we created yet another proprietary
API.

Having said that, I don't follow the reasoning to create a completely new
client library just for streaming results. A lot of work was put in the
existing one, and if you extend it carefully then you might reap the
benefits of that.

Creating a new API is a tedious process that needs to be done very
carefully. And also keep in mind that the majority of users these days
doesn't use libpq directly. All the other language interfaces would have
to be converted, that's a major effort that will never get done. What we'd
end up with are two different APIs that are only half-maintained each. And
a backend that has to support them both.


> The main thing I dislike about the current interface is that it's not
> low-level enough. It won't let me get around the features that I don't
> want (like caching the entire result).

Then factor out the low-level routines and make them part of the API. You
could certainly re-implement the current "get all rows" as "while (rows
left) { row = malloc(); read(&row); }".


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: Alternative new libpq interface.

From
Chris Bitmead
Date:
Peter Eisentraut wrote:

> If you want to implement a new C API, look at SQL/CLI in ISO/IEC
> 9075-3:1999. It would be a shame if we created yet another proprietary
> API.

As usual, our resident standards guru comes and saves the day. :-)

Ok, I'm going to implement the SQL3 C API, which is a streaming API. The
one change I'll make is I'll be adding a
Boolean SQLIsNewGroup(hstmt), so that the OO stuff can tell when a new
object type is on the way. Oh and I'll have some appropriate APIs for
postgres specific extensions, like SQLLastInsertOid().


Re: Alternative new libpq interface.

From
M.Feldtmann@t-online.de (Marten Feldtmann)
Date:
On Thu, 06 Jul 2000 15:50:13 +1000, Chris Bitmead wrote:

>
>My idea is that there should be a very low level interface that has a
>minimum of bloat and features and caching and copying. This would be
>especially nice for me writing an ODMG interface because the ODMG
>interface would be needing to cache and copy things about so having
>libpq doing it too is extra overhead. It could also form the basis of
>a trivial re-implementation of the current libpq in terms of this
>interface.
What does it mean: ODMG interface. I've the ODMG 3.0 book in front
of me and i do not know, what you would like to create ... why is
caching and copying a need for ODMG ???
Marten Feldtmann

----

Marten Feldtmann, Germany



Re: Alternative new libpq interface.

From
Chris Bitmead
Date:
Marten Feldtmann wrote:
> 
> On Thu, 06 Jul 2000 15:50:13 +1000, Chris Bitmead wrote:
> 
> >
> >My idea is that there should be a very low level interface that has a
> >minimum of bloat and features and caching and copying. This would be
> >especially nice for me writing an ODMG interface because the ODMG
> >interface would be needing to cache and copy things about so having
> >libpq doing it too is extra overhead. It could also form the basis of
> >a trivial re-implementation of the current libpq in terms of this
> >interface.
> 
>  What does it mean: ODMG interface. I've the ODMG 3.0 book in front
> of me and i do not know, what you would like to create ... why is
> caching and copying a need for ODMG ???

Each programming language has a specified ODMG interface. Database
objects are mapped 1:1 with language objects. Every time you read
a database object a language object is created to represent it.

Now if you read the same database object in different places in your
code. Maybe the same object is "navigated" to via different paths,
you don't want two objects created in memory to represent that object.
If that happened you could have a confusing integrity situation.

So with an ODMG interface it keeps track of what database objects
are in memory at any one time - think of it as a cache, and makes
sure that if you request the same object again, it doesn't construct
a new one but returns the existing one.

Of course when you create one of these language objects, the values
must be copied into the fields of the object. That's where the copying
comes in. Now some object databases are implemented by just transferring
whole database pages to the client side. Obviously they have pretty low
overhead in terms of memory copying data from one place to another. A 
postgres style architecture _can_ compete with this, but I suspect
it must try harder in libpq in terms of how many times a bit of 
memory coming in is copied around the place. (Or maybe not. Maybe that
is premature optimisation).


Re: Alternative new libpq interface.

From
M.Feldtmann@t-online.de (Marten Feldtmann)
Date:
On Tue, 11 Jul 2000 11:05:04 +1000, Chris Bitmead wrote:

>
>Each programming language has a specified ODMG interface. Database
>objects are mapped 1:1 with language objects. Every time you read
>a database object a language object is created to represent it.
>
Ok, this is defined as the language bindungs mentioned in this 
book.

>Now if you read the same database object in different places in your
>code. Maybe the same object is "navigated" to via different paths,
>you don't want two objects created in memory to represent that object.
>If that happened you could have a confusing integrity situation.
>
>So with an ODMG interface it keeps track of what database objects
>are in memory at any one time - think of it as a cache, and makes
>sure that if you request the same object again, it doesn't construct
>a new one but returns the existing one.
>
Hmmm, what you want is not that easy. It means, that the object
data is stored several times on the client:
- you MUST hold an independent cache for each open connection   to the database.- you MUST copy the values from the
cacheto the language  dependent representation.
 
And you still do not get the result you want to have: the
integrity problem. What happens, if the cache is not big
enough. How are cached objects thrown away ? Garbage Collector
in the cache system ??
And another point: this has nothing to do with an ODMG interface.
It's just a nice performance hint for database access, but
ODMG has nothing to do with it.
Normally the identity is assured by the language binding - either
by the database (as you would like it) or by the binding of a
particular language to this database.
To get an ODMG language binding you may use the libpq. You may
put a cache system on top of this libpq and you have the thing 
you perhaps want to have. That's all you really need. 
What indeed would be a big win, it the chance to retrieve different 
result sets with one query !

Marten


----

Marten Feldtmann, Germany



Re: Alternative new libpq interface.

From
Chris Bitmead
Date:
Marten Feldtmann wrote:

>  Hmmm, what you want is not that easy. It means, that the object
> data is stored several times on the client:
> 
>  - you MUST hold an independent cache for each open connection
>    to the database.
>  - you MUST copy the values from the cache to the language
>    dependent representation.

No it's stored once on the client. The language dependant cache IS
the cache.
>  And you still do not get the result you want to have: the
> integrity problem. What happens, if the cache is not big
> enough. How are cached objects thrown away ? Garbage Collector
> in the cache system ??

The most simple scenario is that all objects are discarded upon
transaction
commit.

Beyond that, there are other scenarios. Like if you want to reclaim some
cache then UPDATE the database with any changes and leave the
transaction
open. If you need an object again then you read it in again.

But to a large extent, memory management is based on the model of
the programming language that you use, and managing it properly. Even
if you use JDBC you can't just slurp gigabytes into memory. You have
to re-use memory according to the conventions of the language in use.

>  And another point: this has nothing to do with an ODMG interface.
> It's just a nice performance hint for database access, but
> ODMG has nothing to do with it.

What has nothing to do with ODMG?

>  Normally the identity is assured by the language binding - either
> by the database (as you would like it) or by the binding of a
> particular language to this database.
> 
>  To get an ODMG language binding you may use the libpq. You may
> put a cache system on top of this libpq and you have the thing
> you perhaps want to have. That's all you really need.

Yes, but it's nice to compete on performance too. Whether libpq has
inefficiencies that prevent that is to be seen. Many commercial
ODBMSes are blindingly fast on object retrieval.
>  What indeed would be a big win, it the chance to retrieve different
> result sets with one query !

I'm working on it.


Re: Alternative new libpq interface.

From
M.Feldtmann@t-online.de (Marten Feldtmann)
Date:
On Tue, 11 Jul 2000 15:50:10 +1000, Chris Bitmead wrote:

>Marten Feldtmann wrote:
>
>>  Hmmm, what you want is not that easy. It means, that the object
>> data is stored several times on the client:
>> 
>>  - you MUST hold an independent cache for each open connection
>>    to the database.
>>  - you MUST copy the values from the cache to the language
>>    dependent representation.
>
>No it's stored once on the client. The language dependant cache IS
>the cache.

Ok, then the new libpg has no own cache. That was not clear in your
posting. Databases like Versant and Oracle do have client based
caching system, which are NOT the language dependant cache - but
an overall client based cache. This is mainly due to performance
improvements they expect from that feature.

> 
>>  And you still do not get the result you want to have: the
>> integrity problem. What happens, if the cache is not big
>> enough. How are cached objects thrown away ? Garbage Collector
>> in the cache system ??
>
>The most simple scenario is that all objects are discarded upon
>transaction
>commit.
Which is handled by the language binding ... correct ?
I had a strange feeling when you wrote, that you want to write
an ODMG interface, but never ever mentioned a programming language ! 
Therefore I thought you would like to create a new libpg with 
some support for an ODMG interface and I asked myself: what is
so important, that I need to write a new lipq


>
>>  Normally the identity is assured by the language binding - either
>> by the database (as you would like it) or by the binding of a
>> particular language to this database.
>> 
>>  To get an ODMG language binding you may use the libpq. You may
>> put a cache system on top of this libpq and you have the thing
>> you perhaps want to have. That's all you really need.
>
>Yes, but it's nice to compete on performance too. Whether libpq has
>inefficiencies that prevent that is to be seen. Many commercial
>ODBMSes are blindingly fast on object retrieval.
> 
Hmmm, what should I say. I've seen PostgreSQL beeing blindingly
fast fetching objects belonging to associations from one
object. This has mostly something to do with the object model
and it's mapping into the database ...
Marten
----

Marten Feldtmann, Germany