Proposed new libpq API - Mailing list pgsql-hackers

From Chris Bitmead
Subject Proposed new libpq API
Date
Msg-id 3962E1CF.773ADF9E@nimrod.itg.telecom.com.au
Whole thread Raw
Responses Re: Proposed new libpq API  (The Hermit Hacker <scrappy@hub.org>)
List pgsql-hackers
I've been thinking about what changes are necessary to the libpq
interface to support returning variable type tuples. This was
discussed a number of months back but an exact interface wasn't nailed
down. 

Let me then put forward the following suggestion open for comment. The
suggestion is very similar to the original postgres solution to this
problem. What I have added is some consideration of how a streaming
interface should work, and hopefully I will incorporate that
enhancement while I'm at it.

Into libpq will be (re)introduced the concept of a group. Tuples which
are returned will be from a finite number of different layouts.

Thus there will be an API PQnfieldsGroup(PGresult, group_num). And
similar for PQftypeGroup etc. There will be a PQgroup(PGresult,
tuple_num) which will tell you which group any given tuple belongs to.

To support streaming of results a new function PQflush(PGresult,
tuple_num) would
be introduced. It discards previous results that are cached. PQexec
would be changed so that it doesn't absorb the full result set
straight away like it does now (*). Instead it would only absorb
results on a need to basis when calling say PQgetValue.

Currently you might read results like this...

PGresult *res = PQexec("select * from foo");
for (int i = 0; i < PQntuples(res); i++) { printf("%s\n", PQgetValue(res, i, 0);
}

It has the disadvantage that all the results are kept in memory at
once. This code would in the future be modified to be...

PGresult *res = PQexec("select * from foo");
for (int i = 0; i < PQntuples(res); i++) { printf("%s\n", PQgetValue(res, i, 0);  PQflush(res) // NEW NEW
}

Now PQexec doesn't absorb all the results at once. PQgetValue will
read them on a need-to basis. PQflush will discard each result through
the loop.

I could also write...

PGresult *res = PQexec("select * from foo");
for (int i = 0; i < PQntuples(res); i++) { printf("%s\n", PQgetValue(res, i, 0); if (i % 20) {   PQflush(res, -1) }
}

In this case the results are cached in chunks of 20. Or I could write...

PGresult *res = PQexec("select * from foo");
for (int i = 0; i < PQntuples(res); i++) { printf("%s\n", PQgetValue(res, i, 0); PQflush(res, i-20)
}

In this case the last 20 tuples are kept in memory in any one time as
a sliding window. If I try to access something out of range of the
current cache I get a NULL result.

Back to the multiple tuple return types issue. psql code may do
something like...

int currentGroup = -1, group;
PGresult *res = PQexec(someQuery);
for (int i = 0; i < PQntuples(res); i++) { group = PQgroup(res, i); if (group != currentGroup)    printHeaders(res,
group);} currentGroup = group; for (j = 0; j < PQnfieldsGroup(res, group); j++) {    printf("%s |", PQgetValue(res, i,
j);} printf("\n"); PQflush(res)
 
}

printHeaders(PGresult *res, int group) { for (j = 0; j < PQnfieldsGroup(res, group); j++) {   printf("%s |",
PQfnameGroup(res,group)); } printf("\n");
 
}

This would print different result types with appropriate headers...
create table a (aa text);
create table b under a (bb text);
select ** from a;
aa |
----
foo
jar

aa | bb
-------
bar|baz
boo|bon

(*) Assuming that this doesn't unduly affect current behaviour. I
can't see that it would, but if it would another API would be needed
PQexecStream.


pgsql-hackers by date:

Previous
From: "Hiroshi Inoue"
Date:
Subject: RE: current CVS: undefined reference to `PGLZ_RAW_SIZE'
Next
From: Chris Bitmead
Date:
Subject: Re: Proposed new libpq API