I'm using libpq's asynchronous interface. I'd like to do the
equivalent of setting fetchSize on the JDBC driver - get a chunk of
results, handle them, free the memory they use, and then get another
until I've handled an entire query.
I can do this at the SQL layer using "declare cursor ..." and "fetch
forward n ..." but it seems like the lower level should be able to do
this for me. It'd also let me have a more natural interface that (A)
doesn't make the caller take a PostgreSQL-specific declare/fetch path
for each query (B) can still use the JDBC-style "execute" that
doesn't care if it's dealing with a row-returning statement.
I see that JDBC driver (at least in protocol version 3; I don't care
about 2) does this by passing a maximum number of rows when sending
Execute, then handling PortalSuspended and Execute again. I also see
that libpq never sends a maximum number of rows or handles
PortalSuspended.
Still, I think it should be able to do what I want. The results are
sent from the database in order. This message type would be necessary
to ensure the database sends no more than N rows, but it's not
necessary to ensure the client handles N rows as soon as it has them.
I had been retrieving results from a query in this fashion:
while True: readfds = [PQsocket(conn)] writefds = [] if PQflush(conn): writefds =
readfds if PQconsumeInput(conn): error if not PQisBusy(conn): break
poll(readfds,writefds) return PQgetResult(conn)
which the documentation recommends. But PQisBusy(conn) doesn't return
false until the _entire_ resultset has been retrieved from the
server. And if I skip PQisBusy(conn) and go straight for the
PQgetResult(), it blocks internally until it can complete.
I looked inside libpq, and came up with this instead:
while True: readfds = [PQsocket(conn)] writefds = [] if PQflush(conn): writefds =
readfds if PQconsumeInput(conn): error if PQisBusy(conn): break if
conn->result!= NULL and PQntuples(conn->result) >
retrieved: return conn->result poll(readfds, writefds) last = True return PQgetResult(conn)
where "retrieved" is the number of rows I've examined so far, and
"last" indicates that I shouldn't call again.
which is 1/3rd right:
- It does return results incrementally; good.
- It pokes inside libpq; ugh.
- It doesn't free any memory until the whole query's done. I suppose
I could do that by changing conn->result myself, but...ugh. Is there
a better way?
--
Scott Lamb <http://www.slamb.org/>