Incremental results from libpq - Mailing list pgsql-interfaces

From Scott Lamb
Subject Incremental results from libpq
Date
Msg-id 4D426B54-E08F-4DE9-9776-D46610CA4968@slamb.org
Whole thread Raw
Responses Re: Incremental results from libpq  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-interfaces
I'm using libpq's asynchronous interface. I'd like to do the  
equivalent of setting fetchSize on the JDBC driver - get a chunk of  
results, handle them, free the memory they use, and then get another  
until I've handled an entire query.

I can do this at the SQL layer using "declare cursor ..." and "fetch  
forward n ..." but it seems like the lower level should be able to do  
this for me. It'd also let me have a more natural interface that (A)  
doesn't make the caller take a PostgreSQL-specific declare/fetch path  
for each query (B) can still use the JDBC-style "execute" that  
doesn't care if it's dealing with a row-returning statement.

I see that JDBC driver (at least in protocol version 3; I don't care  
about 2) does this by passing a maximum number of rows when sending  
Execute, then handling PortalSuspended and Execute again. I also see  
that libpq never sends a maximum number of rows or handles  
PortalSuspended.

Still, I think it should be able to do what I want. The results are  
sent from the database in order. This message type would be necessary  
to ensure the database sends no more than N rows, but it's not  
necessary to ensure the client handles N rows as soon as it has them.

I had been retrieving results from a query in this fashion:
    while True:        readfds = [PQsocket(conn)]        writefds = []        if PQflush(conn):            writefds =
readfds       if PQconsumeInput(conn):            error        if not PQisBusy(conn):            break
poll(readfds,writefds)    return PQgetResult(conn)
 

which the documentation recommends. But PQisBusy(conn) doesn't return  
false until the _entire_ resultset has been retrieved from the  
server. And if I skip PQisBusy(conn) and go straight for the  
PQgetResult(), it blocks internally until it can complete.

I looked inside libpq, and came up with this instead:
    while True:        readfds = [PQsocket(conn)]        writefds = []        if PQflush(conn):            writefds =
readfds       if PQconsumeInput(conn):            error        if PQisBusy(conn):            break        if
conn->result!= NULL and PQntuples(conn->result) >  
 
retrieved:            return conn->result        poll(readfds, writefds)    last = True    return PQgetResult(conn)

where "retrieved" is the number of rows I've examined so far, and  
"last" indicates that I shouldn't call again.

which is 1/3rd right:

- It does return results incrementally; good.
- It pokes inside libpq; ugh.
- It doesn't free any memory until the whole query's done. I suppose  
I could do that by changing conn->result myself, but...ugh. Is there  
a better way?

-- 
Scott Lamb <http://www.slamb.org/>




pgsql-interfaces by date:

Previous
From: Michael Fuhr
Date:
Subject: Re: PLPythonU & Out of Memory - Importing Query
Next
From: Tom Lane
Date:
Subject: Re: Incremental results from libpq