Thread: Not fetching the entire result set when using holdable cursors

Not fetching the entire result set when using holdable cursors

From
Arie Ozarov
Date:
Hi,

Are there any plans to support holdable cursors without loading the entire data-set (which can potentially be very big)? 
The database seems to support it using holdable cursors.
My usage scenario is as follow:
I have many threads that share the same connection. They use different statements and iterate their own result sets.
It is very likely for threads to process their result set concurrently. All statements are running queries only.
We are running with auto-commit false as we don't want to load the entire data-set in memory. 
Every so often a thread will commit the transaction as it appears that running the transaction for too long slows down the process (DB?).
Upon commit any other thread that is currently iterating its result set using the same connection will get an exception suggesting the cursor is invalid as the transaction was committed.
We thought to solve this issue by using ResultSet.HOLD_CURSORS_OVER_COMMIT but as it appears that will trigger the driver to load the whole result set in memory (similar to autoCommit true).
Any suggestions?

Also how effective is connection multiplexing and processing/reading from multiple result set in parallel?

BTW, we noticed that all types are returned to the driver as strings and then converted to the specific type upon request (e.g. getInt). Any reason for that?

Some information:
driver: postgresql-8.2-506.jdbc3.jar (PostgreSQL 8.2 JDBC3 with SSL (build 506))
server: 8.2.4
OS: Linux XXX 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux

Thanks,
Arie.



Re: Not fetching the entire result set when using holdable cursors

From
Kris Jurka
Date:

On Wed, 14 May 2008, Arie Ozarov wrote:

> Are there any plans to support holdable cursors without loading the
> entire data-set (which can potentially be very big)?

There are no immediate plans.  Right now the way cursors are created on
the server is via the frontend/backend protocol not a DECLARE command.
The protocol level cursor creation does not allow specifying options like
holdable, so the driver can't do this.  The server folks aren't ready for
a new protocol revision yet, so we're in a little bit of a holding pattern
for the moment.

> Also how effective is connection multiplexing and processing/reading
> from multiple result set in parallel?

Only one thread can be fetching data from the backend at a time.  With an
appropriately set fetchsize, you should hopefully get enough data to spend
some time processing it instead of waiting for your turn to talk to the
server.  It all depends on how much processing you do per row.  If you're
fetching as fast as possible, using multiple threads won't help much.

> BTW, we noticed that all types are returned to the driver as strings and
> then converted to the specific type upon request (e.g. getInt). Any
> reason for that?

The problem with binary format is that you might not understand it
(especially considering user defined types).  So you can't ask for all
data to be returned as binary.  You must know what the data type to be
returned is to know if you can handle it as binary.  To get the type
information prior to fetching the data requires an extra network roundtrip
(at least for the first execution).  There is an outstanding patch to do
binary transfer when possible, but it has not been applied yet.

Kris Jurka