Thread: Result Set Cursor Patch

Result Set Cursor Patch

From
Andy Zeneski
Date:
I am not sure if this is the right place, or if the pgsql-patches list
is where this should go, since the CVS repositories are now in
different locations, I hoped this would be the right area.

Attached is a patch against the current CVS head for added support for
cursors in result set.

The reason for this patch I needed to iterate over a very large result
set, at first I was using PGSQL 7.3.x and the query crashed each time.
I believe out of memory errors were the cause. Later I realized that
the entire result set was being sent over, this was several millions of
rows.

I upgraded to PGSQL 7.4.2 since I saw it has support for cursors in
result sets, but it didn't work with my code. The problem was, I would
jump to last() and obtain the row for the total count, then jump back
to first() to begin my iteration. When setting the fetch size to 50, it
would only report 50 rows total. The last(), first(), absolute()
methods only looked at the current 'chunk' of rows. I think this may
have been a bug since later it appears that when scrollable was set it
was supposed to ignore the fetch size and grab all rows. This just will
not do.

This patch implements the absolute() method, using the cursor. The code
which checks for scrollable now creates the cursor with the SCROLL
keyword rather then avoiding cursors all together.

The patch passes the entire test suite, and I added a new test to test
absolute positioning. I am happy to answer any questions, and if anyone
sees any problems with the code please let me know. I think this will
be a great addition to PostgreSQL as I am having users switch to MySQL
MaxDB (SAPDB) due to performance problems. This patch solved a good
number of those problems!

Again, let me know if you see anything wrong and I will try to get it
corrected ASAP so this can get included with the next release!

Thanks,

-Andy


Attachment

Re: Result Set Cursor Patch

From
Kris Jurka
Date:

On Fri, 30 Apr 2004, Andy Zeneski wrote:

> I am not sure if this is the right place, or if the pgsql-patches list
> is where this should go, since the CVS repositories are now in
> different locations, I hoped this would be the right area.

This is the place for all things JDBC.

> Attached is a patch against the current CVS head for added support for
> cursors in result set.
>
> The patch passes the entire test suite, and I added a new test to test
> absolute positioning.

Some more tests for the other wrappers you have made around absolute and
other fetching methods would be nice (previous, relative, last, ...).

> Again, let me know if you see anything wrong and I will try to get it
> corrected ASAP so this can get included with the next release!

This patch does not support updateable ResultSets (deleteRow, insertRow),
but doesn't do anything to prevent a cursor from being used in that case.

Kris Jurka

Re: Result Set Cursor Patch

From
Andy Zeneski
Date:
On Apr 30, 2004, at 11:43 AM, Kris Jurka wrote:

>
> Some more tests for the other wrappers you have made around absolute
> and
> other fetching methods would be nice (previous, relative, last, ...).

Sure, I was thinking of that myself. Since most of the code goes around
absolute() that was the primary one  I wanted to make sure was tested.
But I can add some more.

> This patch does not support updateable ResultSets (deleteRow,
> insertRow),
> but doesn't do anything to prevent a cursor from being used in that
> case.

I don't recall changing this, if it previously prevented the cursor
from being used on update, then it should still be the same. I will
look into this and get it fixed. I don't think PostgreSQL supports
updateable cursors, so should the JDBC driver throw an exception if you
try to use updateable with a fetch size > 0, or should it avoid the
cursor and hope it can handle the results?

-Andy

Attachment

Re: Result Set Cursor Patch

From
Andy Zeneski
Date:
Kris,

Does JDBC1 support CONCUR_UPDATABLE or just CONCUR_READ_ONLY?

-Andy

On Apr 30, 2004, at 11:43 AM, Kris Jurka wrote:

> This patch does not support updateable ResultSets (deleteRow,
> insertRow),
> but doesn't do anything to prevent a cursor from being used in that
> case.
>
> Kris Jurka
>

Attachment

Re: Result Set Cursor Patch

From
Kris Jurka
Date:

On Fri, 30 Apr 2004, Andy Zeneski wrote:
>
> On Apr 30, 2004, at 11:43 AM, Kris Jurka wrote:
>
> > This patch does not support updateable ResultSets (deleteRow,
> > insertRow),
> > but doesn't do anything to prevent a cursor from being used in that
> > case.
>
> I don't recall changing this, if it previously prevented the cursor
> from being used on update, then it should still be the same. I will
> look into this and get it fixed. I don't think PostgreSQL supports
> updateable cursors, so should the JDBC driver throw an exception if you
> try to use updateable with a fetch size > 0, or should it avoid the
> cursor and hope it can handle the results?
>

You didn't, but before a scrollable updateable resultset would fetch all
the rows so it could manipulate them.  With a cursor based method you will
be discarding the changes when moving to a new block and when refetching
you won't see them because the cursor won't pick up the changes.  The
scrollable updateable case must fall back to fetching all rows instead of
throwing an exception because so many people are already using it.

Kris Jurka

Re: Result Set Cursor Patch

From
Kris Jurka
Date:

On Fri, 30 Apr 2004, Andy Zeneski wrote:

> Kris,
>
> Does JDBC1 support CONCUR_UPDATABLE or just CONCUR_READ_ONLY?

It doesn't know the difference.  These constants were only introduced in
JDBC2, but all of ResultSet.updateXXX methods are JDBC2 so effectively all
JDBC1 resultsets are readonly.

Kris Jurka


Re: Result Set Cursor Patch

From
Andy Zeneski
Date:
We are talking about concurrency right, not scroll type. The problem
will be not when the result set is scrollable, but rather then the
concurrency is updatable. Is this correct? If so, I have this fixed, I
am adding some test cases now and will send over a new patch.

-Andy

On Apr 30, 2004, at 12:18 PM, Kris Jurka wrote:

>
>
> You didn't, but before a scrollable updateable resultset would fetch
> all
> the rows so it could manipulate them.  With a cursor based method you
> will
> be discarding the changes when moving to a new block and when
> refetching
> you won't see them because the cursor won't pick up the changes.  The
> scrollable updateable case must fall back to fetching all rows instead
> of
> throwing an exception because so many people are already using it.
>
> Kris Jurka

Attachment

Re: Result Set Cursor Patch

From
Kris Jurka
Date:

On Fri, 30 Apr 2004, Andy Zeneski wrote:

> We are talking about concurrency right, not scroll type. The problem
> will be not when the result set is scrollable, but rather then the
> concurrency is updatable. Is this correct? If so, I have this fixed, I
> am adding some test cases now and will send over a new patch.

Right, well it's both because you can have a forward only updateable
resultset use a cursor because it never has to refetch, but the scrollable
case causes problems.

Kris Jurka

Re: Result Set Cursor Patch

From
Andy Zeneski
Date:
I expect this should take care of that issue.

-Andy



On Apr 30, 2004, at 12:29 PM, Kris Jurka wrote:

>>
>> concurrency is updatable. Is this correct? If so, I have this fixed, I
>> am adding some test cases now and will send over a new patch.
>
> Right, well it's both because you can have a forward only updateable
> resultset use a cursor because it never has to refetch, but the
> scrollable
> case causes problems.
>
> Kris Jurka

Attachment

Re: Result Set Cursor Patch

From
Andy Zeneski
Date:
Please let me know if there is anything else which would prevent this
from getting into CVS. Also, if all is good, could you please let me
know when has been committed so I can update my local tree.

-Andy

On Apr 30, 2004, at 12:58 PM, Andy Zeneski wrote:

> I expect this should take care of that issue.
>
> -Andy
>
> <pgjdbc-cursor.patch.gz>
>
> On Apr 30, 2004, at 12:29 PM, Kris Jurka wrote:
>
>>>
>>> concurrency is updatable. Is this correct? If so, I have this fixed,
>>> I
>>> am adding some test cases now and will send over a new patch.
>>
>> Right, well it's both because you can have a forward only updateable
>> resultset use a cursor because it never has to refetch, but the
>> scrollable
>> case causes problems.
>>
>> Kris Jurka

Attachment

Re: Result Set Cursor Patch

From
Oliver Jowett
Date:
Andy Zeneski wrote:
> Please let me know if there is anything else which would prevent this
> from getting into CVS. Also, if all is good, could you please let me
> know when has been committed so I can update my local tree.

A few comments from looking at the patch..

resultSetSize() does a number of FETCH ABSOLUTEs, throwing away the
results, and then backtracking when it reaches a row that doesn't exist.
A better strategy might be to use MOVE FORWARD ALL and examine the
rowcount in the returned command tag.

fetchAbsolute() and things that end up calling it doesn't seem to
respect the fetchsize, i.e. you always end up with a single-row
resultset in memory. How about executing "MOVE ABSOLUTE n; FETCH FORWARD
fetchsize" instead of just "FETCH ABSOLUTE n"?

How does the performance of iterating backwards through a resultset
compare with the non-cursor case or the forward iteration case? It seems
like with the patch it will end up doing a FETCH ABSOLUTE of a single
row on each iteration. Really fetchAbsolute needs to do either a "MOVE
ABSOLUTE n; FETCH FORWARD fetchsize" or "MOVE ABSOLUTE n; FETCH BACKWARD
fetchsize" depending on the resultset's preferred fetch direction (see
setFetchDirection)

I have a number of code style comments too, let me know if you'd like
the gory details :)

-O

Re: Result Set Cursor Patch

From
Andy Zeneski
Date:
>
> resultSetSize() does a number of FETCH ABSOLUTEs, throwing away the
> results, and then backtracking when it reaches a row that doesn't
> exist. A better strategy might be to use MOVE FORWARD ALL and examine
> the rowcount in the returned command tag.
>

Nice. I didn't know about MOVE. I have re-coded the size method to use
this pattern. Much better. I wish I had know about that before, I only
use SQL92 standard in my code but this is perfect for here. Thanks!

> fetchAbsolute() and things that end up calling it doesn't seem to
> respect the fetchsize, i.e. you always end up with a single-row
> resultset in memory. How about executing "MOVE ABSOLUTE n; FETCH
> FORWARD fetchsize" instead of just "FETCH ABSOLUTE n"?
>

Now that I have code in place for MOVE, this will be simple to
implement.

> How does the performance of iterating backwards through a resultset
> compare with the non-cursor case or the forward iteration case? It
> seems like with the patch it will end up doing a FETCH ABSOLUTE of a
> single row on each iteration. Really fetchAbsolute needs to do either
> a "MOVE ABSOLUTE n; FETCH FORWARD fetchsize" or "MOVE ABSOLUTE n;
> FETCH BACKWARD fetchsize" depending on the resultset's preferred fetch
> direction (see setFetchDirection)
>

Okay, now I must ask for some help. In the case that the direction is
reverse, does that mean that the pointer should position itself at the
last record at the beginning? What about unknown, should that default
to forward?

Also, when in reverse mode should next() still go forward, or should
everything be reversed? Meaning, next() would go backwards and
previous() would go forwards?

> I have a number of code style comments too, let me know if you'd like
> the gory details :)
>

I usually stick to the Sun standards with 4 spaces for tabs. I noticed
that in this code there were a lot of different styles being used,
probably from different people making changes. I assumed that a
convention was not official. If there is a certain style that you would
like me to use let me know, if you like I can let IDEA just reformat
everything based on Sun standards, but that would be a big patch.

-Andy

Attachment

Re: Result Set Cursor Patch

From
Oliver Jowett
Date:
Andy Zeneski wrote:

>> fetchAbsolute() and things that end up calling it doesn't seem to
>> respect the fetchsize, i.e. you always end up with a single-row
>> resultset in memory. How about executing "MOVE ABSOLUTE n; FETCH
>> FORWARD fetchsize" instead of just "FETCH ABSOLUTE n"?
>>
>
> Now that I have code in place for MOVE, this will be simple to implement.

I realized there's an off-by-one error in my example above (I think!) --
as the FETCH will start with the next row after 'n'. Something to watch for.

>> How does the performance of iterating backwards through a resultset
>> compare with the non-cursor case or the forward iteration case? It
>> seems like with the patch it will end up doing a FETCH ABSOLUTE of a
>> single row on each iteration. Really fetchAbsolute needs to do either
>> a "MOVE ABSOLUTE n; FETCH FORWARD fetchsize" or "MOVE ABSOLUTE n;
>> FETCH BACKWARD fetchsize" depending on the resultset's preferred fetch
>> direction (see setFetchDirection)
>>
>
> Okay, now I must ask for some help. In the case that the direction is
> reverse, does that mean that the pointer should position itself at the
> last record at the beginning? What about unknown, should that default to
> forward?

What I mean is that if you've set the fetch direction to backwards, the
driver should probably fetch a block *ending* at the row that it wants
to fetch but doesn't currently have in memory, rather than a block
starting at that row. FETCH BACKWARD is one way of doing this (you'll
need to reverse the order of rows returned though).

i.e. with fetchsize 5 and FETCH_FORWARD we fetch this block:

   10  <= desired row
   11
   12
   13
   14

With FETCH_REVERSE we should instead fetch this block:

   6
   7
   8
   9
   10  <= desired row

After thinking about this a bit it's probably simpler to use a FETCH
FORWARD to get a block of 'fetchsize' rows starting at max(1, row -
fetchsize + 1), i.e. fetch forward from row 6 in the above example.

> Also, when in reverse mode should next() still go forward, or should
> everything be reversed? Meaning, next() would go backwards and
> previous() would go forwards?

setFetchDirection() lets the application provide a hint about the likely
access pattern; the actual meaning of all the resultset positioning
operations are unchanged. it's just there to help the driver load rows
efficiently.

The javadoc says:

> public static final int FETCH_REVERSE
>
> The constant indicating that the rows in a result set will be processed
> in a reverse direction; last-to-first. This constant is used by the
> method setFetchDirection  as a hint to the driver, which the driver may
> ignore.

-O