Thread: Result Set Cursor Patch
I am not sure if this is the right place, or if the pgsql-patches list is where this should go, since the CVS repositories are now in different locations, I hoped this would be the right area. Attached is a patch against the current CVS head for added support for cursors in result set. The reason for this patch I needed to iterate over a very large result set, at first I was using PGSQL 7.3.x and the query crashed each time. I believe out of memory errors were the cause. Later I realized that the entire result set was being sent over, this was several millions of rows. I upgraded to PGSQL 7.4.2 since I saw it has support for cursors in result sets, but it didn't work with my code. The problem was, I would jump to last() and obtain the row for the total count, then jump back to first() to begin my iteration. When setting the fetch size to 50, it would only report 50 rows total. The last(), first(), absolute() methods only looked at the current 'chunk' of rows. I think this may have been a bug since later it appears that when scrollable was set it was supposed to ignore the fetch size and grab all rows. This just will not do. This patch implements the absolute() method, using the cursor. The code which checks for scrollable now creates the cursor with the SCROLL keyword rather then avoiding cursors all together. The patch passes the entire test suite, and I added a new test to test absolute positioning. I am happy to answer any questions, and if anyone sees any problems with the code please let me know. I think this will be a great addition to PostgreSQL as I am having users switch to MySQL MaxDB (SAPDB) due to performance problems. This patch solved a good number of those problems! Again, let me know if you see anything wrong and I will try to get it corrected ASAP so this can get included with the next release! Thanks, -Andy
Attachment
On Fri, 30 Apr 2004, Andy Zeneski wrote: > I am not sure if this is the right place, or if the pgsql-patches list > is where this should go, since the CVS repositories are now in > different locations, I hoped this would be the right area. This is the place for all things JDBC. > Attached is a patch against the current CVS head for added support for > cursors in result set. > > The patch passes the entire test suite, and I added a new test to test > absolute positioning. Some more tests for the other wrappers you have made around absolute and other fetching methods would be nice (previous, relative, last, ...). > Again, let me know if you see anything wrong and I will try to get it > corrected ASAP so this can get included with the next release! This patch does not support updateable ResultSets (deleteRow, insertRow), but doesn't do anything to prevent a cursor from being used in that case. Kris Jurka
On Apr 30, 2004, at 11:43 AM, Kris Jurka wrote: > > Some more tests for the other wrappers you have made around absolute > and > other fetching methods would be nice (previous, relative, last, ...). Sure, I was thinking of that myself. Since most of the code goes around absolute() that was the primary one I wanted to make sure was tested. But I can add some more. > This patch does not support updateable ResultSets (deleteRow, > insertRow), > but doesn't do anything to prevent a cursor from being used in that > case. I don't recall changing this, if it previously prevented the cursor from being used on update, then it should still be the same. I will look into this and get it fixed. I don't think PostgreSQL supports updateable cursors, so should the JDBC driver throw an exception if you try to use updateable with a fetch size > 0, or should it avoid the cursor and hope it can handle the results? -Andy
Attachment
Kris, Does JDBC1 support CONCUR_UPDATABLE or just CONCUR_READ_ONLY? -Andy On Apr 30, 2004, at 11:43 AM, Kris Jurka wrote: > This patch does not support updateable ResultSets (deleteRow, > insertRow), > but doesn't do anything to prevent a cursor from being used in that > case. > > Kris Jurka >
Attachment
On Fri, 30 Apr 2004, Andy Zeneski wrote: > > On Apr 30, 2004, at 11:43 AM, Kris Jurka wrote: > > > This patch does not support updateable ResultSets (deleteRow, > > insertRow), > > but doesn't do anything to prevent a cursor from being used in that > > case. > > I don't recall changing this, if it previously prevented the cursor > from being used on update, then it should still be the same. I will > look into this and get it fixed. I don't think PostgreSQL supports > updateable cursors, so should the JDBC driver throw an exception if you > try to use updateable with a fetch size > 0, or should it avoid the > cursor and hope it can handle the results? > You didn't, but before a scrollable updateable resultset would fetch all the rows so it could manipulate them. With a cursor based method you will be discarding the changes when moving to a new block and when refetching you won't see them because the cursor won't pick up the changes. The scrollable updateable case must fall back to fetching all rows instead of throwing an exception because so many people are already using it. Kris Jurka
On Fri, 30 Apr 2004, Andy Zeneski wrote: > Kris, > > Does JDBC1 support CONCUR_UPDATABLE or just CONCUR_READ_ONLY? It doesn't know the difference. These constants were only introduced in JDBC2, but all of ResultSet.updateXXX methods are JDBC2 so effectively all JDBC1 resultsets are readonly. Kris Jurka
We are talking about concurrency right, not scroll type. The problem will be not when the result set is scrollable, but rather then the concurrency is updatable. Is this correct? If so, I have this fixed, I am adding some test cases now and will send over a new patch. -Andy On Apr 30, 2004, at 12:18 PM, Kris Jurka wrote: > > > You didn't, but before a scrollable updateable resultset would fetch > all > the rows so it could manipulate them. With a cursor based method you > will > be discarding the changes when moving to a new block and when > refetching > you won't see them because the cursor won't pick up the changes. The > scrollable updateable case must fall back to fetching all rows instead > of > throwing an exception because so many people are already using it. > > Kris Jurka
Attachment
On Fri, 30 Apr 2004, Andy Zeneski wrote: > We are talking about concurrency right, not scroll type. The problem > will be not when the result set is scrollable, but rather then the > concurrency is updatable. Is this correct? If so, I have this fixed, I > am adding some test cases now and will send over a new patch. Right, well it's both because you can have a forward only updateable resultset use a cursor because it never has to refetch, but the scrollable case causes problems. Kris Jurka
I expect this should take care of that issue. -Andy On Apr 30, 2004, at 12:29 PM, Kris Jurka wrote: >> >> concurrency is updatable. Is this correct? If so, I have this fixed, I >> am adding some test cases now and will send over a new patch. > > Right, well it's both because you can have a forward only updateable > resultset use a cursor because it never has to refetch, but the > scrollable > case causes problems. > > Kris Jurka
Attachment
Please let me know if there is anything else which would prevent this from getting into CVS. Also, if all is good, could you please let me know when has been committed so I can update my local tree. -Andy On Apr 30, 2004, at 12:58 PM, Andy Zeneski wrote: > I expect this should take care of that issue. > > -Andy > > <pgjdbc-cursor.patch.gz> > > On Apr 30, 2004, at 12:29 PM, Kris Jurka wrote: > >>> >>> concurrency is updatable. Is this correct? If so, I have this fixed, >>> I >>> am adding some test cases now and will send over a new patch. >> >> Right, well it's both because you can have a forward only updateable >> resultset use a cursor because it never has to refetch, but the >> scrollable >> case causes problems. >> >> Kris Jurka
Attachment
Andy Zeneski wrote: > Please let me know if there is anything else which would prevent this > from getting into CVS. Also, if all is good, could you please let me > know when has been committed so I can update my local tree. A few comments from looking at the patch.. resultSetSize() does a number of FETCH ABSOLUTEs, throwing away the results, and then backtracking when it reaches a row that doesn't exist. A better strategy might be to use MOVE FORWARD ALL and examine the rowcount in the returned command tag. fetchAbsolute() and things that end up calling it doesn't seem to respect the fetchsize, i.e. you always end up with a single-row resultset in memory. How about executing "MOVE ABSOLUTE n; FETCH FORWARD fetchsize" instead of just "FETCH ABSOLUTE n"? How does the performance of iterating backwards through a resultset compare with the non-cursor case or the forward iteration case? It seems like with the patch it will end up doing a FETCH ABSOLUTE of a single row on each iteration. Really fetchAbsolute needs to do either a "MOVE ABSOLUTE n; FETCH FORWARD fetchsize" or "MOVE ABSOLUTE n; FETCH BACKWARD fetchsize" depending on the resultset's preferred fetch direction (see setFetchDirection) I have a number of code style comments too, let me know if you'd like the gory details :) -O
> > resultSetSize() does a number of FETCH ABSOLUTEs, throwing away the > results, and then backtracking when it reaches a row that doesn't > exist. A better strategy might be to use MOVE FORWARD ALL and examine > the rowcount in the returned command tag. > Nice. I didn't know about MOVE. I have re-coded the size method to use this pattern. Much better. I wish I had know about that before, I only use SQL92 standard in my code but this is perfect for here. Thanks! > fetchAbsolute() and things that end up calling it doesn't seem to > respect the fetchsize, i.e. you always end up with a single-row > resultset in memory. How about executing "MOVE ABSOLUTE n; FETCH > FORWARD fetchsize" instead of just "FETCH ABSOLUTE n"? > Now that I have code in place for MOVE, this will be simple to implement. > How does the performance of iterating backwards through a resultset > compare with the non-cursor case or the forward iteration case? It > seems like with the patch it will end up doing a FETCH ABSOLUTE of a > single row on each iteration. Really fetchAbsolute needs to do either > a "MOVE ABSOLUTE n; FETCH FORWARD fetchsize" or "MOVE ABSOLUTE n; > FETCH BACKWARD fetchsize" depending on the resultset's preferred fetch > direction (see setFetchDirection) > Okay, now I must ask for some help. In the case that the direction is reverse, does that mean that the pointer should position itself at the last record at the beginning? What about unknown, should that default to forward? Also, when in reverse mode should next() still go forward, or should everything be reversed? Meaning, next() would go backwards and previous() would go forwards? > I have a number of code style comments too, let me know if you'd like > the gory details :) > I usually stick to the Sun standards with 4 spaces for tabs. I noticed that in this code there were a lot of different styles being used, probably from different people making changes. I assumed that a convention was not official. If there is a certain style that you would like me to use let me know, if you like I can let IDEA just reformat everything based on Sun standards, but that would be a big patch. -Andy
Attachment
Andy Zeneski wrote: >> fetchAbsolute() and things that end up calling it doesn't seem to >> respect the fetchsize, i.e. you always end up with a single-row >> resultset in memory. How about executing "MOVE ABSOLUTE n; FETCH >> FORWARD fetchsize" instead of just "FETCH ABSOLUTE n"? >> > > Now that I have code in place for MOVE, this will be simple to implement. I realized there's an off-by-one error in my example above (I think!) -- as the FETCH will start with the next row after 'n'. Something to watch for. >> How does the performance of iterating backwards through a resultset >> compare with the non-cursor case or the forward iteration case? It >> seems like with the patch it will end up doing a FETCH ABSOLUTE of a >> single row on each iteration. Really fetchAbsolute needs to do either >> a "MOVE ABSOLUTE n; FETCH FORWARD fetchsize" or "MOVE ABSOLUTE n; >> FETCH BACKWARD fetchsize" depending on the resultset's preferred fetch >> direction (see setFetchDirection) >> > > Okay, now I must ask for some help. In the case that the direction is > reverse, does that mean that the pointer should position itself at the > last record at the beginning? What about unknown, should that default to > forward? What I mean is that if you've set the fetch direction to backwards, the driver should probably fetch a block *ending* at the row that it wants to fetch but doesn't currently have in memory, rather than a block starting at that row. FETCH BACKWARD is one way of doing this (you'll need to reverse the order of rows returned though). i.e. with fetchsize 5 and FETCH_FORWARD we fetch this block: 10 <= desired row 11 12 13 14 With FETCH_REVERSE we should instead fetch this block: 6 7 8 9 10 <= desired row After thinking about this a bit it's probably simpler to use a FETCH FORWARD to get a block of 'fetchsize' rows starting at max(1, row - fetchsize + 1), i.e. fetch forward from row 6 in the above example. > Also, when in reverse mode should next() still go forward, or should > everything be reversed? Meaning, next() would go backwards and > previous() would go forwards? setFetchDirection() lets the application provide a hint about the likely access pattern; the actual meaning of all the resultset positioning operations are unchanged. it's just there to help the driver load rows efficiently. The javadoc says: > public static final int FETCH_REVERSE > > The constant indicating that the rows in a result set will be processed > in a reverse direction; last-to-first. This constant is used by the > method setFetchDirection as a hint to the driver, which the driver may > ignore. -O