Thread: get row count from a "cursor resultset"
Hello! I'm querying my database and sometimes get a very big resultset back, so I needed to use a resultset with a cursor to avoid a OutOfMemoryError. Now, because I have to use a TYPE_FORWARD_ONLY resultset for that, I can not use the result.last() function anymore to retrieve the amount of rows the query returns. The query is a little bit complicated and that for can take some time, so I don't want to do the "SELECT COUNT(*) FROM (SELECT...)" afterward, what would result in doing the query twice and there for would double the time needed. I didn't find a solution for this, but there must be, because the pgAdmin III - tool does this. After starting the query in the SQL-Dialog it asks wether to retrieve just the first 100 results or all results displaying the corect amount of found rows. Or is this just not possible with jdbc. Thanks for any help Maik
Maik Wiege wrote: > Hello! > I'm querying my database and sometimes get a very big resultset back, so > I needed to use a resultset with a cursor to avoid a OutOfMemoryError. > > Now, because I have to use a TYPE_FORWARD_ONLY resultset for that, I can > not use the result.last() function anymore to retrieve the amount of > rows the query returns. > > The query is a little bit complicated and that for can take some time, > so I don't want to do the "SELECT COUNT(*) FROM (SELECT...)" afterward, > what would result in doing the query twice and there for would double > the time needed. Have you tried using COUNT directly in your original query (not as a separate query)? I wouldn't expect that to affect total execution time much if at all; the main difference will be that the full query has to complete before you get any results, while without the COUNT you might start getting results immediately. There's not really any other option that the current driver code will do for you. You could DECLARE CURSOR / MOVE / FETCH yourself perhaps; I'd expect this to have similar startup characteristics to the COUNT(*) case though. Even if the driver supported cursor-based random-access resultsets, it'd be using MOVE / FETCH under the covers in the same way.. > I didn't find a solution for this, but there must be, because the pgAdmin III - tool does this. After starting the queryin the SQL-Dialog it asks wether to retrieve just the first 100 results or all results displaying the corect amountof found rows. Or is this just not possible with jdbc. I'd point out that this does not require knowing exactly how many rows there are -- only that there are more than 100. You can do the equivalent in JDBC easily by setting fetchsize to 100 or so and calling next() 100 times.. if you haven't hit the end of the resultset by that point, it has more than 100 rows. -O
Oliver Jowett wrote: > Maik Wiege wrote: > >>I didn't find a solution for this, but there must be, because the pgAdmin III - tool does this. After starting the queryin the SQL-Dialog it asks wether to retrieve just the first 100 results or all results displaying the corect amountof found rows. Or is this just not possible with jdbc. > > > I'd point out that this does not require knowing exactly how many rows > there are -- only that there are more than 100. You can do the > equivalent in JDBC easily by setting fetchsize to 100 or so and calling > next() 100 times.. if you haven't hit the end of the resultset by that > point, it has more than 100 rows. Oh, sorry, my mistake -- I misread your description. Perhaps they are using either COUNT(*) in the same query or MOVE to get the number of rows. -O
Oliver Jowett wrote: > Oliver Jowett wrote: > >>Maik Wiege wrote: >> >> >>>I didn't find a solution for this, but there must be, because the pgAdmin III - tool does this. After starting the queryin the SQL-Dialog it asks wether to retrieve just the first 100 results or all results displaying the corect amountof found rows. Or is this just not possible with jdbc. >> >> >>I'd point out that this does not require knowing exactly how many rows >>there are -- only that there are more than 100. You can do the >>equivalent in JDBC easily by setting fetchsize to 100 or so and calling >>next() 100 times.. if you haven't hit the end of the resultset by that >>point, it has more than 100 rows. > > > Oh, sorry, my mistake -- I misread your description. Perhaps they are > using either COUNT(*) in the same query or MOVE to get the number of rows. Ok, some digging in pgAdmin's svn web interface seems to indicate that pgadmin is not actually doing anything special here -- it does the full query via libpq's PQexec(), retrieving all the tuples (and hence can count them) but only does the work to construct a user interface displaying them all after that "lots of rows, retrieve anyway?" dialog has been answered. You could check this by looking at pgadmin's memory footprint while retrieving a big resultset. -O
Oliver Jowett wrote: > Ok, some digging in pgAdmin's svn web interface seems to indicate that > pgadmin is not actually doing anything special here -- it does the full > query via libpq's PQexec(), retrieving all the tuples (and hence can > count them) but only does the work to construct a user interface > displaying them all after that "lots of rows, retrieve anyway?" dialog > has been answered. > > You could check this by looking at pgadmin's memory footprint while > retrieving a big resultset. Thank you for your answers! Yes, this explains it, it is realy using much memory. But what can I do about my problem? Isn't this a very common problem? Can't imagine any user querying a database and does not want to be informed how many results he got... Thanks for any help Maik
Maik Wiege wrote: > But what can I do about my problem? Isn't this a very > common problem? Can't imagine any user querying a database and does not > want to be informed how many results he got... Well, as I suggested before, use DECLARE CURSOR / MOVE / FETCH, or include a COUNT(*) column in your original query. -O
Oliver Jowett schrieb: > Maik Wiege wrote: > > >>But what can I do about my problem? Isn't this a very >>common problem? Can't imagine any user querying a database and does not >>want to be informed how many results he got... > > > Well, as I suggested before, use DECLARE CURSOR / MOVE / FETCH, or > include a COUNT(*) column in your original query. > Hi! Could you give me an example how to do it? I can't fugure it out. If I put the COUNT(*) in my query: SELECT DISTINCT REGISTER.pd.pdkz, COUNT(*) FROM REGISTER.pd JOIN REGISTER.vornamen ON REGISTER.vornamen.pdkz = REGISTER.pd.pdkz the sql error comes up: ERROR: column "pd.pdkz" must appear in the GROUP BY clause or be used in an aggregate function Thanks for any help Maik
Oliver Jowett schrieb: > Try COUNT(distinct REGISTER.pd.pdkz) perhaps Hi! SELECT COUNT(DISTINCT REGISTER.pd.pdkz) FROM REGISTER.pd JOIN REGISTER.vornamen ON REGISTER.vornamen.pdkz = REGISTER.pd.pdkz works, but with that of cource I don't get the data, just the amount of rows, so I would have to call it another time without COUNT to get the data. A query like that SELECT COUNT(DISTINCT REGISTER.pd.pdkz), DISTINCT REGISTER.pd.pdkz FROM REGISTER.pd JOIN REGISTER.vornamen ON REGISTER.vornamen.pdkz = REGISTER.pd.pdkz gives the same mentioned error: column "pd.pdkz" must appear in the GROUP BY clause or be used in an aggregate function Any ideas anyone?
Maik, Your only choice may be to do two separate queries. This isn't quite as bad as you think, as the count will load the buffers with your query If you need exactly the right amount, make sure you do this in a transaction. Dave On 7-Jul-05, at 7:29 AM, Maik Wiege wrote: > Oliver Jowett schrieb: > >> Try COUNT(distinct REGISTER.pd.pdkz) perhaps >> > Hi! > SELECT COUNT(DISTINCT REGISTER.pd.pdkz) > FROM REGISTER.pd JOIN > REGISTER.vornamen ON REGISTER.vornamen.pdkz = REGISTER.pd.pdkz > > works, but with that of cource I don't get the data, just the > amount of rows, so I would have to call it another time without > COUNT to get the data. > > A query like that > SELECT COUNT(DISTINCT REGISTER.pd.pdkz), DISTINCT REGISTER.pd.pdkz > FROM REGISTER.pd JOIN > REGISTER.vornamen ON REGISTER.vornamen.pdkz = REGISTER.pd.pdkz > gives the same mentioned error: column "pd.pdkz" must appear in the > GROUP BY clause or be used in an aggregate function > > Any ideas anyone? > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > >