Thread: not fetching all query results

not fetching all query results

From
Radim Kolar
Date:
Any plans to work on these todo issues:
  • [] Allow scrollable ResultSets to not fetch all results in one batch.
  • [] Allow refcursor ResultSets to not fetch all results in one batch.

This problems are there for a very long time (5+ years), it should not be that difficult to get it fixed. I am not interested in workarounds.

Re: not fetching all query results

From
Craig Ringer
Date:
On 07/17/2012 01:21 AM, Radim Kolar wrote:
Any plans to work on these todo issues:
  • [] Allow scrollable ResultSets to not fetch all results in one batch.
  • [] Allow refcursor ResultSets to not fetch all results in one batch.

This problems are there for a very long time (5+ years), it should not be that difficult to get it fixed. I am not interested in workarounds.


Are you willing to step up and help improve PgJDBC, or to fund someone else so they can take time away from paying work to do so?

--
Craig Ringer

Re: not fetching all query results

From
Radim Kolar
Date:
> Are you willing to step up and help improve PgJDBC, or to fund someone
> else so they can take time away from paying work to do so?
How much it will cost me?

Re: not fetching all query results

From
Dave Cramer
Date:
Radim,

What exactly are you looking to be done ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Tue, Jul 17, 2012 at 7:45 AM, Radim Kolar <hsn@filez.com> wrote:
>
>> Are you willing to step up and help improve PgJDBC, or to fund someone
>> else so they can take time away from paying work to do so?
>
> How much it will cost me?
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc

Re: not fetching all query results

From
Radim Kolar
Date:
Dne 17.7.2012 17:09, Dave Cramer napsal(a):
Radim,

What exactly are you looking to be done ?
working setFetchSize() with some sensible default

Re: not fetching all query results

From
Dave Cramer
Date:
Radim,

Reading the code it would appear that setFetchSize is implemented
other than the sensible default.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Tue, Jul 17, 2012 at 11:33 AM, Radim Kolar <hsn@filez.com> wrote:
> Dne 17.7.2012 17:09, Dave Cramer napsal(a):
>
> Radim,
>
> What exactly are you looking to be done ?
>
> working setFetchSize() with some sensible default

Re: not fetching all query results

From
Radim Kolar
Date:
setFetchSize is defined there:

https://github.com/pgjdbc/pgjdbc/blob/master/org/postgresql/jdbc2/AbstractJdbc2Statement.java

but i havent checked if this value is really used for something. But
probably not because its OOM here if result set is over 30k rows.

Re: not fetching all query results

From
Dave Cramer
Date:
Well the default is 0 which means that it is off, if you want to
activate it then set it to something.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Tue, Jul 17, 2012 at 2:14 PM, Radim Kolar <hsn@filez.com> wrote:
>
> setFetchSize is defined there:
>
> https://github.com/pgjdbc/pgjdbc/blob/master/org/postgresql/jdbc2/AbstractJdbc2Statement.java
>
> but i havent checked if this value is really used for something. But
> probably not because its OOM here if result set is over 30k rows.
>
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc

Re: not fetching all query results

From
Dave Cramer
Date:
Further more the spec actually says the default is 0

Gives the JDBC driver a hint as to the number of rows that should be
fetched from the database when more rows are needed for ResultSet
objects genrated by this Statement. If the value specified is zero,
then the hint is ignored. The default value is zero.


dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Tue, Jul 17, 2012 at 2:28 PM, Dave Cramer <pg@fastcrypt.com> wrote:
> Well the default is 0 which means that it is off, if you want to
> activate it then set it to something.
>
> Dave Cramer
>
> dave.cramer(at)credativ(dot)ca
> http://www.credativ.ca
>
>
> On Tue, Jul 17, 2012 at 2:14 PM, Radim Kolar <hsn@filez.com> wrote:
>>
>> setFetchSize is defined there:
>>
>> https://github.com/pgjdbc/pgjdbc/blob/master/org/postgresql/jdbc2/AbstractJdbc2Statement.java
>>
>> but i havent checked if this value is really used for something. But
>> probably not because its OOM here if result set is over 30k rows.
>>
>>
>> --
>> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-jdbc

Re: not fetching all query results

From
Radim Kolar
Date:
with 5m result set you can set fetchsize to any value it has no effect

@GrabConfig(systemClassLoader = true)
@Grab(group='postgresql', module='postgresql', version='8.3-603.jdbc4')
@Grab(group='commons-lang', module='commons-lang', version='2.6')

import groovy.sql.Sql
import org.apache.commons.lang.math.RandomUtils

def sql=Sql.newInstance("jdbc:postgresql://localhost/grails",
"postgres", "postgres")
sql.setCacheStatements(true)

/*
sql.withTransaction {
for (i in 1..5000000 ) {
   sql.executeUpdate("insert into public.test (a) values (?)",
[RandomUtils.nextLong()])
}
}
*/

def c = sql.getConnection()
def s = c.createStatement()
s.setFetchSize(100000)
def r = s.executeQuery("select * from public.test")
def cnt = 0
while (r.next()) {
  cnt++
}

println "${cnt} rows."

sql.close()



Re: not fetching all query results

From
Dave Cramer
Date:
Radim,

As per our docs
http://jdbc.postgresql.org/documentation/91/query.html#fetchsize-example

You need to do the select inside a transaction

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Tue, Jul 17, 2012 at 3:49 PM, Radim Kolar <hsn@filez.com> wrote:
> with 5m result set you can set fetchsize to any value it has no effect
>
> @GrabConfig(systemClassLoader = true)
> @Grab(group='postgresql', module='postgresql', version='8.3-603.jdbc4')
> @Grab(group='commons-lang', module='commons-lang', version='2.6')
>
> import groovy.sql.Sql
> import org.apache.commons.lang.math.RandomUtils
>
> def sql=Sql.newInstance("jdbc:postgresql://localhost/grails", "postgres",
> "postgres")
> sql.setCacheStatements(true)
>
> /*
> sql.withTransaction {
> for (i in 1..5000000 ) {
>   sql.executeUpdate("insert into public.test (a) values (?)",
> [RandomUtils.nextLong()])
> }
> }
> */
>
> def c = sql.getConnection()
> def s = c.createStatement()
> s.setFetchSize(100000)
> def r = s.executeQuery("select * from public.test")
> def cnt = 0
> while (r.next()) {
>  cnt++
> }
>
> println "${cnt} rows."
>
> sql.close()
>
>

Re: not fetching all query results

From
Radim Kolar
Date:
Dne 17.7.2012 22:05, Dave Cramer napsal(a):
> Radim,
>
> As per our docs
> http://jdbc.postgresql.org/documentation/91/query.html#fetchsize-example
>
> You need to do the select inside a transaction
In transaction that simple example really works. Can it be fixed in
driver to work in autocommit mode? I will try scrollable cursor next.

Re: not fetching all query results

From
Craig Ringer
Date:
On 07/18/2012 05:00 AM, Radim Kolar wrote:
Dne 17.7.2012 22:05, Dave Cramer napsal(a):
Radim,

As per our docs
http://jdbc.postgresql.org/documentation/91/query.html#fetchsize-example

You need to do the select inside a transaction
In transaction that simple example really works. Can it be fixed in driver to work in autocommit mode? I will try scrollable cursor next.

I haven't looked into how the fetch size code works to see whether it's possible for it to operate in autocommit. A quick look at QueryExecutorImpl.sendOneQuery in  org/postgresql/core/v3/QueryExecutorImpl.java finds the line:

  boolean usePortal = (flags & QueryExecutor.QUERY_FORWARD_CURSOR) != 0 && !noResults && !noMeta && fetchSize > 0 && !describeOnly;

given that and the subsequent code I don't understand why an explicit transaction is required. I've only done a tiny bit with the JDBC code, though, and only related to the authentication system and client certificates.



My rather blunt comment to you at the start of this thread was to point out that JDBC driver development doesn't just happen. It's a volunteer effort; people put their own time into this. Something like

    "It should not be that difficult to get it fixed. I am not interested in workarounds"

is pretty dismissive of the work you're already getting for free and without having contributed to it in any way. If you don't like it, one of your options is to do something about it yourself. The PgJDBC driver is open source, so rather than just asserting it can't be that difficult, you have the opportunity to prove your assertion by producing a patch that fixes your problem. Demanding that others spend their time fixing a problem you care about is a bit rude.

You can check the sources out here:

    http://jdbc.postgresql.org/development/git.html

and there's developer info here:

    http://jdbc.postgresql.org/development/intro.html

If you approach this with a the attitude of "this is causing a problem for me and I would love some help fixing it, any suggestions where I should start" you will probably get better results.

--
Craig Ringer

Re: not fetching all query results

From
Kris Jurka
Date:

On 7/19/2012 10:16 PM, Craig Ringer wrote:
> I haven't looked into how the fetch size code works to see whether it's
> possible for it to operate in autocommit.
>
> given that and the subsequent code I don't understand why an explicit
> transaction is required. I've only done a tiny bit with the JDBC code,
> though, and only related to the authentication system and client
> certificates.

A transaction is required to keep the portal (protocol level cursor)
open.  In autocommit mode, each batch fetched runs in its own
transaction.  So after the first batch is fetched, the autocommit
transaction ends and the cursor is destroyed.  So the next batch fetch
cannot work.

For multiple fetches to work in autocommit mode, you need a WITH HOLD
cursor.  This has a performance impact and you cannot create a WITH HOLD
portal at the protocol level.  So making this work is decidedly
non-trivial.  You would need to rework the query processor to rewrite
the SQL to be a CREATE CURSOR statement.

A similar problem occurs when trying to make fetchsize work for
scrollable resultsets.  You can't create a scrollable portal at the
protocol level and you cannot move any direction other than forwards at
the protocol level.

So the JDBC team has been sort of waiting for the frontend/backend
protocol to support these features before supporting them in the drive.

The one thing that could be fixed is making refcursors respect fetchsize
with the same restrictions as the current normal query process.

Kris Jurka

Re: not fetching all query results

From
"ml-tb"
Date:
Am Freitag, 20. Juli 2012 schrieb Craig Ringer:
> On 07/18/2012 05:00 AM, Radim Kolar wrote:
> > Dne 17.7.2012 22:05, Dave Cramer napsal(a):
> >> Radim,
> >>
> >> As per our docs
> >> http://jdbc.postgresql.org/documentation/91/query.html#fetchsize-e
> >> xample
> >>
> >> You need to do the select inside a transaction
> >
> > In transaction that simple example really works. Can it be fixed in
> > driver to work in autocommit mode? I will try scrollable cursor
> > next.
>
> I haven't looked into how the fetch size code works to see whether
> it's possible for it to operate in autocommit. A quick look at
> QueryExecutorImpl.sendOneQuery in
> org/postgresql/core/v3/QueryExecutorImpl.java finds the line:
>
>    boolean usePortal = (flags & QueryExecutor.QUERY_FORWARD_CURSOR)
> != 0 && !noResults && !noMeta && fetchSize > 0 && !describeOnly;
>
> given that and the subsequent code I don't understand why an explicit
> transaction is required. I've only done a tiny bit with the JDBC
> code, though, and only related to the authentication system and
> client certificates.
Perhaps the the original poster is helped with a modified URL, that set
the fetch size default != 0. We have the same issue here, plus that the
used framework resets connections to default values (why ever ...). So
we modified the possibilities of the driver URL:

fixDefaults (true/false; default: false)
defaultFetchSize (int, default: 0)
defaultAutoCommit (true/false; default: true)

> My rather blunt comment to you at the start of this thread was to
> point out that JDBC driver development doesn't just happen. It's a
> volunteer effort; people put their own time into this. Something
> like
>
>      "It should not be that difficult to get it fixed. I am not
> interested in workarounds"
>
> is pretty dismissive of the work you're already getting for free and
> without having contributed to it in any way. If you don't like it,
> one of your options is to do something about it yourself. The PgJDBC
> driver is open source, so rather than just asserting it can't be
> that difficult, you have the opportunity to prove your assertion by
> producing a patch that fixes your problem. Demanding that others
> spend their time fixing a problem you care about is a bit rude.
>
> You can check the sources out here:
>
> http://jdbc.postgresql.org/development/git.html
>
> and there's developer info here:
>
> http://jdbc.postgresql.org/development/intro.html
>
> If you approach this with a the attitude of "this is causing a
> problem for me and I would love some help fixing it, any suggestions
> where I should start" you will probably get better results.
This is obviously true :-)

Bye Thomas

Re: not fetching all query results

From
Radim Kolar
Date:
> A transaction is required to keep the portal (protocol level cursor)
> open.  In autocommit mode, each batch fetched runs in its own
> transaction.  So after the first batch is fetched, the autocommit
> transaction ends and the cursor is destroyed.  So the next batch fetch
> cannot work.
what about to create transaction for each Statement in autocommit mode.
after calling statement.close() close automatically created transaction.

Re: not fetching all query results

From
Radim Kolar
Date:

"It should not be that difficult to get it fixed. I am not interested in workarounds"

is pretty dismissive of the work you're already getting for free and without having contributed to it in any way. If you don't like it, one of your options is to do something about it yourself. The PgJDBC driver is open source, so rather than just asserting it can't be that difficult, you have the opportunity to prove your assertion by producing a patch that fixes your problem. Demanding that others spend their time fixing a problem you care about is a bit rude.
I offered you an opportunity to improve your product. To improve compatibility with JDBC standard and get paid for this. You guys decided that it is too difficult and you will wait years for PostgreSQL protocol change.

If you approach this with a the attitude of "this is causing a problem for me and I would love some help fixing it, any suggestions where I should start" you will probably get better results.
I already have results I wanted. I had 2000 EUR allocated to it and only 650 were needed to get it done, its about 150 lines patch and it took 2 days to code and test it.

Let me ask this one: why you do not have JDBC4 Postgresql Driver, Version 9.1-902
in maven repository? Unwilling to devote few minutes to upload it there?

Re: not fetching all query results

From
Dave Cramer
Date:
On Mon, Jul 30, 2012 at 10:00 AM, Radim Kolar <hsn@filez.com> wrote:
>
> "It should not be that difficult to get it fixed. I am not interested in
> workarounds"
>
> is pretty dismissive of the work you're already getting for free and without
> having contributed to it in any way. If you don't like it, one of your
> options is to do something about it yourself. The PgJDBC driver is open
> source, so rather than just asserting it can't be that difficult, you have
> the opportunity to prove your assertion by producing a patch that fixes your
> problem. Demanding that others spend their time fixing a problem you care
> about is a bit rude.
>
> I offered you an opportunity to improve your product. To improve
> compatibility with JDBC standard and get paid for this. You guys decided
> that it is too difficult and you will wait years for PostgreSQL protocol
> change.
>
>
> If you approach this with a the attitude of "this is causing a problem for
> me and I would love some help fixing it, any suggestions where I should
> start" you will probably get better results.
>
> I already have results I wanted. I had 2000 EUR allocated to it and only 650
> were needed to get it done, its about 150 lines patch and it took 2 days to
> code and test it.

So are you contributing that code to JDBC ?


>
> Let me ask this one: why you do not have JDBC4 Postgresql Driver, Version
> 9.1-902
> in maven repository? Unwilling to devote few minutes to upload it there?

By all means if it is that simple feel free to upload it


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

Re: not fetching all query results

From
Radim Kolar
Date:
> Perhaps the the original poster is helped with a modified URL, that set
> the fetch size default != 0.
Reporting software I use do not use transactions, making fetchsize
working only in transaction unusable.

Re: not fetching all query results

From
Radim Kolar
Date:
> So are you contributing that code to JDBC ?
In what ways it will benefit me? You keep telling me: if you need it do
it yourself. I did. Currently i have statement timeout implemented as
well, it was trivial.

 > By all means if it is that simple feel free to upload it.
submission validation is easier for maven repository maintainers if it
is uploaded by some project member. Come on, show some interest, its
your project, isn't it?

There are nearly no commits in 2012 there, just something in january -
http://pgfoundry.org/pipermail/jdbc-commits/ No point in contributing to
dead project. If i will see some jdbc-commits activity especially if
work will be targeted at jdbc4 compatibility, then you can have my patches.

Re: not fetching all query results

From
Dave Cramer
Date:
Radim,

To start with you would be better looking here
https://github.com/pgjdbc/pgjdbc for the code. And the number of
commits.

When we suggested that you "do it yourself" this is in the spirit of
open source where you give back.

You certainly aren't compelled to by any means but you certainly
leveraged a huge amount of work to your benefit. But it's your karma.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

Re: not fetching all query results

From
Atri Sharma
Date:
On Wed, Aug 1, 2012 at 6:30 PM, Dave Cramer <pg@fastcrypt.com> wrote:
> Radim,
>
> To start with you would be better looking here
> https://github.com/pgjdbc/pgjdbc for the code. And the number of
> commits.
>
> When we suggested that you "do it yourself" this is in the spirit of
> open source where you give back.
>
> You certainly aren't compelled to by any means but you certainly
> leveraged a huge amount of work to your benefit. But it's your karma.
>
> Dave Cramer

I completely agree with whatever the community has said in the entire
mail trail,and especially with Dave's mail above.I feel that you
should consider uploading the changed code to the project's
repository.Again,I completely agree with Dave on the fact that the
project has been really helpful to you and you should consider helping
increase it as well.

PostgreSQL JDBC project is *NOT* a dead project at all.Please have a
good look at the github repository(as pointed out by Dave).

Another point I wish to make is that the spirit of open source
software has been misinterpreted(IMNSHO).We(the community) are working
to make one of the finest database systems in the world even better
day-by-day.We expect people to look at it the same way.Open Source is
'free',as 'free speech',not as 'free beer'.

Please do not expect us to be providing you with full time and
extensive support on every issue/feature you want.It is 'do-ers' world
in open source.If you want something,best way is to do it yourself and
if you feel it is good,please consider *giving back*.

Atri

--
Regards,

Atri
l'apprenant

Re: not fetching all query results

From
Maciek Sakrejda
Date:
On Tue, Jul 31, 2012 at 7:38 AM, Radim Kolar <hsn@filez.com> wrote:
>
>> So are you contributing that code to JDBC ?
>
> In what ways it will benefit me? You keep telling me: if you need it do it
> yourself. I did. Currently i have statement timeout implemented as well, it
> was trivial.

Unless this is some secret sauce for a billion dollar startup, why
would you want to maintain this code yourself? Why make the choice
between having to rebase your patches every new release or falling
behind community fixes and features? Why track your packaging
separately from official versions? This sounds like a lot of work for
not much practical benefit.

Re: not fetching all query results

From
Till Toenges
Date:
On 2012-08-01 14:47, Radim Kolar wrote:
>> So are you contributing that code to JDBC ?
> In what ways it will benefit me? You keep telling me: if you need it do
> it yourself. I did. Currently i have statement timeout implemented as
> well, it was trivial.

It's the economic choice. If you don't contribute the source, you will
forever pay to patch your secret sauce into every new revision of the
driver. If you contribute, you get it for free from now on.

Think about it: You paid a fixed sum for your patch, and that money is
gone -- keeping the patch to yourself will not gain you anything. But
you will almost certainly need to update the driver one day, and it
would be much cheaper and simpler if your patch is already included.


Till

--
Kyon, Till Toenges, tt@kyon.de, http://kyon.de
Obergplatz 14, 47804 Krefeld, +49-2151-3620334

Re: not fetching all query results

From
Radim Kolar
Date:
> To start with you would be better looking here
> https://github.com/pgjdbc/pgjdbc for the code. And the number of
> commits.
Last commit is 3 months old, and then about 3 commits per month.
https://github.com/pgjdbc/pgjdbc/commits/master

For example of live project look here:
https://github.com/apache/lucene-solr/commits/trunk


Re: not fetching all query results

From
Craig Ringer
Date:
On 08/30/2012 08:46 PM, Radim Kolar wrote:
>
>> To start with you would be better looking here
>> https://github.com/pgjdbc/pgjdbc for the code. And the number of
>> commits.
> Last commit is 3 months old, and then about 3 commits per month.
> https://github.com/pgjdbc/pgjdbc/commits/master

PgJDBC only has one job: talk to PostgreSQL.

The JDBC spec doesn't change fast, or much. The PostgreSQL server does -
but the wire protocol doesn't, and most enhancements to the server
require no changes to the JDBC driver. A rapid pace of change is not
required.

That said, there's plenty to do on the JDBC driver to improve
functionality, usability, and spec compliance. The issue is finding
people with the time. It's not exciting and fu, and it needs to be done
_very_ carefully.

I've just signed up to help out on PgJDBC, though my timing is poor
since I'm going on holiday for two weeks tomorrow. Like Dave and the
others the work I do will be purely in my own time, unpaid, and take
away from other things I would like to do such as
sailing/bushwalking/cycling. Keep that in mind when you complain.

My personal TODO for PgJDBC includes:

- Implementing statement timeouts using timers, in a way that's
   compatible with both J2SE and Java EE.

- Adding automatic publishing to Maven Central using
   maven-ant-tasks

- Patch review, testing and merging

- Merging the client certificate helper classes

- Support for JDBC 4.2 refcursors

- various compliance work

--
Craig Ringer


Re: not fetching all query results

From
Dave Cramer
Date:
Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Thu, Aug 30, 2012 at 9:24 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:
> On 08/30/2012 08:46 PM, Radim Kolar wrote:
>>
>>
>>> To start with you would be better looking here
>>> https://github.com/pgjdbc/pgjdbc for the code. And the number of
>>> commits.
>>
>> Last commit is 3 months old, and then about 3 commits per month.
>> https://github.com/pgjdbc/pgjdbc/commits/master
>
>
> PgJDBC only has one job: talk to PostgreSQL.
>
> The JDBC spec doesn't change fast, or much. The PostgreSQL server does - but
> the wire protocol doesn't, and most enhancements to the server require no
> changes to the JDBC driver. A rapid pace of change is not required.
>
> That said, there's plenty to do on the JDBC driver to improve functionality,
> usability, and spec compliance. The issue is finding people with the time.
> It's not exciting and fu, and it needs to be done _very_ carefully.
>
> I've just signed up to help out on PgJDBC, though my timing is poor since
> I'm going on holiday for two weeks tomorrow. Like Dave and the others the
> work I do will be purely in my own time, unpaid, and take away from other
> things I would like to do such as sailing/bushwalking/cycling. Keep that in
> mind when you complain.
>
> My personal TODO for PgJDBC includes:
>
> - Implementing statement timeouts using timers, in a way that's
>   compatible with both J2SE and Java EE.

Craig, What is currently incompatible ?

Dave
>
> - Adding automatic publishing to Maven Central using
>   maven-ant-tasks
>
> - Patch review, testing and merging
>
> - Merging the client certificate helper classes
>
> - Support for JDBC 4.2 refcursors
>
> - various compliance work
>
> --
> Craig Ringer
>
>
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc