Thread: ResultSetMetaData.getColumnDisplaySize returns 2147483647 ?
Selecting a constant string or concatenated columns from postgres using jisql (xigole), I get heap overflow. select 'X' AS XXX from . . . . Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.lang.AbstractStringBuilder.<init> (AbstractStringBuilder.java:45) at java.lang.StringBuffer.<init>(StringBuffer.java:91) at com.xigole.util.sql.Jisql.formatLabel(Jisql.java:720) at com.xigole.util.sql.Jisql.doIsql(Jisql.java:352) at com.xigole.util.sql.Jisql.run(Jisql.java:225) at com.xigole.util.sql.Jisql.main(Jisql.java:207) 0 This discussion seems to say this is a postgres JDBC bug: http://forums.hotjoe.com/posts/list/315.page I tried going back to the last rev 7 JAR, and these queries work, albeit with 65K column widths. Something is seriously wrong in this method, if it cannot size a three byte label or a one byte constant string.
On Wed, 25 Feb 2009, DGPickett wrote: > Selecting a constant string or concatenated columns from postgres > using jisql (xigole), I get heap overflow. > > select 'X' AS XXX from . . . . While in this case it's obvious that 'X' has length 1, the server isn't telling the driver that. Even if it did in this case, there are clearly cases the server cannot know the output length. Consider: SELECT repeat('X', (random() * 100)::int); So regardless of the simple case you've shown, we must be able to handle unknown length data. Previously we did that by returning -1 which broke the rowset implementation which insisted that the value by >= 0. So we moved to Integer.MAX_VALUE which seems to mostly work for people. This thread details how we intend to solve the similar problem of getPrecision for Crystal Reports users, so we could likely apply the same sort of changes for display size as well. http://archives.postgresql.org/pgsql-jdbc/2008-12/msg00018.php > This discussion seems to say this is a postgres JDBC bug: > > http://forums.hotjoe.com/posts/list/315.page > That's not how I read the discussion. The thread says that h2 does the same thing and that the author has provided a workaround to allow the user to provide a custom formatter to handle data types with problems. Kris Jurka
I feel it runs contrary to the purpose of JDBC to make clients have postgres-specific code for this response. The common JDBC speccification does not support returning int max (or anything else) for "I don't know." It seems unreasonable that this implementation cannot deal with these trivial cases. I am not sure whether the fault lies with the JDBC implementation or the underlying RDBMS server, that we cannot determine that "X" is 1 byte wide, or a||b is (width of a) + (width of b) bytes wide. Perhaps the requirement to support result metadata did not percolate down into the server or TCP interface code far enough. Can someone illuminate us on this? Best, David
On Thu, 26 Feb 2009, DGPickett wrote: > I feel it runs contrary to the purpose of JDBC to make clients have > postgres-specific code for this response. The common JDBC > speccification does not support returning int max (or anything else) > for "I don't know." Exactly. Since the spec doesn't say what to do, what do you want the JDBC driver to do? Throw an exception? That's not helpful at all. The JDBC team has come up with a solution that works for most clients. If you have a concrete suggestion as to how this case should be handled that will work for everyone, then we'd be glad to hear it. Just wishing this case didn't exist isn't helpful. > It seems unreasonable that this implementation cannot deal with these > trivial cases. I am not sure whether the fault lies with the JDBC > implementation or the underlying RDBMS server, that we cannot > determine that "X" is 1 byte wide... The server is responsible for determining the type and length of data, not the JDBC driver. The server tells the JDBC driver that it is unknown and the driver has to do something with it. You are still ignoring the fact that there must be cases where the server cannot possibly know the length of a piece of data even if it was capable of of handling the simpler cases. Kris Jurka
On Feb 26, 1:18 pm, bo...@ejurka.com (Kris Jurka) wrote: > The server is responsible for determining the type and length of data, not > the JDBC driver. The server tells the JDBC driver that it is unknown and > the driver has to do something with it. You are still ignoring the fact > that there must be cases where the server cannot possibly know the length > of a piece of data even if it was capable of of handling the simpler > cases. > > Kris Jurka > > -- > Sent via pgsql-jdbc mailing list (pgsql-j...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-jdbc Well, can the JDBC team raise this issue with the postgres server team, that many relatively obvious cases are being marked 'width unknown'?
On Thu, 26 Feb 2009, DGPickett wrote: > Well, can the JDBC team raise this issue with the postgres server > team, that many relatively obvious cases are being marked 'width > unknown'? > They are aware. See this previous discussion regarding the same problem with numeric types: http://archives.postgresql.org/pgsql-jdbc/2008-05/msg00021.php You can take it up on pgsql-hackers if you want, but unless you have something conrete to add, I wouldn't bother. Kris Jurka
On Feb 26, 3:54 pm, bo...@ejurka.com (Kris Jurka) wrote: > On Thu, 26 Feb 2009, DGPickett wrote: > > Well, can the JDBC team raise this issue with the postgres server > > team, that many relatively obvious cases are being marked 'width > > unknown'? > > They are aware. See this previous discussion regarding the same problem > with numeric types: > > http://archives.postgresql.org/pgsql-jdbc/2008-05/msg00021.php > > You can take it up on pgsql-hackers if you want, but unless you have > something conrete to add, I wouldn't bother. > > Kris Jurka > > -- > Sent via pgsql-jdbc mailing list (pgsql-j...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-jdbc Yes, I see it comes up over and over: http://archives.postgresql.org/pgsql-hackers/2007-06/msg00428.php The discussions about char versus byte were nice, but the method name says "Display" and the api doc says "characters", so if you have Nvarchar(20) in a 16 bit unicode context, the storage needs 40 bytes (plus a length) and the right return is 20. Nobody expects or wants the gratuitous expense of dynamic sizing, just the obvious max so they can typeset text output. JAVA hides the code for wide characters, and we do not want it bring it back into the light. Somehow, doing what the real money RDBMS do, the de-facto standard, is hard for postgres dudes, or they have more important concerns. It makes the product seem shabby by comparison. Usually, that is enough to motivate frantic coding, to make open source appear just as good.
DGPickett wrote: > Yes, I see it comes up over and over: > > http://archives.postgresql.org/pgsql-hackers/2007-06/msg00428.php > > The discussions about char versus byte were nice, but the method name > says "Display" and the api doc says "characters", so if you have > Nvarchar(20) in a 16 bit unicode context, the storage needs 40 bytes > (plus a length) and the right return is 20. Nobody expects or wants > the gratuitous expense of dynamic sizing, just the obvious max so they > can typeset text output. JAVA hides the code for wide characters, and > we do not want it bring it back into the light. Except that the backend is not written in Java, and neither are all the other frontends. To some extent, this is a case of the JDBC interface not being flexible enough about the metadata (or often not even documenting what it's meant to mean!) - so we are stuck trying to provide the best fit we can from the data the server gives us. Asking the server to conform to JDBC's particular idea of how the metadata should look, at the expense of other interfaces, seems a bit unreasonable. If you have a better idea about how the JDBC driver can map the information coming from the backend to the JDBC metadata interfaces, a patch would be great. Or you could change the calling code to bounds-check the metadata results before it goes off and tries to allocate the entire heap for one column in one row .. -O
On Mar 2, 4:10 am, oli...@opencloud.com (Oliver Jowett) wrote: > Or you could change the calling code to > bounds-check the metadata results before it goes off and tries to > allocate the entire heap for one column in one row .. If you have to change calling code to access Postgres, it violates the original promise of JDBC: use me and be portable. If you want an 'I don't know' response built into this API method, sell it to the JDBC community, but until then, it is a requirement. Most RDBMS have not stumbled when delivering on the requirement, so selling it as a hardship seems difficult. As the real estate zoning lawyers say, "This is, at best, a self-imposed hardship." Re: Character / Byte, I know that in addition to 1, 2 and 4 byte wide characters of several families, we have these variable width creatures like UTF-8. RDBMS developers can hardly be happy about complications of this development, but to stay on the road to being more and more application-facing, they should provide storage in characters for any sort of character, even if the underlying storage must be overly generous or variable to accommodate this. JAVA and JDBC are just leading the way. When this call is made, they should be able to say how many characters are configured for the result set column. Just as when smallint + int is typed up to int (bigint would be safer) to hold the result and all the smallint values are cast up, when "X" || zzz (a column varchar(10)) is parsed, the result should be typed varchar(11) or char(11). The RDBMS should provide this metadata from its parsing process to JDBC.
On Fri, 2009-02-27 at 09:41 -0800, DGPickett wrote: > Somehow, doing what the real money RDBMS do, the de-facto standard, Could you describe, what do they do, especially for cases of really unknown sizes ? One solution would be to fetch all data into a buffer and actually measure the longest field size. > is > hard for postgres dudes, or they have more important concerns. It > makes the product seem shabby by comparison. Usually, that is enough > to motivate frantic coding, to make open source appear just as good. You are most welcome to "code frantically" :) PostgreSQL "dudes" in general are not in business of appearing just as good. Rather we are trying to produce the most advanced database available, at least for values of "most advanced" that matter to us. -- Hannu Krosing http://www.2ndQuadrant.com PostgreSQL Scalability and Availability Services, Consulting and Training
DGPickett wrote: > If you have to change calling code to access Postgres, it violates the > original promise of JDBC: use me and be portable. JDBC in general failed that requirement a long time ago, I'm afraid. Whenever I'm writing a JDBC client, I invariably end up with a strategy object that's customized for each RDBMS target. About the only thing that is truely portable is the overall infrastructure of Connection/PreparedStatement/ResultSet etc. Anyway, if JDBC is meant to be a portable adaptor to existing RDBMSes, then it should provide a superset of existing capabilities that degrades nicely when some are not available on a particular RDBMS. It demonstrably doesn't do that in this case. You seem to consider JDBC as an end in itself, rather than just another interface target to adapt an existing DB to. I don't think it's reasonable or expected for the existing core server to jump through hoops and change behavior solely for the sake of what the JDBC interface happens to want. Obviously, if there's some extra data that's easy for the server to provide, that's a good thing - and in fact we got a lot of extra metadata in the v3 protocol specifically to help out JDBC. But it does not seem to be easy (or even universally possible) for the server to give us the metadata in the limited form that the JDBC interface expects here. Again, I ask: Do you have a better suggestion for how the JDBC driver should handle this case, given the data it gets from the server? -O
On Tue, 3 Mar 2009, Hannu Krosing wrote: > Could you describe, what do they do, especially for cases of really > unknown sizes ? > > One solution would be to fetch all data into a buffer and actually > measure the longest field size. That's not always an option. Consider the case of PreparedStatement.getMetaData(), it must be able to return metadata without executing the statement. Kris Jurka
2009/3/2 DGPickett <DGPickett@aol.com>
On Mar 2, 4:10 am, oli...@opencloud.com (Oliver Jowett) wrote:If you have to change calling code to access Postgres, it violates the
> Or you could change the calling code to
> bounds-check the metadata results before it goes off and tries to
> allocate the entire heap for one column in one row ..
original promise of JDBC: use me and be portable. If you want an 'I
don't know' response built into this API method, sell it to the JDBC
community, but until then, it is a requirement. Most RDBMS have not
stumbled when delivering on the requirement, so selling it as a
hardship seems difficult. As the real estate zoning lawyers say,
"This is, at best, a self-imposed hardship."
I'd say it works fairly well here. It is asked a maximum, it don't know - so it tells theoretical maximum possible and that is not precise, but correct value. And the jisql does have a bug - it tries to allocate memory equal to column maximum width. Imagine a varchar(1000000000) column I may use to store long descriptions without limits (or equal PostgreSQL "text" type). My values never larger then, say, 4KB, and jisql would try to allocate 1000000000 size buffer, that is stupid and would lead to problems. It should have some maximum limit (say, 1MB), because even for varchar(many) column, it usually do not have such a long values.