Thread: DatabaseMetaData - getImportedKeys

DatabaseMetaData - getImportedKeys

From
Aleksey
Date:
Hello,



I have the following problem working with DatabaseMetaData. There is a
database with table and attribute names in Russian. Database cluster was
initialized with appropriate ru_RU.KOI8-R locale. All the databases were
created with KOI8-R encoding. No problems were encountered in accessing
database table data with JDBC.

Database has foreign key constraints that I try to get with
DatabaseMetaData methods. Both getTables and getPrimaryKeys work fine,
all the results have correct encoding and values.

The following fragment of code causes exception:

rs = meta.getImportedKeys(null,null,tableName);
while(rs.next()) {

    String pkTable = rs1.getString("PKTABLE_NAME");
    String pkColumn = rs1.getString("PKCOLUMN_NAME");  /* here */

    String fkTable = rs1.getString("FKTABLE_NAME");
    String fkColumn = rs1.getString("FKCOLUMN_NAME"); /* and here */

}

PKTABLE_NAME and FKTABLE_NAME fields are fetched correctly. Both the
marked lines produce exception with this stack trace:

at org.postgresql.core.Encoding.decodeUTF8(Encoding.java:270)
at org.postgresql.core.Encoding.decode(Encoding.java:165)
at org.postgresql.core.Encoding.decode(Encoding.java:181)
at
org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1ResultSet.java:97)
at
org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1ResultSet.java:337)

Error message is: "Invalid character data was found.  This is most
likely caused by stored data containing characters that are invalid for
the character set the database was created in.  The most common example
of this is storing 8bit data in a SQL_ASCII database.",

but database is not SQL_ASCII (actually KOI8-R) and all the characters
in column names are taken from this codepage. Other DatabaseMetaData
methods work with these characters fine.

I tested the same methods with the same database but with tables with
latin names - everything worked fine, but renaming all the columns will
cause a huge amount of extra work with database and applications.


I use PostgreSQL-7.3.4 compiled from source, JDBC driver from
http://jdbc.postgresql.org/download/pg73jdbc3.jar on Linux, J2SDK 1.4.1_02.


I will appreciate any help with this.

Thank you.


Sincerely yours,
Aleksey.




Re: DatabaseMetaData - getImportedKeys

From
Kris Jurka
Date:

On Mon, 3 Nov 2003, Aleksey wrote:

> Hello,
>
>
>
> I have the following problem working with DatabaseMetaData. There is a
> database with table and attribute names in Russian. Database cluster was
> initialized with appropriate ru_RU.KOI8-R locale. All the databases were
> created with KOI8-R encoding. No problems were encountered in accessing
> database table data with JDBC.
>
> Database has foreign key constraints that I try to get with
> DatabaseMetaData methods. Both getTables and getPrimaryKeys work fine,
> all the results have correct encoding and values.
>
> The following fragment of code causes exception:
>
> rs = meta.getImportedKeys(null,null,tableName);
> while(rs.next()) {
>
>     String pkTable = rs1.getString("PKTABLE_NAME");
>     String pkColumn = rs1.getString("PKCOLUMN_NAME");  /* here */
>
>     String fkTable = rs1.getString("FKTABLE_NAME");
>     String fkColumn = rs1.getString("FKCOLUMN_NAME"); /* and here */
>
> }
>
> PKTABLE_NAME and FKTABLE_NAME fields are fetched correctly. Both the
> marked lines produce exception with this stack trace:
>
> at org.postgresql.core.Encoding.decodeUTF8(Encoding.java:270)
> at org.postgresql.core.Encoding.decode(Encoding.java:165)
> at org.postgresql.core.Encoding.decode(Encoding.java:181)
> at
> org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1ResultSet.java:97)
> at
> org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1ResultSet.java:337)
>
> Error message is: "Invalid character data was found.  This is most
> likely caused by stored data containing characters that are invalid for
> the character set the database was created in.  The most common example
> of this is storing 8bit data in a SQL_ASCII database.",
>
> but database is not SQL_ASCII (actually KOI8-R) and all the characters
> in column names are taken from this codepage. Other DatabaseMetaData
> methods work with these characters fine.
>

This is particularly odd because the DatabaseMetaData function has already
parsed the data as valid unicode and then setup a "fake" in memory result
set to work with which this is failing on.  Could you send me a pg_dump
file of something that will make this fail?

Kris Jurka


Re: DatabaseMetaData - getImportedKeys

From
Kris Jurka
Date:

On Mon, 3 Nov 2003, Aleksey wrote:

> I have the following problem working with DatabaseMetaData.
>
> [ retreiving foreign key column names with KOI8-R characters fails
> when trying to decodeUTF ]

The way many DatabaseMetaData methods work is that they run a query to
retrieve the necessary data which it then iterates over, reformats, and
stores into an in memory ResultSet which is returned to the user.  The in
memory ResultSet is implemented with byte arrays, so all String data has
.getBytes() called on it to turn it into a byte array.  This turns it into
a byte array with the JVM's default charset which may not be the UTF-8 we
need.  This is why the resulting decoding from UTF-8 is failing, because
it is not actually UTF-8 data.

The attached patch encodes the data into the format that the subsequent
decoder expects.  Aleksey, could you try out this patch or the pre-built
jar file that includes it at http://www.ejurka.com/pgsql/ and confirm that
this fixes your problem?

Kris Jurka

Attachment

Re: DatabaseMetaData - getImportedKeys

From
Kris Jurka
Date:

On Tue, 4 Nov 2003, Kris Jurka wrote:

>
>
> On Mon, 3 Nov 2003, Aleksey wrote:
>
> > I have the following problem working with DatabaseMetaData.
> >
> > [ retreiving foreign key column names with KOI8-R characters fails
> > when trying to decodeUTF ]
>
> The way many DatabaseMetaData methods work is that they run a query to
> retrieve the necessary data which it then iterates over, reformats, and
> stores into an in memory ResultSet which is returned to the user.  The in
> memory ResultSet is implemented with byte arrays, so all String data has
> .getBytes() called on it to turn it into a byte array.  This turns it into
> a byte array with the JVM's default charset which may not be the UTF-8 we
> need.  This is why the resulting decoding from UTF-8 is failing, because
> it is not actually UTF-8 data.
>
> The attached patch encodes the data into the format that the subsequent
> decoder expects.  Aleksey, could you try out this patch or the pre-built
> jar file that includes it at http://www.ejurka.com/pgsql/ and confirm that
> this fixes your problem?
>
> Kris Jurka
>

Attached is a corrected patch.  The original failed to compile after doing
a clean, but somehow I was able to build it earlier.  Ant's dependency
tracking could apparently use some work.

Kris Jurka

Attachment