On 13 Jan 2004, Csaba Nagy wrote:
> Hi,
>
> Some thoughts on converting files between different encodings (my search
> for this was long enough to share the results).
> If you're on Linux, you might use the "iconv" tool to convert the file
> encoding (for details: man iconv).
> On all platforms you could use the "native2ascii" tool (it's part of the
> JDK), in a 2 step process: first convert from iso-8859-1 to the
> ASCII/unicode encoded format, then convert that to UTF-8 using the
> "-reverse" option. For all the options of native2ascii, consult the docs
> for your platform, possibly:
> http://java.sun.com/j2se/1.3/docs/tooldocs/tools.html#intl
These are applications, so they aren't terribly useful in the case at
hand. We need a programmatic means of doing this conversion. Java
provides the Reader and Writer classes for doing reads and writes
in specific encodings. Further the JDBC driver internally has a Encoding
class which is used to convert from the JVM's character set to the
databases. So using these two I believe I've got it handling the encoding
issue.
Right now I've got an API roughly like so for moving data into the server:
There is an abstract CopyDataProvider which is implemented by three
concete providers. One pulls data from an InputStream. This is the
fastest method as it just pushes the data as fast as it can read it. The
next pulls from a Reader and will do encoding translation, but the
interface is still a rather opaque stream. The final provider gives you
control to write the column level entries in a friendly way. This
provider takes a CopySource object to draw the actual data from. When
streaming the data to the server it will do something like:
CopySource source;
while (source.next()) {
source.writeRow(SQLOutput);
// Send data for that row to the server;
}
So CopySource could for example wrap a Vector of objects representing a
table's rows. So when source.writeRow is called it picks the next row and
calls writeRow on that object. So now this row equivalent object can
persist itself to the SQLOutput object using methods like so
writeRow(SQLOutput out) {
out.writeInt(this.userid);
out.writeString(this.username);
out.writeTimestamp(this.lastlogin);
}
As you can imagine there are three equivalent implementations of
CopyDataReceiver for extracting data from the server.
Kris Jurka