Re: Unicode confusion - Mailing list pgsql-general

From Ian Barwick
Subject Re: Unicode confusion
Date
Msg-id 200305130813.28878.barwick@gmx.net
Whole thread Raw
In response to Re: Unicode confusion  ("Chris Palmer" <chris.palmer@geneed.com>)
List pgsql-general
On Tuesday 13 May 2003 00:35, Chris Palmer wrote:
(...)
> ===
> ps = new PrintStream(System.out, true, "UTF-8");
> ...
> // this line might look strange to you if your mailer shows it differently
> than mine does: s.executeUpdate("INSERT INTO test (chug) VALUES
> ('¤ä´©¬O¬°¤FÅý')"); s.executeUpdate("INSERT INTO test (chug) VALUES
> ('testing')");
> s.executeUpdate("INSERT INTO test (chug) VALUES ('\u262f\u0b87')");
> ...
> ps.println(rs.getString("chug"));
> ===
>
> I'm no Java expert, so if that's not a good way to get UTF-8-encoded
> output, please let me know. When I try it, I get:
>
> ===
>
> > java Noodle > goo
> > cat goo
>
> ¤ä´©¬O¬°¤FÃ
> ý
> testing
> â¯à®
> ===
>
> I installed KDE on our Linux machine (the one running Java and Pg) and got
> the similar results using konsole. (Fwiw I am using PuTTY on Windows to
> connect to Linux).
>
> ===
> ¤ä´©¬O¬°¤FÃý
> testing
> â¯à®
> ===
>
> Note the lack of the newline in the middle of the first result.
>
> In either case, konsole or PuTTY, I am not getting back what I put in (the
> first s.executeUpdate(...), above).

Err, yes you are. Just encoded differently (UTF-8 vs. whatever Java
uses, I would guess UCS2 or UTF16). The bytes are now getting dumped to the
display, just the display does not know that they are UTF-8. Before starting
konsole you may need to set your locale. (No idea whether putty is Unicode
capable).

> In psql, the result of "select * from test" looks the same as it does when
> output by the Noodle Java program.
>
> Fwiw, I do have the encoding of this database set to UNICODE:

This is expected behaviour. Have you looked to see what encoding
Postgres uses to store Unicode?

Anyway, the obvious question is: have you tried printing the strings
you are currently passing through Postgres directly?
( ps.println('\u262f\u0b87'); ?) Do they appear any differently?


Ian Barwick
barwick@gmx.net



pgsql-general by date:

Previous
From: Devrim GUNDUZ
Date:
Subject: Re: .NET and PostgreSQL
Next
From: Johann Uhrmann
Date:
Subject: list for codes returned by getErrorCode()?