Thread: Encoding issue (utf8): different strings received from java than from PGAdmin



Hello,

I have a strange issue, probably with the jdbc client:

JDBC Version: postgresql-8.2-504
PG version:   8.2


The PG Database, the PG Client and java allworks in UTF8 context.

step to repeat:

1)

create this postgres function that just write the input parameter in /opt/<pgdata>/pg_log/:

CREATE OR REPLACE FUNCTION public.raisecode(v character varying)
  RETURNS integer AS
$BODY$

BEGIN
   RAISE WARNING 'raisecode output: %',v;
   return 0;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;
 

2) call "select raisecode('$§Kü§$')"
   from a postgres client like PGAdmin

  (also check the client encoding: select pg_client_encoding())

3) call the same statements from within a java application:

try {
                
            conn.createStatement().execute("select raisecode(cast(pg_client_encoding() as varchar))");
            conn.createStatement().execute("select raisecode('$§Kü§$')");

        } catch (SQLException e) {
            // TODO Auto-generated catch block
            throw new ApplicationException("Could not select raisecode('$§Kü§$')",e);
        }
       


compare the output in pg_log:

2 => $§Kü§$
3 => $çKüç$

   note: in postgres client , you can also try this:
   select convert ('§' using utf8_to_iso_8859_1) => §  

When called from java, it seems that the character à is added before the special characters §(§) and ¼(ü)
This probably apply to all characters whose code is > 127...

Thanks for any hint,

Marc

> The PG Database, the PG Client and java allworks in UTF8 context.
>
> create this postgres function that just write the input
> parameter in /opt/<pgdata>/pg_log/:
>
> CREATE OR REPLACE FUNCTION public.raisecode(v character varying)
[...]
>    RAISE WARNING 'raisecode output: %',v;
[...]
>
> 2) call "select raisecode('$§Kü§$')"
>    from a postgres client like PGAdmin
>
> 3) call the same statements from within a java application:
>    conn.createStatement().execute("select raisecode('$§Kü§$')");
>
> compare the output in pg_log:
>
> 2 => $§Kü§$
> 3 => $çKüç$

I tried, but cannot reproduce your problem.

Try to examine all the strings involved with 'od -c' and see where
your results differ from mine:

'od -c' on my Test.java and Test.class contain:

   s   e   l   e   c   t       r   a   i   s   e   c   o   d   e
   (   '   $ 302 247   K 303 274 302 247   $   '   )

'od -c' on my log file contains:

   r   a   i   s   e   c   o   d   e       o   u   t   p   u   t
   :       $ 302 247   K 303 274 302 247   $  \n

Is it the same for you?

Yours,
Laurenz Albe

Thank you for your tip,
This was an issue with our java compiler which did not interprete the
source code being utf8.

Marc Mamin



-----Original Message-----
From: Albe Laurenz [mailto:all@adv.magwien.gv.at]
Sent: Friday, April 06, 2007 9:39 AM
To: Marc Mamin; pgsql-jdbc@postgresql.org
Subject: RE: [JDBC] Encoding issue (utf8): different strings received
from java than from PGAdmin
I tried, but cannot reproduce your problem.

Try to examine all the strings involved with 'od -c' and see where your
results differ from mine:

'od -c' on my Test.java and Test.class contain:

   s   e   l   e   c   t       r   a   i   s   e   c   o   d   e
   (   '   $ 302 247   K 303 274 302 247   $   '   )

'od -c' on my log file contains:

   r   a   i   s   e   c   o   d   e       o   u   t   p   u   t
   :       $ 302 247   K 303 274 302 247   $  \n

Is it the same for you?

Yours,
Laurenz Albe