Convert string to UNICODE & migration FROM 7.4 to 9.1 - Mailing list pgsql-general

From Philippe Lang
Subject Convert string to UNICODE & migration FROM 7.4 to 9.1
Date
Msg-id 21F9A2B0E25A7F4497D7B87397AD9D7C84264263@SBS1.attiksystem.local
Whole thread Raw
Responses Re: Convert string to UNICODE & migration FROM 7.4 to 9.1  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi,

I'm migrating a PG 7.4 database (encoded in LATIN1) to PG 9.1. The upgrade is just fine, except a problem with a
conversionto UNICODE for which I was not able to find a solution yet: 

I have a function under PG 7.4 that returns an xml structure, with the content encoded in UNICODE.

The function is like:

---------------------------------------------
CREATE OR REPLACE FUNCTION get_xml()
  RETURNS text AS
$$
  DECLARE
    output    text;
  BEGIN
    -- We fill the output variable with xml
    ...
    -- We return it in unicode
    RETURN convert(output, 'LATIN1', 'UTF8');
  END;
$$
  LANGUAGE 'plpgsql';
---------------------------------------------

After migrating to PG 9.1, I noticed that convert now requires a bytea, and not a text. I tried different things:

- Change the output variable to a bytea
- Use RETURN convert(convert_to(output, 'LATIN1'), 'LATIN1', 'UTF8');
- Encode the database in UTF8 instead of LATIN1

... but no output is similar to what I had under PG 7.4.

More precisely, I had under PG 7.4 something like (notice the Name Hélène converted into Hélène)

"<?xml version="1.0" encoding="UTF-8"?>
<DispatchAuftrag>
  <Versicherungsnehmer>
    <Name>Martelli</Name>
    <Vorname>Hélène</Vorname>
    <Strasse>rue des Comptes</Strasse>
    <Land>Suisse</Land>
    <PLZ>123456</PLZ>
    <Ort>Fribourg</Ort>
    <Email>.</Email>
    <TelMobil>.</TelMobil>
    <TelPrivat>.</TelPrivat>
    <TelGeschaeft>.</TelGeschaeft>
    <Fax>.</Fax>
  </Versicherungsnehmer>
</DispatchAuftrag>
"

And now I get something like:

"<?xml version="1.0" encoding="UTF-8"?>\012<DispatchAuftrag>\012  </Versicherungsnehmer>\012  <Geschaedigter>\012
<Name>Etatdu Valais</Name>\012    <Vorname></Vorname>\012    <Strasse>Indivis / Centre entretien
Autoroute</Strasse>\012   <Land>Suisse</Land>\012    <PLZ>1906</PLZ>\012    <Ort>Charrat</Ort>\012
<Email></Email>\012   <TelMobil></TelMobil>\012    <TelPrivat>027 747 61 00</TelPrivat>\012
<TelGeschaeft>.</TelGeschaeft>\012   <Fax></Fax>\012  </Geschaedigter>\012  <Schadendaten>\012
<SchadenDatum>2005-01-23</SchadenDatum>\012   <SchadenNr>JR/41123-208/JPS</SchadenNr>\012
<GeschaetzteSchadenhoehe></GeschaetzteSchadenhoehe>\012   <SchadenAmFz>Dommages aux installations
routi\303\250res</SchadenAmFz>\012   <Bemerkung></Bemerkung>\012  </Schadendaten>\012  <Fahrzeugstandort>\012
<Name></Name>\012   <Vorname></Vorname>\012    <Strasse></Strasse>\012    <Land></Land>\012    <PLZ></PLZ>\012
<Ort></Ort>\012   <Telefon></Telefon>\012    <Fax></Fax>\012    <Email></Email>\012
</Fahrzeugstandort>\012</DispatchAuftrag>\012"

Newlines don't seem to be handled properly, and I'm unable to find out how to change that. UTF8 encoding is not good
either.

Any idea how to correct that?

Thanks!

Philippe

-------------------------------------------------------------
Attik System              web    : http://www.attiksystem.ch
Philippe Lang             phone  : +41 26 422 13 75
rte de la Fonderie 2      gsm    : +41 79 351 49 94
1700 Fribourg             twitter: @philippelang
                          pgp    : http://keyserver.pgp.com



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Reassign value of IN parameter in 9.1.1
Next
From: Tom Lane
Date:
Subject: Re: "CREATE TABLE table_name AS EXECUTE name WITH DATA" becomes syntax error.