Thread: XML Encoding problem

XML Encoding problem

From
rsmogura
Date:
 Hi,

 I have test database with UTF-8 encoding. I putted there XML
 <a>ЁĄ¡</a>, (U+0401, U+0104, U+00A1). I changed client encoding to
 iso8859-2, as the result of select I got
 ERROR: character 0xd081 of encoding "UTF8" has no equivalent in
 "LATIN2"
 Stan SQL:22P05.

 I should got result with characters entities for unparsable characters
 &#...;.

 Kind regards,
 Radosław Smogura

Re: XML Encoding problem

From
Peter Eisentraut
Date:
On mån, 2011-02-07 at 12:44 +0100, rsmogura wrote:
>  I have test database with UTF-8 encoding. I putted there XML
>  <a>ЁĄ¡</a>, (U+0401, U+0104, U+00A1). I changed client encoding to
>  iso8859-2, as the result of select I got
>  ERROR: character 0xd081 of encoding "UTF8" has no equivalent in
>  "LATIN2"
>  Stan SQL:22P05.
>
>  I should got result with characters entities for unparsable characters
>  &#...;.

Hehe, interesting idea, but it's not implemented that way.  We don't
alter the XML data, except for the XML declaration.


Re: XML Encoding problem

From
Radosław Smogura
Date:
I may write some patch, actually text mode will not be affected, becuase it's
text mode, and patch will fail if client encoding is "reacher" then server
(one possiblity in this situation is to XML-encode to client encoding, text-
rencode to server encoding)

But looking at code same thing could occur with binary recv. I saw there text
based XML conversion (it's altering XML in some way). According to doc I can
store XML in any encodign using binary mode.

I think if text conversion fails, then XML rewrite should occur, and all
unparsable character should be converted to XML entities...

Actually it's XML, not varchar with parsing :)

Peter Eisentraut <peter_e@gmx.net> Wednesday 09 February 2011 23:29:29
> On mån, 2011-02-07 at 12:44 +0100, rsmogura wrote:
> >  I have test database with UTF-8 encoding. I putted there XML
> >  <a>ЁĄ¡</a>, (U+0401, U+0104, U+00A1). I changed client encoding to
> >  iso8859-2, as the result of select I got
> >  ERROR: character 0xd081 of encoding "UTF8" has no equivalent in
> >  "LATIN2"
> >  Stan SQL:22P05.
> >
> >  I should got result with characters entities for unparsable characters
> >  &#...;.
>
> Hehe, interesting idea, but it's not implemented that way.  We don't
> alter the XML data, except for the XML declaration.