Hi, Hannu,
Hannu Krosing wrote:
>>> Are you sure it's UCS-4 ? I've always thought that XML is what is given
>>> in <xml > tag, and utf-8 if no charset is given.
>> You have to distinguish between the supported charset, and the document
>> encoding.
> UCS-4 and UTF-8 are both encodings for UNICODE
> see: http://en.wikipedia.org/wiki/UTF-32
Yes, I know.
The Point I wanted to make was that the document encoding is independent
from the allowed charset (except having to be a subset).
That is what XML entities were defined for.
So even in an document using LATIN-1 as encoding, the charset still is
Unicode, giving us the possibility to use &entities; to use non-latin1
characters.
HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS
Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org