Andrew Dunstan wrote:
> If we want to quote references, we should quote the XML standard. For
> example, see here to see the exact charset supported by XML:
> http://www.w3.org/TR/2006/REC-xml11-20060816/#charsets.
The actual cause of the processing problems we have been seeing are the
character set definitions in the SGML declarations of the respective
document types.
For DocBook SGML 4.2:
CHARSET
BASESET "ISO 646:1983//CHARSET International Reference Version (IRV)//ESC 2/5 4/0" DESCSET
0 9 UNUSED 9 2 9 11 2 UNUSED 13 1 13
14 18 UNUSED 32 95 32 127 1 UNUSED
BASESET "ISO Registration Number 100//CHARSET ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1"
DESCSET 128 32 UNUSED 160 96 32
For XML:
CHARSET BASESET "ISO Registration Number 177//CHARSET ISO/IEC 10646-1:1993 UCS-4 with
implementation level 3//ESC 2/5 2/15 4/6" DESCSET 0 9 UNUSED 9
2 9 11 2 UNUSED 13 1 13 14 18 UNUSED
32 95 32 127 1 UNUSED 128 32 UNUSED 160
55136 160 55296 2048 UNUSED -- surrogates -- 57344 8190 57344 65534
2 UNUSED -- FFFE and FFFF -- 65536 1048576 65536 -- 16 planes outside BMP --
--
Peter Eisentraut
http://developer.postgresql.org/~petere/