Greg Stark <stark@enterprisedb.com> writes:
> Is it really true trhat canonical encodings never contain any composed
> characters in them? I thought there were some glyphs which could only
> be represented by composed characters.
AFAIK that's not true. However, in my original comment I was thinking
about UTF16 surrogates, which are something else entirely --- so I
withdraw that. I'm still dubious that it is our job to deal with
non-normalized characters, though.
> The original post seemed to be a contrived attempt to say "you should
> use ICU".
Indeed. The OP should go read all the previous arguments about ICU
in our archives.
regards, tom lane