On Sun, Dec 10, 2006 at 12:30:12PM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > On Sat, Dec 09, 2006 at 10:50:05PM -0700, Michael Fuhr wrote:
> >> Should the code distinguish between combining characters and
> >> zero-width control characters so the former display correctly?
>
> > Probably, any idea how to tell the difference?
>
> I'm no expert, but isn't there a specific range of Unicode code points
> defined for combining characters?
Yes, several, with others scattered about. Could we use the general
category (Mn = Mark, nonspacing; Me = Mark, enclosing)? ucs_wcwidth()
in src/backend/utils/mb/wchar.c already contains some of that
knowledge, doesn't it? The combining[] list looks incomplete but
otherwise close to what we'd need.
--
Michael Fuhr