Home > mailing lists

Re: [HACKERS] Unicode combining characters - Mailing list pgsql-patches

From	Patrice Hédé
Subject	Re: [HACKERS] Unicode combining characters
Date	October 10, 2001 18:08:53
Msg-id	20011010192819.J14587@idf.net Whole thread Raw
In response to	Re: [HACKERS] Unicode combining characters (Tatsuo Ishii <t-ishii@sra.co.jp>)
List	pgsql-patches

Tree view

> > 1) we support these supplementary characters, knowing that they won't
> >    work with regexes,
> >
> > 2) I back out the change, but then anyone using these characters will
> >    get something weird, since the decoding would be faulty (they would
> >    be handled as 3 bytes UTF-8 chars, and then the fourth byte would
> >    become a "faulty char"... not very good, as the 3-byte version is
> >    still not a valid UTF-8 code !),
> >
> > 3) we fix the regex engine within the next 24 hours, before the beta
> >    deadline is activated :/
> >
> > What do you think ?
>
> I think 2) is not very good, and we should reject these 4-bytes UTF-8
> strings. After all, we are not ready for them.

If we still recognise them as 4-byte UTF-8 chars (in order to parse
the next char correctly) and reject them as invalid chars, that should
be OK :)

> BTW, other part of your patches looks good. Peter, what do you think?

Nice to hear :)

Patrice

--
Patrice Hédé
email: patrice hede à islande org
www  : http://www.islande.org/

pgsql-patches by date:

From: Tatsuo Ishii
Date: 10 October 2001, 00:12:16
Subject: Re: [HACKERS] Unicode combining characters

From: Bruce Momjian
Date: 10 October 2001, 22:37:08
Subject: Re: [PATCH] unconditionally enable pltcl-unknown

Re: [HACKERS] Unicode combining characters - Mailing list pgsql-patches

Previous

Next