Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Before 7.4, to be handled by regex routines, UTF-8 are converted to
> ISO 10646. There was a limitaion in regex routines in that they cannot
> handle multibyte characters > 2bytes. In another word only 16bit UCS-2
> are supported. That's why ISO 10646 > 0x10000 is rejected.
> I'm not sure if the regex routines include in 7.4 or later has this
> restrictions or not. If not, probably we could remove the check (with
> losing data compatibilty).
It looks to me like the regex routines now use pg_wchar, so I don't
think we need the restriction any longer.
regards, tom lane