My apologies for not reading the code properly.
Attached patch using pg_utf_mblen() instead of an indexed table.
It now also do bounds checks.
Regards,
John Hansen
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Saturday, August 07, 2004 4:37 AM
To: John Hansen
Cc: Hackers; Patches
Subject: Re: [HACKERS] UNICODE characters above 0x10000
"John Hansen" <john@geeknet.com.au> writes:
> Attached, as promised, small patch removing the limitation, adding
> correct utf8 validation.
Surely this is badly broken --- it will happily access data outside the
bounds of the given string. Also, doesn't pg_mblen already know the
length rules for UTF8? Why are you duplicating that knowledge?
regards, tom lane