Re: UNICODE characters above 0x10000 - Mailing list pgsql-hackers

From John Hansen
Subject Re: UNICODE characters above 0x10000
Date
Msg-id 5066E5A966339E42AA04BA10BA706AE56085@rodrick.geeknet.com.au
Whole thread Raw
In response to UNICODE characters above 0x10000  ("John Hansen" <john@geeknet.com.au>)
Responses Re: UNICODE characters above 0x10000  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
My apologies for not reading the code properly.

Attached patch using pg_utf_mblen() instead of an indexed table.
It now also do bounds checks.

Regards,

John Hansen

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Saturday, August 07, 2004 4:37 AM
To: John Hansen
Cc: Hackers; Patches
Subject: Re: [HACKERS] UNICODE characters above 0x10000

"John Hansen" <john@geeknet.com.au> writes:
> Attached, as promised, small patch removing the limitation, adding
> correct utf8 validation.

Surely this is badly broken --- it will happily access data outside the
bounds of the given string.  Also, doesn't pg_mblen already know the
length rules for UTF8?  Why are you duplicating that knowledge?

            regards, tom lane



Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCHES] [BUGS] casting strings to multidimensional arrays yields strange
Next
From: Jan Wieck
Date:
Subject: Re: Vacuum Cost Documentation?