Unicode escapes for extended strings.
On 4/16/09, Marko Kreen <markokr@gmail.com> wrote:
> Reasons:
>
> - More people are familiar with \u escaping, as it's standard
> in Java/C#/Python, probably more..
> - U& strings will not work when stdstr=off.
>
> Syntax:
>
> \uXXXX - 16-bit value
> \UXXXXXXXX - 32-bit value
>
> Additionally, both \u and \U can be used to specify UTF-16 surrogate
> pairs to encode characters with value > 0xFFFF. This is exact behaviour
> used by Java/C#/Python. (except that Java does not have \U)
v3 of the patch:
- convert to new reentrant lexer API
- add lexer targets to avoid fallback to default
- completely disallow \U\u without proper number of hex values
- fix logic bug in surrogate pair handling
--
marko