Home > mailing lists

Re: patch: utf8_to_unicode (trivial) - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: patch: utf8_to_unicode (trivial)
Date	August 13, 2010 15:11:59
Msg-id	AANLkTimw2HhW3z8GL2WJzOAHYWN4KKoxvKgO2Kk-QEUN@mail.gmail.com Whole thread Raw
In response to	Re: patch: utf8_to_unicode (trivial) (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: patch: utf8_to_unicode (trivial)
List	pgsql-hackers

Tree view

On Fri, Aug 13, 2010 at 1:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Excerpts from Robert Haas's message of vie ago 13 12:50:13 -0400 2010:
>>> Oh, hey, look at that.  Any thought on what to about the fact that our
>>> two existing copies of utf2ucs() don't match?  (one tests against 0xf8
>>> where the other against 0xf0)
>
>> I'm not sure why it's masking 0xf8 instead of 0xf0.
>
> Because it wants to verify that this is in fact a 4-byte UTF8 code.
> Compare the code (and comments) for pg_utf_mblen.
>
> AFAICS the version in mbprint.c is flat out wrong, and the only reason
> nobody's noticed is that it should never get passed a more-than-4-byte
> sequence anyway.

Should we fix it, then, and if so how far should we go back?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

pgsql-hackers by date:

From: David Fetter
Date: 13 August 2010, 15:02:54
Subject: Re: patch: General purpose utility functions used by the JSON data type

From: Tom Lane
Date: 13 August 2010, 15:22:08
Subject: Re: patch: utf8_to_unicode (trivial)

Re: patch: utf8_to_unicode (trivial) - Mailing list pgsql-hackers

Previous

Next