Re: UTF8 encoding and non-text data types - Mailing list pgsql-sql

From Tom Lane
Subject Re: UTF8 encoding and non-text data types
Date
Msg-id 8278.1200353073@sss.pgh.pa.us
Whole thread Raw
In response to Re: UTF8 encoding and non-text data types  (Joe <dev@freedomcircle.net>)
Responses Re: UTF8 encoding and non-text data types  (Joe <dev@freedomcircle.net>)
List pgsql-sql
Joe <dev@freedomcircle.net> writes:
> Tom Lane wrote:
>> Well, you've got two problems there.  The first and biggest is that
>> &#NNN; is an HTML notation, not a SQL notation; no SQL database is going
>> to think that that string in its input is a representation of a single
>> Unicode character.  The other problem is that even if this did happen,
>> code points 1777 and nearby are not digits; they're something or other
>> in Arabic, apparently.
>> 
> Precisely. 1777 through 1780 decimal equate to code points U+06F1 
> through U+06F4, which correspond to the Arabic numerals 1 through 4.

Oh?  Interesting.  But even if we wanted to teach Postgres about that,
wouldn't there be a pretty strong risk of getting confused by Arabic's
right-to-left writing direction?  Wouldn't be real helpful if the entry
came out as 4321 when the user wanted 1234.  Definitely seems like
something that had better be left to the application side, where there's
more context about what the string means.
        regards, tom lane


pgsql-sql by date:

Previous
From: Joe
Date:
Subject: Re: UTF8 encoding and non-text data types
Next
From: Joe
Date:
Subject: Re: UTF8 encoding and non-text data types