Re: [rfc] unicode escapes for extended strings - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: [rfc] unicode escapes for extended strings
Date
Msg-id 4ABCB99E.6@dunslane.net
Whole thread Raw
In response to Re: [rfc] unicode escapes for extended strings  (Marko Kreen <markokr@gmail.com>)
List pgsql-hackers

Marko Kreen wrote:
> On 9/25/09, tomas@tuxteam.de <tomas@tuxteam.de> wrote:
>   
>>  On Thu, Sep 24, 2009 at 09:42:32PM +0300, Peter Eisentraut wrote:
>>  > Good idea.  This could also check for other invalid things like
>>  > byte-order marks in UTF-8.
>>
>> But watch out. Microsoft apps do like to insert a BOM at the beginning
>>  of the text. Not that I think it's a good idea, but the Unicode folks
>>  seem to think its OK [1] :-(
>>     
>
> As BOM  does not actively break transport layers, it's less clear-cut
> whether to reject it.  It could be said that BOM at the start of string
> is OK.  BOM at the middle of string is more rejectable.  But it will
> only confuse some high-level character counters, not low-level encoders.
>
>   

It seems pretty clear from the URL that Tomas posted that we should not 
treat a BOM specially at all, and just treat it as another Unicode char.

cheers

andrew


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Hot Standby 0.2.1
Next
From: Heikki Linnakangas
Date:
Subject: Re: Hot Standby 0.2.1