Marko Kreen wrote:
> On 9/25/09, tomas@tuxteam.de <tomas@tuxteam.de> wrote:
>
>> On Thu, Sep 24, 2009 at 09:42:32PM +0300, Peter Eisentraut wrote:
>> > Good idea. This could also check for other invalid things like
>> > byte-order marks in UTF-8.
>>
>> But watch out. Microsoft apps do like to insert a BOM at the beginning
>> of the text. Not that I think it's a good idea, but the Unicode folks
>> seem to think its OK [1] :-(
>>
>
> As BOM does not actively break transport layers, it's less clear-cut
> whether to reject it. It could be said that BOM at the start of string
> is OK. BOM at the middle of string is more rejectable. But it will
> only confuse some high-level character counters, not low-level encoders.
>
>
It seems pretty clear from the URL that Tomas posted that we should not
treat a BOM specially at all, and just treat it as another Unicode char.
cheers
andrew