Re: Support UTF-8 files with BOM in COPY FROM - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Support UTF-8 files with BOM in COPY FROM
Date
Msg-id CABUevExwxVivjbyxdv4=R_JQP=POs=BoMgzsrL1OJ3CTMjaVDw@mail.gmail.com
Whole thread Raw
In response to Support UTF-8 files with BOM in COPY FROM  (Itagaki Takahiro <itagaki.takahiro@gmail.com>)
Responses Re: Support UTF-8 files with BOM in COPY FROM
Re: Support UTF-8 files with BOM in COPY FROM
List pgsql-hackers
On Mon, Sep 26, 2011 at 06:58, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> Hi,
>
> I'd like to support UTF-8 text or csv files that has BOM (byte order mark)
> in COPY FROM command. BOM will be automatically detected and ignored
> if the file encoding is UTF-8. WIP patch attached.
>
> I'm thinking about only COPY FROM for reads, but if someone wants to add
> BOM in COPY TO, we might also support COPY TO WITH BOM for writes.
>
> Comments welcome.

I like it in general. But if we're looking at the BOM, shouldn't we
also look and *reject* the file if it's a BOM for a non-UTF8 file? Say
if the BOM claims it's UTF16?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Is there any plan to add unsigned integer types?
Next
From: Itagaki Takahiro
Date:
Subject: Re: Support UTF-8 files with BOM in COPY FROM