Re: UTF8 with BOM support in psql - Mailing list pgsql-hackers

From Itagaki Takahiro
Subject Re: UTF8 with BOM support in psql
Date
Msg-id 20091117093151.14F2.52131E4D@oss.ntt.co.jp
Whole thread Raw
In response to Re: UTF8 with BOM support in psql  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: UTF8 with BOM support in psql
Re: UTF8 with BOM support in psql
Re: UTF8 with BOM support in psql
List pgsql-hackers
Peter Eisentraut <peter_e@gmx.net> wrote:

> OK, I think the consensus here is:
> - Eat BOM at beginning of file (as you implemented)
> - Only when client encoding is UTF-8 --> please fix that

Are they AND condition? If so, this patch will be useless.
Please remember \encoding or SET client_encoding appear
*after* BOM at beginning of file. I'll agree if the condition is 
"Eat BOM at beginning of file and <<set client encoding to UTF-8>>",
like:
Defining Python Source Code Encodings:   http://www.python.org/dev/peps/pep-0263/

> I'm not sure if replacing a BOM by three spaces is a good way to
> implement "eating", because it might throw off a column indicator
> somewhere, say, but I couldn't reproduce a problem.  Note that the U
> +FEFF character is defined as *zero-width* non-breaking space.

I assumed psql discards whitespaces automatically, but I see it is
more robust to remove BOM bytes explitly. I'll fix it.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Summary and Plan for Hot Standby
Next
From: Jan Urbański
Date:
Subject: Re: Partitioning option for COPY