Home > mailing lists

Re: UTF8 with BOM support in psql - Mailing list pgsql-hackers

From	Peter Eisentraut
Subject	Re: UTF8 with BOM support in psql
Date	November 21, 2009 20:01:46
Msg-id	1258847958.30675.9.camel@vanquo.pezone.net Whole thread Raw
In response to	Re: UTF8 with BOM support in psql (Peter Eisentraut <peter_e@gmx.net>)
List	pgsql-hackers

Tree view

On mån, 2009-11-16 at 22:37 +0200, Peter Eisentraut wrote:
> On ons, 2009-10-21 at 13:11 +0900, Itagaki Takahiro wrote:
> > Sure. Client encoding is declared in body of a file, but BOM is
> > in head of the file. So, we should always ignore BOM sequence
> > at the file head no matter what client encoding is used.
> > 
> > The attached patch replace BOM with while spaces, but it does not
> > change client encoding automatically. I think we can always ignore
> > client encoding at the replacement because SQL command cannot start
> > with BOM sequence. If we don't ignore the sequence, execution of
> > the script must fail with syntax error.
> 
> OK, I think the consensus here is:
> 
> - Eat BOM at beginning of file (as you implemented)
> 
> - Only when client encoding is UTF-8 --> please fix that
> 
> I'm not sure if replacing a BOM by three spaces is a good way to
> implement "eating", because it might throw off a column indicator
> somewhere, say, but I couldn't reproduce a problem.  Note that the U
> +FEFF character is defined as *zero-width* non-breaking space.

I have committed a change that implements the above.

pgsql-hackers by date:

From: Jan Urbański
Date: 21 November 2009, 20:01:29
Subject: Re: Partitioning option for COPY

From: Alex Hunsaker
Date: 21 November 2009, 20:05:43
Subject: Re: Ignoring white space in regression tests really a good idea?

Re: UTF8 with BOM support in psql - Mailing list pgsql-hackers

Previous

Next