On 02/15/2013 12:45 AM, Peter Eisentraut wrote:
> On 2/11/13 10:22 PM, Greg Stark wrote:
>> On Sun, Feb 10, 2013 at 11:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> If we knew that postgresql.conf was stored in, say, UTF8, then it would
>>> probably be possible to perform encoding conversion to get string
>>> variables into the database encoding. Perhaps we should allow some
>>> magic syntax to tell us the encoding of a config file?
>>>
>>> file_encoding = 'utf8' # must precede any non-ASCII in the file
>> If we're going to do that we might as well use the Emacs standard
>> -*-coding: latin-1;-*-
> Yes, or more generally perhaps what Python does:
> http://docs.python.org/2.7/reference/lexical_analysis.html#encoding-declarations
>
> (In Python 2, the default is ASCII, in Python 3, the default is UTF8.)
Not that Python also respects a BOM in a UTF-8 file, treating the BOM as
flagging the file as being UTF-8.
"In addition, if the first bytes of the file are the UTF-8 byte-order
mark ('\xef\xbb\xbf'), the declared file encoding is UTF-8."
IMO we should do the same. If there's no explicit encoding declaration,
treat it as UTF-8 if there's a BOM and as the platform's local character
encoding otherwise.
-- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services