Encoding and multibye support - Mailing list pgsql-docs

From Iain
Subject Encoding and multibye support
Date
Msg-id 003501c3e545$13858f10$7201a8c0@mst1x5r347kymb
Whole thread Raw
List pgsql-docs
Hi All,
 
I recently had a sight problem with a development database because I used the default encoding of SQL_ASCII. When I tried to load the database into a EUC_JP database of course there were some problems with invlaid EUC_JP characters. Fortunately they were easy to find and fix.
 
Anyway, my search on "encoding" or "multibyte" showed up nothing in the 7.4 documentation. Eventually I found a page written by Tatsuo Ishii in the 7.2 documentation.
 
I think that it's an important area, and is a potential trap for new players so I'd like to see the documentation updated.
 
The following came out of a discussion with Tom Lane. I submitted it as comment in the interactive documentation. I think it would be a good idea to check the details and update the doc:
------
The default encoding SQL_ASCII effectively disables any encoding conversion. This means that your db will accept any kind of data. It's a potential problem as you may end up wth different kinds of encoding being used in both your data and metadata.
 
It would seem that unless you specifically require to store data in various encodings then you should select a specific encoding when creating a new database. Use initdb -E to set the default for all new DBs. This can be overridden when using creating a new DB
------
 
Also, the documentation for installation (chapter 14), creating database clusters (16.2) and creating databases (18.2) doesn't mention encoding at all. Maybe they should. Also 16.2 should link to the documention for initdb (Server Applications, section III). I think that wuld be a good idea.
 
regards
Iain

pgsql-docs by date:

Previous
From: Christophe Combelles
Date:
Subject: Re: small typo in doc
Next
From: "Iain"
Date:
Subject: Re: Encoding and multibye support