Thread: Problems with a custom LOCALE
Having got the 9.2 instance running in parallel with the previous version I am now encountering this difficulty: Couldn't create database for {"adapter"=>"postgresql", "collate"=>"en_US.UTF-8", "ctype"=>"en_US.UTF-8", "encoding"=>"UTF8", "database"=>"hll_th_forex_development", "host"=>"inet01.hamilton.harte-lyne.ca", "pool"=>"5", "password"=>"hll_theheart_devl_password", "sslmode"=>"require", "template"=>"template1", "username"=>"hll_theheart_db_devl"} PG::Error: ERROR: encoding "UTF8" does not match locale "en_CA@yyyy-mmm-dd.UTF-8" DETAIL: The chosen LC_CTYPE setting requires encoding "LATIN1". : CREATE DATABASE "hll_th_forex_test" ENCODING = 'UTF8' TEMPLATE = "template1" I am constructing this remotely using a Rake task borrowed from Ruby on Rails 3.2.9 and employing ActiveRecord-3.2.9. We use a custom local on all our servers "en_CA@yyyy-mmm-dd.UTF-8" which differs from en_CA only in the presentation of the date. However, regardless of the system setting, postgresql.conf was configured thus: # These settings are initialized by initdb, but they can be changed. lc_messages = 'en_US.UTF-8' # locale for system error message # strings lc_monetary = 'en_US.UTF-8' # locale for monetary formatting lc_numeric = 'en_US.UTF-8' # locale for number formatting lc_time = 'en_US.UTF-8' # locale for time formatting So, what is going wrong here? -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB@Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
"James B. Byrne" <byrnejb@harte-lyne.ca> writes: > PG::Error: ERROR: encoding "UTF8" does not match locale > "en_CA@yyyy-mmm-dd.UTF-8" > DETAIL: The chosen LC_CTYPE setting requires encoding "LATIN1". > : CREATE DATABASE "hll_th_forex_test" ENCODING = 'UTF8' TEMPLATE = > "template1" > I am constructing this remotely using a Rake task borrowed from Ruby > on Rails 3.2.9 and employing ActiveRecord-3.2.9. We use a custom > local on all our servers "en_CA@yyyy-mmm-dd.UTF-8" which differs from > en_CA only in the presentation of the date. Sorta looks like you based that locale on an ISO 8859-1 locale, not a UTF8 locale. AFAICT from looking at the code, PG is seeing that nl_langinfo(CODESET) returns "ISO-8859-1" or some variant spelling, so it complains. regards, tom lane
On Fri, December 14, 2012 17:40, Tom Lane wrote: > "James B. Byrne" <byrnejb@harte-lyne.ca> writes: >> PG::Error: ERROR: encoding "UTF8" does not match locale >> "en_CA@yyyy-mmm-dd.UTF-8" >> DETAIL: The chosen LC_CTYPE setting requires encoding "LATIN1". >> : CREATE DATABASE "hll_th_forex_test" ENCODING = 'UTF8' TEMPLATE = >> "template1" > > > Sorta looks like you based that locale on an ISO 8859-1 locale, not a > UTF8 locale. AFAICT from looking at the code, PG is seeing that > nl_langinfo(CODESET) returns "ISO-8859-1" or some variant spelling, > so it complains. > This is what I see on the host running postgresql-9.2 # LC_ALL=en_CA@yyyy-mmm-dd.utf8 locale charmap UTF-8 # However, on that system, because of PostgreSQL, we have left the locale unmodified from default provided in CentOS-5.8: # locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= # locale charmap UTF-8 Running locale against the base en_CA@yyyy-mmm-dd on the PG host shows this. LC_ALL=en_CA@yyyy-mmm-dd locale charmap ISO-8859-1 On the calling system I see this: $ locale LANG=en_CA.UTF-8 LC_CTYPE="en_CA.UTF-8" LC_NUMERIC="en_CA.UTF-8" LC_TIME="en_CA.UTF-8" LC_COLLATE="en_CA.UTF-8" LC_MONETARY="en_CA.UTF-8" LC_MESSAGES="en_CA.UTF-8" LC_PAPER="en_CA.UTF-8" LC_NAME="en_CA.UTF-8" LC_ADDRESS="en_CA.UTF-8" LC_TELEPHONE="en_CA.UTF-8" LC_MEASUREMENT="en_CA.UTF-8" LC_IDENTIFICATION="en_CA.UTF-8" LC_ALL= $ locale charmap UTF-8 Previously we were running PG 8.4 on that host and did not encounter this problem. -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB@Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
"James B. Byrne" <byrnejb@harte-lyne.ca> writes: >>> PG::Error: ERROR: encoding "UTF8" does not match locale >>> "en_CA@yyyy-mmm-dd.UTF-8" >>> DETAIL: The chosen LC_CTYPE setting requires encoding "LATIN1". > This is what I see on the host running postgresql-9.2 > # LC_ALL=en_CA@yyyy-mmm-dd.utf8 locale charmap > UTF-8 > Running locale against the base en_CA@yyyy-mmm-dd on the PG host shows > this. > LC_ALL=en_CA@yyyy-mmm-dd locale charmap > ISO-8859-1 You're showing us three different spellings of the locale name above. Are you really sure they're all equivalent? Beyond that, you probably need to find a locale guru. I see no reason to think there is anything wrong with the Postgres code for this, and every reason to think there's something wrong with your locale definition. But I don't know enough about custom locales to help you identify exactly what. regards, tom lane
On Sun, December 16, 2012 15:13, Tom Lane wrote: > "James B. Byrne" <byrnejb@harte-lyne.ca> writes: >>>> PG::Error: ERROR: encoding "UTF8" does not match locale >>>> "en_CA@yyyy-mmm-dd.UTF-8" >>>> DETAIL: The chosen LC_CTYPE setting requires encoding "LATIN1". > >> This is what I see on the host running postgresql-9.2 >> # LC_ALL=en_CA@yyyy-mmm-dd.utf8 locale charmap >> UTF-8 > >> Running locale against the base en_CA@yyyy-mmm-dd on the PG host >> shows >> this. >> LC_ALL=en_CA@yyyy-mmm-dd locale charmap >> ISO-8859-1 > > You're showing us three different spellings of the locale name above. > Are you really sure they're all equivalent? > > Beyond that, you probably need to find a locale guru. I see no reason > to think there is anything wrong with the Postgres code for this, and > every reason to think there's something wrong with your locale > definition. But I don't know enough about custom locales to help you > identify exactly what. > > regards, tom lane > Well, I did eventually track this down and fix it. It was a number of issues which all sort of came together to drive me insane. First, the RDBMS was a remote host. On CentOS/RHEL ssh is configured to export the local session's LC environment to remote hosts. Problem, en_CA@yyyy-mmm-dd did not exist on the remote host. Running initdb in a ssh shell with the locale set to a non-existent local locale probably affected the database structure in some obscure way, although the postgresql.conf file said 'en_US.UTF-8'. However, perhaps the config file is ignored by the init.d script on RHEL in this respect as well. Of course, postgresql.conf does not yet exist when one runs initdb. Hmm. I will leave that one for the existentialists among us. In any case, installing the custom locale on the remote host did not solve the problem because when localedef could not find the UTF-8 character map it used ISO-8859-1 instead and kept that piece of information to itself because, I did not specify the -v option. Of course, if you do specify the -v option and there are any warnings then the existing character map is not replaced even when it does find the utf-8 character map. That is, unless you also specify the -c option. And finally, you need to archive the base character map (.utf8) and then create aliases for the rest (.utf-8, UTF8 and UTF-8). And the way one does that is to use the self-explanatory --no-archive option together with the -c option on the aliases but use --replace option on the localedef call to create the archived map. Really, localedef is perhaps the most Byzantine and under documented single function utility that I have run across since I stopped writing assembler. Anyway, after rebuilding the remote 9.2 database whilst having the remote session locale set to en_US.UTF-8, resetting the remote and local host's locale databases and aliases for en_CA@yyyy-mmm-dd, and then trying everything once more on the development host I ran into this error: PG::Error: ERROR: new encoding (UTF8) is incompatible with the encoding of the template database (LATIN1) HINT: Use the same encoding as in the template database, or use template0 as template. : CREATE DATABASE "hll_th_forex_development" ENCODING = 'UTF-8' So, I now must specify a template in the database connection, where I did not need to before. Furthermore, the template I must use is template0 because regardless of what locale initdb is run in template1 evidently is always created with LATIN1 encoding. At lease if there is a way to force template1 to utf8 during initdb I now lack the energy to discover it. However, specifying template0 in the db connection does allow the remote db creation to succeed. -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB@Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
"James B. Byrne" <byrnejb@harte-lyne.ca> writes: > PG::Error: ERROR: new encoding (UTF8) is incompatible with the > encoding of the template database (LATIN1) > HINT: Use the same encoding as in the template database, or use > template0 as template. > : CREATE DATABASE "hll_th_forex_development" ENCODING = 'UTF-8' > So, I now must specify a template in the database connection, where I > did not need to before. Furthermore, the template I must use is > template0 because regardless of what locale initdb is run in template1 > evidently is always created with LATIN1 encoding. At lease if there > is a way to force template1 to utf8 during initdb I now lack the > energy to discover it. You could use initdb's -E option, but really it ought to default to the charmap of the locale that prevails from its environment. I suspect there's still something wonky about your custom locale. (But man, your description of localedef is scary.) regards, tom lane