Re: buildfarm / handling (undefined) locales - Mailing list pgsql-hackers

From Christoph Berg
Subject Re: buildfarm / handling (undefined) locales
Date
Msg-id 20140513204013.GA21331@msgid.df7cb.de
Whole thread Raw
In response to Re: buildfarm / handling (undefined) locales  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Re: Tom Lane 2014-05-13 <27525.1400012096@sss.pgh.pa.us>
> Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> > On 05/13/2014 09:58 PM, Tom Lane wrote:
> >> ... If so the issue is presumably
> >> that the environment variable(s) were set to incorrect values.  While
> >> we *could* abort in that situation, I've never heard of any program
> >> that did; the normal response is to silently ignore the environment
> >> variables and use C locale.  We're not being exactly silent about it
> >> but I think the outcome is the expected one.
> 
> > Initdb isn't like most programs. The locale given to initdb is memorized 
> > in the data directory, and if you later notice that it was wrong, you'll 
> > have to dump and reload. There is a strong argument for initdb to be 
> > more strict than, say, your average text editor.
> 
> Hm, well, if that's the behavior we want then it's certainly an easy
> change.

It should definitely fail. If you have some LC_ variables set, you
want to store that charset in your database. If the DB ends up using
C, that's not helpful. (Or probably even worse, as SQL_ASCII will
accept binary garbage without checking anything, so you'll only notice
when it's too late.)

Bad locales are the #1 reason for initdb problems at install time for
Debian packages - while pg_createcluster catches some of these itself
before invoking initdb, making the process more deterministic would be
a good thing.

> But independently of whether it's a fatal error or not: when there's
> no relevant command-line argument then we print the
> 
>     invalid locale name ""
> 
> message which is surely pretty unhelpful.  It'd be better if we could
> finger the incorrect environment setting.  Unfortunately, we don't know
> for sure which environment variable(s) setlocale was looking at --- I
> believe it's somewhat platform specific.  We could probably print
> something like this instead:
> 
>     environment locale settings are invalid

Definitely a good plan. The current behavior is just not helpful:

$ LANG=de_DE.utf-9 /usr/lib/postgresql/9.4/bin/initdb -D /tmp/bar
The files belonging to this database system will be owned by user "cbe".
This user must also own the server process.

initdb: invalid locale name ""
initdb: invalid locale name ""
initdb: invalid locale name ""
initdb: invalid locale name ""
initdb: invalid locale name ""
initdb: invalid locale name ""
The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/



pgsql-hackers by date:

Previous
From: Rohit Goyal
Date:
Subject: Re: Error in running DBT2
Next
From: Andrew Dunstan
Date:
Subject: Re: buildfarm / handling (undefined) locales