I've worked out a scheme that should adequately detect encoding
mismatches in initdb. Please comment on the following behavior.
The locale is still taken from the environment or the command line; no
change.
If the locale is C or POSIX, then we set the encoding to SQL_ASCII or
whatever was specified on the command line, and do nothing further.
(No useful matching can be done in this case.)
If the locale is not C or POSIX:
If the encoding is specified, check for compatibility. If not
compatible, print a warning. Continue in any case.
If the encoding was not specified, pick a matching one, print it out,
continue. (This is probably the most usual case.)
If no matching encoding could be found, print an error message asking
the user to set one explicitly.
Here are some "screenshots":
$ initdb -D pg-install/var/data --locale=de_DE@euro
[...]
The database cluster will be initialized with locale de_DE@euro.
The default database encoding has accordingly been set to LATIN9.
$ initdb -D pg-install/var/data --locale=de_DE@euro --encoding=UNICODE
[...]
The database cluster will be initialized with locale de_DE@euro.
initdb: warning: encoding mismatch
The encoding you selected (UNICODE) and the encoding that the selected
locale uses (ISO-8859-15) are not known to match. This may lead to
misbehavior in various character string processing functions. To fix
this situation, rerun initdb and either do not specify an encoding
explicitly, or choose a matching combination.
[continues...]
$ initdb -D pg-install/var/data --locale=japanese.sjis
[...]
The database cluster will be initialized with locale japanese.sjis.
initdb: could not find suitable encoding for locale "japanese.sjis"
Rerun initdb with the -E option.
Try "initdb --help" for more information.
[exit 1]
--
Peter Eisentraut
http://developer.postgresql.org/~petere/