Thread: Rough sketch for locale by default
I've mentioned a while ago that I wanted to make the --enable-locale switch the default (and remove the switch), and make the choice of locale-awareness a run-time choice. Here is how that might work. I've already explained how I plan to get around the performance problems, so this will just focus on the user interface. We currently have two kinds of locale categories: Those that must be fixed at initdb-time and those that may be changed at run-time. I suggest that initdb always defaults its locales to C and that we provide command line options to set a different locale. E.g., initdb --lc-collate=en_US This makes the change transparent for those who like the C locale. It is also much clearer than figuring out which of LANG, LC_COLLATE, LC_ALL will get in your way. Personally, I also find it better to separate the locale settings in your login account meant for interactive use from those meant for PostgreSQL servers. In other words, if I'm the "postgres" account and administering a bunch of databases I'd still like to set LC_ALL=de_DE so all the shell commands print their things formatted right, and I don't want to change this every time I start a server from within that account. In particular, I'd like the following set of options: --lc-collate --lc-ctype --locale (allows specifying all in one, but may be overridden by specific options) It might actually work to say initdb --locale='' to force inherting the settings from the environment. In the post-initdb stage, we'd add a bunch of GUC variables, such as lc_numeric lc_monetary lc_time locale These all default to "C". For a start we'd make them fixed for the life-time of the postmaster, but we could evaluate other options later. This again makes this change hidden for users that didn't use locale support. Also, it prevents accidentally changing the locale when you (or someone else) fiddle with your environment variables. Note that you get the same kind of command line options as in initdb: --lc-numeric, --locale, etc. You can also run SHOW lc_numeric to see what's going on. Comments? -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes: > [ good stuff snipped ] > ... Also, it prevents accidentally changing the locale when you > (or someone else) fiddle with your environment variables. If I follow this correctly, the behavior would be that PG would not pay attention to *any* LC_xxx environment variables? Although I agree with that principle in the abstract, it bothers me that PG will be out of step with every single other locale-using program in the Unix world. We ought to think twice about whether that's really a good idea. > Note that you get the same kind of command line options as in initdb: > --lc-numeric, --locale, etc. You can also run SHOW lc_numeric to see > what's going on. Probably you thought of this already: please also support SHOW for the initdb-time variables (lc_collate, etc), so that one can find out the active locale settings without having to resort to contrib/pg_controldata. regards, tom lane
On Wed, 2002-03-27 at 19:26, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > [ good stuff snipped ] > > > ... Also, it prevents accidentally changing the locale when you > > (or someone else) fiddle with your environment variables. > > If I follow this correctly, the behavior would be that PG would not pay > attention to *any* LC_xxx environment variables? Although I agree with > that principle in the abstract, it bothers me that PG will be out of > step with every single other locale-using program in the Unix world. IIRC oracle uses NLS_LANG and not any LC_* (even on unix ;) it is set to smth like NLS_LANG=ESTONIAN_ESTONIA.WE8ISO8859P15 > We ought to think twice about whether that's really a good idea. > > > Note that you get the same kind of command line options as in initdb: > > --lc-numeric, --locale, etc. You can also run SHOW lc_numeric to see > > what's going on. > > Probably you thought of this already: please also support SHOW for the > initdb-time variables (lc_collate, etc), so that one can find out the > active locale settings without having to resort to > contrib/pg_controldata. ------------ Hannu
On Wed, 2002-03-27 at 19:05, Peter Eisentraut wrote: > I've mentioned a while ago that I wanted to make the --enable-locale > switch the default (and remove the switch), and make the choice of > locale-awareness a run-time choice. Here is how that might work. I've > already explained how I plan to get around the performance problems, so > this will just focus on the user interface. > > We currently have two kinds of locale categories: Those that must be > fixed at initdb-time and those that may be changed at run-time. As a more radical idea we should get rid of those which are fixed at initdb time (except databases storage charset) and do proper NCHAR types for anything not in C locale. ----------- Hannu
Tom Lane writes: > If I follow this correctly, the behavior would be that PG would not pay > attention to *any* LC_xxx environment variables? Although I agree with > that principle in the abstract, it bothers me that PG will be out of > step with every single other locale-using program in the Unix world. During earlier discussions people had objected to enabling locale support by default on the grounds that it is very hard to follow which locale is getting activated when. Especially from Japan I heard that a lot of people have some locale settings in their environment, but that most locales are unsuitable ("broken") for use in the PostgreSQL server. So this approach would keep the behavior backward compatible with the --disable-locale case. Here's a possible compromise for the postmaster: We let initdb figure out what locales the user wants and then not only initialize pg_control appropriately, but also write the run-time changeable categories into the postgresql.conf file. That way, the postmaster executable could still consult the LC_* variables, but in the common case it would just be overridden when the postgresql.conf file is read. This way we also hide the details of what locale category gets what treatment from users that only want one locale for all categories and don't want to change it. Futhermore it all but eliminates the problem I'm concerned about that the locale may accidentally be changed when the postmaster is restarted. How does initdb figure out what locale is wanted? I agree it makes sense to use the setting in the environment, because in many cases the database will want to use the same locale as everything else on the system. We could provide a flag --no-locale, which sets all locale categories to "C", as a clear and simple way to turn this off. -- Peter Eisentraut peter_e@gmx.net