Thread: Default Locale in initdb

Default Locale in initdb

From
pgsql@mohawksoft.com
Date:
Is it me or has the default locale of created databases change at some point?

Currently, on Linux, if one does not specify a locale, the locale is taken
from the system environment and it is not "C."

While I can both sides of a discussion, I think that choosing a "locale"
without one being specified is a bad idea, even if it is the locale of the
machine. The reason why it is a bad idea is that certain features of the
database which only work correctly with a locale of "C" will not work by
default.




Re: Default Locale in initdb

From
Andrew Dunstan
Date:
pgsql@mohawksoft.com wrote:

>Is it me or has the default locale of created databases change at some point?
>
>Currently, on Linux, if one does not specify a locale, the locale is taken
>from the system environment and it is not "C."
>
>While I can both sides of a discussion, I think that choosing a "locale"
>without one being specified is a bad idea, even if it is the locale of the
>machine. The reason why it is a bad idea is that certain features of the
>database which only work correctly with a locale of "C" will not work by
>default.
>
>  
>

This is not new behaviour.

(Why are you the only person who posts here who is nameless?)

cheers

andrew


Re: Default Locale in initdb

From
Paul Ramsey
Date:
Just because it is not new does not mean that it is good.

When this new behavior was introduced, and I migrated our databases to 
the new PgSQL version (dump/restore), the locale of all my databases 
were silently changed from C to US_en. This broke one application in a 
very subtle way because of slightly different sort behavior in the 
different locale. Tracking it down was quite tricky.

PgSQL was just a little too helpful in this case.

Andrew Dunstan wrote:

> pgsql@mohawksoft.com wrote:
> 
>> Is it me or has the default locale of created databases change at some 
>> point?
>>
>> Currently, on Linux, if one does not specify a locale, the locale is 
>> taken
>> from the system environment and it is not "C."
>>
>> While I can both sides of a discussion, I think that choosing a "locale"
>> without one being specified is a bad idea, even if it is the locale of 
>> the
>> machine. The reason why it is a bad idea is that certain features of the
>> database which only work correctly with a locale of "C" will not work by
>> default.
> 
> This is not new behaviour.
> 
> (Why are you the only person who posts here who is nameless?)
> 
> cheers
> 
> andrew


--       __     /     | Paul Ramsey     | Refractions Research     | Email: pramsey@refractions.net     | Phone: (250)
885-0632    \_
 


Re: Default Locale in initdb

From
Stephan Szabo
Date:
On Wed, 2 Jun 2004 pgsql@mohawksoft.com wrote:

> Is it me or has the default locale of created databases change at some point?
>
> Currently, on Linux, if one does not specify a locale, the locale is taken
> from the system environment and it is not "C."
>
> While I can both sides of a discussion, I think that choosing a "locale"
> without one being specified is a bad idea, even if it is the locale of the
> machine. The reason why it is a bad idea is that certain features of the
> database which only work correctly with a locale of "C" will not work by
> default.

The same is true with not taking the locale.  Other unix applications will
sort "correctly" without additional work, but PostgreSQL will not. The
LIKE optimization can be "fixed" in recent versions by adding an index and
leaving the locale, but getting correct sorting is going to require a
reinitdb.


Re: Default Locale in initdb

From
Andrew Dunstan
Date:
Paul Ramsey wrote:

> Just because it is not new does not mean that it is good.



Sure. I've been caught by it too. Once. :-)

>
> When this new behavior was introduced, and I migrated our databases to 
> the new PgSQL version (dump/restore), the locale of all my databases 
> were silently changed from C to US_en. This broke one application in a 
> very subtle way because of slightly different sort behavior in the 
> different locale. Tracking it down was quite tricky.
>
> PgSQL was just a little too helpful in this case.


It doesn't happen silently - initdb tells you what it is doing.

Ignoring the current environment and using a default value of "C" would 
be a very simple change to make, if that's what people want.

cheers

andrew


>
> Andrew Dunstan wrote:
>
>> pgsql@mohawksoft.com wrote:
>>
>>> Is it me or has the default locale of created databases change at 
>>> some point?
>>>
>>> Currently, on Linux, if one does not specify a locale, the locale is 
>>> taken
>>> from the system environment and it is not "C."
>>>
>>> While I can both sides of a discussion, I think that choosing a 
>>> "locale"
>>> without one being specified is a bad idea, even if it is the locale 
>>> of the
>>> machine. The reason why it is a bad idea is that certain features of 
>>> the
>>> database which only work correctly with a locale of "C" will not 
>>> work by
>>> default.
>>
>>
>> This is not new behaviour.
>>
>> (Why are you the only person who posts here who is nameless?)
>>
>> cheers
>>
>> andrew
>
>
>



Re: Default Locale in initdb

From
Christopher Kings-Lynne
Date:
> When this new behavior was introduced, and I migrated our databases to 
> the new PgSQL version (dump/restore), the locale of all my databases 
> were silently changed from C to US_en. This broke one application in a 
> very subtle way because of slightly different sort behavior in the 
> different locale. Tracking it down was quite tricky.
> 
> PgSQL was just a little too helpful in this case.

Seems pretty nasty thing to do.  I would so vote for making -E and -W 
and --locate required flags to initdb.  Oh the amount of time I've spent 
with people in IRC..

Chris



Re: Default Locale in initdb

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > When this new behavior was introduced, and I migrated our databases to 
> > the new PgSQL version (dump/restore), the locale of all my databases 
> > were silently changed from C to US_en. This broke one application in a 
> > very subtle way because of slightly different sort behavior in the 
> > different locale. Tracking it down was quite tricky.
> > 
> > PgSQL was just a little too helpful in this case.
> 
> Seems pretty nasty thing to do.  I would so vote for making -E and -W 
> and --locate required flags to initdb.  Oh the amount of time I've spent 
> with people in IRC..

What about folks who don't use locales?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Default Locale in initdb

From
pgsql@mohawksoft.com
Date:
> Christopher Kings-Lynne wrote:
>> > When this new behavior was introduced, and I migrated our databases to
>> > the new PgSQL version (dump/restore), the locale of all my databases
>> > were silently changed from C to US_en. This broke one application in a
>> > very subtle way because of slightly different sort behavior in the
>> > different locale. Tracking it down was quite tricky.
>> >
>> > PgSQL was just a little too helpful in this case.
>>
>> Seems pretty nasty thing to do.  I would so vote for making -E and -W
>> and --locate required flags to initdb.  Oh the amount of time I've spent
>> with people in IRC..
>
> What about folks who don't use locales?

This has bitten me a couple times. In what version did it change?

My feeling, and I'd like to see what everyone else thinks, is that if you
do not specify a locale, you get "C."

That way things work as you'd expect in most cases.




Re: Default Locale in initdb

From
Christopher Kings-Lynne
Date:
> This has bitten me a couple times. In what version did it change?
> 
> My feeling, and I'd like to see what everyone else thinks, is that if you
> do not specify a locale, you get "C."

I think that initdb should default to something, and do the following:

* Have an explicit warnign if no locale specified, and what it is 
defaulting to

* Same for encoding.  NO-ONE knows about the -E option when they first 
use postgres.  Trust me on this.

* Same for -W.  NO-ONE knows this exists.  Then they change their trusts 
to md5 and they can't login to their postgres account anymore.


Re: Default Locale in initdb

From
Andrew Dunstan
Date:

Christopher Kings-Lynne wrote:

>> This has bitten me a couple times. In what version did it change?
>>
>> My feeling, and I'd like to see what everyone else thinks, is that if 
>> you
>> do not specify a locale, you get "C."
>
>
> I think that initdb should default to something, and do the following:
>
> * Have an explicit warnign if no locale specified, and what it is 
> defaulting to
>
> * Same for encoding.  NO-ONE knows about the -E option when they first 
> use postgres.  Trust me on this.
>
> * Same for -W.  NO-ONE knows this exists.  Then they change their 
> trusts to md5 and they can't login to their postgres account anymore.
>

Of these, encoding can be overridden when you create a db, and the 
password issue can be recovered from very quickly. Only the lc-ctype and 
lc-collate settings are written in stone by initdb. So I think we can 
split up the cases.

ISTM there's a good case for defaulting at least lc-collate and lc-ctype 
to "C" rather than whatever the environment says (the other locale 
settings can be reset in the config file anyway).

cheers

andrew