Thread: per-database locale: createdb switches
Hi, I just noticed that the interface for choosing a different locale at db creation time is createdb --lc-collate=X --lc-ctype=X. Is there a reason for having these two separate switches? It seems awkward; why can't we just have a single --locale switch that selects both settings at once? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Hi, > > I just noticed that the interface for choosing a different locale at db > creation time is > createdb --lc-collate=X --lc-ctype=X. Is there a reason for having > these two separate switches? It seems awkward; why can't we just have a > single --locale switch that selects both settings at once? > Sometimes it's needed to use C-collate with non-C-ctype. But for most users it's enough just a locale switch. What about [--locale=X|--lc-collate=X --lc-ctype=X] option?
Teodor Sigaev <teodor@sigaev.ru> writes: > Alvaro Herrera wrote: >> It seems awkward; why can't we just have a >> single --locale switch that selects both settings at once? > Sometimes it's needed to use C-collate with non-C-ctype. But for most > users it's enough just a locale switch. What about > [--locale=X|--lc-collate=X --lc-ctype=X] option? Seems to me there's one there already. regards, tom lane
Tom Lane wrote: > Teodor Sigaev <teodor@sigaev.ru> writes: > > Alvaro Herrera wrote: > >> It seems awkward; why can't we just have a > >> single --locale switch that selects both settings at once? > > > Sometimes it's needed to use C-collate with non-C-ctype. But for most > > users it's enough just a locale switch. What about > > [--locale=X|--lc-collate=X --lc-ctype=X] option? > > Seems to me there's one there already. You're thinking of initdb maybe? I'm talking about createdb. $ LC_ALL=C createdb --version createdb (PostgreSQL) 8.4devel $ LC_ALL=C createdb --help createdb creates a PostgreSQL database. Usage: createdb [OPTION]... [DBNAME] [DESCRIPTION] Options: -D, --tablespace=TABLESPACE default tablespace for the database -E, --encoding=ENCODING encoding for the database--lc-collate=LOCALE LC_COLLATE setting for the database --lc-ctype=LOCALE LC_CTYPE setting forthe database -O, --owner=OWNER database user to own the new database -T, --template=TEMPLATE templatedatabase to copy -e, --echo show the commands being sent to the server --help show this help, then exit --version output version information, then exit Connection options: -h, --host=HOSTNAME database server host or socket directory -p, --port=PORT databaseserver port -U, --username=USERNAME user name to connect as -W, --password force password prompt By default, a database with the same name as the current user is created. Report bugs to <pgsql-bugs@postgresql.org>. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> Seems to me there's one there already. > You're thinking of initdb maybe? I'm talking about createdb. Oh, okay. But how often is someone going to be changing locales during createdb? I think the most common case might well be like Teodor said, where you need to tweak them individually anyway. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane wrote: > >> Seems to me there's one there already. > > > You're thinking of initdb maybe? I'm talking about createdb. > > Oh, okay. But how often is someone going to be changing locales during > createdb? I think the most common case might well be like Teodor said, > where you need to tweak them individually anyway. Frequently, I think. In fact I think creating a database in a different language is going to be more frequent than tweaking the settings individually. I like Teodor's proposal; I'll see about implementing that. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > I like Teodor's proposal; I'll see about implementing that. Attached. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > Alvaro Herrera wrote: >> I like Teodor's proposal; I'll see about implementing that. > Attached. You missed updating the sgml docs, and personally I'd be inclined to list -l before the individual --lc switches; otherwise it looks fine. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Alvaro Herrera wrote: > >> I like Teodor's proposal; I'll see about implementing that. > > > Attached. > > You missed updating the sgml docs, and personally I'd be inclined to > list -l before the individual --lc switches; otherwise it looks fine. Thanks, committed that way. I noticed that --lc-ctype and --lc-collate were forgotten in SGML docs, so I added them too. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Tom Lane wrote: >> Alvaro Herrera <alvherre@commandprompt.com> writes: >>> Alvaro Herrera wrote: >>>> I like Teodor's proposal; I'll see about implementing that. >>> Attached. >> You missed updating the sgml docs, and personally I'd be inclined to >> list -l before the individual --lc switches; otherwise it looks fine. > > Thanks, committed that way. I noticed that --lc-ctype and --lc-collate > were forgotten in SGML docs, so I added them too. Should we have a shorthand CREATE DATABASE option like that as well? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Alvaro Herrera wrote: > > Tom Lane wrote: > >> Alvaro Herrera <alvherre@commandprompt.com> writes: > >>> Alvaro Herrera wrote: > >>>> I like Teodor's proposal; I'll see about implementing that. > >>> Attached. > >> You missed updating the sgml docs, and personally I'd be inclined to > >> list -l before the individual --lc switches; otherwise it looks fine. > > > > Thanks, committed that way. I noticed that --lc-ctype and --lc-collate > > were forgotten in SGML docs, so I added them too. > > Should we have a shorthand CREATE DATABASE option like that as well? createdb is really about convenience; not sure it is warranted for CREATE DATABASE. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Heikki Linnakangas wrote: >> Alvaro Herrera wrote: >>> Tom Lane wrote: >>>> Alvaro Herrera <alvherre@commandprompt.com> writes: >>>>> Alvaro Herrera wrote: >>>>>> I like Teodor's proposal; I'll see about implementing that. >>>>> Attached. >>>> You missed updating the sgml docs, and personally I'd be inclined to >>>> list -l before the individual --lc switches; otherwise it looks fine. >>> Thanks, committed that way. I noticed that --lc-ctype and --lc-collate >>> were forgotten in SGML docs, so I added them too. >> Should we have a shorthand CREATE DATABASE option like that as well? > > createdb is really about convenience; not sure it is warranted for > CREATE DATABASE. I think unless you are doing something completely funny, you would usually want to have COLLATE and CTYPE equal. The fact that you now have to enter both to get that result could be pretty annoying in practice, I would think.
Peter Eisentraut wrote: > Bruce Momjian wrote: > > Heikki Linnakangas wrote: > >> Alvaro Herrera wrote: > >>> Tom Lane wrote: > >>>> Alvaro Herrera <alvherre@commandprompt.com> writes: > >>>>> Alvaro Herrera wrote: > >>>>>> I like Teodor's proposal; I'll see about implementing that. > >>>>> Attached. > >>>> You missed updating the sgml docs, and personally I'd be inclined to > >>>> list -l before the individual --lc switches; otherwise it looks fine. > >>> Thanks, committed that way. I noticed that --lc-ctype and --lc-collate > >>> were forgotten in SGML docs, so I added them too. > >> Should we have a shorthand CREATE DATABASE option like that as well? > > > > createdb is really about convenience; not sure it is warranted for > > CREATE DATABASE. > > I think unless you are doing something completely funny, you would > usually want to have COLLATE and CTYPE equal. The fact that you now > have to enter both to get that result could be pretty annoying in > practice, I would think. I agree but I can't think of many cases where we offer one option which controls two other options; can you? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: >>>>>> You missed updating the sgml docs, and personally I'd be inclined to >>>>>> list -l before the individual --lc switches; otherwise it looks fine. >>>>> Thanks, committed that way. I noticed that --lc-ctype and --lc-collate >>>>> were forgotten in SGML docs, so I added them too. >>>> Should we have a shorthand CREATE DATABASE option like that as well? >>> createdb is really about convenience; not sure it is warranted for >>> CREATE DATABASE. >> I think unless you are doing something completely funny, you would >> usually want to have COLLATE and CTYPE equal. The fact that you now >> have to enter both to get that result could be pretty annoying in >> practice, I would think. > > I agree but I can't think of many cases where we offer one option which > controls two other options; can you? We have cases like that: initdb --locale createdb --locale It looks to me, however, that there is possible confusion about what createdb --locale (as well as any possible option to be added to CREATE DATABASE) really affects: initdb --locale controls --lc-ctype, --lc-collate, --lc-messages, --lc-monetary, --lc-numeric, --lc-time. createdb --locale only controls --lc-ctype and --lc-collate. The functionality to have database-specific settings of the other locale categories already exists, so why shouldn't those be set as well? Which raises yet another question, why CTYPE and COLLATE have to be hardcoded settings and catalog columns instead of being stored in datconfig as database-startup-only settings?
Peter Eisentraut wrote: > Which raises yet another question, why CTYPE and COLLATE have to be > hardcoded settings and catalog columns instead of being stored in > datconfig as database-startup-only settings? Because changing CTYPE or COLLATE in an existing database would render indexes broken. Perhaps we could've put them in datconfig, and forbidden changing them after CREATE DATABASE. Then again, encoding is a similar setting too, and that's stored in a catalog column. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Peter Eisentraut wrote: >> Which raises yet another question, why CTYPE and COLLATE have to be >> hardcoded settings and catalog columns instead of being stored in >> datconfig as database-startup-only settings? > > Because changing CTYPE or COLLATE in an existing database would render > indexes broken. > > Perhaps we could've put them in datconfig, and forbidden changing them > after CREATE DATABASE. Then again, encoding is a similar setting too, > and that's stored in a catalog column. Yeah, it's a tricky case somewhere in between all the facilities that we already have. I notice in the documentation that the createdb --lc-ctype sets the lc_ctype setting for the database, but the corresponding parameter for CREATE DATABASE is CTYPE, but the global GUC setting is lc_ctype. Should that be more consistent?
Peter Eisentraut wrote: > Heikki Linnakangas wrote: >> Peter Eisentraut wrote: >>> Which raises yet another question, why CTYPE and COLLATE have to be >>> hardcoded settings and catalog columns instead of being stored in >>> datconfig as database-startup-only settings? >> >> Because changing CTYPE or COLLATE in an existing database would render >> indexes broken. >> >> Perhaps we could've put them in datconfig, and forbidden changing them >> after CREATE DATABASE. Then again, encoding is a similar setting too, >> and that's stored in a catalog column. > > Yeah, it's a tricky case somewhere in between all the facilities that we > already have. > > I notice in the documentation that the createdb --lc-ctype sets the > lc_ctype setting for the database, but the corresponding parameter for > CREATE DATABASE is CTYPE, but the global GUC setting is lc_ctype. Should > that be more consistent? Hmm, I remember I pondered for a long time if it should be COLLATE and CTYPE or LC_COLLATE and LC_CTYPE. I think the rationale in the end was that a) COLLATE/CTYPE looks nicer and b) if we add support for ICU or some other collation implementation, the association with LC_* environment variables becomes misleading. Being consistent would be nice, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Peter Eisentraut wrote: >> I notice in the documentation that the createdb --lc-ctype sets the >> lc_ctype setting for the database, but the corresponding parameter for >> CREATE DATABASE is CTYPE, but the global GUC setting is lc_ctype. >> Should that be more consistent? > > Hmm, I remember I pondered for a long time if it should be COLLATE and > CTYPE or LC_COLLATE and LC_CTYPE. I think the rationale in the end was > that a) COLLATE/CTYPE looks nicer and b) if we add support for ICU or > some other collation implementation, the association with LC_* > environment variables becomes misleading. > > Being consistent would be nice, though. I think consistency could be reached by renaming the GUC setting to ctype. We could add a "lc_ctype" synonym for backwards compatibility (like sort_mem) -- or maybe not. Since the createdb setting is new as of 8.4, we should just rename that to ctype as well. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Heikki Linnakangas wrote: >> Hmm, I remember I pondered for a long time if it should be COLLATE and >> CTYPE or LC_COLLATE and LC_CTYPE. I think the rationale in the end was >> that a) COLLATE/CTYPE looks nicer and b) if we add support for ICU or >> some other collation implementation, the association with LC_* >> environment variables becomes misleading. >> >> Being consistent would be nice, though. > I think consistency could be reached by renaming the GUC setting to > ctype. I think this is a bad idea, particularly if you also rename the other GUC to COLLATE (which is a reserved word that we're going to have to implement someday). People know what LC_CTYPE and LC_COLLATE do, at least if they've heard of Unix locale support at all (and if not they can google those names successfully). If we want consistency then the right answer is to rename the *new* things to lc_xxx, not break compatibility on the names of the existing things. regards, tom lane
Peter Eisentraut wrote: > Bruce Momjian wrote: > >>>>>> You missed updating the sgml docs, and personally I'd be inclined to > >>>>>> list -l before the individual --lc switches; otherwise it looks fine. > >>>>> Thanks, committed that way. I noticed that --lc-ctype and --lc-collate > >>>>> were forgotten in SGML docs, so I added them too. > >>>> Should we have a shorthand CREATE DATABASE option like that as well? > >>> createdb is really about convenience; not sure it is warranted for > >>> CREATE DATABASE. > >> I think unless you are doing something completely funny, you would > >> usually want to have COLLATE and CTYPE equal. The fact that you now > >> have to enter both to get that result could be pretty annoying in > >> practice, I would think. > > > > I agree but I can't think of many cases where we offer one option which > > controls two other options; can you? > > We have cases like that: > > initdb --locale > createdb --locale > > It looks to me, however, that there is possible confusion about what > createdb --locale (as well as any possible option to be added to CREATE > DATABASE) really affects: > > initdb --locale controls --lc-ctype, --lc-collate, --lc-messages, > --lc-monetary, --lc-numeric, --lc-time. > > createdb --locale only controls --lc-ctype and --lc-collate. The > functionality to have database-specific settings of the other locale > categories already exists, so why shouldn't those be set as well? > > Which raises yet another question, why CTYPE and COLLATE have to be > hardcoded settings and catalog columns instead of being stored in > datconfig as database-startup-only settings? I was asking for cases where _SQL_ commands have one parameter that controls two others, not command-line examples. Can you think of any? FYI, I am fine adding the SQL-level option, I was just asking. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Heikki Linnakangas wrote: > >> Hmm, I remember I pondered for a long time if it should be COLLATE and > >> CTYPE or LC_COLLATE and LC_CTYPE. I think the rationale in the end was > >> that a) COLLATE/CTYPE looks nicer and b) if we add support for ICU or > >> some other collation implementation, the association with LC_* > >> environment variables becomes misleading. > >> > >> Being consistent would be nice, though. > > > I think consistency could be reached by renaming the GUC setting to > > ctype. > > I think this is a bad idea, particularly if you also rename the other > GUC to COLLATE (which is a reserved word that we're going to have to > implement someday). People know what LC_CTYPE and LC_COLLATE do, > at least if they've heard of Unix locale support at all (and if not > they can google those names successfully). > > If we want consistency then the right answer is to rename the *new* > things to lc_xxx, not break compatibility on the names of the > existing things. Is anyone working on resolving this? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Tom Lane wrote: >> Alvaro Herrera <alvherre@commandprompt.com> writes: >>> Heikki Linnakangas wrote: >>>> Hmm, I remember I pondered for a long time if it should be COLLATE and >>>> CTYPE or LC_COLLATE and LC_CTYPE. I think the rationale in the end was >>>> that a) COLLATE/CTYPE looks nicer and b) if we add support for ICU or >>>> some other collation implementation, the association with LC_* >>>> environment variables becomes misleading. >>>> >>>> Being consistent would be nice, though. >>> I think consistency could be reached by renaming the GUC setting to >>> ctype. >> I think this is a bad idea, particularly if you also rename the other >> GUC to COLLATE (which is a reserved word that we're going to have to >> implement someday). People know what LC_CTYPE and LC_COLLATE do, >> at least if they've heard of Unix locale support at all (and if not >> they can google those names successfully). >> >> If we want consistency then the right answer is to rename the *new* >> things to lc_xxx, not break compatibility on the names of the >> existing things. > > Is anyone working on resolving this? I think we can just leave it for now.