Thread: Changing DB Encodings
Why have I been told that I need to do a re initdb to change the char encoding? The man says i can just createdb foodb -E utf8 so why would i need to dump/initdb/create/restore? cant i just dump/create/restore? It'd save all the messing around with changing the data dirs etc.
Naz Gassiep <naz@mira.net> writes: > Why have I been told that I need to do a re initdb to change the char > encoding? The man says i can just createdb foodb -E utf8 so why > would i need to dump/initdb/create/restore? cant i just dump/create/restore? Are you clear on the difference between encoding and locale? You can make new databases with whatever encoding you say, but the server's lc_collate and lc_ctype are frozen at initdb, and it will not work well to select an encoding that is incompatible with the locale setting. In practice this means that you can only use multiple database encodings when you initdb'd in "C" locale; all other locale settings imply a particular encoding. Yes, this is pretty annoying. No, it's not easy to fix. regards, tom lane
Tom Lane wrote: > Are you clear on the difference between encoding and locale? > I confidently reply with "maybe". > You can make new databases with whatever encoding you say, but the > server's lc_collate and lc_ctype are frozen at initdb, and it will > not work well to select an encoding that is incompatible with the > locale setting. In practice this means that you can only use multiple > database encodings when you initdb'd in "C" locale; all other locale > settings imply a particular encoding. > > Yes, this is pretty annoying. No, it's not easy to fix. > OK I understand now, I think. How do I check what locale is currently set? I've successfully created a utf8 database, does that imply that because I was able to create a DB with a different encoding to the ones all the others use (SQL_ASCII) that my locale is set to "C" ? Regards, - Naz.
Naz Gassiep <naz@mira.net> writes: > OK I understand now, I think. How do I check what locale is currently > set? "show lc_collate" (maybe check the other lc_ settings too for luck). > I've successfully created a utf8 database, does that imply that > because I was able to create a DB with a different encoding to the ones > all the others use (SQL_ASCII) that my locale is set to "C" ? No, that implies a lack of error checking. regards, tom lane
Tom Lane wrote: >> I've successfully created a utf8 database, does that imply that >> because I was able to create a DB with a different encoding to the ones >> all the others use (SQL_ASCII) that my locale is set to "C" ? >> > > No, that implies a lack of error checking. Surely, then, that's a bug? Shouldn't postmaster check if a DB is in an encoding that differs from the selected locale? lc_collate and lc_type both report C as the locale. Does that mean I can safely bulk load a pg_dump into it? When doing the dump, do I have to use --encoding=utf8 as a param? Please excuse me if I sound ignorant of all of this, I'm currently forced to address an issue that I don't yet feel I have a sufficient grasp of.
Naz Gassiep <naz@mira.net> writes: > Tom Lane wrote: >> No, that implies a lack of error checking. > Surely, then, that's a bug? Shouldn't postmaster check if a DB is in an > encoding that differs from the selected locale? Yeah, it should. Whether it can is a different question. Part of the problem here is the lack of a reliable way to tell *which* encoding is implied by a locale. On some systems you can get a poorly-standardized string name for the locale's encoding; on others you can't get anything. There's been some experimental code in initdb for awhile now that tries to guess encoding from locale. I have not heard reports of it failing lately, so maybe we could promote it into a hard error check, or at least a backend-side warning at CREATE DATABASE time. It still won't help on old systems without nl_langinfo(CODESET), though. (But how many of those are left? That call is specified by the Single Unix Spec. Anybody know if it works on Windows?) regards, tom lane