Thread: Re: [COMMITTERS] pgsql: Re-allow UTF8 encodings on win32.
mha@postgresql.org (Magnus Hagander) writes: > Re-allow UTF8 encodings on win32. Since UTF8 is converted to > UTF16 before being used, all (valid) locales will work for this. So where do we stand on the Windows locale/encoding business --- are we happy with the behavior now, or does it still need work? regards, tom lane
Tom Lane wrote: > mha@postgresql.org (Magnus Hagander) writes: >> Re-allow UTF8 encodings on win32. Since UTF8 is converted to >> UTF16 before being used, all (valid) locales will work for this. > > So where do we stand on the Windows locale/encoding business --- are > we happy with the behavior now, or does it still need work? I think we're good. But I'd like to hear some verification from somebody else. Specifically, I'd like to hear a signoff from someone who can actually do "real tests" on a locale that's not US and not Swedish. Also, I'd like to hear from the Japanese people (Hiroshi? Can you do this?) that we didn't break it for them. I don't think we did, but I want to be sure :) Hiroshi, and whomever else can help to test, this is only testing the backend, not the installer. The installer may need a few minor tweaks still once the backend is considered fixed. And what needs to be tested is CVS HEAD as of today. //Magnus
Hi. Um, It seems that it only passed the strict check of chklocale.c. Probably, It may enable mistaken selection...However, I will clarify a problem by the test. Regards, Hiroshi Saito From: "Magnus Hagander" <magnus@hagander.net> > Tom Lane wrote: >> mha@postgresql.org (Magnus Hagander) writes: >>> Re-allow UTF8 encodings on win32. Since UTF8 is converted to >>> UTF16 before being used, all (valid) locales will work for this. >> >> So where do we stand on the Windows locale/encoding business --- are >> we happy with the behavior now, or does it still need work? > > I think we're good. But I'd like to hear some verification from somebody > else. Specifically, I'd like to hear a signoff from someone who can > actually do "real tests" on a locale that's not US and not Swedish. > Also, I'd like to hear from the Japanese people (Hiroshi? Can you do > this?) that we didn't break it for them. I don't think we did, but I > want to be sure :) > > Hiroshi, and whomever else can help to test, this is only testing the > backend, not the installer. The installer may need a few minor tweaks > still once the backend is considered fixed. And what needs to be tested > is CVS HEAD as of today. > > //Magnus
2007/10/16, Magnus Hagander <magnus@hagander.net>: > Tom Lane wrote: > > mha@postgresql.org (Magnus Hagander) writes: > >> Re-allow UTF8 encodings on win32. Since UTF8 is converted to > >> UTF16 before being used, all (valid) locales will work for this. > > > > So where do we stand on the Windows locale/encoding business --- are > > we happy with the behavior now, or does it still need work? > > I think we're good. But I'd like to hear some verification from somebody > else. Specifically, I'd like to hear a signoff from someone who can > actually do "real tests" on a locale that's not US and not Swedish. > Also, I'd like to hear from the Japanese people (Hiroshi? Can you do > this?) that we didn't break it for them. I don't think we did, but I > want to be sure :) > > Hiroshi, and whomever else can help to test, this is only testing the > backend, not the installer. The installer may need a few minor tweaks > still once the backend is considered fixed. And what needs to be tested > is CVS HEAD as of today. > > //Magnus > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > I can test it with czech locale. Can I download binaries anywhere? Pavel
Hi. > I can test it with czech locale. Can I download binaries anywhere? http://winpg.jp/~saito/pg83/postgresql-8.3beta-cvs.tgz It is a thing after regression test.(MinGW+gcc) Regards, Hiroshi Saito
Hi. > Um, It seems that it only passed the strict check of chklocale.c. Probably, It may > enable mistaken selection...However, I will clarify a problem by the test. First, it is one problem.... http://winpg.jp/~saito/pg83/pg83b1-err.txt And a test continues....
Hi. Second, it is big problem.... http://winpg.jp/~saito/pg83/pg83b1-err2.txt It is text serch config error. However, It passes initdb.(locale=Japanese_Japan.932 ... This is ShiftJIS locale) And a test continues.... Regards, Hiroshi Saito
Hiroshi Saito wrote: > Hi. > >> Um, It seems that it only passed the strict check of chklocale.c. >> Probably, It may enable mistaken selection...However, I will clarify a >> problem by the test. > > First, it is one problem.... > http://winpg.jp/~saito/pg83/pg83b1-err.txt > > And a test continues.... But SJIS isn't supposed to work, no? //Magnus
Hiroshi Saito wrote: > Hi. > > Second, it is big problem.... > http://winpg.jp/~saito/pg83/pg83b1-err2.txt > It is text serch config error. > However, It passes initdb.(locale=Japanese_Japan.932 ... This is > ShiftJIS locale) > > And a test continues.... What text search config would you expect? //Magnus
Hi. > Hiroshi Saito wrote: >> Hi. >> >> Second, it is big problem.... >> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >> It is text serch config error. >> However, It passes initdb.(locale=Japanese_Japan.932 ... This is >> ShiftJIS locale) >> >> And a test continues.... > > What text search config would you expect? This problem here is that locale of initdb passes Japanese_Japan.932. Regards, Hiroshi Saito
Hiroshi Saito wrote: > Hi. > > Second, it is big problem.... > http://winpg.jp/~saito/pg83/pg83b1-err2.txt > It is text serch config error. > However, It passes initdb.(locale=Japanese_Japan.932 ... This is > ShiftJIS locale) > > And a test continues.... The changes that were made were only to re-enable UTF-8. SJIS wasn't ever supported as a server encoding (http://www.postgresql.org/docs/8.2/interactive/multibyte.html). The fact that initdb continues if you use Japanese_Japan.932 is an inconsistency I reported previously but has yet to be fixed. /D
From: "Dave Page" <dpage@postgresql.org> > Hiroshi Saito wrote: >> Hi. >> >> Second, it is big problem.... >> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >> It is text serch config error. >> However, It passes initdb.(locale=Japanese_Japan.932 ... This is >> ShiftJIS locale) >> >> And a test continues.... > > The changes that were made were only to re-enable UTF-8. Yes, Please see, http://winpg.jp/~saito/pg83/pg83b1-err2.txt Is that initdb is successful a problem as for this? > > SJIS wasn't ever supported as a server encoding > (http://www.postgresql.org/docs/8.2/interactive/multibyte.html). The > fact that initdb continues if you use Japanese_Japan.932 is an > inconsistency I reported previously but has yet to be fixed. Yes, However, Encoding and locale are not equivalent. Regards, Hiroshi Saito
Hiroshi Saito wrote: > From: "Dave Page" <dpage@postgresql.org> > >> Hiroshi Saito wrote: >>> Hi. >>> >>> Second, it is big problem.... >>> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >>> It is text serch config error. >>> However, It passes initdb.(locale=Japanese_Japan.932 ... This is >>> ShiftJIS locale) >>> >>> And a test continues.... >> >> The changes that were made were only to re-enable UTF-8. > > Yes, Please see, > http://winpg.jp/~saito/pg83/pg83b1-err2.txt > Is that initdb is successful a problem as for this? Oh, sorry - misread that. I chatted with Magnus about that. It is correct, but misleading. pg_control will say Japanese_Japan.932 as well iirc, even though it is really Japanese_Japan.65001. Regards, Dave
Dave Page wrote: > Hiroshi Saito wrote: >> Hi. >> >> Second, it is big problem.... >> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >> It is text serch config error. >> However, It passes initdb.(locale=Japanese_Japan.932 ... This is >> ShiftJIS locale) >> >> And a test continues.... > > The changes that were made were only to re-enable UTF-8. > > SJIS wasn't ever supported as a server encoding > (http://www.postgresql.org/docs/8.2/interactive/multibyte.html). The > fact that initdb continues if you use Japanese_Japan.932 is an > inconsistency I reported previously but has yet to be fixed. That is a good point, if unrelated to this very discussion. Do we want to change that thing to an exit instead of complain-and-continue? I think yes? //Magnus
Dave Page wrote: > Hiroshi Saito wrote: >> From: "Dave Page" <dpage@postgresql.org> >> >>> Hiroshi Saito wrote: >>>> Hi. >>>> >>>> Second, it is big problem.... >>>> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >>>> It is text serch config error. >>>> However, It passes initdb.(locale=Japanese_Japan.932 ... This is >>>> ShiftJIS locale) >>>> >>>> And a test continues.... >>> The changes that were made were only to re-enable UTF-8. >> Yes, Please see, >> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >> Is that initdb is successful a problem as for this? > > Oh, sorry - misread that. I chatted with Magnus about that. It is > correct, but misleading. pg_control will say Japanese_Japan.932 as well > iirc, even though it is really Japanese_Japan.65001. Not so. The locale is Japanese_Japan, really. That's the only part that's relevant for UTF16 encodings, which is what we use to do UTF8. We specifically *don't* try to use Japanese_Japan.65001. //Magnus
Hi. From: "Dave Page" <dpage@postgresql.org> >> Yes, Please see, >> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >> Is that initdb is successful a problem as for this? > > Oh, sorry - misread that. I chatted with Magnus about that. It is > correct, but misleading. pg_control will say Japanese_Japan.932 as well > iirc, even though it is really Japanese_Japan.65001. But, Please see. http://winpg.jp/~saito/pg83/pg83b1-err3.txt Japanese_Japan.65001 is error... Japanese_Japan is true. Regards, Hiroshi Saito
Magnus Hagander wrote: > Not so. The locale is Japanese_Japan, really. That's the only part > that's relevant for UTF16 encodings, which is what we use to do UTF8. We > specifically *don't* try to use Japanese_Japan.65001. Thats not what I mean. From a *usability* perspective, Hiroshi should see Japanese_Japan.65001 because he's selected UTF-8 in Japanese_Japan. He shouldn't see Japanese_Japan.932 because that definitely isn't what he selected. /D
Hiroshi Saito wrote: > Hi. > > From: "Dave Page" <dpage@postgresql.org> >>> Yes, Please see, >>> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >>> Is that initdb is successful a problem as for this? >> >> Oh, sorry - misread that. I chatted with Magnus about that. It is >> correct, but misleading. pg_control will say Japanese_Japan.932 as well >> iirc, even though it is really Japanese_Japan.65001. > > But, Please see. > http://winpg.jp/~saito/pg83/pg83b1-err3.txt > Japanese_Japan.65001 is error... > Japanese_Japan is true. Yes, that is expected. If you explicitly ask for the .65001 locale it will try the one that doesn't have the proper NLS files, and that shouldn't work. If you just put in Japanese_Japan, it will use the UTF16 locale. //Magnus
> But, Please see. > http://winpg.jp/~saito/pg83/pg83b1-err3.txt > Japanese_Japan.65001 is error... > Japanese_Japan is true. However, The test of this state is continued. But but but, Sorry, I face to a bed... Regards, Hiroshi Saito
Dave Page wrote: > Magnus Hagander wrote: >> Not so. The locale is Japanese_Japan, really. That's the only part >> that's relevant for UTF16 encodings, which is what we use to do UTF8. We >> specifically *don't* try to use Japanese_Japan.65001. > > Thats not what I mean. From a *usability* perspective, Hiroshi should > see Japanese_Japan.65001 because he's selected UTF-8 in Japanese_Japan. > He shouldn't see Japanese_Japan.932 because that definitely isn't what > he selected. I'l grant you that from a usbility perspective, he should see Japanese_Japan. Not the .65001 part, though. //Magnus
Hiroshi Saito wrote: > Hi. > > From: "Dave Page" <dpage@postgresql.org> >>> Yes, Please see, >>> http://winpg.jp/~saito/pg83/pg83b1-err2.txt >>> Is that initdb is successful a problem as for this? >> >> Oh, sorry - misread that. I chatted with Magnus about that. It is >> correct, but misleading. pg_control will say Japanese_Japan.932 as well >> iirc, even though it is really Japanese_Japan.65001. > > But, Please see. > http://winpg.jp/~saito/pg83/pg83b1-err3.txt > Japanese_Japan.65001 is error... > Japanese_Japan is true. Yes, we're faking utf-8 support using utf-16. Specifying it as you have there bypasses the workaround and tries to use the 65001 codepage which then fails because LC_CTYPE cannot be set to .65001 in any locale. /D
Magnus Hagander wrote: > Dave Page wrote: >> Magnus Hagander wrote: >>> Not so. The locale is Japanese_Japan, really. That's the only part >>> that's relevant for UTF16 encodings, which is what we use to do UTF8. We >>> specifically *don't* try to use Japanese_Japan.65001. >> Thats not what I mean. From a *usability* perspective, Hiroshi should >> see Japanese_Japan.65001 because he's selected UTF-8 in Japanese_Japan. >> He shouldn't see Japanese_Japan.932 because that definitely isn't what >> he selected. > > I'l grant you that from a usbility perspective, he should see > Japanese_Japan. Not the .65001 part, though. Well, that depends on whether we care that we're actually faking the utf-8 support and/or we want to keep the message consistent with what you'd see in other locales. /D
2007/10/16, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>: > Hi. > > > I can test it with czech locale. Can I download binaries anywhere? > http://winpg.jp/~saito/pg83/postgresql-8.3beta-cvs.tgz > It is a thing after regression test.(MinGW+gcc) > I have problem, there isn't libintl-2.dll Pavel
Magnus Hagander <magnus@hagander.net> writes: > Dave Page wrote: >> SJIS wasn't ever supported as a server encoding >> (http://www.postgresql.org/docs/8.2/interactive/multibyte.html). The >> fact that initdb continues if you use Japanese_Japan.932 is an >> inconsistency I reported previously but has yet to be fixed. > That is a good point, if unrelated to this very discussion. Do we want > to change that thing to an exit instead of complain-and-continue? I > think yes? Yeah, I thought we'd agreed to that a few days ago. regards, tom lane
Hi. From: "Pavel Stehule" <pavel.stehule@gmail.com> >> > I can test it with czech locale. Can I download binaries anywhere? >> http://winpg.jp/~saito/pg83/postgresql-8.3beta-cvs.tgz >> It is a thing after regression test.(MinGW+gcc) >> > > I have problem, there isn't libintl-2.dll Ooops, sorry, it is full-build. Please, this is minimum composition http://winpg.jp/~saito/pg83/postgresql-8.3beta-cvs-minbin.tgz Thanks. Regards, Hiroshi Saito
Hi. From: "Magnus Hagander" <magnus@hagander.net> >> But, Please see. >> http://winpg.jp/~saito/pg83/pg83b1-err3.txt >> Japanese_Japan.65001 is error... >> Japanese_Japan is true. > > Yes, that is expected. If you explicitly ask for the .65001 locale it > will try the one that doesn't have the proper NLS files, and that > shouldn't work. If you just put in Japanese_Japan, it will use the UTF16 > locale. Umm, As for result ... initdb -E UTF8 --locale=Japanese_Japan -D../data http://winpg.jp/~saito/pg83/pg83b1-err4.txt It seems that it is only complemented. Regards, Hiroshi Saito
Hiroshi Saito wrote: > Hi. > > From: "Magnus Hagander" <magnus@hagander.net> > >>> But, Please see. >>> http://winpg.jp/~saito/pg83/pg83b1-err3.txt >>> Japanese_Japan.65001 is error... >>> Japanese_Japan is true. >> >> Yes, that is expected. If you explicitly ask for the .65001 locale it >> will try the one that doesn't have the proper NLS files, and that >> shouldn't work. If you just put in Japanese_Japan, it will use the UTF16 >> locale. > > Umm, As for result ... initdb -E UTF8 --locale=Japanese_Japan -D../data > http://winpg.jp/~saito/pg83/pg83b1-err4.txt > It seems that it is only complemented. Yes, that is expected, though not entirely to my tastes. The cluster should still actually be in utf-8 however. /D
I did some test, but without success, Pavel I have win2003 Server .. with czech locales support. I:\PGSQL\BIN>initdb -D ../data -L i:\pgsql\share The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale Czech_Czech Republic.1250. could not determine encoding for locale "Czech_Czech Republic.1250": codeset is "CP1250" INITDB: could not find suitable encoding for locale Czech_Czech Republic.1250 Rerun INITDB with the -E option. Try "INITDB --help" for more information. I:\PGSQL\BIN> I:\PGSQL\BIN>initdb -E UTF-8 -D ../data -L i:\pgsql\share The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale Czech_Czech Republic.1250. could not determine encoding for locale "Czech_Czech Republic.1250": codeset is "CP1250" INITDB: could not find suitable text search configuration for locale Czech_Czech Republic.1250 The default text search configuration will be set to "simple". fixing permissions on existing directory ../data ... ok creating subdirectories ... ok selecting default max_connections ... 10 selecting default shared_buffers/max_fsm_pages ... 400kB/20000 creating configuration files ... ok creating template1 database in ../data/base/1 ... FATAL: could not select a sui table default timezone DETAIL: It appears that your GMT time zone uses leap seconds. PostgreSQL does n ot support leap seconds. child process exited with exit code 1 INITDB: removing contents of data directory "../data" I:\PGSQL\BIN>initdb -E win1250 --locale="Czech_Czech Republic.1250" -D ../data - L i:\pgsql\share The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale Czech_Czech Republic.1250. could not determine encoding for locale "Czech_Czech Republic.1250": codeset is "CP1250" INITDB: could not find suitable text search configuration for locale Czech_Czech Republic.1250 The default text search configuration will be set to "simple". fixing permissions on existing directory ../data ... ok creating subdirectories ... ok selecting default max_connections ... 10 selecting default shared_buffers/max_fsm_pages ... 400kB/20000 creating configuration files ... ok creating template1 database in ../data/base/1 ... FATAL: could not select a sui table default timezone DETAIL: It appears that your GMT time zone uses leap seconds. PostgreSQL does n ot support leap seconds. child process exited with exit code 1 INITDB: removing contents of data directory "../data"
"Pavel Stehule" <pavel.stehule@gmail.com> writes: > could not determine encoding for locale "Czech_Czech Republic.1250": codeset is > "CP1250" Hm, we seem to have missed an entry for PG_WIN1250. Fixed. regards, tom lane