Thread: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Hiroshi Inoue
Date:
(2011/04/16 2:56), Heikki Linnakangas wrote: > setlocale() on Windows doesn't work correctly if the locale name contains > apostrophes or dots. As for apostrophes, isn't the cause that initdb loses the single quote of locale? ([BUGS] BUG #5818: initdb lose the single quote of locale) As the bug reporter mentions, initdb loses the single quote in reality. Concretely speaking, scanstr() called from bootscanner.l loses it. I'm not sure if it's suitable for the bootstrap code to call scanstr(). regards, Hiroshi Inoue > There isn't much hope of Microsoft fixing it any time > soon, it's been like that for ages, so we better work around it. So, map a > few common Windows locale names known to cause problems to aliases that work. > > Branch > ------ > master > > Details > ------- > http://git.postgresql.org/pg/commitdiff/d5a7bf8c11c8b66c822bbb1a6c90e1a14425bd6e > > Modified Files > -------------- > src/bin/initdb/initdb.c | 89 +++++++++++++++++++++++++++++++++++++++++++---- > 1 files changed, 82 insertions(+), 7 deletions(-)
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Tom Lane
Date:
Hiroshi Inoue <inoue@tpf.co.jp> writes: > (2011/04/16 2:56), Heikki Linnakangas wrote: >> setlocale() on Windows doesn't work correctly if the locale name contains >> apostrophes or dots. > As for apostrophes, isn't the cause that initdb loses the single quote > of locale? ([BUGS] BUG #5818: initdb lose the single quote of locale) > As the bug reporter mentions, initdb loses the single quote in reality. > Concretely speaking, scanstr() called from bootscanner.l loses it. > I'm not sure if it's suitable for the bootstrap code to call scanstr(). Huh? Bootstrap mode just deals with the data found in src/include/catalog/*.h. The locale names found by initdb.c are stuck in there afterwards, using regular SQL commands. I don't know where the problem really comes from, but I doubt the connection you're trying to make above. regards, tom lane
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Hiroshi Inoue
Date:
(2011/04/20 9:22), Tom Lane wrote: > Hiroshi Inoue<inoue@tpf.co.jp> writes: >> (2011/04/16 2:56), Heikki Linnakangas wrote: >>> setlocale() on Windows doesn't work correctly if the locale name contains >>> apostrophes or dots. > >> As for apostrophes, isn't the cause that initdb loses the single quote >> of locale? ([BUGS] BUG #5818: initdb lose the single quote of locale) > >> As the bug reporter mentions, initdb loses the single quote in reality. >> Concretely speaking, scanstr() called from bootscanner.l loses it. >> I'm not sure if it's suitable for the bootstrap code to call scanstr(). > > Huh? Bootstrap mode just deals with the data found in > src/include/catalog/*.h. The locale names found by initdb.c are stuck > in there afterwards, using regular SQL commands. bootstrap_template1() in initdb runs the BKI script in bootstrap mode to create template1. Some symbols (LC_COLLATE, LC_CTYPE in pg_database etc) in the BKI script are substituted by actual values using replace_token(). Isn't it correct? ISTM replace_token() takes care of nothing about single quotes in its input values but the comment in scanstr() says /* * Note: if scanner is working right, unescaped quotes can only * appear in pairs, so there should be another character. */ regards, Hiroshi Inoue > I don't know where the > problem really comes from, but I doubt the connection you're trying to > make above. > > regards, tom lane
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Andrew Dunstan
Date:
On 04/19/2011 09:42 PM, Hiroshi Inoue wrote: > > bootstrap_template1() in initdb runs the BKI script in bootstrap > mode to create template1. Some symbols (LC_COLLATE, LC_CTYPE in > pg_database etc) in the BKI script are substituted by actual values > using replace_token(). Isn't it correct? > ISTM replace_token() takes care of nothing about single quotes > in its input values but the comment in scanstr() says > /* > * Note: if scanner is working right, unescaped > quotes can only > * appear in pairs, so there should be another > character. > */ > That's perfectly true, but only one of the replaced locale names contains a single quote mark. So clearly there's more going on here than just the bug you're referring to. Heikki's commit message specifically refers to dots in locale names, which shouldn't cause a problem of that type, I believe. cheers andrew
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Hiroshi Inoue
Date:
(2011/04/20 12:25), Andrew Dunstan wrote: > > On 04/19/2011 09:42 PM, Hiroshi Inoue wrote: >> >> bootstrap_template1() in initdb runs the BKI script in bootstrap >> mode to create template1. Some symbols (LC_COLLATE, LC_CTYPE in >> pg_database etc) in the BKI script are substituted by actual values >> using replace_token(). Isn't it correct? >> ISTM replace_token() takes care of nothing about single quotes >> in its input values but the comment in scanstr() says >> /* >> * Note: if scanner is working right, unescaped >> quotes can only >> * appear in pairs, so there should be another >> character. >> */ >> > > That's perfectly true, but only one of the replaced locale names > contains a single quote mark. So clearly there's more going on here than > just the bug you're referring to. Heikki's commit message specifically > refers to dots in locale names, which shouldn't cause a problem of that > type, I believe. Yes it's completely another issue as for dots. I can find no concrete reference to problems about locale names containing dots. Is the following an example? In my environment (Windows Vista using VC8) setlocale(LC_XXXX, "Chinese (Traditional)_MCO.950"); works and setlocale(LC_XXXX, NULL); returns Chinese (Traditional)_Macao S.A.R..950 but setlocale(LC_XXXX, "Chinese (Traditional)_Macao S.A.R..950"); fails. regards, Hiroshi Inoue
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Heikki Linnakangas
Date:
On 20.04.2011 06:48, Hiroshi Inoue wrote: > I can find no concrete reference to problems about locale > names containing dots. Is the following an example? Yes. > In my environment (Windows Vista using VC8) > > setlocale(LC_XXXX, "Chinese (Traditional)_MCO.950"); > works and > setlocale(LC_XXXX, NULL); > returns > Chinese (Traditional)_Macao S.A.R..950 Interesting. According to Microsoft's documentation, the codes are three-letter country codes specified by ISO-3166 (http://msdn.microsoft.com/en-us/library/cdax410z%28v=VS.100%29.aspx). However, according to Wikipedia, MCO stands for Monaco, not Macau (https://secure.wikimedia.org/wikipedia/en/wiki/ISO_3166-1_alpha-3). So according to bug #5818, the problem with "People's Republic of China" was different from "Hong Kong S.A.R.", "Macau S.A.R.", and "U.A.E.". setlocale() handles apostrophe fine, but it's not escaped correctly in the BKI file. I'll remove the "People's Republic of China" -> "China" mapping I committed, and fix the escaping instead. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Tom Lane
Date:
Hiroshi Inoue <inoue@tpf.co.jp> writes: > In my environment (Windows Vista using VC8) > setlocale(LC_XXXX, "Chinese (Traditional)_MCO.950"); > works and > setlocale(LC_XXXX, NULL); > returns > Chinese (Traditional)_Macao S.A.R..950 > but > setlocale(LC_XXXX, "Chinese (Traditional)_Macao S.A.R..950"); > fails. Interesting. This example suggests that maybe Windows' setlocale can only cope with dot as introducing a codepage number. Are there any cases where a dot works as part of the basic locale name? regards, tom lane
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Hiroshi Inoue
Date:
(2011/04/20 15:30), Heikki Linnakangas wrote: > On 20.04.2011 06:48, Hiroshi Inoue wrote: >> I can find no concrete reference to problems about locale >> names containing dots. Is the following an example? > > Yes. > >> In my environment (Windows Vista using VC8) >> >> setlocale(LC_XXXX, "Chinese (Traditional)_MCO.950"); >> works and >> setlocale(LC_XXXX, NULL); >> returns >> Chinese (Traditional)_Macao S.A.R..950 but setlocale(LC_XXXX, "Chinese (Traditional)_Macao S.A.R..950"); fails. I see another issue for the behavior. For example, the following code in src/backend/utis/adt/pg_locale.c won't work as expected in case the current locale is Hong Kong, Macao or UAE because the last setlocale() in the code would fail. I can find such save & restore operations of locales in several places. bool check_locale(int category, const char *value) {char *save;bool ret; save = setlocale(category, NULL);if (!save) return false; /* won't happen, we hope */ /* save may be pointing at a modifiable scratch variable, see above */save = pstrdup(save); /* set the locale with setlocale, to see if it accepts it. */ret = (setlocale(category, value) != NULL); setlocale(category, save); /* assume this won't fail */pfree(save); return ret; } regards, Hiroshi Inoue
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Tom Lane
Date:
Hiroshi Inoue <inoue@tpf.co.jp> writes: > I see another issue for the behavior. > For example, the following code in src/backend/utis/adt/pg_locale.c > won't work as expected in case the current locale is Hong Kong, Macao or > UAE because the last setlocale() in the code would fail. I can > find such save & restore operations of locales in several places. Well, if Windows' setlocale is too brain-dead to accept its own output, there's nothing to be done about it except to file a bug with Microsoft. There isn't anything in the POSIX API that would let us avoid using setlocale with a previous result value to restore the previous setting. regards, tom lane
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Hiroshi Inoue
Date:
(2011/04/20 22:08), Tom Lane wrote: > Hiroshi Inoue<inoue@tpf.co.jp> writes: >> In my environment (Windows Vista using VC8) > >> setlocale(LC_XXXX, "Chinese (Traditional)_MCO.950"); >> works and >> setlocale(LC_XXXX, NULL); >> returns >> Chinese (Traditional)_Macao S.A.R..950 >> but >> setlocale(LC_XXXX, "Chinese (Traditional)_Macao S.A.R..950"); >> fails. > > Interesting. This example suggests that maybe Windows' setlocale can > only cope with dot as introducing a codepage number. ACP or OCP as well as codepage number seem to be allowed. > Are there any > cases where a dot works as part of the basic locale name? Unfortunately I don't know any explanation how dots are allowed. regards, Hiroshi Inoue
Re: Re: [COMMITTERS] pgsql: setlocale() on Windows doesn't work correctly if the locale name
From
Hiroshi Inoue
Date:
(2011/04/20 15:30), Heikki Linnakangas wrote: > On 20.04.2011 06:48, Hiroshi Inoue wrote: >> I can find no concrete reference to problems about locale >> names containing dots. Is the following an example? > > Yes. > >> In my environment (Windows Vista using VC8) >> >> setlocale(LC_XXXX, "Chinese (Traditional)_MCO.950"); >> works and >> setlocale(LC_XXXX, NULL); >> returns >> Chinese (Traditional)_Macao S.A.R..950 > > Interesting. According to Microsoft's documentation, the codes are > three-letter country codes specified by ISO-3166 > (http://msdn.microsoft.com/en-us/library/cdax410z%28v=VS.100%29.aspx). > However, according to Wikipedia, MCO stands for Monaco, not Macau > (https://secure.wikimedia.org/wikipedia/en/wiki/ISO_3166-1_alpha-3). Hmm Windows locale system seems to have an inconsistency and the same country code (MCO) corresponds to different countries. ZHM_MCO corresponds to Chinese (Traditional)_Macao S.A.R..950 whereas FRM_MCO corresponds to French_Principality of Monaco. regards, Hiroshi Inoue