Thread: Thread-safe nl_langinfo() and localeconv()
Hi, Over on the discussion thread about remaining setlocale() work[1], I wrote out some long boring theories about $SUBJECT. Here are some draft patches to try those theories out, and make a commitfest entry. nl_langinfo_l() is a trivial drop-in replacement, and pg_localeconv_r() has 4 different implementation strategies: 1. Windows, with ugly _configthreadlocale() and thread-local result. 2. Glibc, with nice nl_langinfo_l() extensions. 3. macOS/*BSD, with nice localeconv_l(). 4. Baseline POSIX: uselocale() + localeconv() + honking great lock. In reality it'd just be Solaris running #4 (and AIX if it comes back). Whether they truly implement it as pessimally as the standard allows, who knows... you could drop the lock if you somehow knew that they returned a pointer to thread-local storage or a member of the locale_t object. [1] https://www.postgresql.org/message-id/flat/4c5da86af36a0d5e430eee3f60ce5e06f1b5cd34.camel%40j-davis.com
Attachment
On 13/08/2024 08:45, Thomas Munro wrote: > Hi, > > Over on the discussion thread about remaining setlocale() work[1], I > wrote out some long boring theories about $SUBJECT. Here are some > draft patches to try those theories out, and make a commitfest entry. > nl_langinfo_l() is a trivial drop-in replacement, and > pg_localeconv_r() has 4 different implementation strategies: > > 1. Windows, with ugly _configthreadlocale() and thread-local result. > 2. Glibc, with nice nl_langinfo_l() extensions. > 3. macOS/*BSD, with nice localeconv_l(). > 4. Baseline POSIX: uselocale() + localeconv() + honking great lock. > > In reality it'd just be Solaris running #4 (and AIX if it comes back). > Whether they truly implement it as pessimally as the standard allows, > who knows... you could drop the lock if you somehow knew that they > returned a pointer to thread-local storage or a member of the locale_t > object. Patches 1 and 2 look good to me. Patch 3 makes sense too, some comments on the details: The #ifdefs and the LCONV_MEMBER stuff makes it a bit hard to follow what happens in each implementation strategy. I wonder if it would be more clear to duplicate more code. There's a comment at the top of pg_locale.c ("!!! NOW HEAR THIS !!!") that needs to be removed or adjusted now. > * The POSIX standard explicitly says that it is undefined what happens if > * LC_MONETARY or LC_NUMERIC imply an encoding (codeset) different from > * that implied by LC_CTYPE. In practice, all Unix-ish platforms seem to > * believe that localeconv() should return strings that are encoded in the > * codeset implied by the LC_MONETARY or LC_NUMERIC locale name. Hence, > * once we have successfully collected the localeconv() results, we will > * convert them from that codeset to the desired server encoding. The patch loses this comment, leaving just a much shorter comment in the WIN32 implementation. But it still seems like a relevant comment for the !WIN32 implementation too. > This gets rid of some setlocale() calls and makes the returned value > unclobberable with a defined lifetime. The remaining call to > setlocale() is only a query of the name of the current local (in a typo: local -> locale > multi-threaded future this would have to be changed, perhaps to use a > per-database or per-backend locale_t instead of LC_GLOBAL_LOCALE). > > All known non-Windows targets have nl_langinfo_l(), from POSIX 2018. I think that's supposed to be POSIX 2008 -- Heikki Linnakangas Neon (https://neon.tech)
On Tue, Aug 13, 2024 at 6:23 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 13/08/2024 08:45, Thomas Munro wrote: > Patches 1 and 2 look good to me. Thanks. I went ahead and pushed these ones. A couple of Macs in the build farm are failing, as if they didn't include <xlocale.h> and haven't seen the type locale_t. CI's macOS build is OK, and my own local Mac is building master OK, and animal "indri" is OK... hmm, those are all using MacPorts, but I don't yet see why that would be it...
On Tue, Aug 13, 2024 at 11:25 PM Thomas Munro <thomas.munro@gmail.com> wrote: > On Tue, Aug 13, 2024 at 6:23 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > On 13/08/2024 08:45, Thomas Munro wrote: > > Patches 1 and 2 look good to me. > > Thanks. I went ahead and pushed these ones. A couple of Macs in the > build farm are failing, as if they didn't include <xlocale.h> and > haven't seen the type locale_t. CI's macOS build is OK, and my own > local Mac is building master OK, and animal "indri" is OK... hmm, > those are all using MacPorts, but I don't yet see why that would be > it... Ah, got it. It was OK under meson but not autoconf for my Mac, so I guess it must be transitive headers coming from somewhere making it work for some systems. I just have a typo in an #ifdef macro. Will fix.
Here's another mystery from Windows + MinGW. Although "fairywren" is green, that is because it lacks ICU, which would activate extra tests. CI is green too, but the optional CI task "Windows - Server 2019, MinGW64 - Meson" has ICU and it is now failing if you trigger it[1] after commit 35eeea62, in initdb/001_initdb: [05:43:49.764] | 146/305 - options --locale-provider=icu --locale=und --lc-*=C: no stderr FAIL ... because it logs a warning to stderr: WARNING: no usable system locales were found I can only assume there was some extra dependency on setlocale() global state changes in the removed code. I don't quite get it, but whatever the reason, it's less than helpful to have different compilers taking different code paths on our weirdest OS that most of us don't use, so I propose to push this change to take the regular MSVC code path for MinGW too, when looking up code pages. Somehow, this fixes that, though it'd probably take someone with a local MinGW setup to dig into what exactly is happening there. (There are plenty more places where we do something different for MinGW. I suspect they are all obsolete problems. We should probably just harmonise everything and see what breaks now that we have a CI system, but that can be for another day.) That warning is from pg_import_system_locales(), which is new-ish (v16) on that OS. It was recently discovered to trigger a pre-existing problem[2]: the simple setlocale() save/restore pattern doesn't work in general on Windows, because some local names are non-ASCII, and the restore can fail (abort in the system library due to bad encoding, because the intermediate setlocale() changed the expected encoding of the locale name itself). So it's good that we aren't doing that anymore in this location; I'm just thinking out loud about whether that phenomenon could also be somehow connected to this failure, though I don't see it. Another realisation is that my pg_localeconv_r() patch, which can't avoid a thread-safe setlocale() save-and-restore on that OS (and might finish up being the only one left in the tree by the time we're done?), had better use wsetlocale() instead to defend itself against that particular madness. [1] https://cirrus-ci.com/task/5928104793735168 [2] https://www.postgresql.org/message-id/CA%2BhUKG%2BFxeRLURZ%3Dn8NPyLwgjFds_SqU_cQvE40ks6RQKUGbGg%40mail.gmail.com
Attachment
On Tue, Aug 13, 2024 at 6:23 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Patch 3 makes sense too, some comments on the details: > The #ifdefs and the LCONV_MEMBER stuff makes it a bit hard to follow > what happens in each implementation strategy. I wonder if it would be > more clear to duplicate more code. I tried to make it easier to follow. > There's a comment at the top of pg_locale.c ("!!! NOW HEAR THIS !!!") > that needs to be removed or adjusted now. Yeah. We can remove that PSA if we also fix up the equivalent code for LC_TIME. First attempt at that attached. > > * The POSIX standard explicitly says that it is undefined what happens if > > * LC_MONETARY or LC_NUMERIC imply an encoding (codeset) different from > > * that implied by LC_CTYPE. In practice, all Unix-ish platforms seem to > > * believe that localeconv() should return strings that are encoded in the > > * codeset implied by the LC_MONETARY or LC_NUMERIC locale name. Hence, > > * once we have successfully collected the localeconv() results, we will > > * convert them from that codeset to the desired server encoding. > > The patch loses this comment, leaving just a much shorter comment in the > WIN32 implementation. But it still seems like a relevant comment for the > !WIN32 implementation too. New version makes it much clearer, and also is much more careful about what exactly happens if you have mismatched encodings. (Over in CF #3772 I was exploring the idea of banning the use of locales that are not compatible with the database encoding. As far as I can guess, that idea must have come from the time when Windows didn't have native UTF-8 support. Now it does. There I was mostly interested in killing all the whcar_t conversion code, but maybe we could also delete a few lines of transcoding around here too?)
Attachment
On 15/08/2024 11:03, Thomas Munro wrote: > Here's a new patch to add to this pile, this time for check_locale(). > I also improved the error handling and comments in the other patches. There's a similar function in initdb, check_locale_name. I wonder if that could reuse the same code. I wonder if it would be more clear to split this into three functions: /* * Get the name of the locale in "native environment", * like setlocale(category, NULL) does */ char *get_native_locale(int category); /* * Return true if 'locale' is valid locale name for 'category */ bool check_locale(int category, const char *locale); /* * Return a canonical name of given locale */ char *canonicalize_locale(int category, const char *locale); > result = malloc(strlen(canonical) + 1); > if (!result) > goto exit; > strcpy(result, canonical); Could use "result = strdup(canonical)" here. Or even better, could we use palloc()/pstrdup() here, and save the caller the trouble to copy it? -- Heikki Linnakangas Neon (https://neon.tech)
On Tue, Aug 13, 2024 at 5:45 PM Thomas Munro <thomas.munro@gmail.com> wrote: > Over on the discussion thread about remaining setlocale() work[1], I > wrote out some long boring theories about $SUBJECT. Just as an FYI/curiosity, I converted my frustrations with localeconv() into a request[1] that POSIX consider standardising one of the interfaces that doesn't suck, and the reaction seems so far positive. Of course that doesn't really have any actionable consequences on any relevant time frame, even if eventually successful, and we can already use the saner interfaces on the systems most of us really care about, but still... it's nice to think that the pessimistic fallback code (really only used by Solaris and maybe AIX) could eventually be redundant if it goes somewhere... [1] https://www.mail-archive.com/austin-group-l@opengroup.org/msg12850.html
On 16.08.24 02:48, Thomas Munro wrote: > 2. A similar argument applies to Windows canonicalisation. CREATE > COLLATION isn't doing it. CREATE DATABASE is, but again, what is the > point? See previous. I don't really know about Windows locales. But we are doing canonicalization of ICU locale names now. So there seems to be a desire to do canonicalization in general. (Obviously, if we're doing it poorly, then we don't have to keep it that way indefinitely.)
On 19.08.24 08:29, Thomas Munro wrote: > Here is a slightly better version of patch 0003. I removed some > unnecessary refactoring, making the patch smaller. > > FTR I wrote a small program[1] for CI to test the assumptions about > Windows in 0001. I printed out the addresses of the objects, to > confirm that different threads were looking at different objects once > the thread local mode was activated, and also assert that the struct > contents were as expected while 8 threads switched locales in a tight > loop, and the output[2] looked OK to me. > > [1] https://github.com/macdice/hello-windows/blob/793eb2fe3e6738c200781f681a22a7e6358f39e5/test.c > [2] https://cirrus-ci.com/task/4650412253380608 Review of the patch v5-0002-Use-thread-safe-strftime_l-instead-of-strftime.patch: This all looks very sensible. My only comment on the code is that for handling error returns from newlocale() and _create_locale() we already have report_newlocale_failure(), which handles various special cases. (But it doesn't do the _dosmaperr() that your patch does.) It would be best if you used that as well (and maybe improve as needed). A couple of comments on the commit message: > Use thread-safe strftime_l() instead of strftime(). I don't think this is about thread-safety of either function? It's more that the latter requires a non-thread-safe code structure around it. I would frame this more around the use-locale_t-everywhere theme than the thread-safety theme. > While here, adjust error message for strftime_l() failure: it does not > set errno, so no %m. Although POSIX says that strftime() and strftime_l() should change errno, experimentation shows that they do not. So this is fine. But I thought also that the previous code was problematic because errno could be overwritten since the failing call, so you wouldn't get a very accurate error message anyway.
On 14.02.25 15:13, Peter Eisentraut wrote: > On 09.02.25 15:52, Peter Eisentraut wrote: >> This patch set is still desirable. Here is a rebased version of the >> v5 patches. I haven't applied any corrections or review comments. > > Here is the same patch set with some review comments. > > Patch 0001 looks okay to me. I'm just offering some cosmetic > improvements in patch 0004. I have committed this patch. The original patch had a typo that prevented the BSD-ish branch (using localeconv_l()) from being taken, so it actually used the fallback uselocale() branch, which then failed on CI on NetBSD. (But conversely, this gave some testing on CI for the uselocale() branch.) > Patch 0002 also looks okay, except that the error handling could be > unified with existing code, as I had previously pointed out. Patch 0005 > fixes that. I plan to commit this one next, after the above had a bit of time to stew. > About patch 0003: > > I had previously pointed out that the canonicalization might have been > intentional, and that we have canonicalization of ICU locale names. But > we don't have to keep the setlocale()-based locale checking > implementation just for that, I think. (If this was meant to be a real > feature offered by libc, there should be a get_locale_name(locale_t) > function.) POSIX 2024 actually has getlocalename_l(), but it doesn't appear to be widely implemented. Anyway, this patch fails tests on CI on NetBSD, so it will need some further investigation.
On 27.03.25 11:16, Peter Eisentraut wrote: >> Patch 0002 also looks okay, except that the error handling could be >> unified with existing code, as I had previously pointed out. Patch >> 0005 fixes that. > > I plan to commit this one next, after the above had a bit of time to stew. also done >> About patch 0003: >> >> I had previously pointed out that the canonicalization might have been >> intentional, and that we have canonicalization of ICU locale names. >> But we don't have to keep the setlocale()-based locale checking >> implementation just for that, I think. (If this was meant to be a >> real feature offered by libc, there should be a >> get_locale_name(locale_t) function.) > Anyway, this patch fails tests on CI on NetBSD, so it will need some > further investigation. It turns out the problem was that nl_langinfo_l() returns a codeset name of "646" for the C locale, which we didn't have in our mapping table. If we add that, then everything passes there as well. (But the use of nl_langinfo_l() wasn't added by this patch, it just exposes it to the tests. So this is apparently a pre-existing problem.) Attached are the remaining patches in this series.
Attachment
Hi, On 2025-03-28 09:13:54 +0100, Peter Eisentraut wrote: > On 27.03.25 11:16, Peter Eisentraut wrote: > > > Patch 0002 also looks okay, except that the error handling could be > > > unified with existing code, as I had previously pointed out. Patch > > > 0005 fixes that. > > > > I plan to commit this one next, after the above had a bit of time to stew. > > also done My compiler complains with: [20/1982 42 1%] Compiling C object src/port/libpgport_shlib.a.p/pg_localeconv_r.c.o ../../../../../home/andres/src/postgresql/src/port/pg_localeconv_r.c:63:1: warning: ‘static’ is not at beginning of declaration[-Wold-style-declaration] 63 | const static struct lconv_member_info table[] = { | ^~~~~ This is the only such warning in the postgres tree. I'll go and fix that. Greetings, Andres Freund
On 28.03.25 09:13, Peter Eisentraut wrote: >>> About patch 0003: >>> >>> I had previously pointed out that the canonicalization might have >>> been intentional, and that we have canonicalization of ICU locale >>> names. But we don't have to keep the setlocale()-based locale >>> checking implementation just for that, I think. (If this was meant >>> to be a real feature offered by libc, there should be a >>> get_locale_name(locale_t) function.) > >> Anyway, this patch fails tests on CI on NetBSD, so it will need some >> further investigation. > > It turns out the problem was that nl_langinfo_l() returns a codeset name > of "646" for the C locale, which we didn't have in our mapping table. If > we add that, then everything passes there as well. (But the use of > nl_langinfo_l() wasn't added by this patch, it just exposes it to the > tests. So this is apparently a pre-existing problem.) Further analysis: (But I have not tested any of this.) It appears that the new implementation of check_locale() provided by this patch, using newlocale() instead of setlocale(), works differently on NetBSD. Specifically, it apparently does not catch garbage locale names. Instead, it just assumes they are C locale. Then, in pg_get_encoding_from_locale(), we have special cases mapping "C" and "POSIX" to SQL_ASCII. But as discussed, with this patch, we no longer do canonicalization of the passed-in locale name, so if you pass a garbage locale name, it will not match "C" or "POSIX". Then, you fall through to the mapping table, and that's where we get the error about the missing "646" entry. That's why this was not a problem before, even though we've used nl_langinfo_l() for a few months and nl_langinfo() before that. I'm not sure what to do with this. If setlocale() and newlocale() indeed behave differently in what set of locale names they accept, then technically we ought to test both of them, since we do use both of them later on. Or maybe we push on with the effort to get rid of setlocale() calls and then just worry about testing newlocale() (as this patch does). But right now, if newlocale() is more permissive, then we could accept locale names that will later fail setlocale() calls, which might be a problem.
Peter Eisentraut <peter@eisentraut.org> writes: > I'm not sure what to do with this. If setlocale() and newlocale() > indeed behave differently in what set of locale names they accept, then > technically we ought to test both of them, since we do use both of them > later on. Or maybe we push on with the effort to get rid of setlocale() > calls and then just worry about testing newlocale() (as this patch > does). But right now, if newlocale() is more permissive, then we could > accept locale names that will later fail setlocale() calls, which might > be a problem. I think the clear answer is "let's stop using setlocale(), and then not have to worry about any behavioral differences". regards, tom lane
On 31.03.25 15:52, Tom Lane wrote: > Peter Eisentraut <peter@eisentraut.org> writes: >> I'm not sure what to do with this. If setlocale() and newlocale() >> indeed behave differently in what set of locale names they accept, then >> technically we ought to test both of them, since we do use both of them >> later on. Or maybe we push on with the effort to get rid of setlocale() >> calls and then just worry about testing newlocale() (as this patch >> does). But right now, if newlocale() is more permissive, then we could >> accept locale names that will later fail setlocale() calls, which might >> be a problem. > > I think the clear answer is "let's stop using setlocale(), and then > not have to worry about any behavioral differences". Right. That effort is woven into various other ongoing work related to locales, collations, etc. So for now, I'm going to close this commitfest entry as done, since $subject was achieved. The rest can be picked up later, when the required progress in the other work has been made.