On 28.03.25 09:13, Peter Eisentraut wrote:
>>> About patch 0003:
>>>
>>> I had previously pointed out that the canonicalization might have
>>> been intentional, and that we have canonicalization of ICU locale
>>> names. But we don't have to keep the setlocale()-based locale
>>> checking implementation just for that, I think. (If this was meant
>>> to be a real feature offered by libc, there should be a
>>> get_locale_name(locale_t) function.)
>
>> Anyway, this patch fails tests on CI on NetBSD, so it will need some
>> further investigation.
>
> It turns out the problem was that nl_langinfo_l() returns a codeset name
> of "646" for the C locale, which we didn't have in our mapping table. If
> we add that, then everything passes there as well. (But the use of
> nl_langinfo_l() wasn't added by this patch, it just exposes it to the
> tests. So this is apparently a pre-existing problem.)
Further analysis:
(But I have not tested any of this.)
It appears that the new implementation of check_locale() provided by
this patch, using newlocale() instead of setlocale(), works differently
on NetBSD. Specifically, it apparently does not catch garbage locale
names. Instead, it just assumes they are C locale. Then, in
pg_get_encoding_from_locale(), we have special cases mapping "C" and
"POSIX" to SQL_ASCII. But as discussed, with this patch, we no longer
do canonicalization of the passed-in locale name, so if you pass a
garbage locale name, it will not match "C" or "POSIX". Then, you fall
through to the mapping table, and that's where we get the error about
the missing "646" entry. That's why this was not a problem before, even
though we've used nl_langinfo_l() for a few months and nl_langinfo()
before that.
I'm not sure what to do with this. If setlocale() and newlocale()
indeed behave differently in what set of locale names they accept, then
technically we ought to test both of them, since we do use both of them
later on. Or maybe we push on with the effort to get rid of setlocale()
calls and then just worry about testing newlocale() (as this patch
does). But right now, if newlocale() is more permissive, then we could
accept locale names that will later fail setlocale() calls, which might
be a problem.