Re: Remaining dependency on setlocale() - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Remaining dependency on setlocale()
Date
Msg-id CA+hUKGLZ8A_j+UQidb2yKeUQm_FOYz3HgZRES21YOOZBT88pZQ@mail.gmail.com
Whole thread Raw
In response to Re: Remaining dependency on setlocale()  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Remaining dependency on setlocale()
List pgsql-hackers
On Thu, Aug 8, 2024 at 5:16 AM Jeff Davis <pgsql@j-davis.com> wrote:
> On Wed, 2024-08-07 at 19:07 +1200, Thomas Munro wrote:
> > How far can we get by using more _l() functions?
>
> There are a ton of calls to, for example, isspace(), used mostly for
> parsing.
>
> I wouldn't expect a lot of differences in behavior from locale to
> locale, like might be the case with iswspace(), but behavior can be
> different at least in theory.
>
> So I guess we're stuck with setlocale()/uselocale() for a while, unless
> we're able to move most of those call sites over to an ascii-only
> variant.

Here are two more cases that I don't think I've seen discussed.

1.  The nl_langinfo() call in pg_get_encoding_from_locale(), can
probably be changed to nl_langinfo_l() (it is everywhere we currently
care about except Windows, which has a different already-thread-safe
alternative; AIX seems to lack the _l version, but someone writing a
patch to re-add support for that OS could supply the configure goo for
a uselocale() safe/restore implementation).  One problem is that it
has callers that pass it NULL meaning the backend default, but we'd
perhaps use LC_C_GLOBAL for now and have to think about where we get
the database default locale_t in the future.

2.  localeconv() is *doubly* non-thread-safe: it depends on the
current locale, and it also returns an object whose storage might be
clobbered by any other call to localeconv(), setlocale, or even,
according to POSIX, uselocale() (!!!).  I think that effectively
closes off that escape hatch.  On some OSes (macOS, BSDs) you find
localeconv_l() and then I think they give you a more workable
lifetime: as long as the locale_t lives, which makes perfect sense.  I
am surprised that no one has invented localeconv_r() where you supply
the output storage, and you could wrap that in uselocale()
save/restore to deal with the other problem, or localeconv_r_l() or
something.  I can't understand why this is so bad.  The glibc
documentation calls it "a masterpiece of poor design".  Ahh, so it
seems like we need to delete our use of localeconf() completely,
because we should be able to get all the information we need from
nl_langinfo_l() instead:

https://www.gnu.org/software/libc/manual/html_node/Locale-Information.html



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Logical Replication of sequences
Next
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: [BUG?] check_exclusion_or_unique_constraint false negative