Re: Remaining dependency on setlocale() - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Remaining dependency on setlocale()
Date
Msg-id CA+hUKGKcnC_nc1WH5sQfD1-ADgRKAoVHrkDLY9omgoAubrbu-w@mail.gmail.com
Whole thread Raw
In response to Re: Remaining dependency on setlocale()  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Thu, Aug 15, 2024 at 11:00 AM Jeff Davis <pgsql@j-davis.com> wrote:
> On Thu, 2024-08-15 at 10:43 +1200, Thomas Munro wrote:
> > So I think the solution could perhaps be something like: in some
> > early
> > startup phase before there are any threads, we nail down all the
> > locale categories to "C" (or whatever we decide on for the permanent
> > global locale), and also query the "" categories and make a copy of
> > them in case anyone wants them later, and then never call setlocale()
> > again.
>
> +1.

We currently nail down these categories:

    /* We keep these set to "C" always.  See pg_locale.c for explanation. */
    init_locale("LC_MONETARY", LC_MONETARY, "C");
    init_locale("LC_NUMERIC", LC_NUMERIC, "C");
    init_locale("LC_TIME", LC_TIME, "C");

CF #5170 has patches to make it so that we stop changing them even
transiently, using locale_t interfaces to feed our caches of stuff
needed to work with those categories, so they really stay truly nailed
down.

It sounds like someone needs to investigate doing the same thing for
these two, from CheckMyDatabase():

    if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
        ereport(FATAL,
                (errmsg("database locale is incompatible with
operating system"),
                 errdetail("The database was initialized with
LC_COLLATE \"%s\", "
                           " which is not recognized by setlocale().", collate),
                 errhint("Recreate the database with another locale or
install the missing locale.")));

    if (pg_perm_setlocale(LC_CTYPE, ctype) == NULL)
        ereport(FATAL,
                (errmsg("database locale is incompatible with
operating system"),
                 errdetail("The database was initialized with LC_CTYPE \"%s\", "
                           " which is not recognized by setlocale().", ctype),
                 errhint("Recreate the database with another locale or
install the missing locale.")));

How should that work?  Maybe we could imagine something like
MyDatabaseLocale, a locale_t with LC_COLLATE and LC_CTYPE categories
set appropriately.  Or should it be a pg_locale_t instead (if your
database default provider is ICU, then you don't even need a locale_t,
right?).

Then I think there is one quite gnarly category, from
assign_locale_messages() (a GUC assignment function):

    (void) pg_perm_setlocale(LC_MESSAGES, newval);

I have never really studied gettext(), but I know it was just
standardised in POSIX 2024, and the standardised interface has _l()
variants of all functions.  Current implementations don't have them
yet.  Clearly we absolutely couldn't call pg_perm_setlocale() after
early startup --  but if gettext() is relying on the current locale to
affect far away code, then maybe this is one place where we'd just
have to use uselocale().  Perhaps we could plan some transitional
strategy where NetBSD users lose the ability to change the GUC without
restarting the server and it has to be the same for all sessions, or
something like that, until they produce either gettext_l() or
uselocale(), but I haven't thought hard about this part at all yet...



pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: [BUG?] check_exclusion_or_unique_constraint false negative
Next
From: Thomas Munro
Date:
Subject: Re: On non-Windows, hard depend on uselocale(3)