Re: [PATCH] Fix severe performance regression with gettext 0.20+ on Windows - Mailing list pgsql-hackers

From Bryan Green
Subject Re: [PATCH] Fix severe performance regression with gettext 0.20+ on Windows
Date
Msg-id 9b5ddea1-fd3a-4eb2-b2fa-b1b0bcdd27c5@gmail.com
Whole thread Raw
In response to Re: [PATCH] Fix severe performance regression with gettext 0.20+ on Windows  (Peter Eisentraut <peter@eisentraut.org>)
List pgsql-hackers
On 12/11/2025 8:43 AM, Peter Eisentraut wrote:
> On 10.12.25 01:45, Bryan Green wrote:
>> The attached patch takes a pragmatic approach: for gettext 0.20.1+, we
>> avoid triggering the bug by using Windows locale format instead of
>> calling IsoLocaleName(). This works because gettext 0.20.1+ internally
>> converts the Windows format back to POSIX for catalog lookups, whereas
>> 0.19.8 and earlier need POSIX format directly.
> 
> I can confirm that this patch fixes the performance deviation from
> activating --enable-nls on Windows (tested with MSYS2/UCRT64).
> 
> I wonder, this change that gettext did with the locale naming, does that
> also affect what guidance we need to provide to users about how to
> configure locale names?  For example, on a Unix-ish system, a user can
> do something like initdb ... --lc-messages=de_DE.  What locale name
> format do you need to use on Windows to get the translations to
> activate?  Does this also depend on the gettext version?
> 
If the language catalogue is installed then they will get translated
messages as expected.  The downside is that because they are passing a
posix locale name then gettext will still do the enumeration everytime.
This will have the negative performance impact.  The good news is that
gettext has accepted my cache patch for their next release.  If a
Windows system is configured with lc_messages="de_DE", but has the next
release of gettext-- they should be fine.  If they don't have the next
release of gettext-- they will notice the performance issue, but that
can be fixed by just changing to from "de_DE" to the correct Windows
locale name.


Walk-through:


1. LCID Lookup: get_lcid("de_DE")
   - Enumerates Windows locales looking for "de_DE"
   - Fails: Windows locales are named "German_Germany", not "de_DE"
   - Returns: 0
   - BUG: Doesn't cache the failure, repeats on every call (patched on
next gettext release)

2. Catalog Search: _nl_make_l10nflist()
   - Tries: locale/de_DE/LC_MESSAGES/postgres-19.mo (not found)
   - Tries: locale/de/LC_MESSAGES/postgres-19.mo (found!)
   - Loads German translations
   - Success!

So, the user gets German messages (catalog fallback works) but
performance is poor (LCID lookup repeats every time) because we don't
cache the failed locale search.



More detailed information for the curious:

Even though get_lcid() returned 0, gettext continues with catalog lookup:

  Function: _nl_find_domain() and _nl_make_l10nflist()
  Location: gettext-runtime/intl/dcigettext.c and l10nflist.c

  Process:
    1. Parse "de_DE" into components:
       language = "de"
       territory = "DE"
       codeset = NULL
       modifier = NULL

    2. Try catalog paths in order (most specific to least specific):

       Try #1: language + territory + codeset + modifier
         Path: /share/locale/de_DE.UTF-8@euro/LC_MESSAGES/postgres-19.mo
         stat(): File not found

       Try #2: language + territory + codeset
         Path: /share/locale/de_DE.UTF-8/LC_MESSAGES/postgres-19.mo
         stat(): File not found

       Try #3: language + territory
         Path: /share/locale/de_DE/LC_MESSAGES/postgres-19.mo
         stat(): File not found (PostgreSQL doesn't ship de_DE)

       Try #4: language + codeset
         Path: /share/locale/de.UTF-8/LC_MESSAGES/postgres-19.mo
         stat(): File not found

       Try #5: language only
         Path: /share/locale/de/LC_MESSAGES/postgres-19.mo
         stat(): SUCCESS! File exists ✓

    3. Load catalog: _nl_load_domain()
       Parse .mo file, load German translations

    4. Look up message: _nl_find_msg()
       Binary search for "division by zero"
       Find translation: "Teilung durch Null"

    5. Return translated message


You might be wondering what happens if the "de" catalog doesn't exist?
It depends on whether the user has set the environment variable LANGUAGE
for their preferred ordered list of languages.  On Windows you can also
set this in the registry. Gettext figures this out.  If LANGUAGE is not
set on Windows then Gettext uses GetUserDefaultUILanguage() to determine
what locale to use. If everything fails, you would get back the msgid
you sent in to start with...so, untranslated.

-- 
Bryan Green
EDB: https://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: "Greg Burd"
Date:
Subject: Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ​barriers
Next
From: Tom Lane
Date:
Subject: Re: Solaris versus our NLS files