Re: tiny step toward threading: reduce dependence on setlocale() - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: tiny step toward threading: reduce dependence on setlocale()
Date
Msg-id cfd9eb85-c52a-4ec9-a90e-a5e4de56e57d@eisentraut.org
Whole thread Raw
In response to Re: tiny step toward threading: reduce dependence on setlocale()  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On 15.06.24 01:35, Jeff Davis wrote:
> On Thu, 2024-06-06 at 11:37 -0700, Jeff Davis wrote:
>>
>> I think this patch series is a nice cleanup, as well, making libc
>> more
>> like the other providers and not dependent on global state.
> 
> New rebased series attached with additional cleanup. Now that
> pg_locale_t is never NULL, we can simplify the way the collation cache
> works, eliminating ~100 lines.

Overall, this is great.  I have wished for a simplification like this
for a long time.  It is the logical continuation of relying on
locale_t stuff rather than process-global settings.


* v2-0001-Make-database-default-collation-internal-to-pg_lo.patch

+void
+pg_init_database_collation()

The function argument should be (void).

The prefix pg_ makes it sound like it's a user-facing function of some
sort.  Is there a reason for it?

Maybe add a small comment before the function.


* v2-0002-Make-database-collation-pg_locale_t-always-non-NU.patch

There is quite a bit of code duplication from
pg_newlocale_from_collation().  Can we do this better?


* v2-0003-ts_locale.c-do-not-use-NULL-to-mean-the-database-.patch

The "TODO" markers should be left, because what they refer to is that
these functions just use the default collation rather than something
passed in from the expression machinery.  This is not addressed by
this change.  (Obviously, the comments could have been clearer about
this.)


* v2-0004-Remove-support-for-null-pg_locale_t.patch

I found a few more places that should be adjusted in a similar way.

- In src/backend/regex/regc_pg_locale.c, the whole business with
   pg_regex_locale, in particular in pg_set_regex_collation().

- In src/backend/utils/adt/formatting.c, various places such as
   str_tolower().

- In src/backend/utils/adt/pg_locale.c, wchar2char() and char2wchar().
   (Also, for wchar2char() there is one caller that explicitly passes
   NULL.)

The changes at the call sites of pg_locale_deterministic() are
unfortunate, because they kind of go into the opposite direction: They
add checks for NULL locales instead of removing them.  (For a minute I
was thinking I was reading your patch backwards.)  We should think of
a way to write this clearer.

Looking for example at hashtext() after 0001..0004 applied, it is

     if (!lc_collate_is_c(collid))
         mylocale = pg_newlocale_from_collation(collid);

     if (!mylocale || pg_locale_deterministic(mylocale))
     {

But then after your 0006 patch, lc_locale_is_c() internally also calls
pg_newlocale_from_collation(), so this code just becomes redundant.
Better might be if pg_locale_deterministic() itself checked if collate
is C, and then hashtext() would just need to write:

     mylocale = pg_newlocale_from_collation(collid);

     if (pg_locale_deterministic(mylocale))
     {

The patch sequencing might be a bit tricky here.  Maybe it's ok if
patch 0004 stays as is in this respect if 0006 were to fix it back.


* v2-0005-Avoid-setlocale-in-lc_collate_is_c-and-lc_ctype_i.patch

Nothing uses this, AFAICT, so why?

Also, this creates more duplication between
pg_init_database_collation() and pg_newlocale_from_collation(), as
mentioned at patch 0002.


* v2-0006-Simplify-collation-cache.patch

ok




pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Conflict Detection and Resolution
Next
From: Ashutosh Bapat
Date:
Subject: Re: Conflict Detection and Resolution