Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG
Date
Msg-id 933780cc-2265-a47a-9f39-dd9208bd2c64@iki.fi
Whole thread Raw
In response to Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG  (Joe Conway <mail@joeconway.com>)
Responses Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG
List pgsql-hackers
On 21/06/2023 01:02, Joe Conway wrote:
> On 6/19/23 19:30, Heikki Linnakangas wrote:
>> On 18/06/2023 21:27, Joe Conway wrote:
>> With the patch you're proposing, do we now have a coding rule that you
>> must call "uselocale(LC_GLOBAL_LOCALE)" before every and any call to
>> setlocale()? If so, you missed a few spots: pg_perm_setlocale,
>> pg_bind_textdomain_codeset, and cache_locale_time.
> 
> Well I was not proposing such a rule (trying to stay narrowly focused on
> the demonstrated issue) but I suppose it might make sense. Anywhere we
> use setlocale() we are depending on subsequent locale operations to use
> the global locale. And uselocale(LC_GLOBAL_LOCALE) itself looks like it
> ought to be pretty cheap.
> 
>> The current locale affects a lot of other things than localeconv()
>> calls. For example, LC_MESSAGES affects all strerror() calls. Do we need
>> to call "uselocale(LC_GLOBAL_LOCALE)" before all possible strerror()
>> calls too?
> 
> That seems heavy handed

Yet I think that's exactly where this is heading. See this case (for 
gettext() rather than strerror()):

postgres=# set lc_messages ='sv_SE.UTF8';
SET
postgres=# this prints syntax error in Swedish;
FEL:  syntaxfel vid eller nära "this"
LINE 1: this prints syntax error in Swedish;
         ^
postgres=# load 'plperl';
LOAD
postgres=# set lc_messages ='en_GB.utf8';
SET
postgres=# this *should* print syntax error in English;
FEL:  syntaxfel vid eller nära "this"
LINE 1: this *should* print syntax error in English;
         ^

>> I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after
>> returning from the perl interpreter, instead of before setlocale()
>> calls, if we want all Postgres code to run with the global locale. Not
>> sure how much performance overhead that would have.
> 
> I don't see how that is practical, or at least it does not really
> address the issue. I think any loaded shared library could cause the
> same problem by running newlocale() + uselocale() on init. Perhaps I
> should go test that theory though.

Any shared library could do that, that's true. Any shared library could 
also call 'chdir'. But most shared libraries don't. I think it's the 
responsibility of the extension that loads the shared library, plperl in 
this case, to make sure it doesn't mess up the environment for the 
postgres backend.

>> Testing the patch, I bumped into this:
>>
>> postgres=# create or replace function finnish_to_number() returns
>> numeric as $$ select to_number('1,23', '9D99'); $$ language sql set
>> lc_numeric to 'fi_FI.utf8';
>> CREATE FUNCTION
>> postgres=# DO LANGUAGE 'plperlu' $$
>> use POSIX qw(setlocale LC_NUMERIC);
>> use locale;
>>
>> setlocale LC_NUMERIC, "fi_FI.utf8";
>>
>> $n = 5/2;   # Assign numeric 2.5 to $n
>>
>> spi_exec_query('SELECT finnish_to_number()');
>>
>> $a = " $n"; # Locale-dependent conversion to string
>> elog(NOTICE, "half five is $n");       # Locale-dependent output
>> $$;
>> NOTICE:  half five is 2,5
>> DO
>> postgres=# select to_char(now(), 'Day');
>> WARNING:  could not determine encoding for locale "en_GB.UTF-8": codeset
>> is "ANSI_X3.4-1968"
>>      to_char
>> -----------
>>     Tuesday
>> (1 row)
> 
> Do you think that is because uselocale(LC_GLOBAL_LOCALE) pulls out the
> rug from under perl?

libperl is fine in this case. But cache_locale_time() also calls 
setlocale(), and your patch didn't add the "uselocale(LC_GLOBAL_LOCALE)" 
there.

It's a valid concern that "uselocale(LC_GLOBAL_LOCALE)" could pull the 
rug from under perl. I tried to find issues like that, by calling 
locale-dependent functions in plperl, with SQL functions that call 
"uselocale(LC_GLOBAL_LOCALE)" via PGLC_localeconv() in between. But I 
couldn't find any case where the perl code would misbehave. I guess 
libperl calls uselocale() before any locale-dependent function, but I 
didn't look very closely.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Allow pg_archivecleanup to remove backup history files