Thread: Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG
(moving to hackers) On 6/12/23 05:13, Heikki Linnakangas wrote: > On 10/06/2023 22:28, Joe Conway wrote: >> On 6/10/23 15:07, Joe Conway wrote: >>> On 6/10/23 14:42, Tom Lane wrote: >>>> Joe Conway <mail@joeconway.com> writes: >>>>> 5/ The attached fixes the issue for me on pg10 and passes check-world. >>>>> Comments? >>>> >>>> The call in PGLC_localeconv seems *very* oddly placed. Why not >>>> do that before it does any other locale calls? Otherwise you don't >>>> really have reason to believe you're saving the appropriate >>>> values to restore later. >>> >>> >>> As far as I can tell it really only affects localeconv(), so I tried to >>> place it close to those. But I am fine with moving it up. >> >> This version is against pg16 (rather than pg10), moves up that hunk, >> mentions localeconv() in the comment as the reason for the call, and >> fixes some whitespace sloppiness. I will plan to apply to all supported >> branches. >> >> Better? > > The man page for uselocale(LC_GLOBAL_LOCALE) says: "The calling thread's > current locale is set to the global locale determined by setlocale(3)." > Does that undo the effect of calling uselocale() previously, so if you > later call setlocale(), the new locale takes effect in the thread too? > Or is it equivalent to "uselocale(LC_ALL, setlocale(NULL))", so that it > sets the thread's locale to the current global locale, but later > setlocale() calls have no effect on it? setlocale() changes the global locale, but uselocale() changes the locale that is currently active, as I understand it. Also note that uselocale man page says "Unlike setlocale(3), uselocale() does not allow selective replacement of individual locale categories. To employ a locale that differs in only a few categories from the current locale, use calls to duplocale(3) and newlocale(3) to obtain a locale object equivalent to the current locale and modify the desired categories in that object." > In any case, this still doesn't feel like the right place. We have many > more setlocale() calls. Shouldn't we sprinkle them all with > uselocale(LC_GLOBAL_LOCALE)? cache_locale_time() for example. Or rather, > all the places where we use any functions that depend on the current locale. > > How about we replace all setlocale() calls with uselocale()? I don't see us backpatching something that invasive. It might be the right thing to do for pg17, or even pg16, but I think that is a different discussion > Shouldn't we restore the old thread-specific locale after the calls? I'm > not sure why libperl calls uselocale(), but we are now overwriting the > locale that it sets. That is a good question. Though arguably perl is doing the wrong thing by not resetting the global locale when it is being used embedded. > We have a few other uselocale() calls in pg_locale.c, and we take > care to restore the old locale in those. I think as long as we are relying on setlocale rather than uselocale in general (see above), the global locale is where we want things left. > There are a few uselocale() calls in ecpg, and they are protected by > HAVE_USELOCALE. Interestingly, the calls in pg_locale.c are not, but > they are protected by HAVE_LOCALE_T. Seems a little inconsistent. Possibly something we should clean up, but I think that is separate from this fix. In general I think we have 2 or possibly three distinct things here: 1/ how do we fix the misbehavior reported due to libperl in existing stable branches 2/ what makes most sense going forward (and does that mean pg16 or pg17) 3/ misc code cleanups I was mostly trying to concentrate on #1, but 2 & 3 are worthy of discussion. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 6/12/23 10:44, Joe Conway wrote: > 1/ how do we fix the misbehavior reported due to libperl in existing > stable branches <snip> > I was mostly trying to concentrate on #1, but 2 & 3 are worthy of > discussion. Hmm, browsing through the perl source I came across a reference to this (from https://perldoc.perl.org/perllocale): --------------- PERL_SKIP_LOCALE_INIT This environment variable, available starting in Perl v5.20, if set (to any value), tells Perl to not use the rest of the environment variables to initialize with. Instead, Perl uses whatever the current locale settings are. This is particularly useful in embedded environments, see "Using embedded Perl with POSIX locales" in perlembed. --------------- Seems we ought to be using that. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 6/12/23 17:28, Joe Conway wrote: > On 6/12/23 10:44, Joe Conway wrote: >> 1/ how do we fix the misbehavior reported due to libperl in existing >> stable branches > > <snip> > >> I was mostly trying to concentrate on #1, but 2 & 3 are worthy of >> discussion. > > Hmm, browsing through the perl source I came across a reference to this > (from https://perldoc.perl.org/perllocale): > > --------------- > PERL_SKIP_LOCALE_INIT > > This environment variable, available starting in Perl v5.20, if set > (to any value), tells Perl to not use the rest of the environment > variables to initialize with. Instead, Perl uses whatever the current > locale settings are. This is particularly useful in embedded > environments, see "Using embedded Perl with POSIX locales" in perlembed. > --------------- > > Seems we ought to be using that. Turns out that that does nothing useful as far as I can tell. So I am back to proposing the attached against pg16beta1, to be backpatched to pg11. Since much of the discussion happened on pgsql-bugs, the background summary for hackers is this: When plperl is first loaded, the init function eventually works its way to calling Perl_init_i18nl10n(). In versions of perl >= 5.20, that ends up at S_emulate_setlocale() which does a series of uselocale() calls. For reference, RHEL 7 is perl 5.16.3 while RHEL 9 is perl 5.32.1. Older versions of perl do not have this behavior. The problem with uselocale() is that it changes the active locale away from the default global locale. Subsequent uses of setlocale() affect the global locale, but if that is not the active locale, it does not control the results of locale dependent functions such as localeconv(), which is what we depend on in PGLC_localeconv(). The result is illustrated in this example: 8<------------ psql test psql (16beta1) Type "help" for help. test=# show lc_monetary; lc_monetary ------------- en_GB.UTF-8 (1 row) test=# SELECT 12.34::money AS price; price -------- £12.34 (1 row) test=# \q 8<------------ psql test psql (16beta1) Type "help" for help. test=# load 'plperl'; LOAD test=# show lc_monetary; lc_monetary ------------- en_GB.UTF-8 (1 row) test=# SELECT 12.34::money AS price; price -------- $12.34 (1 row) 8<------------ Notice that merely loading plperl makes the currency symbol wrong. I have proposed a targeted fix that I believe is safe to backpatch -- attached. IIUC, Tom was +1, but Heikki was looking for a more general solution. My issue with the more general solution is that it will likely be too invasive to backpatch, and at the moment at least, there are no other confirmed bugs related to all of this (even if the current code is more fragile than we would prefer). I would like to commit this to all supported branches in the next few days, unless there are other suggestions or objections. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On 18/06/2023 21:27, Joe Conway wrote: > I have proposed a targeted fix that I believe is safe to backpatch -- > attached. > > IIUC, Tom was +1, but Heikki was looking for a more general solution. > > My issue with the more general solution is that it will likely be too > invasive to backpatch, and at the moment at least, there are no other > confirmed bugs related to all of this (even if the current code is more > fragile than we would prefer). Ok, I agree switching to uselocale() everywhere is too much to backpatch. We should consider it for master though. With the patch you're proposing, do we now have a coding rule that you must call "uselocale(LC_GLOBAL_LOCALE)" before every and any call to setlocale()? If so, you missed a few spots: pg_perm_setlocale, pg_bind_textdomain_codeset, and cache_locale_time. The current locale affects a lot of other things than localeconv() calls. For example, LC_MESSAGES affects all strerror() calls. Do we need to call "uselocale(LC_GLOBAL_LOCALE)" before all possible strerror() calls too? I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after returning from the perl interpreter, instead of before setlocale() calls, if we want all Postgres code to run with the global locale. Not sure how much performance overhead that would have. I just found out about perl's "switch_to_global_locale" function (https://perldoc.perl.org/perlapi#switch_to_global_locale). Should we use that? Testing the patch, I bumped into this: postgres=# create or replace function finnish_to_number() returns numeric as $$ select to_number('1,23', '9D99'); $$ language sql set lc_numeric to 'fi_FI.utf8'; CREATE FUNCTION postgres=# DO LANGUAGE 'plperlu' $$ use POSIX qw(setlocale LC_NUMERIC); use locale; setlocale LC_NUMERIC, "fi_FI.utf8"; $n = 5/2; # Assign numeric 2.5 to $n spi_exec_query('SELECT finnish_to_number()'); $a = " $n"; # Locale-dependent conversion to string elog(NOTICE, "half five is $n"); # Locale-dependent output $$; NOTICE: half five is 2,5 DO postgres=# select to_char(now(), 'Day'); WARNING: could not determine encoding for locale "en_GB.UTF-8": codeset is "ANSI_X3.4-1968" to_char ----------- Tuesday (1 row) -- Heikki Linnakangas Neon (https://neon.tech)
On 6/19/23 19:30, Heikki Linnakangas wrote: > On 18/06/2023 21:27, Joe Conway wrote: >> I have proposed a targeted fix that I believe is safe to backpatch -- >> attached. >> >> IIUC, Tom was +1, but Heikki was looking for a more general solution. >> >> My issue with the more general solution is that it will likely be too >> invasive to backpatch, and at the moment at least, there are no other >> confirmed bugs related to all of this (even if the current code is more >> fragile than we would prefer). > > Ok, I agree switching to uselocale() everywhere is too much to > backpatch. We should consider it for master though. Makes sense > With the patch you're proposing, do we now have a coding rule that you > must call "uselocale(LC_GLOBAL_LOCALE)" before every and any call to > setlocale()? If so, you missed a few spots: pg_perm_setlocale, > pg_bind_textdomain_codeset, and cache_locale_time. Well I was not proposing such a rule (trying to stay narrowly focused on the demonstrated issue) but I suppose it might make sense. Anywhere we use setlocale() we are depending on subsequent locale operations to use the global locale. And uselocale(LC_GLOBAL_LOCALE) itself looks like it ought to be pretty cheap. > The current locale affects a lot of other things than localeconv() > calls. For example, LC_MESSAGES affects all strerror() calls. Do we need > to call "uselocale(LC_GLOBAL_LOCALE)" before all possible strerror() > calls too? That seems heavy handed > I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after > returning from the perl interpreter, instead of before setlocale() > calls, if we want all Postgres code to run with the global locale. Not > sure how much performance overhead that would have. I don't see how that is practical, or at least it does not really address the issue. I think any loaded shared library could cause the same problem by running newlocale() + uselocale() on init. Perhaps I should go test that theory though. > I just found out about perl's "switch_to_global_locale" function > (https://perldoc.perl.org/perlapi#switch_to_global_locale). Should we > use that? Maybe, although it does not seem to exist on the older perl version on RHEL7. And same comment as above -- while it might solve the problem with libperl, it doesn't address similar problems with other loaded shared libraries. > Testing the patch, I bumped into this: > > postgres=# create or replace function finnish_to_number() returns > numeric as $$ select to_number('1,23', '9D99'); $$ language sql set > lc_numeric to 'fi_FI.utf8'; > CREATE FUNCTION > postgres=# DO LANGUAGE 'plperlu' $$ > use POSIX qw(setlocale LC_NUMERIC); > use locale; > > setlocale LC_NUMERIC, "fi_FI.utf8"; > > $n = 5/2; # Assign numeric 2.5 to $n > > spi_exec_query('SELECT finnish_to_number()'); > > $a = " $n"; # Locale-dependent conversion to string > elog(NOTICE, "half five is $n"); # Locale-dependent output > $$; > NOTICE: half five is 2,5 > DO > postgres=# select to_char(now(), 'Day'); > WARNING: could not determine encoding for locale "en_GB.UTF-8": codeset > is "ANSI_X3.4-1968" > to_char > ----------- > Tuesday > (1 row) Do you think that is because uselocale(LC_GLOBAL_LOCALE) pulls out the rug from under perl? -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 21/06/2023 01:02, Joe Conway wrote: > On 6/19/23 19:30, Heikki Linnakangas wrote: >> On 18/06/2023 21:27, Joe Conway wrote: >> With the patch you're proposing, do we now have a coding rule that you >> must call "uselocale(LC_GLOBAL_LOCALE)" before every and any call to >> setlocale()? If so, you missed a few spots: pg_perm_setlocale, >> pg_bind_textdomain_codeset, and cache_locale_time. > > Well I was not proposing such a rule (trying to stay narrowly focused on > the demonstrated issue) but I suppose it might make sense. Anywhere we > use setlocale() we are depending on subsequent locale operations to use > the global locale. And uselocale(LC_GLOBAL_LOCALE) itself looks like it > ought to be pretty cheap. > >> The current locale affects a lot of other things than localeconv() >> calls. For example, LC_MESSAGES affects all strerror() calls. Do we need >> to call "uselocale(LC_GLOBAL_LOCALE)" before all possible strerror() >> calls too? > > That seems heavy handed Yet I think that's exactly where this is heading. See this case (for gettext() rather than strerror()): postgres=# set lc_messages ='sv_SE.UTF8'; SET postgres=# this prints syntax error in Swedish; FEL: syntaxfel vid eller nära "this" LINE 1: this prints syntax error in Swedish; ^ postgres=# load 'plperl'; LOAD postgres=# set lc_messages ='en_GB.utf8'; SET postgres=# this *should* print syntax error in English; FEL: syntaxfel vid eller nära "this" LINE 1: this *should* print syntax error in English; ^ >> I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after >> returning from the perl interpreter, instead of before setlocale() >> calls, if we want all Postgres code to run with the global locale. Not >> sure how much performance overhead that would have. > > I don't see how that is practical, or at least it does not really > address the issue. I think any loaded shared library could cause the > same problem by running newlocale() + uselocale() on init. Perhaps I > should go test that theory though. Any shared library could do that, that's true. Any shared library could also call 'chdir'. But most shared libraries don't. I think it's the responsibility of the extension that loads the shared library, plperl in this case, to make sure it doesn't mess up the environment for the postgres backend. >> Testing the patch, I bumped into this: >> >> postgres=# create or replace function finnish_to_number() returns >> numeric as $$ select to_number('1,23', '9D99'); $$ language sql set >> lc_numeric to 'fi_FI.utf8'; >> CREATE FUNCTION >> postgres=# DO LANGUAGE 'plperlu' $$ >> use POSIX qw(setlocale LC_NUMERIC); >> use locale; >> >> setlocale LC_NUMERIC, "fi_FI.utf8"; >> >> $n = 5/2; # Assign numeric 2.5 to $n >> >> spi_exec_query('SELECT finnish_to_number()'); >> >> $a = " $n"; # Locale-dependent conversion to string >> elog(NOTICE, "half five is $n"); # Locale-dependent output >> $$; >> NOTICE: half five is 2,5 >> DO >> postgres=# select to_char(now(), 'Day'); >> WARNING: could not determine encoding for locale "en_GB.UTF-8": codeset >> is "ANSI_X3.4-1968" >> to_char >> ----------- >> Tuesday >> (1 row) > > Do you think that is because uselocale(LC_GLOBAL_LOCALE) pulls out the > rug from under perl? libperl is fine in this case. But cache_locale_time() also calls setlocale(), and your patch didn't add the "uselocale(LC_GLOBAL_LOCALE)" there. It's a valid concern that "uselocale(LC_GLOBAL_LOCALE)" could pull the rug from under perl. I tried to find issues like that, by calling locale-dependent functions in plperl, with SQL functions that call "uselocale(LC_GLOBAL_LOCALE)" via PGLC_localeconv() in between. But I couldn't find any case where the perl code would misbehave. I guess libperl calls uselocale() before any locale-dependent function, but I didn't look very closely. -- Heikki Linnakangas Neon (https://neon.tech)
On 6/22/23 03:26, Heikki Linnakangas wrote: > On 21/06/2023 01:02, Joe Conway wrote: >> On 6/19/23 19:30, Heikki Linnakangas wrote: >>> I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after >>> returning from the perl interpreter, instead of before setlocale() >>> calls, if we want all Postgres code to run with the global locale. Not >>> sure how much performance overhead that would have. >> >> I don't see how that is practical, or at least it does not really >> address the issue. I think any loaded shared library could cause the >> same problem by running newlocale() + uselocale() on init. Perhaps I >> should go test that theory though. > > Any shared library could do that, that's true. Any shared library could > also call 'chdir'. But most shared libraries don't. I think it's the > responsibility of the extension that loads the shared library, plperl in > this case, to make sure it doesn't mess up the environment for the > postgres backend. Ok, fair enough. The attached fixes all of the issues raised on this thread by specifically patching plperl. 8<------------ create or replace function finnish_to_number() returns numeric as $$ select to_number('1,23', '9D99') $$ language sql set lc_numeric to 'fi_FI.utf8'; pl_regression=# show lc_monetary; lc_monetary ------------- C (1 row) DO LANGUAGE 'plperlu' $$ use POSIX qw(setlocale LC_NUMERIC); use locale; setlocale LC_NUMERIC, "fi_FI.utf8"; $n = 5/2; # Assign numeric 2.5 to $n spi_exec_query('SELECT finnish_to_number()'); # Locale-dependent conversion to string $a = " $n"; # Locale-dependent output elog(NOTICE, "half five is $n"); $$; NOTICE: half five is 2,5 DO set lc_messages ='sv_SE.UTF8'; this prints syntax error in Swedish; FEL: syntaxfel vid eller nära "this" LINE 1: this prints syntax error in Swedish; ^ set lc_messages ='en_GB.utf8'; this *should* print syntax error in English; ERROR: syntax error at or near "this" LINE 1: this *should* print syntax error in English; ^ set lc_monetary ='sv_SE.UTF8'; SELECT 12.34::money AS price; price ---------- 12,34 kr (1 row) set lc_monetary ='en_GB.UTF8'; SELECT 12.34::money AS price; price -------- £12.34 (1 row) set lc_monetary ='en_US.UTF8'; SELECT 12.34::money AS price; price -------- $12.34 (1 row) 8<------------ This works correctly from what I can see -- tested against pg16beta1 on Linux Mint with perl v5.34.0 as well as against pg15.2 on RHEL 7 with perl v5.16.3. Although I have not looked yet, presumably we could have similar problems with plpython. I would like to get agreement on this approach against plperl before diving into that though. Thoughts? -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote: > Although I have not looked yet, presumably we could have similar > problems with plpython. I would like to get agreement on this approach > against plperl before diving into that though. > > Thoughts? I don't see anything immediately wrong with this. I think doing a similar thing for plpython would make sense. Might make sense to CC any other pl* maintainers too. -- Tristan Partin Neon (https://neon.tech)
On 7/3/23 12:25, Tristan Partin wrote: > On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote: >> Although I have not looked yet, presumably we could have similar >> problems with plpython. I would like to get agreement on this approach >> against plperl before diving into that though. >> >> Thoughts? > > I don't see anything immediately wrong with this. I think doing a > similar thing for plpython would make sense. Might make sense to CC any > other pl* maintainers too. In our tree there are only plperl and plpython to worry about. "other pl* maintainers" is a fuzzy concept since other pl's are scattered far and wide. I think it is reasonable to expect such maintainers to be paying attention to hackers and pick up on it themselves (I say that as a pl maintainer myself -- plr) -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 7/3/23 12:25, Tristan Partin wrote: > On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote: >> Although I have not looked yet, presumably we could have similar >> problems with plpython. I would like to get agreement on this approach >> against plperl before diving into that though. >> >> Thoughts? > > I don't see anything immediately wrong with this. Any further comments on the posted patch[1]? I would like to apply/push this prior to the beta and minor releases next week. Joe [1] https://www.postgresql.org/message-id/ec6fa20d-e691-198a-4a13-e761771b9dec%40joeconway.com -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue Aug 1, 2023 at 8:48 AM CDT, Joe Conway wrote: > On 7/3/23 12:25, Tristan Partin wrote: > > On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote: > >> Although I have not looked yet, presumably we could have similar > >> problems with plpython. I would like to get agreement on this approach > >> against plperl before diving into that though. > >> > >> Thoughts? > > > > I don't see anything immediately wrong with this. > > Any further comments on the posted patch[1]? I would like to apply/push > this prior to the beta and minor releases next week. > > Joe > > [1] > https://www.postgresql.org/message-id/ec6fa20d-e691-198a-4a13-e761771b9dec%40joeconway.com None from my end. -- Tristan Partin Neon (https://neon.tech)
On 01/08/2023 16:48, Joe Conway wrote: > Any further comments on the posted patch[1]? I would like to apply/push > this prior to the beta and minor releases next week. I'm not sure about the placement of the uselocale() calls. In plperl_spi_exec(), for example, I think we should switch to the global locale right after the check_spi_usage_allowed() call. Otherwise, if an error happens in BeginInternalSubTransaction() or in pg_verifymbstr(), the error would be processed with the perl locale. Maybe that's harmless, error processing hardly cares about LC_MONETARY, but seems wrong in principle. Hmm, come to think of it, if BeginInternalSubTransaction() throws an error, we just jump out of the perl interpreter? That doesn't seem cool. But that's not new with this patch. If I'm reading correctly, compile_plperl_function() calls select_perl_context(), which calls plperl_trusted_init(), which calls uselocale(). So it leaves locale set to the perl locale. Who sets it back? How about adding a small wrapper around eval_pl() that sets and unsets the locale(), just when we enter the interpreter? It's easier to see that we are doing the calls in right places, if we make them as close as possible to entering/exiting the interpreter. Are there other functions in addition to eval_pl() that need to be called with the perl locale? > /* > * plperl_xact_callback --- cleanup at main-transaction end. > */ > static void > plperl_xact_callback(XactEvent event, void *arg) > { > /* ensure global locale is the current locale */ > if (uselocale((locale_t) 0) != LC_GLOBAL_LOCALE) > perl_locale_obj = uselocale(LC_GLOBAL_LOCALE); > } So the assumption is that the if current locale is not LC_GLOBAL_LOCALE, then it was the perl locale. Seems true today, but this could confusion if anything else calls uselocale(). In particular, if another PL implementation copies this, and you use plperl and the other PL at the same time, they would get mixed up. I think we need another "bool perl_locale_obj_in_use" variable to track explicitly whether the perl locale is currently active. If we are careful to put the uselocale() calls in the right places so that we never ereport() while in perl locale, this callback isn't needed. Maybe it's still a good idea, though, to be extra sure that things get reset to a sane state if something unexpected happens. If multiple interpreters are used, is the single perl_locale_obj variable still enough? Each interpreter can have their own locale I believe. PS. please pgindent -- Heikki Linnakangas Neon (https://neon.tech)
On 8/15/23 10:40, Heikki Linnakangas wrote: > On 01/08/2023 16:48, Joe Conway wrote: >> Any further comments on the posted patch[1]? I would like to apply/push >> this prior to the beta and minor releases next week. > > I'm not sure about the placement of the uselocale() calls. In > plperl_spi_exec(), for example, I think we should switch to the global > locale right after the check_spi_usage_allowed() call. Otherwise, if an > error happens in BeginInternalSubTransaction() or in pg_verifymbstr(), > the error would be processed with the perl locale. Maybe that's > harmless, error processing hardly cares about LC_MONETARY, but seems > wrong in principle. I guess you could probably argue that we should flip this around, and only enter the perl locale when calling into libperl, and exit the perl locale every time we reemerge under plperl.c control. That seems pretty drastic and potentially messy though. > Hmm, come to think of it, if BeginInternalSubTransaction() throws an > error, we just jump out of the perl interpreter? That doesn't seem cool. > But that's not new with this patch. Hmm, true enough I guess. > If I'm reading correctly, compile_plperl_function() calls > select_perl_context(), which calls plperl_trusted_init(), which calls > uselocale(). So it leaves locale set to the perl locale. Who sets it back? No one does it seems, at least not currently > How about adding a small wrapper around eval_pl() that sets and unsets > the locale(), just when we enter the interpreter? It's easier to see > that we are doing the calls in right places, if we make them as close as > possible to entering/exiting the interpreter. Are there other functions > in addition to eval_pl() that need to be called with the perl locale? I can see that as a better strategy, but "other functions in addition to eval_pv()" (I assume you mean eval_pv rather than eval_pl) is a tricky one to answer. I ran the attached script like so (from cwd src/pl/plperl) like so: ``` symbols-used.sh /lib/x86_64-linux-gnu/libperl.so.5.34 plperl.so ``` and get a fairly long list of exported libperl functions that get linked into plperl.so: ``` Matched symbols: boot_DynaLoader perl_alloc Perl_av_extend Perl_av_fetch Perl_av_len Perl_av_push *Perl_call_list *Perl_call_pv *Perl_call_sv perl_construct Perl_croak Perl_croak_nocontext Perl_croak_sv Perl_croak_xs_usage Perl_die *Perl_eval_pv Perl_free_tmps Perl_get_sv Perl_gv_add_by_type Perl_gv_stashpv Perl_hv_clear Perl_hv_common Perl_hv_common_key_len Perl_hv_iterinit Perl_hv_iternext Perl_hv_iternext_flags Perl_hv_iternextsv Perl_hv_ksplit Perl_looks_like_number Perl_markstack_grow Perl_mg_get Perl_newRV Perl_newRV_noinc Perl_newSV Perl_newSViv Perl_newSVpv Perl_newSVpvn Perl_newSVpvn_flags Perl_newSVsv Perl_newSVsv_flags Perl_newSV_type Perl_newSVuv Perl_newXS Perl_newXS_flags *perl_parse Perl_pop_scope Perl_push_scope *perl_run Perl_save_item Perl_savetmps Perl_stack_grow Perl_sv_2bool Perl_sv_2bool_flags Perl_sv_2iv Perl_sv_2iv_flags Perl_sv_2mortal Perl_sv_2pv Perl_sv_2pvbyte Perl_sv_2pvbyte_flags Perl_sv_2pv_flags Perl_sv_2pvutf8 Perl_sv_2pvutf8_flags Perl_sv_bless Perl_sv_free Perl_sv_free2 Perl_sv_isa Perl_sv_newmortal Perl_sv_setiv Perl_sv_setiv_mg Perl_sv_setsv Perl_sv_setsv_flags Perl_sys_init Perl_sys_init3 Perl_xs_boot_epilog Perl_xs_handshake ``` I marked the ones that look like perhaps we should care about in the above list with an asterisk: *Perl_call_list *Perl_call_pv *Perl_call_sv *Perl_eval_pv *perl_run but perhaps there are others? >> /* >> * plperl_xact_callback --- cleanup at main-transaction end. >> */ >> static void >> plperl_xact_callback(XactEvent event, void *arg) >> { >> /* ensure global locale is the current locale */ >> if (uselocale((locale_t) 0) != LC_GLOBAL_LOCALE) >> perl_locale_obj = uselocale(LC_GLOBAL_LOCALE); >> } > > So the assumption is that the if current locale is not LC_GLOBAL_LOCALE, > then it was the perl locale. Seems true today, but this could confusion > if anything else calls uselocale(). In particular, if another PL > implementation copies this, and you use plperl and the other PL at the > same time, they would get mixed up. I think we need another "bool > perl_locale_obj_in_use" variable to track explicitly whether the perl > locale is currently active. Or perhaps don't assume that we want the global locale and swap between pg_locale_obj (whatever it is) and perl_locale_obj? > If we are careful to put the uselocale() calls in the right places so > that we never ereport() while in perl locale, this callback isn't > needed. Maybe it's still a good idea, though, to be extra sure that > things get reset to a sane state if something unexpected happens. I feel more comfortable that we have a "belt and suspenders" method to restore the locale that was in use by Postgres before entering perl. > If multiple interpreters are used, is the single perl_locale_obj > variable still enough? Each interpreter can have their own locale I believe. So in other words plperl and plperlu both used in the same query? I don't see how we could get from one to the other without going through the outer "postgres" locale first. Or are you thinking something else? > PS. please pgindent ok -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On 27/08/2023 16:41, Joe Conway wrote: > On 8/15/23 10:40, Heikki Linnakangas wrote: >> If multiple interpreters are used, is the single perl_locale_obj >> variable still enough? Each interpreter can have their own locale I believe. > > So in other words plperl and plperlu both used in the same query? I > don't see how we could get from one to the other without going through > the outer "postgres" locale first. Or are you thinking something else? I think you got that it backwards. 'perl_locale_obj' is set to the perl interpreter's locale, whenever we are *outside* the interpreter. This crashes with the patch: postgres=# DO LANGUAGE plperlu $function$ use POSIX qw(setlocale LC_NUMERIC); use locale; setlocale LC_NUMERIC, "sv_SE.utf8"; $function$; DO postgres=# do language plperl $$ $$; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. I was going to test using plperl and plperl in the same session and expected the interpreters to mix up the locales they use. Maybe the crash is because of something like that, although I didn't expect a crash, just weird confusion on which locale is used. -- Heikki Linnakangas Neon (https://neon.tech)
On Sun, Aug 27, 2023 at 4:25 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > I think you got that it backwards. 'perl_locale_obj' is set to the perl > interpreter's locale, whenever we are *outside* the interpreter. This thread has had no update for more than 4 months, so I'm marking the CF entry RwF for now. It can always be reopened, if Joe or Tristan or Heikki or someone else picks it up again. -- Robert Haas EDB: http://www.enterprisedb.com
On 1/5/24 12:56, Robert Haas wrote: > On Sun, Aug 27, 2023 at 4:25 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> I think you got that it backwards. 'perl_locale_obj' is set to the perl >> interpreter's locale, whenever we are *outside* the interpreter. > > This thread has had no update for more than 4 months, so I'm marking > the CF entry RwF for now. > > It can always be reopened, if Joe or Tristan or Heikki or someone else > picks it up again. It is definitely a bug, so I do plan to get back to it at some point, hopefully sooner rather than later... -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com