Thread: plpgsql is not translate-aware
Hi, In reviewing Volkan Yazici's (sorry for the dots) patch to improve plpgsql's error messages, I noticed that we have no PO files for plpgsql at all! It doesn't seem hard to add; I just had to create a nls.mk file and things seem ready to go. Obviously, we'll need to add plpgsql to the pgtranslation files in pgfoundry. There are 141 new strings to translate, and from spanish I get 71 fuzzies, so it seems an easy project. Should I go ahead and commit the initial files? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > It doesn't seem hard to add; I just had to create a nls.mk file and > things seem ready to go. Obviously, we'll need to add plpgsql to the > pgtranslation files in pgfoundry. Actually this is wrong -- since the library is going to run with "postgres" text domain, we need to add the files to the backend's nls.mk: Index: nls.mk =================================================================== RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/nls.mk,v retrieving revision 1.22 diff -c -p -u -r1.22 nls.mk --- nls.mk 24 Mar 2008 18:08:47 -0000 1.22 +++ nls.mk 5 Sep 2008 16:00:18 -0000 @@ -7,7 +7,7 @@ GETTEXT_FILES := + gettext-filesGETTEXT_TRIGGERS:= _ errmsg errdetail errdetail_log errhint errcontextwrite_stderr yyerrorgettext-files: distprep - find $(srcdir)/ $(srcdir)/../port/ -name '*.c' -print >$@ + find $(srcdir)/ $(srcdir)/../port/ $(srcdir)/../pl/ -name '*.c' -print >$@my-maintainer-clean: rm -f gettext-files -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > In reviewing Volkan Yazici's (sorry for the dots) patch to improve > plpgsql's error messages, I noticed that we have no PO files for plpgsql > at all! Ugh. Yeah, we should fix that. Does it actually just work, seeing that plpgsql is a loadable library? regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > In reviewing Volkan Yazici's (sorry for the dots) patch to improve > > plpgsql's error messages, I noticed that we have no PO files for plpgsql > > at all! > > Ugh. Yeah, we should fix that. Does it actually just work, seeing > that plpgsql is a loadable library? Well, it didn't, but I just tested what I posted in the followup and it does work: alvherre=# create function aa (internal) returns int language plpgsql as $$ begin; select 1; end; $$; ERROR: las funciones plpgsql no pueden tener el tipo internal como argumento The vices in the error message are not the translator's fault: missing quotes and "plpgsql" instead of "PL/pgSQL": alvherre=# set lc_messages to 'C'; SET alvherre=# create function aa (internal) returns int language plpgsql as $$ begin; select 1; end; $$; ERROR: plpgsql functions cannot take type internal I'd even go a bit further and say that the original should not include the language name in the string, so that (say) plpython and plperl can use the same translation: "%s functions cannot take type \"%s\"", "PL/pgSQL", type_name -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Actually this is wrong -- since the library is going to run with > "postgres" text domain, we need to add the files to the backend's > nls.mk: Can't we give it its own text domain? It seems fundamentally wrong for a plug-in language to require core support for its messages. (Now that I think about it, that may have been the reason we don't have localization for it already.) I suppose this must be possible, since e.g. glibc manages to have its own messages separate from whatever app it's linked with. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Actually this is wrong -- since the library is going to run with > > "postgres" text domain, we need to add the files to the backend's > > nls.mk: > > Can't we give it its own text domain? It seems fundamentally wrong > for a plug-in language to require core support for its messages. > (Now that I think about it, that may have been the reason we don't have > localization for it already.) I suppose this must be possible, > since e.g. glibc manages to have its own messages separate from > whatever app it's linked with. I'm not sure how this'd work. I think this would require plpgsql using dgettext (passing a domain) instead of plain gettext(), but since it uses ereport() just like the backend, I don't see a good way to make that work. Maybe another idea would be to call textdomain() just before calling anything that would raise an error, and reset it on exit. But since backend errors can happen at any time too, this doesn't seem possible. Not sure how glibc does it. Maybe they just use dgettext(). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > The vices in the error message are not the translator's fault: missing > quotes and "plpgsql" instead of "PL/pgSQL": It's been message style policy for quite some time to not quote the output of format_type. I think this is because format_type sometimes puts quotes into its output, and it'd look weird. > I'd even go a bit further and say that the original should not include > the language name in the string, so that (say) plpython and plperl can > use the same translation: That'd only be useful if they all share a common message catalog, which does not seem like a good design direction to me. How would a non-core PL hope to get localized if it can't have its own catalog? regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Actually this is wrong -- since the library is going to run with > > "postgres" text domain, we need to add the files to the backend's > > nls.mk: > > Can't we give it its own text domain? It seems fundamentally wrong > for a plug-in language to require core support for its messages. > (Now that I think about it, that may have been the reason we don't have > localization for it already.) I suppose this must be possible, > since e.g. glibc manages to have its own messages separate from > whatever app it's linked with. What glibc does is use dgettext() instead of plain gettext(), so they are able to pass the domain along the message. See here, at the bottom: http://github.com/bneumeier/glibc/tree/master/include/libintl.h where _libc_intl_domainname is defined as "libc" elsewhere. I guess one way to go about this is to add some way to pass the domain from the caller, i.e. ereport(ERROR, (errdomain("plpgsql"), errmsg(" ... "))) but it seems a bit fragile, if only because it's easy to forget to add it to new code, and easy to overlook a message that should have it. Another idea would be to redefine errmsg() and friends within the plpgsql sources, but that seems worse because every other library will need to do the same. Refinement of the previous idea: maybe we can add a compiler flag, to be set in Makefiles, that defines the domain and passes it down to EVALUATE_MESSAGE. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> Can't we give it its own text domain? It seems fundamentally wrong >> for a plug-in language to require core support for its messages. > Another idea would be to redefine errmsg() and friends within the > plpgsql sources, but that seems worse because every other library will > need to do the same. > Refinement of the previous idea: maybe we can add a compiler flag, to be > set in Makefiles, that defines the domain and passes it down to > EVALUATE_MESSAGE. It seems like something like that could work. I'd be inclined to change the ereport() macro itself, not errmsg/errdetail/etc. The domain name could be passed in at ereport and then used when evaluating the messages. A possible problem is that some places use _() (ie, gettext) directly instead of leaving it to the elog.c code to apply translation. But I suppose we might be able to redefine _() too. regards, tom lane
Tom Lane wrote: > It seems like something like that could work. I'd be inclined to change > the ereport() macro itself, not errmsg/errdetail/etc. The domain name > could be passed in at ereport and then used when evaluating the > messages. This seems to work fine. Modules would provide their message domain by overriding the ereport() definition to pass a different domain to ereport_domain(), like the attached plpgsql_i18n.h. Now, the obvious big problem I have with this patch is that I have to pass plpgsql's locale path to bindtextdomain(), but I'm not seeing any decent way to do that ... ideas? Note that I didn't bother changing the elog() macros to provide a message domain ... I'm sure that can be fixed but it's very low priority, given that most of the time those messages do not get translated. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > Now, the obvious big problem I have with this patch is that I have to > pass plpgsql's locale path to bindtextdomain(), but I'm not seeing any > decent way to do that ... ideas? Shouldn't it just use the same locale path as the backend? Or are you stuck getting hold of my_exec_path so you can call get_locale_path? If the latter, it would probably be reasonable for postmaster/backend startup to expose the exec path as a global variable. Two minor nits about the patch itself: the domain field of ErrorData should be const char * (it's like funcname, not like message; in fact, this coding ought to be giving you a warning about casting away const) and there seems some gratuitous inconsistency between the ordering of function arguments, field locations, and statements copying one to the other. (Yeah, I know, that last is *really* anal-retentive, but it's just easier to see that things match up and you didn't miss anything when you keep consistent ordering.) > Note that I didn't bother changing the elog() macros to provide a > message domain ... I'm sure that can be fixed but it's very low > priority, given that most of the time those messages do not get > translated. None of the time do elog messages get translated, so yes, that would be pointless. However --- > ! if (!errstart(elevel, edata->filename, edata->lineno, NULL, edata->funcname)) --- this looks like it could result in passing a NULL to dgettext, somewhere along the line. Probably safer to pass "postgres". regards, tom lane
Alvaro Herrera wrote: > Now, the obvious big problem I have with this patch is that I have to > pass plpgsql's locale path to bindtextdomain(), but I'm not seeing any > decent way to do that ... ideas? I think the best way to do this is to stash the library path being loaded where _PG_init can find it. It is a bit ugly but not beyond belief. Would anybody object to this? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Now, the obvious big problem I have with this patch is that I have to > > pass plpgsql's locale path to bindtextdomain(), but I'm not seeing any > > decent way to do that ... ideas? > > Shouldn't it just use the same locale path as the backend? Or are you > stuck getting hold of my_exec_path so you can call get_locale_path? > If the latter, it would probably be reasonable for postmaster/backend > startup to expose the exec path as a global variable. That would work too. I'm not sure I prefer this, or the hack to have dfmgr.c expose it to _PG_init, per my v2 patch. > Two minor nits about the patch itself: the domain field of ErrorData > should be const char * (it's like funcname, not like message; in fact, > this coding ought to be giving you a warning about casting away const) Yeah, I had added it in v2. I had failed to see the warning, but it was there. > and there seems some gratuitous inconsistency between the ordering of > function arguments, field locations, and statements copying one to the > other. True, fixed. > However --- > > > ! if (!errstart(elevel, edata->filename, edata->lineno, NULL, edata->funcname)) > > --- this looks like it could result in passing a NULL to dgettext, > somewhere along the line. Probably safer to pass "postgres". Hmm, I was trusting that dgettext is documented to accept a NULL as meaning "use the domain previously set with textdomain", but then it is possible that elog() will be called before textdomain is set, so you might be right. Fixed in this new version. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > Would anybody object to this? I don't think this will work desirably at all ... get_locale_path assumes that it's given a path to $installdir/bin/someexecutable, not wherever shared libraries might have come from. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Would anybody object to this? > > I don't think this will work desirably at all ... get_locale_path > assumes that it's given a path to $installdir/bin/someexecutable, > not wherever shared libraries might have come from. Hmm, OK, I'll see about doing it the other way. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> --- this looks like it could result in passing a NULL to dgettext, >> somewhere along the line. Probably safer to pass "postgres". > Hmm, I was trusting that dgettext is documented to accept a NULL as > meaning "use the domain previously set with textdomain", but then it is > possible that elog() will be called before textdomain is set, so you > might be right. Fixed in this new version. Another way, which would save some amount of string constant space, is to have both elog_finish and the ereport macro pass NULL, and let errstart insert the default: > + edata->domain = domain ? domain : "postgres"; Otherwise we'll have at least one copy of "postgres" per backend .o file ... regards, tom lane
Tom Lane wrote: > Another way, which would save some amount of string constant space, > is to have both elog_finish and the ereport macro pass NULL, and let > errstart insert the default: > > > + edata->domain = domain ? domain : "postgres"; > > Otherwise we'll have at least one copy of "postgres" per backend .o > file ... Hmm, true. I think this means we need to redefine ereport(), not just TEXTDOMAIN, in each module (in fact it makes TEXTDOMAIN goes away as a symbol). Same number of lines on each module though, so it's not a considerable difference. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
... and here it is with equivalent definitions added for plperl, plpython and pltcl. (Notably missing in this patch are the necessary nls.mk files). I note that this opens the door to contrib modules (and others) wanting to provide translations, but I'm not going there for now. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> Another way, which would save some amount of string constant space, >> is to have both elog_finish and the ereport macro pass NULL, and let >> errstart insert the default: > Hmm, true. I think this means we need to redefine ereport(), not just > TEXTDOMAIN, in each module (in fact it makes TEXTDOMAIN goes away as a > symbol). Same number of lines on each module though, so it's not a > considerable difference. No, you could have TEXTDOMAIN be defined as NULL by default, and let modules redefine it as "foo". regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Hmm, true. I think this means we need to redefine ereport(), not just > > TEXTDOMAIN, in each module (in fact it makes TEXTDOMAIN goes away as a > > symbol). Same number of lines on each module though, so it's not a > > considerable difference. > > No, you could have TEXTDOMAIN be defined as NULL by default, and let > modules redefine it as "foo". Doh, right. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Tom Lane wrote: > > No, you could have TEXTDOMAIN be defined as NULL by default, and let > > modules redefine it as "foo". > > Doh, right. So this'd seem to be the version ready to be applied (plus the needed nls.mk files). -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > So this'd seem to be the version ready to be applied (plus the needed > nls.mk files). I'm down to two seriously trivial nitpicks: * "preferably" has one r. > + const char *domain; /* message domain, NULL if default */ This comment is incorrect, since the field value is not actually intended to ever be NULL. But you need Peter's advice before going further, because I don't actually know a darn thing about the NLS infrastructure. regards, tom lane
Alvaro Herrera wrote: > Alvaro Herrera wrote: >> Tom Lane wrote: > >>> No, you could have TEXTDOMAIN be defined as NULL by default, and let >>> modules redefine it as "foo". >> Doh, right. > > So this'd seem to be the version ready to be applied (plus the needed > nls.mk files). Perhaps repeated code like the following should be refactored to a common function offered by the backend? > > diff -c -p -r1.40 pl_handler.c > *** src/pl/plpgsql/src/pl_handler.c 29 Aug 2008 13:02:33 -0000 1.40 > --- src/pl/plpgsql/src/pl_handler.c 9 Oct 2008 00:51:22 -0000 > *************** _PG_init(void) > *** 42,47 **** > --- 42,57 ---- > if (inited) > return; > > + #ifdef ENABLE_NLS > + if (my_exec_path[0] != '\0') > + { > + char locale_path[MAXPGPATH]; > + > + get_locale_path(my_exec_path, locale_path); > + bindtextdomain(TEXTDOMAIN, locale_path); > + } > + #endif > + > plpgsql_HashTableInit(); > RegisterXactCallback(plpgsql_xact_cb, NULL); > RegisterSubXactCallback(plpgsql_subxact_cb, NULL);
Peter Eisentraut wrote: > Alvaro Herrera wrote: >> So this'd seem to be the version ready to be applied (plus the needed >> nls.mk files). > > Perhaps repeated code like the following should be refactored to a > common function offered by the backend? True; committed that way, with a new function in miscinit.c. Now we need the new catalogs to be added to the pgtranslations project files; the ball is in your (Peter's) court. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > True; committed that way, with a new function in miscinit.c. So apparently I broke buildfarm member grebe: /opt/prod/gcc-4.1.1/bin/gcc -maix64 -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels-fno-strict-aliasing -fwrapv -g -I../../../../src/include -I/opt/prod/tcl8.4.13_64_20060901/include -I/opt/freeware/include -c -o elog.o elog.c -MMD -MP -MF .deps/elog.Po elog.c: In function 'errmsg': elog.c:663: warning: implicit declaration of function 'dgettext' elog.c:663: warning: incompatible implicit declaration of built-in function 'dgettext' elog.c: In function 'errmsg_internal': elog.c:689: warning: incompatible implicit declaration of built-in function 'dgettext' elog.c: In function 'errdetail': elog.c:710: warning: incompatible implicit declaration of built-in function 'dgettext' elog.c: In function 'errdetail_log': elog.c:731: warning: incompatible implicit declaration of built-in function 'dgettext' elog.c: In function 'errhint': elog.c:752: warning: incompatible implicit declaration of built-in function 'dgettext' elog.c: In function 'errcontext': elog.c:777: warning: incompatible implicit declaration of built-in function 'dgettext' elog.c: In function 'elog_finish': elog.c:996: warning: incompatible implicit declaration of built-in function 'dgettext' ... /opt/prod/gcc-4.1.1/bin/gcc -maix64 -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels-fno-strict-aliasing -fwrapv -g -L../../src/port -Wl,-bbigtoc -L/opt/prod/tcl8.4.13_64_20060901/lib -L/opt/freeware/lib -Wl,-blibpath:/opt/rg/data_dba/build-farm/HEAD/inst/lib:/opt/prod/tcl8.4.13_64_20060901/lib:/opt/freeware/lib:/usr/lib:/lib access/common/heaptuple.oaccess/common/indextuple.o access/common/printtup.o access/common/reloptions.o [...] ld: 0711-317 ERROR: Undefined symbol: .dgettext ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information. collect2: ld returned 8 exit status gmake[2]: *** [postgres] Error 1 gmake[2]: Leaving directory `/opt/rg/data_dba/build-farm/HEAD/pgsql.1126634/src/backend' gmake[1]: *** [all] Error 2 gmake[1]: Leaving directory `/opt/rg/data_/build-farm/HEAD/pgsql.1126634/src' gmake: *** [all] Error 2 -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Alvaro Herrera wrote: > > > True; committed that way, with a new function in miscinit.c. > > So apparently I broke buildfarm member grebe: Hmm, see bug #2608, http://archives.postgresql.org/pgsql-bugs/2006-09/msg00029.php I don't understand why gettext works and dgettext doesn't ... -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support