Thread: gettext, plural form and translation
Hi, all. gnu gettext have support for correct plural form translation (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html), but postgresql does not use it. why not ? maybe it have some problem in some supported OS ? if not, can it implemented ? maybe someone already doing this ? ps: i try to translate psql message "(1 row)/(3 rows)" but can't do this correctly without plural form support. need some work with source for implement it and xgettext params used for extract messages for http://babel.postgresql.org/ Thanks for comments ! -- Sergey Burladyan
Sergey Burladyan escribió: > Hi, all. > > gnu gettext have support for correct plural form translation > (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html), > but postgresql does not use it. why not ? > maybe it have some problem in some supported OS ? if not, can it implemented ? > maybe someone already doing this ? > > ps: i try to translate psql message "(1 row)/(3 rows)" but can't do this > correctly without plural form support. You don't need plural forms in this example. We have three separate messages, one for "(No rows)", another one for the singular "(1 row)" and a third one for the plural "(N rows)". We avoid mixing plurals and singulars. Is this still a problem for you somewhere? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera escribió: > Sergey Burladyan escribió: > > Hi, all. > > > > gnu gettext have support for correct plural form translation > > (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html), > > but postgresql does not use it. why not ? > You don't need plural forms in this example. We have three separate > messages, one for "(No rows)", another one for the singular "(1 row)" > and a third one for the plural "(N rows)". After reading the cited page it is clear that we need to improve our use of gettext. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wednesday 18 March 2009 11:21:03 Sergey Burladyan wrote: > gnu gettext have support for correct plural form translation > (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html), > but postgresql does not use it. why not ? > maybe it have some problem in some supported OS ? Yes, the main reason is that it is not clear whether this is supported on all OS, or moreover that I believe it is not. So some allowances for that will probably have to be made.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Sergey Burladyan escribió: > > gnu gettext have support for correct plural form translation > > (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html), > > but postgresql does not use it. why not ? > > maybe it have some problem in some supported OS ? if not, can it implemented ? > > maybe someone already doing this ? > > > > ps: i try to translate psql message "(1 row)/(3 rows)" but can't do this > > correctly without plural form support. > You don't need plural forms in this example. We have three separate > messages, one for "(No rows)", another one for the singular "(1 row)" > and a third one for the plural "(N rows)". only one third message for plural is not enough for example for Russian. Russian have three plural forms, for example: 2 rows | 2 zapisy 3 rows | 3 zapisy 5 rows | 5 zapisey 11 rows | 11 zapisey 21 rows | 21 zapis etc > We avoid mixing plurals and singulars. Is this still a problem for you > somewhere? don't know :) i see this untranslated message (N rows) every day and try to translate it and find this issue. Peter Eisentraut <peter_e@gmx.net> writes: > On Wednesday 18 March 2009 11:21:03 Sergey Burladyan wrote: > > gnu gettext have support for correct plural form translation > > (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html), > > but postgresql does not use it. why not ? > > maybe it have some problem in some supported OS ? > Yes, the main reason is that it is not clear whether this is supported on all > OS, or moreover that I believe it is not. So some allowances for that will > probably have to be made. maybe build farm can help to test it ? i think about "(N rows)" message today and find other solution. i do not essentially need this support for this message. because if i exchange position of word and number in translated message - it will have right pronunciation, something like: (rows: N) | (zapisey: N) | (записей: N) is it correct to add ':' in translated message ? i think colon is need here because sense part of message is not first... ps: but this change order is look like "hack" and look like "this program is not support correct spelling and use only one plural form" :) and original order with different plural form is more closely to original English text. also, how about other languages ? IMHO not all of it can have simple solution like change words order... i think support for ngettext() still must be implemented. if some problem with it will be found - it can be rejected, isn't it ? =) -- Sergey Burladyan
Sergey Burladyan escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Yes, the main reason is that it is not clear whether this is supported on all > > OS, or moreover that I believe it is not. So some allowances for that will > > probably have to be made. > > maybe build farm can help to test it ? Yes, I think we should implement it and see what happens with the buildfarm. If we stand still and do nothing, we won't be any wiser. Care to submit a patch? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Sergey Burladyan escribió: > > Alvaro Herrera <alvherre@commandprompt.com> writes: > > > > Yes, the main reason is that it is not clear whether this is supported on all > > > OS, or moreover that I believe it is not. So some allowances for that will > > > probably have to be made. > > > > maybe build farm can help to test it ? > > Yes, I think we should implement it and see what happens with the > buildfarm. If we stand still and do nothing, we won't be any wiser. > > Care to submit a patch? i will try. -- Sergey Burladyan
Alvaro Herrera <alvherre@commandprompt.com> writes: > Sergey Burladyan escribi�: >> maybe build farm can help to test it ? > Yes, I think we should implement it and see what happens with the > buildfarm. If we stand still and do nothing, we won't be any wiser. The buildfarm is irrelevant to the fact that some platforms don't have ngettext. If the patch is designed to use ngettext where available, and to be no worse than what we have where it isn't, then we could consider it. regards, tom lane
If the "(n rows)" is the *only* message that needs it then I think it would be simpler to just make it "(Rows: n)" instead. But I wouldn't be surprised if there were other messages with similar issues. -- greg
Greg Stark wrote: > If the "(n rows)" is the *only* message that needs it then I think it > would be simpler to just make it "(Rows: n)" instead. But I wouldn't > be surprised if there were other messages with similar issues. There are a few more, e.g., %d index pages have been deleted %d connections Identifier must be less than %d characters.
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Sergey Burladyan escribió: >>> maybe build farm can help to test it ? > >> Yes, I think we should implement it and see what happens with the >> buildfarm. If we stand still and do nothing, we won't be any wiser. > > The buildfarm is irrelevant to the fact that some platforms don't > have ngettext. > > If the patch is designed to use ngettext where available, and to be no > worse than what we have where it isn't, then we could consider it. It depends also on what we *want* to target. I originally omitted the plural support because it was a GNU extension, and I wanted to support "standard" gettext implementations as well. (There was also a licensing consideration.) Solaris is the "original" implementation of this API, so it can serve as a reference point. But it is open to debate whether that decision is more useful than the tradeoff it imposes. What I read now, however, is that Solaris 9 introduced the GNU-compatible ngettext extension, which changes the above argument considerably. Given the current information available to me, I would also be satisfied to require ngettext() and reject NLS support on platforms that don't provide it.
Peter Eisentraut <peter_e@gmx.net> writes: > Greg Stark wrote: >> If the "(n rows)" is the *only* message that needs it then I think it >> would be simpler to just make it "(Rows: n)" instead. But I wouldn't >> be surprised if there were other messages with similar issues. > There are a few more, e.g., > %d index pages have been deleted > %d connections > Identifier must be less than %d characters. What's supposed to happen when a message contains more than one number (for example, most of the vacuum activity messages)? regards, tom lane
Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: >> Greg Stark wrote: >>> If the "(n rows)" is the *only* message that needs it then I think it >>> would be simpler to just make it "(Rows: n)" instead. But I wouldn't >>> be surprised if there were other messages with similar issues. > >> There are a few more, e.g., > >> %d index pages have been deleted >> %d connections >> Identifier must be less than %d characters. > > What's supposed to happen when a message contains more than one > number (for example, most of the vacuum activity messages)? Heh. Good point. That is not supported. It would obviously explode the API. But I agree it's a problem. Btw., you can find out how much of a problem by using for x in $(find . -name "*.pot"); do msggrep -K -E -e '%[diu] [[:alpha:]].*%[diu] [[:alpha:]]' $x; done and manually hand-filtering the rest. I count about 16 problem messages. They are mostly vacuum messages as well as messages of the kind "expected %d things but received only %d items". There are a number of additional messages that circumvent the problem by writing "expected %d things but received only %d", but that is not possible in all cases.
* Peter Eisentraut <peter_e@gmx.net> [090319 04:21]: > It depends also on what we *want* to target. I originally omitted the > plural support because it was a GNU extension, and I wanted to support > "standard" gettext implementations as well. (There was also a licensing > consideration.) Solaris is the "original" implementation of this API, > so it can serve as a reference point. > > But it is open to debate whether that decision is more useful than the > tradeoff it imposes. Of course, ngettext can easily be "ported", at it's most basic level (equivalent to GNU's C locale implentation) it's: #ifdef NEED_NGETTEXT#define ngettext(s,p,n) gettext((n)==1?(s):(p))#endif Or a real port function:const char* ngettext (const char*s, const char*p, int n){ return gettext(n == 1 ? s : p);} a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk wrote: > * Peter Eisentraut <peter_e@gmx.net> [090319 04:21]: > >> It depends also on what we *want* to target. I originally omitted the >> plural support because it was a GNU extension, and I wanted to support >> "standard" gettext implementations as well. (There was also a licensing >> consideration.) Solaris is the "original" implementation of this API, >> so it can serve as a reference point. >> >> But it is open to debate whether that decision is more useful than the >> tradeoff it imposes. > > Of course, ngettext can easily be "ported", at it's most basic level > (equivalent to GNU's C locale implentation) it's: > > #ifdef NEED_NGETTEXT > #define ngettext(s,p,n) gettext((n)==1?(s):(p)) > #endif The more interesting question is whether such a msgfmt will correctly process message catalogs containing things like #: print.c:2351 #, c-format msgid "(1 row)" msgid_plural "(%lu rows)" msgstr[0] "(1 Zeile)" msgstr[1] "(%lu Zeilen)" My guess is not.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Care to submit a patch? this is it, i divide it into two, first is change source and second is change ru.po file for psql. changelog: gettext-plural-test.patch - check ngettext in configure (HAVE_NGETTEXT), show warning if not. must be error, i agree with Peter, i think gettext without support of plural form can't compile .po file with it :(, but not sure, so for test it is only warning - new macros _P(s,p,n) for ngettext - HAVE_NGETTEXT always 1 in pg_config.h.win32 - psql, remove "(1 row)", switch this string into _P(...) macros gettext-plural-ru-test.patch: - correct translation for "1 rows" message *** a/config/programs.m4 --- b/config/programs.m4 *************** *** 193,198 **** AC_DEFUN([PGAC_CHECK_GETTEXT], --- 193,202 ---- [ AC_SEARCH_LIBS(bind_textdomain_codeset, intl, [], [AC_MSG_ERROR([a gettext implementation is required for NLS])]) + AC_SEARCH_LIBS(ngettext, intl, + [AC_DEFINE(HAVE_NGETTEXT, 1, + [Define to 1 if you have the ngettext function.])], + [AC_MSG_WARN([NLS broken, plural forms support required for compile .po files])]) AC_CHECK_HEADER([libintl.h], [], [AC_MSG_ERROR([header file <libintl.h> is required for NLS])]) AC_CHECK_PROGS(MSGFMT, msgfmt) *** a/configure --- b/configure *************** *** 26022,26027 **** echo "$as_me: error: a gettext implementation is required for NLS" >&2;} --- 26022,26117 ---- { (exit 1); exit 1; }; } fi + { echo "$as_me:$LINENO: checking for library containing ngettext" >&5 + echo $ECHO_N "checking for library containing ngettext... $ECHO_C" >&6; } + if test "${ac_cv_search_ngettext+set}" = set; then + echo $ECHO_N "(cached) $ECHO_C" >&6 + else + ac_func_search_save_LIBS=$LIBS + cat >conftest.$ac_ext <<_ACEOF + /* confdefs.h. */ + _ACEOF + cat confdefs.h >>conftest.$ac_ext + cat >>conftest.$ac_ext <<_ACEOF + /* end confdefs.h. */ + + /* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ + #ifdef __cplusplus + extern "C" + #endif + char ngettext (); + int + main () + { + return ngettext (); + ; + return 0; + } + _ACEOF + for ac_lib in '' intl; do + if test -z "$ac_lib"; then + ac_res="none required" + else + ac_res=-l$ac_lib + LIBS="-l$ac_lib $ac_func_search_save_LIBS" + fi + rm -f conftest.$ac_objext conftest$ac_exeext + if { (ac_try="$ac_link" + case "(($ac_try" in + *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; + *) ac_try_echo=$ac_try;; + esac + eval "echo \"\$as_me:$LINENO: $ac_try_echo\"") >&5 + (eval "$ac_link") 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && { + test -z "$ac_c_werror_flag" || + test ! -s conftest.err + } && test -s conftest$ac_exeext && + $as_test_x conftest$ac_exeext; then + ac_cv_search_ngettext=$ac_res + else + echo "$as_me: failed program was:" >&5 + sed 's/^/| /' conftest.$ac_ext >&5 + + + fi + + rm -f core conftest.err conftest.$ac_objext conftest_ipa8_conftest.oo \ + conftest$ac_exeext + if test "${ac_cv_search_ngettext+set}" = set; then + break + fi + done + if test "${ac_cv_search_ngettext+set}" = set; then + : + else + ac_cv_search_ngettext=no + fi + rm conftest.$ac_ext + LIBS=$ac_func_search_save_LIBS + fi + { echo "$as_me:$LINENO: result: $ac_cv_search_ngettext" >&5 + echo "${ECHO_T}$ac_cv_search_ngettext" >&6; } + ac_res=$ac_cv_search_ngettext + if test "$ac_res" != no; then + test "$ac_res" = "none required" || LIBS="$ac_res $LIBS" + + cat >>confdefs.h <<\_ACEOF + #define HAVE_NGETTEXT 1 + _ACEOF + + else + { echo "$as_me:$LINENO: WARNING: NLS broken, plural forms support required for compile .po files" >&5 + echo "$as_me: WARNING: NLS broken, plural forms support required for compile .po files" >&2;} + fi + if test "${ac_cv_header_libintl_h+set}" = set; then { echo "$as_me:$LINENO: checking for libintl.h" >&5 echo $ECHO_N "checking for libintl.h... $ECHO_C" >&6; } *** a/src/bin/psql/print.c --- b/src/bin/psql/print.c *************** *** 2348,2357 **** printQuery(const PGresult *result, const printQueryOpt *opt, FILE *fout, FILE *f char default_footer[100]; total_records = opt->topt.prior_records + cont.nrows; ! if (total_records == 1) ! snprintf(default_footer, 100, _("(1 row)")); ! else ! snprintf(default_footer, 100, _("(%lu rows)"), total_records); printTableAddFooter(&cont, default_footer); } --- 2348,2354 ---- char default_footer[100]; total_records = opt->topt.prior_records + cont.nrows; ! snprintf(default_footer, 100, _P("(%lu row)", "(%lu rows)", total_records), total_records); printTableAddFooter(&cont, default_footer); } *** a/src/include/c.h --- b/src/include/c.h *************** *** 91,102 **** --- 91,108 ---- #include <locale.h> #define _(x) gettext(x) + #ifdef HAVE_NGETTEXT + #define _P(s,p,n) ngettext(s,p,n) + #else + #define _P(s,p,n) ((n) == 1 ? (s) : (p)) + #endif #ifdef ENABLE_NLS #include <libintl.h> #else #define gettext(x) (x) #define dgettext(d,x) (x) + #define ngettext(s,p,n) ((n) == 1 ? (s) : (p)) #endif /* *** a/src/include/pg_config.h.in --- b/src/include/pg_config.h.in *************** *** 321,326 **** --- 321,329 ---- /* Define to 1 if you have the <netinet/tcp.h> header file. */ #undef HAVE_NETINET_TCP_H + /* Define to 1 if you have the ngettext function. */ + #undef HAVE_NGETTEXT + /* Define to 1 if you have the `on_exit' function. */ #undef HAVE_ON_EXIT *** a/src/include/pg_config.h.win32 --- b/src/include/pg_config.h.win32 *************** *** 267,272 **** --- 267,275 ---- /* Define to 1 if you have the <netinet/tcp.h> header file. */ /* #undef HAVE_NETINET_TCP_H */ + /* Define to 1 if you have the 'ngettext' function. */ + #define HAVE_NGETTEXT 1 + /* Define to 1 if you have the `on_exit' function. */ /* #undef HAVE_ON_EXIT */ *** a/src/bin/psql/po/ru.po --- b/src/bin/psql/po/ru.po *************** *** 19,25 **** msgid "" msgstr "" "Project-Id-Version: PostgreSQL 8.0\n" "POT-Creation-Date: 2005-01-17 19:06+0000\n" ! "PO-Revision-Date: 2005-01-17 15:36-0500\n" "Last-Translator: Serguei A. Mokhov <mokhov@cs.concordia.ca>\n" "Language-Team: pgsql-ru-general <pgsql-ru-general@postgresql.org>\n" "MIME-Version: 1.0\n" --- 19,25 ---- msgstr "" "Project-Id-Version: PostgreSQL 8.0\n" "POT-Creation-Date: 2005-01-17 19:06+0000\n" ! "PO-Revision-Date: 2009-03-20 05:19+0300\n" "Last-Translator: Serguei A. Mokhov <mokhov@cs.concordia.ca>\n" "Language-Team: pgsql-ru-general <pgsql-ru-general@postgresql.org>\n" "MIME-Version: 1.0\n" *************** *** 27,32 **** msgstr "" --- 27,34 ---- "Content-Transfer-Encoding: 8bit\n" "X-Poedit-Language: Russian\n" "X-Poedit-Country: RUSSIAN FEDERATION\n" + "Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%" + "10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n" #: command.c:116 msgid "Warning: This syntax is deprecated.\n" *************** *** 930,943 **** msgstr " msgid "(No rows)\n" msgstr "(Нет записей)\n" ! #: print.c:1200 ! msgid "(1 row)" ! msgstr "(1 запись)" ! ! #: print.c:1202 #, c-format ! msgid "(%d rows)" ! msgstr "(записей: %d)" #: startup.c:138 #, c-format --- 932,944 ---- msgid "(No rows)\n" msgstr "(Нет записей)\n" ! #: print.c:2351 #, c-format ! msgid "(%lu row)" ! msgid_plural "(%lu rows)" ! msgstr[0] "(%lu строка)" ! msgstr[1] "(%lu строки)" ! msgstr[2] "(%lu строк)" #: startup.c:138 #, c-format -- Sergey Burladyan
Sergey Burladyan <eshkinkot@gmail.com> writes: > gettext-plural-ru-test.patch: > - correct translation for "1 rows" message hmmm... encoding is broken... i post it again in gzip -- Sergey Burladyan
Attachment
On Saturday 21 March 2009 01:01:57 Sergey Burladyan wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Care to submit a patch? > > this is it, i divide it into two, first is change source and second is > change ru.po file for psql. I have now committed a more extensive pluralization, but your case is included there. As for the ru.po file, please see http://babel.postgresql.org/.
Peter Eisentraut <peter_e@gmx.net> writes: > I have now committed a more extensive pluralization, but your case is included > there. > > As for the ru.po file, please see http://babel.postgresql.org/. Great! I am translating 8.3 messages now. After this, i will go to HEAD. ps: By the way, when 8.4 will be released ? Have i time for translate HEAD before release will be ? I try to find 8.4 release date but it is 1st March 2009 %) -- Sergey Burladyan
Le lundi 30 mars 2009 à 15:21:38, Sergey Burladyan a écrit : > Peter Eisentraut <peter_e@gmx.net> writes: > > I have now committed a more extensive pluralization, but your case is > > included there. > > > > As for the ru.po file, please see http://babel.postgresql.org/. > > Great! I am translating 8.3 messages now. After this, i will go to HEAD. > > ps: > By the way, when 8.4 will be released ? Have i time for translate HEAD > before release will be ? I try to find 8.4 release date but it is 1st March > 2009 %) > You have *some* time because 8.4 beta is not even out yet. I suppose I should work on the translation too... -- Guillaume.http://www.postgresqlfr.orghttp://dalibo.com