Thread: gettext, plural form and translation

gettext, plural form and translation

From
Sergey Burladyan
Date:
Hi, all.

gnu gettext have support for correct plural form translation
(http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html),
but postgresql does not use it. why not ?
maybe it have some problem in some supported OS ? if not, can it implemented ?
maybe someone already doing this ?

ps: i try to translate psql message "(1 row)/(3 rows)" but can't do this
correctly without plural form support.

need some work with source for implement it and xgettext params used for
extract messages for http://babel.postgresql.org/

Thanks for comments !

-- 
Sergey Burladyan


Re: gettext, plural form and translation

From
Alvaro Herrera
Date:
Sergey Burladyan escribió:
> Hi, all.
> 
> gnu gettext have support for correct plural form translation
> (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html),
> but postgresql does not use it. why not ?
> maybe it have some problem in some supported OS ? if not, can it implemented ?
> maybe someone already doing this ?
> 
> ps: i try to translate psql message "(1 row)/(3 rows)" but can't do this
> correctly without plural form support.

You don't need plural forms in this example.  We have three separate
messages, one for "(No rows)", another one for the singular "(1 row)"
and a third one for the plural "(N rows)".

We avoid mixing plurals and singulars.  Is this still a problem for you
somewhere?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: gettext, plural form and translation

From
Alvaro Herrera
Date:
Alvaro Herrera escribió:
> Sergey Burladyan escribió:
> > Hi, all.
> > 
> > gnu gettext have support for correct plural form translation
> > (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html),
> > but postgresql does not use it. why not ?

> You don't need plural forms in this example.  We have three separate
> messages, one for "(No rows)", another one for the singular "(1 row)"
> and a third one for the plural "(N rows)".

After reading the cited page it is clear that we need to improve our use
of gettext.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: gettext, plural form and translation

From
Peter Eisentraut
Date:
On Wednesday 18 March 2009 11:21:03 Sergey Burladyan wrote:
> gnu gettext have support for correct plural form translation
> (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html),
> but postgresql does not use it. why not ?
> maybe it have some problem in some supported OS ?

Yes, the main reason is that it is not clear whether this is supported on all 
OS, or moreover that I believe it is not.  So some allowances for that will 
probably have to be made.


Re: gettext, plural form and translation

From
Sergey Burladyan
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:

> Sergey Burladyan escribió:
> > gnu gettext have support for correct plural form translation
> > (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html),
> > but postgresql does not use it. why not ?
> > maybe it have some problem in some supported OS ? if not, can it implemented ?
> > maybe someone already doing this ?
> >
> > ps: i try to translate psql message "(1 row)/(3 rows)" but can't do this
> > correctly without plural form support.

> You don't need plural forms in this example.  We have three separate
> messages, one for "(No rows)", another one for the singular "(1 row)"
> and a third one for the plural "(N rows)".

only one third message for plural is not enough for example for Russian. Russian
have three plural forms, for example:

2 rows  | 2 zapisy
3 rows  | 3 zapisy
5 rows  | 5 zapisey
11 rows | 11 zapisey
21 rows | 21 zapis
etc

> We avoid mixing plurals and singulars.  Is this still a problem for you
> somewhere?

don't know :) i see this untranslated message (N rows) every day and try
to translate it and find this issue.

Peter Eisentraut <peter_e@gmx.net> writes:

> On Wednesday 18 March 2009 11:21:03 Sergey Burladyan wrote:
> > gnu gettext have support for correct plural form translation
> > (http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html),
> > but postgresql does not use it. why not ?
> > maybe it have some problem in some supported OS ?

> Yes, the main reason is that it is not clear whether this is supported on all
> OS, or moreover that I believe it is not.  So some allowances for that will
> probably have to be made.

maybe build farm can help to test it ?


i think about "(N rows)" message today and find other solution. i do not essentially
need this support for this message. because if i exchange position of word and
number in translated message - it will have right pronunciation, something like:
(rows: N) | (zapisey: N) | (записей: N)

is it correct to add ':' in translated message ? i think colon is need here
because sense part of message is not first...

ps: but this change order is look like "hack" and look like "this program is
not support correct spelling and use only one plural form" :)
and original order with different plural form is more closely to original English text.

also, how about other languages ? IMHO not all of it can have simple solution
like change words order...

i think support for ngettext() still must be implemented. if some problem with
it will be found - it can be rejected, isn't it ? =)

--
Sergey Burladyan


Re: gettext, plural form and translation

From
Alvaro Herrera
Date:
Sergey Burladyan escribió:
> Alvaro Herrera <alvherre@commandprompt.com> writes:

> > Yes, the main reason is that it is not clear whether this is supported on all 
> > OS, or moreover that I believe it is not.  So some allowances for that will 
> > probably have to be made.
> 
> maybe build farm can help to test it ?

Yes, I think we should implement it and see what happens with the
buildfarm.  If we stand still and do nothing, we won't be any wiser.

Care to submit a patch?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: gettext, plural form and translation

From
Sergey Burladyan
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:

> Sergey Burladyan escribió:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
>
> > > Yes, the main reason is that it is not clear whether this is supported on all
> > > OS, or moreover that I believe it is not.  So some allowances for that will
> > > probably have to be made.
> >
> > maybe build farm can help to test it ?
>
> Yes, I think we should implement it and see what happens with the
> buildfarm.  If we stand still and do nothing, we won't be any wiser.
>
> Care to submit a patch?

i will try.

--
Sergey Burladyan


Re: gettext, plural form and translation

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Sergey Burladyan escribi�:
>> maybe build farm can help to test it ?

> Yes, I think we should implement it and see what happens with the
> buildfarm.  If we stand still and do nothing, we won't be any wiser.

The buildfarm is irrelevant to the fact that some platforms don't
have ngettext.

If the patch is designed to use ngettext where available, and to be no
worse than what we have where it isn't, then we could consider it.
        regards, tom lane


Re: gettext, plural form and translation

From
Greg Stark
Date:
If the "(n rows)" is the *only* message that needs it then I think it
would be simpler to just make it "(Rows: n)" instead. But I wouldn't
be surprised if there were other messages with similar issues.

-- 
greg


Re: gettext, plural form and translation

From
Peter Eisentraut
Date:
Greg Stark wrote:
> If the "(n rows)" is the *only* message that needs it then I think it
> would be simpler to just make it "(Rows: n)" instead. But I wouldn't
> be surprised if there were other messages with similar issues.

There are a few more, e.g.,

%d index pages have been deleted
%d connections
Identifier must be less than %d characters.


Re: gettext, plural form and translation

From
Peter Eisentraut
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Sergey Burladyan escribió:
>>> maybe build farm can help to test it ?
> 
>> Yes, I think we should implement it and see what happens with the
>> buildfarm.  If we stand still and do nothing, we won't be any wiser.
> 
> The buildfarm is irrelevant to the fact that some platforms don't
> have ngettext.
> 
> If the patch is designed to use ngettext where available, and to be no
> worse than what we have where it isn't, then we could consider it.

It depends also on what we *want* to target.  I originally omitted the 
plural support because it was a GNU extension, and I wanted to support 
"standard" gettext implementations as well.  (There was also a licensing 
consideration.)  Solaris is the "original" implementation of this API, 
so it can serve as a reference point.

But it is open to debate whether that decision is more useful than the 
tradeoff it imposes.

What I read now, however, is that Solaris 9 introduced the 
GNU-compatible ngettext extension, which changes the above argument 
considerably.

Given the current information available to me, I would also be satisfied 
to require ngettext() and reject NLS support on platforms that don't 
provide it.


Re: gettext, plural form and translation

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Greg Stark wrote:
>> If the "(n rows)" is the *only* message that needs it then I think it
>> would be simpler to just make it "(Rows: n)" instead. But I wouldn't
>> be surprised if there were other messages with similar issues.

> There are a few more, e.g.,

> %d index pages have been deleted
> %d connections
> Identifier must be less than %d characters.

What's supposed to happen when a message contains more than one
number (for example, most of the vacuum activity messages)?
        regards, tom lane


Re: gettext, plural form and translation

From
Peter Eisentraut
Date:
Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> Greg Stark wrote:
>>> If the "(n rows)" is the *only* message that needs it then I think it
>>> would be simpler to just make it "(Rows: n)" instead. But I wouldn't
>>> be surprised if there were other messages with similar issues.
> 
>> There are a few more, e.g.,
> 
>> %d index pages have been deleted
>> %d connections
>> Identifier must be less than %d characters.
> 
> What's supposed to happen when a message contains more than one
> number (for example, most of the vacuum activity messages)?

Heh.  Good point.  That is not supported.  It would obviously explode 
the API.  But I agree it's a problem.

Btw., you can find out how much of a problem by using

for x in $(find . -name "*.pot"); do msggrep -K -E -e '%[diu] 
[[:alpha:]].*%[diu] [[:alpha:]]' $x; done

and manually hand-filtering the rest.  I count about 16 problem 
messages.  They are mostly vacuum messages as well as messages of the 
kind "expected %d things but received only %d items".  There are a 
number of additional messages that circumvent the problem by writing 
"expected %d things but received only %d", but that is not possible in 
all cases.


Re: gettext, plural form and translation

From
Aidan Van Dyk
Date:
* Peter Eisentraut <peter_e@gmx.net> [090319 04:21]:

> It depends also on what we *want* to target.  I originally omitted the  
> plural support because it was a GNU extension, and I wanted to support  
> "standard" gettext implementations as well.  (There was also a licensing  
> consideration.)  Solaris is the "original" implementation of this API,  
> so it can serve as a reference point.
>
> But it is open to debate whether that decision is more useful than the  
> tradeoff it imposes.

Of course, ngettext can easily be "ported", at it's most basic level
(equivalent to GNU's C locale implentation) it's:
#ifdef NEED_NGETTEXT#define ngettext(s,p,n)    gettext((n)==1?(s):(p))#endif

Or a real port function:const char* ngettext (const char*s, const char*p, int n){    return gettext(n == 1 ? s : p);}

a.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: gettext, plural form and translation

From
Peter Eisentraut
Date:
Aidan Van Dyk wrote:
> * Peter Eisentraut <peter_e@gmx.net> [090319 04:21]:
> 
>> It depends also on what we *want* to target.  I originally omitted the  
>> plural support because it was a GNU extension, and I wanted to support  
>> "standard" gettext implementations as well.  (There was also a licensing  
>> consideration.)  Solaris is the "original" implementation of this API,  
>> so it can serve as a reference point.
>>
>> But it is open to debate whether that decision is more useful than the  
>> tradeoff it imposes.
> 
> Of course, ngettext can easily be "ported", at it's most basic level
> (equivalent to GNU's C locale implentation) it's:
> 
>     #ifdef NEED_NGETTEXT
>     #define ngettext(s,p,n)    gettext((n)==1?(s):(p))
>     #endif

The more interesting question is whether such a msgfmt will correctly 
process message catalogs containing things like

#: print.c:2351
#, c-format
msgid "(1 row)"
msgid_plural "(%lu rows)"
msgstr[0] "(1 Zeile)"
msgstr[1] "(%lu Zeilen)"

My guess is not.


Re: gettext, plural form and translation

From
Sergey Burladyan
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:

> Care to submit a patch?

this is it, i divide it into two, first is change source and second is change
ru.po file for psql.

changelog:

 gettext-plural-test.patch
 - check ngettext in configure (HAVE_NGETTEXT), show warning if not. must be
 error, i agree with Peter, i think gettext without support of plural form can't
 compile .po file with it :(, but not sure, so for test it is only warning
 - new macros _P(s,p,n) for ngettext
 - HAVE_NGETTEXT always 1 in pg_config.h.win32
 - psql, remove "(1 row)", switch this string into _P(...) macros

 gettext-plural-ru-test.patch:
 - correct translation for "1 rows" message

*** a/config/programs.m4
--- b/config/programs.m4
***************
*** 193,198 **** AC_DEFUN([PGAC_CHECK_GETTEXT],
--- 193,202 ----
  [
    AC_SEARCH_LIBS(bind_textdomain_codeset, intl, [],
                   [AC_MSG_ERROR([a gettext implementation is required for NLS])])
+   AC_SEARCH_LIBS(ngettext, intl,
+                  [AC_DEFINE(HAVE_NGETTEXT, 1,
+                             [Define to 1 if you have the ngettext function.])],
+                  [AC_MSG_WARN([NLS broken, plural forms support required for compile .po files])])
    AC_CHECK_HEADER([libintl.h], [],
                    [AC_MSG_ERROR([header file <libintl.h> is required for NLS])])
    AC_CHECK_PROGS(MSGFMT, msgfmt)
*** a/configure
--- b/configure
***************
*** 26022,26027 **** echo "$as_me: error: a gettext implementation is required for NLS" >&2;}
--- 26022,26117 ----
     { (exit 1); exit 1; }; }
  fi

+   { echo "$as_me:$LINENO: checking for library containing ngettext" >&5
+ echo $ECHO_N "checking for library containing ngettext... $ECHO_C" >&6; }
+ if test "${ac_cv_search_ngettext+set}" = set; then
+   echo $ECHO_N "(cached) $ECHO_C" >&6
+ else
+   ac_func_search_save_LIBS=$LIBS
+ cat >conftest.$ac_ext <<_ACEOF
+ /* confdefs.h.  */
+ _ACEOF
+ cat confdefs.h >>conftest.$ac_ext
+ cat >>conftest.$ac_ext <<_ACEOF
+ /* end confdefs.h.  */
+
+ /* Override any GCC internal prototype to avoid an error.
+    Use char because int might match the return type of a GCC
+    builtin and then its argument prototype would still apply.  */
+ #ifdef __cplusplus
+ extern "C"
+ #endif
+ char ngettext ();
+ int
+ main ()
+ {
+ return ngettext ();
+   ;
+   return 0;
+ }
+ _ACEOF
+ for ac_lib in '' intl; do
+   if test -z "$ac_lib"; then
+     ac_res="none required"
+   else
+     ac_res=-l$ac_lib
+     LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+   fi
+   rm -f conftest.$ac_objext conftest$ac_exeext
+ if { (ac_try="$ac_link"
+ case "(($ac_try" in
+   *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;;
+   *) ac_try_echo=$ac_try;;
+ esac
+ eval "echo \"\$as_me:$LINENO: $ac_try_echo\"") >&5
+   (eval "$ac_link") 2>conftest.er1
+   ac_status=$?
+   grep -v '^ *+' conftest.er1 >conftest.err
+   rm -f conftest.er1
+   cat conftest.err >&5
+   echo "$as_me:$LINENO: \$? = $ac_status" >&5
+   (exit $ac_status); } && {
+      test -z "$ac_c_werror_flag" ||
+      test ! -s conftest.err
+        } && test -s conftest$ac_exeext &&
+        $as_test_x conftest$ac_exeext; then
+   ac_cv_search_ngettext=$ac_res
+ else
+   echo "$as_me: failed program was:" >&5
+ sed 's/^/| /' conftest.$ac_ext >&5
+
+
+ fi
+
+ rm -f core conftest.err conftest.$ac_objext conftest_ipa8_conftest.oo \
+       conftest$ac_exeext
+   if test "${ac_cv_search_ngettext+set}" = set; then
+   break
+ fi
+ done
+ if test "${ac_cv_search_ngettext+set}" = set; then
+   :
+ else
+   ac_cv_search_ngettext=no
+ fi
+ rm conftest.$ac_ext
+ LIBS=$ac_func_search_save_LIBS
+ fi
+ { echo "$as_me:$LINENO: result: $ac_cv_search_ngettext" >&5
+ echo "${ECHO_T}$ac_cv_search_ngettext" >&6; }
+ ac_res=$ac_cv_search_ngettext
+ if test "$ac_res" != no; then
+   test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+ cat >>confdefs.h <<\_ACEOF
+ #define HAVE_NGETTEXT 1
+ _ACEOF
+
+ else
+   { echo "$as_me:$LINENO: WARNING: NLS broken, plural forms support required for compile .po files" >&5
+ echo "$as_me: WARNING: NLS broken, plural forms support required for compile .po files" >&2;}
+ fi
+
    if test "${ac_cv_header_libintl_h+set}" = set; then
    { echo "$as_me:$LINENO: checking for libintl.h" >&5
  echo $ECHO_N "checking for libintl.h... $ECHO_C" >&6; }
*** a/src/bin/psql/print.c
--- b/src/bin/psql/print.c
***************
*** 2348,2357 **** printQuery(const PGresult *result, const printQueryOpt *opt, FILE *fout, FILE *f
          char        default_footer[100];

          total_records = opt->topt.prior_records + cont.nrows;
!         if (total_records == 1)
!             snprintf(default_footer, 100, _("(1 row)"));
!         else
!             snprintf(default_footer, 100, _("(%lu rows)"), total_records);

          printTableAddFooter(&cont, default_footer);
      }
--- 2348,2354 ----
          char        default_footer[100];

          total_records = opt->topt.prior_records + cont.nrows;
!         snprintf(default_footer, 100, _P("(%lu row)", "(%lu rows)", total_records), total_records);

          printTableAddFooter(&cont, default_footer);
      }
*** a/src/include/c.h
--- b/src/include/c.h
***************
*** 91,102 ****
--- 91,108 ----
  #include <locale.h>

  #define _(x) gettext(x)
+ #ifdef HAVE_NGETTEXT
+ #define _P(s,p,n) ngettext(s,p,n)
+ #else
+ #define _P(s,p,n) ((n) == 1 ? (s) : (p))
+ #endif

  #ifdef ENABLE_NLS
  #include <libintl.h>
  #else
  #define gettext(x) (x)
  #define dgettext(d,x) (x)
+ #define ngettext(s,p,n) ((n) == 1 ? (s) : (p))
  #endif

  /*
*** a/src/include/pg_config.h.in
--- b/src/include/pg_config.h.in
***************
*** 321,326 ****
--- 321,329 ----
  /* Define to 1 if you have the <netinet/tcp.h> header file. */
  #undef HAVE_NETINET_TCP_H

+ /* Define to 1 if you have the ngettext function. */
+ #undef HAVE_NGETTEXT
+
  /* Define to 1 if you have the `on_exit' function. */
  #undef HAVE_ON_EXIT

*** a/src/include/pg_config.h.win32
--- b/src/include/pg_config.h.win32
***************
*** 267,272 ****
--- 267,275 ----
  /* Define to 1 if you have the <netinet/tcp.h> header file. */
  /* #undef HAVE_NETINET_TCP_H */

+ /* Define to 1 if you have the 'ngettext' function. */
+ #define HAVE_NGETTEXT 1
+
  /* Define to 1 if you have the `on_exit' function. */
  /* #undef HAVE_ON_EXIT */

*** a/src/bin/psql/po/ru.po
--- b/src/bin/psql/po/ru.po
***************
*** 19,25 **** msgid ""
  msgstr ""
  "Project-Id-Version: PostgreSQL 8.0\n"
  "POT-Creation-Date: 2005-01-17 19:06+0000\n"
! "PO-Revision-Date: 2005-01-17 15:36-0500\n"
  "Last-Translator: Serguei A. Mokhov <mokhov@cs.concordia.ca>\n"
  "Language-Team: pgsql-ru-general <pgsql-ru-general@postgresql.org>\n"
  "MIME-Version: 1.0\n"
--- 19,25 ----
  msgstr ""
  "Project-Id-Version: PostgreSQL 8.0\n"
  "POT-Creation-Date: 2005-01-17 19:06+0000\n"
! "PO-Revision-Date: 2009-03-20 05:19+0300\n"
  "Last-Translator: Serguei A. Mokhov <mokhov@cs.concordia.ca>\n"
  "Language-Team: pgsql-ru-general <pgsql-ru-general@postgresql.org>\n"
  "MIME-Version: 1.0\n"
***************
*** 27,32 **** msgstr ""
--- 27,34 ----
  "Content-Transfer-Encoding: 8bit\n"
  "X-Poedit-Language: Russian\n"
  "X-Poedit-Country: RUSSIAN FEDERATION\n"
+ "Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%"
+ "10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n"

  #: command.c:116
  msgid "Warning: This syntax is deprecated.\n"
***************
*** 930,943 **** msgstr "
  msgid "(No rows)\n"
  msgstr "(Нет записей)\n"

! #: print.c:1200
! msgid "(1 row)"
! msgstr "(1 запись)"
!
! #: print.c:1202
  #, c-format
! msgid "(%d rows)"
! msgstr "(записей: %d)"

  #: startup.c:138
  #, c-format
--- 932,944 ----
  msgid "(No rows)\n"
  msgstr "(Нет записей)\n"

! #: print.c:2351
  #, c-format
! msgid "(%lu row)"
! msgid_plural "(%lu rows)"
! msgstr[0] "(%lu строка)"
! msgstr[1] "(%lu строки)"
! msgstr[2] "(%lu строк)"

  #: startup.c:138
  #, c-format

--
Sergey Burladyan

Re: gettext, plural form and translation

From
Sergey Burladyan
Date:
Sergey Burladyan <eshkinkot@gmail.com> writes:

>  gettext-plural-ru-test.patch:
>  - correct translation for "1 rows" message

hmmm... encoding is broken... i post it again in gzip




--
Sergey Burladyan

Attachment

Re: gettext, plural form and translation

From
Peter Eisentraut
Date:
On Saturday 21 March 2009 01:01:57 Sergey Burladyan wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Care to submit a patch?
>
> this is it, i divide it into two, first is change source and second is
> change ru.po file for psql.

I have now committed a more extensive pluralization, but your case is included 
there.

As for the ru.po file, please see http://babel.postgresql.org/.



Re: gettext, plural form and translation

From
Sergey Burladyan
Date:
Peter Eisentraut <peter_e@gmx.net> writes:

> I have now committed a more extensive pluralization, but your case is included 
> there.
> 
> As for the ru.po file, please see http://babel.postgresql.org/.

Great! I am translating 8.3 messages now. After this, i will go to HEAD.

ps:
By the way, when 8.4 will be released ? Have i time for translate HEAD before
release will be ? I try to find 8.4 release date but it is 1st March 2009 %)

-- 
Sergey Burladyan


Re: gettext, plural form and translation

From
Guillaume Lelarge
Date:
Le lundi 30 mars 2009 à 15:21:38, Sergey Burladyan a écrit :
> Peter Eisentraut <peter_e@gmx.net> writes:
> > I have now committed a more extensive pluralization, but your case is
> > included there.
> >
> > As for the ru.po file, please see http://babel.postgresql.org/.
>
> Great! I am translating 8.3 messages now. After this, i will go to HEAD.
>
> ps:
> By the way, when 8.4 will be released ? Have i time for translate HEAD
> before release will be ? I try to find 8.4 release date but it is 1st March
> 2009 %)
>

You have *some* time because 8.4 beta is not even out yet. I suppose I should
work on the translation too...


--
Guillaume.http://www.postgresqlfr.orghttp://dalibo.com