Thread: Improving the ngettext() patch
After looking through the current uses of ngettext(), I think that it wouldn't be too difficult to modify the patch to address the concerns I had about it. What I propose doing is to add an additional elog.h function errmsg_plural(const char *fmt_singular, const char *fmt_plural, unsigned long n, ...) and replace the current errmsg(ngettext(...)) calls with this. Similarly add errdetail_plural to replace errdetail(ngettext(...)). (We could also add errhint_plural and so on, but right offhand these seem unlikely to be useful.) The advantage of doing this is that we avoid double translation and eliminate the current kluge whereby usages in PL code have to be different from usages anywhere else. I don't feel a need to touch the usages in client programs (pg_dump and so on). In principle the double-translation risk still exists there, but it seems much less likely to be a real hazard because any one client program has a *far* smaller pool of translatable messages than the backend does. Also, there's only one active text domain in a client program, so the problem of needing to use dngettext in special cases doesn't exist. There are a few usages of ngettext() in the backend that are not tied to ereport calls, but I think they can be left as-is. There's no double-translation risk, and with so few of them I don't see much of a risk of wrongly copying the usage in PL code, either. Also: one thought that came to me while looking at the existing usages is that there are several places that are plural-ized that seem completely pointless; why are we making our translators work harder on them? For example ereport(ERROR, (errcode(ERRCODE_TOO_MANY_ARGUMENTS), errmsg(ngettext("functions cannothave more than %d argument", "functions cannot have more than %d arguments", FUNC_MAX_ARGS), FUNC_MAX_ARGS))); It seems extremely far-fetched that FUNC_MAX_ARGS would ever be small enough that it would make any language's special cases kick in. Or how about this one: #if 0 write_msg(modulename, ngettext("read %lu byte into lookahead buffer\n", "read %lubytes into lookahead buffer\n", AH->lookaheadLen), (unsigned long) AH->lookaheadLen); #endif I'm not sure why this debug support is still there at all, but surely it's a crummy candidate for making translators sweat over. So I'd like to revert these. Comments, objections? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > ereport(ERROR, > (errcode(ERRCODE_TOO_MANY_ARGUMENTS), > errmsg(ngettext("functions cannot have more than %d argument", > "functions cannot have more than %d arguments", > FUNC_MAX_ARGS), > FUNC_MAX_ARGS))); > > It seems extremely far-fetched that FUNC_MAX_ARGS would ever be small > enough that it would make any language's special cases kick in. Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2. -- Sergey Burladyan
> Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2.<br /><br />True. The rule IIRC is that exceptfor 11-14 and for collective numerals, declination follows the last digit.<br /><br />It would be possible to generalizedeclination via a language-specific message-selector function, especially if the number of numerical complementswere limited to 1.<br /><br />How awkward would it be to re-word the style of messages to avoid declination? For example, the Russian equivalent of "X rows" could be something like "#rows -- X".<br /><br />David Hudson<br/><br />
pg@thetdh.com writes: >> Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2. > True. The rule IIRC is that except for 11-14 and for collective numerals, declination follows the last digit. Wow. So how does anyone represent that in the .po files? AFAICT the notation the gettext machinery provides isn't really powerful enough for this. regards, tom lane
* Tom Lane <tgl@sss.pgh.pa.us> [090604 10:22]: > pg@thetdh.com writes: > >> Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2. > > > True. The rule IIRC is that except for 11-14 and for collective numerals, declination follows the last digit. > > Wow. So how does anyone represent that in the .po files? AFAICT the > notation the gettext machinery provides isn't really powerful enough > for this. Well, the C/english "template" one includes just the msgid, and msgid_plural string. When the russian translators get to it, they make a russion .po which contains (something like) the following in the msgid "" header:"Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0: n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n" And then they provide msgstr[0], msgstr[1], and msgstr[2] to fill the 3 slots that above plural-forms can use when translationg plural-form strings. It's all encapsulated in the gettext tools and libraries, and the C (non-translated) base just always uses ngetttext(single, plural, n), and ngettext will (if the compiled catalog has different plural-forms) use whatever the catalog specifies, or fall back to the simple n == 1 ? singular : plural type choice when no translated catalog is available. a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk <aidan@highrise.ca> writes: > When the russian translators get to it, they make a russion .po which > contains (something like) the following in the msgid "" header: > "Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n" Oh, I see. I didn't realize there was a mapping mechanism available to the translator. Okay, so the bottom line there is that there is some value in pluralizing the messages about FUNC_MAX_ARGS --- I withdraw the suggestion to undo that. Anyone wish to defend the ones that are ifdef'd out? regards, tom lane
(Grrr, declension, not declination.)<br /><br />> "Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 :n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n"<br /><br />Thanks. The above (ignoringbackslash-EOL) is the form recommended for Russian (inter alia(s)) in the Texinfo manual for gettext ("info gettext"). FWIW this might be an alternative:<br /><br />"Plural-Forms: nplurals=3; plural=((n - 1) % 10) >= (5-1) ||(((n - 1) % 100) <= (14-1) && ((n - 1) % 100) >= (11 - 1)) ? 2 : ((n - 1) % 10) == (1 - 1) ? 0 : 1;\n"<br/><br />David Hudson<br /><br />