Improving the ngettext() patch
After looking through the current uses of ngettext(), I think that it
wouldn't be too difficult to modify the patch to address the concerns
I had about it. What I propose doing is to add an additional elog.h
function
errmsg_plural(const char *fmt_singular, const char *fmt_plural,
unsigned long n, ...)
and replace the current errmsg(ngettext(...)) calls with this.
Similarly add errdetail_plural to replace errdetail(ngettext(...)).
(We could also add errhint_plural and so on, but right offhand these
seem unlikely to be useful.) The advantage of doing this is that
we avoid double translation and eliminate the current kluge whereby
usages in PL code have to be different from usages anywhere else.
I don't feel a need to touch the usages in client programs (pg_dump and
so on). In principle the double-translation risk still exists there,
but it seems much less likely to be a real hazard because any one client
program has a *far* smaller pool of translatable messages than the
backend does. Also, there's only one active text domain in a client
program, so the problem of needing to use dngettext in special cases
doesn't exist.
There are a few usages of ngettext() in the backend that are not tied
to ereport calls, but I think they can be left as-is. There's no
double-translation risk, and with so few of them I don't see much of
a risk of wrongly copying the usage in PL code, either.
Also: one thought that came to me while looking at the existing usages
is that there are several places that are plural-ized that seem
completely pointless; why are we making our translators work
harder on them? For example
ereport(ERROR,
(errcode(ERRCODE_TOO_MANY_ARGUMENTS),
errmsg(ngettext("functions cannot have more than %d argument",
"functions cannot have more than %d arguments",
FUNC_MAX_ARGS),
FUNC_MAX_ARGS)));
It seems extremely far-fetched that FUNC_MAX_ARGS would ever be small
enough that it would make any language's special cases kick in. Or
how about this one:
#if 0
write_msg(modulename, ngettext("read %lu byte into lookahead buffer\n",
"read %lu bytes into lookahead buffer\n",
AH->lookaheadLen),
(unsigned long) AH->lookaheadLen);
#endif
I'm not sure why this debug support is still there at all, but surely
it's a crummy candidate for making translators sweat over. So I'd like
to revert these.
Comments, objections?
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes:
ereport(ERROR,
(errcode(ERRCODE_TOO_MANY_ARGUMENTS),
errmsg(ngettext("functions cannot have more than %d argument",
"functions cannot have more than %d arguments",
FUNC_MAX_ARGS),
FUNC_MAX_ARGS)));It seems extremely far-fetched that FUNC_MAX_ARGS would ever be small
enough that it would make any language's special cases kick in.
Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2.
--
Sergey Burladyan
Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2.
True. The rule IIRC is that except for 11-14 and for collective numerals, declination follows the last digit.
It would be possible to generalize declination via a language-specific message-selector function, especially if the number of numerical complements were limited to 1.
How awkward would it be to re-word the style of messages to avoid declination? For example, the Russian equivalent of "X rows" could be something like "#rows -- X".
David Hudson
Import Notes
Resolved by subject fallback
pg@thetdh.com writes:
Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2.
True. The rule IIRC is that except for 11-14 and for collective numerals, declination follows the last digit.
Wow. So how does anyone represent that in the .po files? AFAICT the
notation the gettext machinery provides isn't really powerful enough
for this.
regards, tom lane
* Tom Lane <tgl@sss.pgh.pa.us> [090604 10:22]:
pg@thetdh.com writes:
Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2.
True. The rule IIRC is that except for 11-14 and for collective numerals, declination follows the last digit.
Wow. So how does anyone represent that in the .po files? AFAICT the
notation the gettext machinery provides isn't really powerful enough
for this.
Well, the C/english "template" one includes just the msgid, and
msgid_plural string.
When the russian translators get to it, they make a russion .po which
contains (something like) the following in the msgid "" header:
"Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n"
And then they provide msgstr[0], msgstr[1], and msgstr[2] to fill the 3
slots that above plural-forms can use when translationg plural-form
strings.
It's all encapsulated in the gettext tools and libraries, and the C
(non-translated) base just always uses ngetttext(single, plural, n), and
ngettext will (if the compiled catalog has different plural-forms) use
whatever the catalog specifies, or fall back to the simple
n == 1 ? singular : plural
type choice when no translated catalog is available.
a.
--
Aidan Van Dyk Create like a god,
aidan@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.
Aidan Van Dyk <aidan@highrise.ca> writes:
When the russian translators get to it, they make a russion .po which
contains (something like) the following in the msgid "" header:
"Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n"
Oh, I see. I didn't realize there was a mapping mechanism available
to the translator.
Okay, so the bottom line there is that there is some value in
pluralizing the messages about FUNC_MAX_ARGS --- I withdraw the
suggestion to undo that. Anyone wish to defend the ones that
are ifdef'd out?
regards, tom lane
(Grrr, declension, not declination.)
"Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 :n%10>=2 && n%10<=4 && (n%100<10 ||n%100>=20) ? 1 : 2;\n"
Thanks. The above (ignoring backslash-EOL) is the form recommended for Russian (inter alia(s)) in the Texinfo manual for gettext ("info gettext"). FWIW this might be an alternative:
"Plural-Forms: nplurals=3; plural=((n - 1) % 10) >= (5-1) || (((n - 1) % 100) <= (14-1) && ((n - 1) % 100) >= (11 - 1)) ? 2 : ((n - 1) % 10) == (1 - 1) ? 0 : 1;\n"
David Hudson
Import Notes
Resolved by subject fallback