[jbosstools-dev] Handling plurals properly
Sean Flanigan
sflaniga at redhat.com
Fri Jun 19 04:24:47 EDT 2009
Hi,
Handling plurals in an internationalized way is something of an unsolved
problem in Java, at least in Eclipse circles.
Here's some code I found in DefaultWizardDataValidator.java [1].
message = DefaultCreateHandler.title(parent, true) +
" can contain only " + max +
((max == 1) ? " child " : " children ") +
"with entity " + entity + ".";
I'm not sure whether the XModel needs i18n, but that's a subject for my
other email. Here, it's just an example.
Now I can use MessageFormat to construct the strings like this:
message = ((max == 1) ?
MessageFormat.format(
"{0} can contain only {1} child with entity {2}.",
DefaultCreateHandler.title(parent, true), max, entity)
: MessageFormat.format(
"{0} can contain only {1} children with entity {2}.",
DefaultCreateHandler.title(parent, true), max, entity));
but that only tends to work for Germanic languages, since every language
has different rules for plurals, and even differing numbers of plural
forms[2]:
http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html
There's a few ways we can deal with this sort of thing:
1. Use a clunky workaround like:
"{0} can contain only {1} child(ren) with entity {2}."
2. Reformulate [3] the sentence, perhaps like this:
"Parent has too many children with entity. Parent: {0}, Number of
children: {1}, Entity {2}"
3. Use ICU's PluralFormat [4] support:
com.ibm.icu.text.MessageFormat.format(
"{1, plural, " +
"one {{0} can contain only # child with entity {2}.} " +
"other {{0} can contain only # children with entity {2}.}}",
new Object[] {parent, max, entity});
4. Use Gettext's plural support[5], by using
gnu.gettext.GettextResource.ngettext() instead of
java.util.ResourceBundle.getString(). Actually, we can't really use
libgettext as is [6], but we could borrow the plural handling (it's
LGPL) and use it with Eclipse's Messages classes or ResourceBundles.
I18n.ngettext(Messages.class, "TOO_MANY_CHILDREN", a, b, c);
where the English messages.properties might be
TOO_MANY_CHILDREN_0={0} can contain only one child with entity {2}.
TOO_MANY_CHILDREN_1={0} can contain only {1} children with entity {2}.
or
I18n.ngettext(Messages.TOO_MANY_CHILDREN, a, b, c);
where the English messages.properties might be
TOO_MANY_CHILDREN={0} can contain only one child with entity {2}.\
|{0} can contain only {1} children with entity {2}.
Option 1 is commonly used, but ugly across all languages.
I think option 2 deserves consideration. Yes, it's a workaround, and
reformulating the sentences may be difficult, but I think the result
will often be clearer than the original, even in English. (If the above
example isn't clear, it's probably because I didn't understand the
original in context.) Using this workaround may avoid the need for
options 3 and 4.
Option 3 is pretty strong in some ways, since ICU comes with Galileo,
and it seems to be very complete. ICU even comes with predefined plural
rules built-in for a number of languages. However, the translation
strings end up being almost unreadable, which makes them almost
untranslatable.
Option 4 can work, but will require solving some technical issues. For
instance, either we encode 1-4 plural forms into a single resource
bundle string (with some sort of delimiter, which leads to edge cases)
or we generate resource keys at runtime to look up the plural we want
(which means Eclipse's wizards thinks those resource keys are unused).
It may also be appropriate to use option 2 for some messages, and option
3/4 in cases where option 2 is not good enough.
Opinions?
Sean.
[1]
http://anonsvn.jboss.org/repos/jbosstools/trunk/common/plugins/org.jboss.tools.common.model/src/org/jboss/tools/common/meta/action/impl/DefaultWizardDataValidator.java
[2] http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html
[3]
http://globalizer.wordpress.com/2007/01/30/how-to-use-choiceformat-then-lost-in-translation-part-v/
[4]
http://bugs.icu-project.org/apiref/icu4j/com/ibm/icu/text/PluralFormat.html
[5]
http://www.gnu.org/software/gettext/manual/html_node/javadoc2/gnu/gettext/GettextResource.html#ngettext(java.util.ResourceBundle,%20java.lang.String,%20java.lang.String,%20long)
[6] The Gettext API expects the ResourceBundle keys to be the actual
English text, not a made-up key. This doesn't match the Eclipse
approach with Messages classes where key is a field name.
--
Sean Flanigan
Senior Software Engineer
Engineering - Internationalisation
Red Hat
More information about the jbosstools-dev
mailing list