[jbosstools-dev] Handling plurals properly

Sean Flanigan sflaniga at redhat.com
Fri Jun 19 04:24:47 EDT 2009


Hi,

Handling plurals in an internationalized way is something of an unsolved 
problem in Java, at least in Eclipse circles.

Here's some code I found in DefaultWizardDataValidator.java [1].

message = DefaultCreateHandler.title(parent, true) +
     " can contain only " + max +
     ((max == 1) ? " child " : " children ") +
     "with entity " + entity + ".";

I'm not sure whether the XModel needs i18n, but that's a subject for my 
other email.  Here, it's just an example.

Now I can use MessageFormat to construct the strings like this:

message = ((max == 1) ?
   MessageFormat.format(
     "{0} can contain only {1} child with entity {2}.",
     DefaultCreateHandler.title(parent, true), max, entity)
   : MessageFormat.format(
     "{0} can contain only {1} children with entity {2}.",
     DefaultCreateHandler.title(parent, true), max, entity));

but that only tends to work for Germanic languages, since every language 
has different rules for plurals, and even differing numbers of plural 
forms[2]: 
http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html


There's a few ways we can deal with this sort of thing:

1. Use a clunky workaround like:
      "{0} can contain only {1} child(ren) with entity {2}."

2. Reformulate [3] the sentence, perhaps like this:
      "Parent has too many children with entity.  Parent: {0}, Number of 
children: {1}, Entity {2}"

3. Use ICU's PluralFormat [4] support:
   com.ibm.icu.text.MessageFormat.format(
     "{1, plural, " +
     "one {{0} can contain only # child with entity {2}.} " +
     "other {{0} can contain only # children with entity {2}.}}",
     new Object[] {parent, max, entity});

4. Use Gettext's plural support[5], by using 
gnu.gettext.GettextResource.ngettext() instead of 
java.util.ResourceBundle.getString().  Actually, we can't really use 
libgettext as is [6], but we could borrow the plural handling (it's 
LGPL) and use it with Eclipse's Messages classes or ResourceBundles.
     I18n.ngettext(Messages.class, "TOO_MANY_CHILDREN", a, b, c);

where the English messages.properties might be
   TOO_MANY_CHILDREN_0={0} can contain only one child with entity {2}.
   TOO_MANY_CHILDREN_1={0} can contain only {1} children with entity {2}.

or
     I18n.ngettext(Messages.TOO_MANY_CHILDREN, a, b, c);

where the English messages.properties might be
   TOO_MANY_CHILDREN={0} can contain only one child with entity {2}.\
|{0} can contain only {1} children with entity {2}.


Option 1 is commonly used, but ugly across all languages.

I think option 2 deserves consideration.  Yes, it's a workaround, and 
reformulating the sentences may be difficult, but I think the result 
will often be clearer than the original, even in English.  (If the above 
example isn't clear, it's probably because I didn't understand the 
original in context.)  Using this workaround may avoid the need for 
options 3 and 4.

Option 3 is pretty strong in some ways, since ICU comes with Galileo, 
and it seems to be very complete.  ICU even comes with predefined plural 
rules built-in for a number of languages.  However, the translation 
strings end up being almost unreadable, which makes them almost 
untranslatable.

Option 4 can work, but will require solving some technical issues.  For 
instance, either we encode 1-4 plural forms into a single resource 
bundle string (with some sort of delimiter, which leads to edge cases) 
or we generate resource keys at runtime to look up the plural we want 
(which means Eclipse's wizards thinks those resource keys are unused).


It may also be appropriate to use option 2 for some messages, and option 
3/4 in cases where option 2 is not good enough.


Opinions?


Sean.


[1] 
http://anonsvn.jboss.org/repos/jbosstools/trunk/common/plugins/org.jboss.tools.common.model/src/org/jboss/tools/common/meta/action/impl/DefaultWizardDataValidator.java
[2] http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html
[3] 
http://globalizer.wordpress.com/2007/01/30/how-to-use-choiceformat-then-lost-in-translation-part-v/
[4] 
http://bugs.icu-project.org/apiref/icu4j/com/ibm/icu/text/PluralFormat.html
[5] 
http://www.gnu.org/software/gettext/manual/html_node/javadoc2/gnu/gettext/GettextResource.html#ngettext(java.util.ResourceBundle,%20java.lang.String,%20java.lang.String,%20long)
[6] The Gettext API expects the ResourceBundle keys to be the actual 
English text, not a made-up key.  This doesn't match the Eclipse 
approach with Messages classes where key is a field name.

-- 
Sean Flanigan

Senior Software Engineer
Engineering - Internationalisation
Red Hat



More information about the jbosstools-dev mailing list