[jbosstools-dev] Handling plurals properly

Sean Flanigan sflaniga at redhat.com
Sun Jun 21 22:00:51 EDT 2009


On 06/19/2009 08:11 PM, Max Rydahl Andersen wrote:
>>
>> There's a few ways we can deal with this sort of thing:
>>
>> 1. Use a clunky workaround like:
>>       "{0} can contain only {1} child(ren) with entity {2}."
> Clunky, but it at least states the reason why the error/warning is
> happening.

In English it might (state the reason), *if* you understand the context. 
  I wouldn't bet on it stating anything useful after translation, since 
translators won't have the context unless they chase it up with the 
programmer.

It's just as well Slava said XModel doesn't need translation, for the 
most part!

>> 2. Reformulate [3] the sentence, perhaps like this:
>>       "Parent has too many children with entity.  Parent: {0}, Number
>> of children: {1}, Entity {2}"
> I don't think this is a clear message, compared to #1.

Well, that's why I said "perhaps"!  Another option:
"Parent {0} has too many children with entity {2}.  Maximum children: {1}"

I still don't really know what it means (children with entity?!), but 
it's closer to the original.  I think the main thing is to get the 
conditional plural out of the sentence.

But you're right, it's pretty difficult to remove the plural from this 
sentence without changing it for the worse (*in English*).  But it may 
be worth paying this price if it lets us skip options 3 and 4, at least 
for now.

>> 3. Use ICU's PluralFormat [4] support:
>>    com.ibm.icu.text.MessageFormat.format(
>>      "{1, plural, " +
>>      "one {{0} can contain only # child with entity {2}.} " +
>>      "other {{0} can contain only # children with entity {2}.}}",
>>      new Object[] {parent, max, entity});
> they encode the plural rules into one string ?

Yep, it's a very general solution, but hard to read, even for a 
programmer, let alone a translator.  You should read the example in the 
Javadoc! 
http://bugs.icu-project.org/apiref/icu4j/com/ibm/icu/text/PluralFormat.html

I do think putting everything into one string is the most practical 
solution in Java, since otherwise you have different numbers of strings 
for your different locales.  Gets complicated, especially if you like 
Eclipse's i18n wizards.

I'd be all for ICU, if it just allowed simpler strings for the 
translators to deal with.  And if I could find any documentation other 
than the javadoc above.  I haven't even found a proper list of the 
supported plural forms for French, for example.

>> 4. Use Gettext's plural support[5], by using
>> gnu.gettext.GettextResource.ngettext() instead of
>> java.util.ResourceBundle.getString().  Actually, we can't really use
>> libgettext as is [6], but we could borrow the plural handling (it's
>> LGPL) and use it with Eclipse's Messages classes or ResourceBundles.
>>      I18n.ngettext(Messages.class, "TOO_MANY_CHILDREN", a, b, c);
> How much code is that plural handling ?

A lot more than I realised.  GettextResource.java is pretty short -
 
http://cvs.savannah.gnu.org/viewvc/gettext/gettext-runtime/intl-java/gnu/gettext/GettextResource.java?revision=1.3&root=gettext&view=markup
- but now that I dig into the code, I realise that it really depends on 
the generated code from write-java.c
http://cvs.savannah.gnu.org/viewvc/gettext/gettext-tools/src/write-java.c?revision=1.33&root=gettext&view=markup

It actually generates Java code from the plural rules for each locale. 
I was expecting to find a simple plural rule interpreter I could easily 
borrow, but I can see why they might want to use compiled code.  That 
won't work for us unless we start generating & compiling our 
ResourceBundles too...

So it might actually be less trouble to convert from a simple string 
format (for translators) into the ICU plural format.  But that's not to 
say it would be easy.  It might be very easy to get it wrong...

> Is it worth the trouble when we can't ensure all the plugins we use will
> be able to use it ? (3rd party plugins would at most just use #3)

Well, no, we can't.  It's just for JBT code.  And it could be a lot of 
trouble.  I'm just listing all the options, but I like option 2.

Based on a quick grep of the Babel langpacks, none of the Eclipse 
projects have used ICU's plural support.  I suspect (a) they haven't 
considered the problem of non-Germanic plurals, (b) they rolled their 
own solution, or (c) they went with something like option 2.

> And how many places do we have to actually handle plural ?

I have no way of knowing until I find them.  I've only found a few so 
far, but there are a lot more strings I haven't looked at.

I'm hoping there won't be many.  I think you would get many more full 
grammatical sentences in, say, a business application than in a 
development tool.  They tend to be terser.

>> I think option 2 deserves consideration.  Yes, it's a workaround, and
>> reformulating the sentences may be difficult, but I think the result
>> will often be clearer than the original, even in English.  (If the
>> above example isn't clear, it's probably because I didn't understand
>> the original in context.)  Using this workaround may avoid the need
>> for options 3 and 4.
> Agreed, just need a better message ;)

Exactly.  Now what on earth does the original message mean?

I'll make sure I refer any other de-pluralizations I'm not sure about. 
They should be rare enough that it makes sense to give them individual 
JIRAs.  I hope.

-- 
Sean Flanigan

Senior Software Engineer
Engineering - Internationalisation
Red Hat



More information about the jbosstools-dev mailing list