UTF-8 is a little more complex than that. Chars 0-0x7F are represented by one byte,
0-0x7F. Beyond that characters are represented with 2 to 4 bytes. This means that for
every character there are multiple comparisons and shifts performed, with some extra bits
being set or cleared for certain characters.
UTF-16 on the other hand, being the native encoding for Java, is written one char at a
time without transcoding - no shifts, no comparisons, no bitmasks. It's just a
straight write of chars. You can't possibly do better than that in terms of
processing speed.
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4150822#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...