[jboss-jira] [JBoss JIRA] (JGRP-2062) TYPE_STRING does not handle unicode
Bela Ban (JIRA)
issues at jboss.org
Tue Jul 26 10:18:01 EDT 2016
[ https://issues.jboss.org/browse/JGRP-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bela Ban updated JGRP-2062:
---------------------------
Steps to Reproduce:
{code:java}
channel.send(new org.jgroups.Message(null, "\u1F601"));
{code}
Expected: Ironically, Jira fails on expected output. See http://apps.timwhitlock.info/emoji/tables/unicode
Actual: garbage data
was:
{{
channel.send(new org.jgroups.Message(null, "\u1F601"));
}}
Expected: Ironically, Jira fails on expected output. See http://apps.timwhitlock.info/emoji/tables/unicode
Actual: garbage data
> TYPE_STRING does not handle unicode
> -----------------------------------
>
> Key: JGRP-2062
> URL: https://issues.jboss.org/browse/JGRP-2062
> Project: JGroups
> Issue Type: Bug
> Reporter: Cody Ebberson
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 4.0
>
>
> In several places throughout the org.jgroups.util.Util class, it is assumed that Strings are one byte per character.
> For example, see objectToByteBuffer lines 561-567:
> https://github.com/belaban/JGroups/blob/master/src/org/jgroups/util/Util.java#L561-L567
> {code:java}
> case TYPE_STRING:
> String str=(String)obj;
> int len=str.length();
> ByteBuffer retval=ByteBuffer.allocate(Global.BYTE_SIZE + len).put(TYPE_STRING);
> for(int i=0; i < len; i++)
> retval.put((byte)str.charAt(i));
> return retval.array();
> {code}
> This code will incorrectly encode any String with non ASCII encoding.
> There are several options to fix. You could use str.getBytes(StandardCharsets.UTF_8) to get a proper byte encoding, or you could use the existing TYPE_SERIALIZABLE code path.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
More information about the jboss-jira
mailing list