[jboss-jira] [JBoss JIRA] (JGRP-2061) TYPE_STRING does not handle unicode

Bela Ban (JIRA) issues at jboss.org
Fri Aug 18 01:54:00 EDT 2017


    [ https://issues.jboss.org/browse/JGRP-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451095#comment-13451095 ] 

Bela Ban commented on JGRP-2061:
--------------------------------

I think this has been fixed (but I forgot to close the issue):

{code:java}
             case TYPE_STRING:
                  String str=(String)obj;
                  if(Util.isAsciiString(str)) {
                      int len=str.length();
                      ByteBuffer retval=ByteBuffer.allocate(Global.BYTE_SIZE + len).put(TYPE_STRING);
                      for(int i=0; i < len; i++)
                          retval.put((byte)str.charAt(i));
                      return retval.array();
                  }
                  else {
                      ByteArrayDataOutputStream out=new ByteArrayDataOutputStream(str.length()*2 +3);
                      out.write(TYPE_UTF_STRING);
                      out.writeUTF(str);
                      byte[] ret=new byte[out.position()];
                      System.arraycopy(out.buffer(), 0, ret, 0, ret.length);
                      return ret;
                  }
{code}

Can you confirm this?

> TYPE_STRING does not handle unicode
> -----------------------------------
>
>                 Key: JGRP-2061
>                 URL: https://issues.jboss.org/browse/JGRP-2061
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Cody Ebberson
>            Assignee: Bela Ban
>             Fix For: 4.0.6
>
>
> In several places throughout the org.jgroups.util.Util class, it is assumed that Strings are one byte per character.
> For example, see objectToByteBuffer lines 561-567:
> https://github.com/belaban/JGroups/blob/master/src/org/jgroups/util/Util.java#L561-L567
> {{
> case TYPE_STRING:
>     String str=(String)obj;
>     int len=str.length();
>     ByteBuffer retval=ByteBuffer.allocate(Global.BYTE_SIZE + len).put(TYPE_STRING);
>     for(int i=0; i < len; i++)
>         retval.put((byte)str.charAt(i));
>     return retval.array();
> }}
> This code will incorrectly encode any String with non ASCII encoding.
> There are several options to fix.  You could use str.getBytes(StandardCharsets.UTF_8) to get a proper byte encoding, or you could use the existing TYPE_SERIALIZABLE code path.



--
This message was sent by Atlassian JIRA
(v7.2.3#72005)


More information about the jboss-jira mailing list