[jboss-jira] [JBoss JIRA] (JGRP-1772) Optimize marshalling of strings

Bela Ban (JIRA) issues at jboss.org
Fri Feb 14 12:27:28 EST 2014


    [ https://issues.jboss.org/browse/JGRP-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12944806#comment-12944806 ] 

Bela Ban edited comment on JGRP-1772 at 2/14/14 12:25 PM:
----------------------------------------------------------

java.lang.String has 2 fields:
{code:title=String.java|borderStyle=solid}
public final class String implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0
...
{code}

The {{value}} field uses 2 bytes for each (char) element, and {{hash}} is not needed either.

We could create a SimpleString or AsciiString class as follows:
{code:title=AsciiString.java|borderStyle=solid}
public class AsciiString {
   protected final byte[] value;

   public AsciiString(String s) {
         // populate value from s (losing the 8 higher-order bits)
   }
}
{code}

This would contain comparison and equality methods, and marshalling code.

The benefits would be:
* Uses less space in memory
* Smaller marshalling footprint *if* double-byte chars are used (otherwise it is the same as {{writeUTF()}}).

This could be used instead of String by {{TpHeader}} for example.
                
      was (Author: belaban):
    java.lang.String has 2 fields:
{code:title=String.java|borderStyle=solid}
public final class String implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0
...
{code}

The {{value}} field uses 2 bytes for each (char) element, and {{hash}} is not needed either.

We could create a SimpleString or AsciiString class as follows:
{code:title=AsciiString.java|borderStyle=solid}
public class AsciiString {
   protected final byte[] value;

   public AsciiString(String s) {
         // populate value from s (losing the 8 higher-order bits)
   }
}
{code}

This would contain comparison and equality methods, and marshalling code.

The benefits would be:
* Uses less space in memory
* Smaller marshallimng footprint *if* double-byte chars are used (otherwise it is the same as {{writeUTF()}}).
                  
> Optimize marshalling of strings
> -------------------------------
>
>                 Key: JGRP-1772
>                 URL: https://issues.jboss.org/browse/JGRP-1772
>             Project: JGroups
>          Issue Type: Enhancement
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.5
>
>
> Currently, we use DataOutput.writeUTF() for all sorts of strings. This is implemented inefficiently and potentially uses more than 1 byte per char.
> Add another method writeString() which converts double byte chars to single byte chars so that only ASCII is supported. This can be used by a lot of internal code which never uses chars above 128.
> For external code, such as {{JChannel.connect(String cluster_name)}}, we need to see whether this is ok. Since cluster names are mainly used to differentiate clusters, perhaps it is ok to mangle the names to chars below 128, although this would change cluster names which use multi-byte chars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list