[gatein-issues] [JBoss JIRA] (GTNCOMMON-24) FastURLDecoder cannot decode surrogate pair characters
Peter Palaga (JIRA)
issues at jboss.org
Fri Feb 27 06:08:49 EST 2015
[ https://issues.jboss.org/browse/GTNCOMMON-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044621#comment-13044621 ]
Peter Palaga edited comment on GTNCOMMON-24 at 2/27/15 6:08 AM:
----------------------------------------------------------------
[~tkonishi], the idea behind the attached test case is to compare the output of {{java.net.URLEncoder}} with the output of {{org.gatein.common.text.FastURLDecoder}}, right? But if this was your intention, then the test seems to be doing something different: it basically compares {{URLEncoder.encode(s, "UTF8")}} with {{FastURLDecoder.getUTF8Instance().encode(URLEncoder.encode(s, "UTF8"), out)}}. Note that the second expression encodes twice. Did you perhaps want something like this?
{code}
public void testEncodeSurrogatePair() throws Exception
{
FastURLDecoder encoder = FastURLDecoder.getUTF8Instance();
CharBuffer out = new CharBuffer();
StringBuilder sb = new StringBuilder( 2 );
sb.append( ( char ) 0xD840 );
sb.append( ( char ) 0xDC0B );
String hanU2000B = sb.toString(); // U+2000B
String encodedWithURLEncoder = URLEncoder.encode(hanU2000B, "UTF8");
encoder.encode(hanU2000B, out);
assertEquals(encodedWithURLEncoder, out.asString());
}
{code}
was (Author: ppalaga):
[~tkonishi], the idea behind the attached test case is to compare the output of {{java.net.URLEncoder}} with the output of {{org.gatein.common.text.FastURLDecoder}}. But if this was your intention, then the test seems to be doing something different: it basically compares {{URLEncoder.encode(s, "UTF8")}} with {{FastURLDecoder.getUTF8Instance().encode(URLEncoder.encode(s, "UTF8"), out)}}. Note that the second expression encodes twice. Did you perhaps want something like this?
{code}
public void testEncodeSurrogatePair() throws Exception
{
FastURLDecoder encoder = FastURLDecoder.getUTF8Instance();
CharBuffer out = new CharBuffer();
StringBuilder sb = new StringBuilder( 2 );
sb.append( ( char ) 0xD840 );
sb.append( ( char ) 0xDC0B );
String hanU2000B = sb.toString(); // U+2000B
String encodedWithURLEncoder = URLEncoder.encode(hanU2000B, "UTF8");
encoder.encode(hanU2000B, out);
assertEquals(encodedWithURLEncoder, out.asString());
}
{code}
> FastURLDecoder cannot decode surrogate pair characters
> ------------------------------------------------------
>
> Key: GTNCOMMON-24
> URL: https://issues.jboss.org/browse/GTNCOMMON-24
> Project: GateIn Common
> Issue Type: Bug
> Reporter: Takayuki Konishi
> Attachments: surrogatepairtest.patch
>
>
> FastURLDecoder cannot decode surrogate pair characters.
> When I decoded [U+20000B|http://www.fileformat.info/info/unicode/char/2000B/index.htm], I got MalformedInputException:
> {code}
> org.gatein.common.text.MalformedInputException: Cannot decode char 'A0'
> at org.gatein.common.text.FastURLDecoder.safeEncode(FastURLDecoder.java:217)
> at org.gatein.common.text.AbstractCharEncoder.encode(AbstractCharEncoder.java:45)
> at org.gatein.common.text.AbstractCharEncoder.encode(AbstractCharEncoder.java:62)
> at org.gatein.common.text.FastURLDecoderTestCase.testEncodeSurrogatePair(FastURLDecoderTestCase.java:159)
> {code}
> I also attach a patch for testcase.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
More information about the gatein-issues
mailing list