[
https://jira.jboss.org/jira/browse/JBWS-1716?page=com.atlassian.jira.plug...
]
song andy commented on JBWS-1716:
---------------------------------
Hello All,
I found another way resolve that issue.
Hijack one class of JBossWs
I digged into codes of JBossWs and found the issue caused by JAXB marshall/unmarshall.
JBossWs default use SUN version of JAXB, and it took UTF-8 as default encoding but sun
default stream api took machine level encoding. So it will mess-up the string results
after marshall/unmarshall. So my approach is specify the jaxb encoding with machine level
encoding rather than default encoding, just like minus number multiply with minus number
and the result is postive number, and I tried Japanese, Chinese, French all of them could
pass sucessfully to R&R service.
1. Benefit: it will not globally impact MKV system and risk is manageable.
2. Drawback: it need carefully hijack and also need control scope of the encoding
configuration.
Class Name: org.jboss.ws.core.jaxws.JAXBSearilzer.java
Jars contain that class: jbossws-core.jar, jbossws-client.jar
Erroneous UTF-8 character encoding when marshalling on machines with
non-UTF-8 default encoding
-----------------------------------------------------------------------------------------------
Key: JBWS-1716
URL:
https://jira.jboss.org/jira/browse/JBWS-1716
Project: JBoss Web Services
Issue Type: Bug
Security Level: Public(Everyone can see)
Components: jbossws-jaxrpc, jbossws-native
Affects Versions: jbossws-1.2.1
Reporter: floe fliep
Attachments: a-jbossws1.2.1GA-utf8-patch.jar, JAXBSerializer.java--afterchange
When sending a client request which includes a non-ASCII UTF-8 character such as the
"ç" in "Français" on a machine which has the default character
encoding set to something different than UTF-8, the encoding is erroneous. For example,
the "ç" in the example above is marshalled on the network stream as 0xC3 0x83
0xC2 0xA7 instead of the legal UTF-8 sequence being 0xC3 0xA7, when the machine's
default character set is set to MS1252 in this case (Windows).
A fix for this is setting the system property file.encoding=utf-8, but this causes as
many problems elsewhere as it fixes (especially in the case of legacy platform-specific
file reading) ... .
A forum post is highly likely to expose the same phenomenon:
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4030510#...
After some good hours of stepping through the JBossWS code, I discovered what I guess
must be the culprit in the method XMLFragment.writeSourceInternal(Writer writer):
....
if (reader == null)
reader = new InputStreamReader(streamSource.getInputStream());
Here streamSource.getInputStream() is an already UTF-8 encoded stream. However, when a
new instance of InputStreamReader is created around it, it will be set to the
machine's default character encoding, thus effectively interpreting bytes from the
UTF-8 stream in a different encoding scheme, resulting in corrupted data.
Each time data passes through the marschalling corruption is added, effectively worsening
wrong character count when data is passed back and forth.
I would suggest attaching a reader to the StreamSource source instance var so that it
keeps track of its encoding, but that might break things elsewhere ...
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira