]
Alessio Soldano updated JBWS-3957:
----------------------------------
Assignee: Alessio Soldano (was: Jim Ma)
publishWsdlImports writes WSDL-XML files with wrong characterset
under Windows
------------------------------------------------------------------------------
Key: JBWS-3957
URL:
https://issues.jboss.org/browse/JBWS-3957
Project: JBoss Web Services
Issue Type: Bug
Environment: WildFly 9, Windows 7 with default Java file.encoding Cp1252.
Reporter: Martin Both
Assignee: Alessio Soldano
Fix For: jbossws-cxf-5.2.0.Final
I would like to deploy a WebService but I get an exception. The WSDL XSD file contains an
german character 'ä'. See:
Here is the part of the XSD:
<xs:simpleType name="Teilnehmerart">
<xs:restriction base="xs:string">
<xs:enumeration value="Privat"></xs:enumeration>
<xs:enumeration value="Geschäftlich"></xs:enumeration>
</xs:restriction>
</xs:simpleType>
The JBoss class org.jboss.ws.common.utils.AbstractWSDLFilePublisher
writes the imported WSDL file using a FileWriter at line 166 + 167.
FileWriter fw = new FileWriter(targetFile);
wsdlWriter.writeWSDL(subdef, fw);
In a next step this file is read again and results in an exception:
Invalid byte 2 of 3-byte UTF-8 sequence.
This is because the file is written in Cp1252 enconding together with an XML prolog
UTF-8 encoding. That is crazy and cannot work.
A workaround would be to set the Java file.encoding to UTF-8 but that is not what I
want.
There are two solutions:
I think it is better to create a binary OutputStream instead of creating a FileWriter.
The wsdlWriter offers two methods! The method with OutputStream argument would always
write XML files using a UTF-8 encoding.
An other way to fix that would be a bug fix to WSDL4J 1.6.3.
The WSDL4J method com.ibm.wsdl.xml.WSDLWriterImpl.writeWSDL(Definition wsdlDef, Writer
sink) does map default Java file.encoding to "UTF-8". That is sometimes wrong.
The class com.ibm.wsdl.util.xml.DOM2Writer should have an XmlEncodingMapping from
"Cp1252" to "Windows-1252". That would fix the problem also. But I
think it would be better to use always UTF-8 with the OutputStream method.
Best regards
Martin Both