[jbossws-issues] [JBoss JIRA] (JBWS-3957) publishWsdlImports writes WSDL-XML files with wrong characterset under Windows

Jim Ma (JIRA) issues at jboss.org
Thu Oct 22 01:05:00 EDT 2015


    [ https://issues.jboss.org/browse/JBWS-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120770#comment-13120770 ] 

Jim Ma commented on JBWS-3957:
------------------------------

Hi Martin, 
Yes. Like what you said  wsdl4j doesn't support cp1252 encoding :
{code:java}
  static
  {
    xmlEncodingMap.put(null, Constants.XML_DECL_DEFAULT);
    xmlEncodingMap.put(System.getProperty("file.encoding"),
                       Constants.XML_DECL_DEFAULT);
    xmlEncodingMap.put("UTF8", "UTF-8");
    xmlEncodingMap.put("UTF-16", "UTF-16");
    xmlEncodingMap.put("UnicodeBig", "UTF-16");
    xmlEncodingMap.put("UnicodeLittle", "UTF-16");
    xmlEncodingMap.put("ASCII", "US-ASCII");
    xmlEncodingMap.put("ISO8859_1", "ISO-8859-1");
    xmlEncodingMap.put("ISO8859_2", "ISO-8859-2");
    xmlEncodingMap.put("ISO8859_3", "ISO-8859-3");
    xmlEncodingMap.put("ISO8859_4", "ISO-8859-4");
    xmlEncodingMap.put("ISO8859_5", "ISO-8859-5");
    xmlEncodingMap.put("ISO8859_6", "ISO-8859-6");
    xmlEncodingMap.put("ISO8859_7", "ISO-8859-7");
    xmlEncodingMap.put("ISO8859_8", "ISO-8859-8");
    xmlEncodingMap.put("ISO8859_9", "ISO-8859-9");
    xmlEncodingMap.put("ISO8859_13", "ISO-8859-13");
    xmlEncodingMap.put("ISO8859_15_FDIS", "ISO-8859-15");
    xmlEncodingMap.put("GBK", "GBK");
    xmlEncodingMap.put("Big5", "Big5");
  }  
{code}
What is your code to read this wsdl ? If the xml prolg encoding is set to "UTF-8", the underlying encoding should be set to "UTF-8" when read it. 




> publishWsdlImports writes WSDL-XML files with wrong characterset under Windows
> ------------------------------------------------------------------------------
>
>                 Key: JBWS-3957
>                 URL: https://issues.jboss.org/browse/JBWS-3957
>             Project: JBoss Web Services
>          Issue Type: Bug
>         Environment: WildFly 9, Windows 7 with default Java file.encoding Cp1252.
>            Reporter: Martin Both
>            Assignee: Jim Ma
>             Fix For: jbossws-cxf-5.2.0.Final
>
>
> I would like to deploy a WebService but I get an exception. The WSDL XSD file contains an german character 'ä'. See:
> Here is the part of the XSD:
>   <xs:simpleType name="Teilnehmerart">
>     <xs:restriction base="xs:string">
>       <xs:enumeration value="Privat"></xs:enumeration>
>       <xs:enumeration value="Geschäftlich"></xs:enumeration>
>     </xs:restriction>
>   </xs:simpleType>
> The JBoss class org.jboss.ws.common.utils.AbstractWSDLFilePublisher
> writes the imported WSDL file using a FileWriter at line 166 + 167.
>                 FileWriter fw = new FileWriter(targetFile);
>                wsdlWriter.writeWSDL(subdef, fw);
> In a next step this file is read again and results in an exception:
>  Invalid byte 2 of 3-byte UTF-8 sequence.
> This is because the file is written in Cp1252 enconding together with an XML prolog
> UTF-8 encoding. That is crazy and cannot work.
> A workaround would be to set the Java file.encoding to UTF-8 but that is not what I want.
> There are two solutions:
> I think it is better to create a binary OutputStream instead of creating a FileWriter.
> The wsdlWriter offers two methods! The method with OutputStream argument would always write XML files using a UTF-8 encoding.
> An other way to fix that would be a bug fix to WSDL4J 1.6.3.
> The WSDL4J method com.ibm.wsdl.xml.WSDLWriterImpl.writeWSDL(Definition wsdlDef, Writer sink) does map default Java file.encoding to "UTF-8". That is sometimes wrong.
> The class com.ibm.wsdl.util.xml.DOM2Writer should have an XmlEncodingMapping from "Cp1252" to "Windows-1252". That would fix the problem also. But I think it would be better to use always UTF-8 with the OutputStream method.
> Best regards
> Martin Both



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)



More information about the jbossws-issues mailing list