[jbossws-issues] [JBoss JIRA] (JBWS-3957) publishWsdlImports writes WSDL-XML files with wrong characterset under Windows

Jim Ma (JIRA) issues at jboss.org
Wed Oct 28 02:51:00 EDT 2015


    [ https://issues.jboss.org/browse/JBWS-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122743#comment-13122743 ] 

Jim Ma commented on JBWS-3957:
------------------------------

Hi Martin, 
Thanks for your quick response. Cp1252 encoding is a standard xml encoding,  I think WSDL4j should add this support. Regards to write and read wsdl , this behavior is specified in   JSR 109(Web Services for Java EE). It requires the wsdl should be published after endpoint deployment. Given the wsdl4j is slowly updated(the latest release is two years ago), we can look at what can we do in jbossws code to fix this issue. I don't have a cp1252 windows machine at my hand, can you please paste the error stacktrace which contains the error "Invalid byte 2 of 3-byte UTF-8 sequence" ? I'll look at if we can manually set something to fix the read. 
Thanks,
Jim

> publishWsdlImports writes WSDL-XML files with wrong characterset under Windows
> ------------------------------------------------------------------------------
>
>                 Key: JBWS-3957
>                 URL: https://issues.jboss.org/browse/JBWS-3957
>             Project: JBoss Web Services
>          Issue Type: Bug
>         Environment: WildFly 9, Windows 7 with default Java file.encoding Cp1252.
>            Reporter: Martin Both
>            Assignee: Jim Ma
>             Fix For: jbossws-cxf-5.2.0.Final
>
>
> I would like to deploy a WebService but I get an exception. The WSDL XSD file contains an german character 'ä'. See:
> Here is the part of the XSD:
>   <xs:simpleType name="Teilnehmerart">
>     <xs:restriction base="xs:string">
>       <xs:enumeration value="Privat"></xs:enumeration>
>       <xs:enumeration value="Geschäftlich"></xs:enumeration>
>     </xs:restriction>
>   </xs:simpleType>
> The JBoss class org.jboss.ws.common.utils.AbstractWSDLFilePublisher
> writes the imported WSDL file using a FileWriter at line 166 + 167.
>                 FileWriter fw = new FileWriter(targetFile);
>                wsdlWriter.writeWSDL(subdef, fw);
> In a next step this file is read again and results in an exception:
>  Invalid byte 2 of 3-byte UTF-8 sequence.
> This is because the file is written in Cp1252 enconding together with an XML prolog
> UTF-8 encoding. That is crazy and cannot work.
> A workaround would be to set the Java file.encoding to UTF-8 but that is not what I want.
> There are two solutions:
> I think it is better to create a binary OutputStream instead of creating a FileWriter.
> The wsdlWriter offers two methods! The method with OutputStream argument would always write XML files using a UTF-8 encoding.
> An other way to fix that would be a bug fix to WSDL4J 1.6.3.
> The WSDL4J method com.ibm.wsdl.xml.WSDLWriterImpl.writeWSDL(Definition wsdlDef, Writer sink) does map default Java file.encoding to "UTF-8". That is sometimes wrong.
> The class com.ibm.wsdl.util.xml.DOM2Writer should have an XmlEncodingMapping from "Cp1252" to "Windows-1252". That would fix the problem also. But I think it would be better to use always UTF-8 with the OutputStream method.
> Best regards
> Martin Both



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)



More information about the jbossws-issues mailing list