[gatein-dev] Setting URIEncoding to UTF-8 in server.xml

Julien Viet julien at julienviet.com
Wed Mar 17 15:20:57 EDT 2010


Hi Matt

in WCI there is code to decode query parameters using UTF-8 charset:

http://anonsvn.jboss.org/repos/gatein/components/wci/trunk/wci/src/main/java/org/gatein/wci/util/RequestDecoder.java

that delegates most of the job to :  
org.gatein.common.http.QueryStringParser

(somehow the code of WCI: WebRequest and WebResponse should be used in  
GateIn in a later feature release, as it provides value add services  
on top of HTTP request/response)

On Mar 17, 2010, at 7:36 PM, Matthew Wringe wrote:

> On Wed, 2010-03-17 at 18:59 +0100, Julien Viet wrote:
>> actually it is servlet container specific so we want to avoid it.
>
> ok, so that means we want to handle it in the portal code completely
> then?
>
>> where does the encoding issue occur ?
>
> It occurs when the url is encoded to one value, but we are expecting
> another. So when the servlet container handles decoding something from
> the url (pathInfo, properties from query strings, etc...) it will  
> decode
> it using whatever encoding it has been configured to use (iso-8859-1  
> by
> default, or another value configured in the server.xml).
>
> So we have the situation where something we set something like  
> 'thúy' in
> the url, the servlet container will take those bytes and convert them
> into the iso-8859-1 standard (where the ú gets converted into two
> bytes), and then the string value we get back is 'thúy' (it converts
> the two bytes back into utf-8 characters).
>
> If we set the url encoding to be UTF-8 in the server.xml, then we  
> don't
> get these encoding issues since everything is handled in the same
> encoding.
>
> For the portal naming issue, I found a way to get around that by using
> values not decoded by the servlet container. For the dashboard naming
> issue, it might be a bit more complex to handle.
>
> I really wish the javax.servlet classes/spec was written better to
> handle encoding issue more nicely. Some methods allow you to get the
> non-decoded values, other don't. Character encoding only works on
> non-querystring properties, etc...
>
>
>> On Mar 17, 2010, at 4:46 PM, Matthew Wringe wrote:
>>
>>> There have been a few issues which have been brought up in regards  
>>> to
>>> portal and tab names not working properly if they contain non ascii
>>> characters.
>>>
>>> https://jira.jboss.org/jira/browse/GTNPORTAL-337
>>> https://jira.jboss.org/jira/browse/GTNPORTAL-596
>>>
>>> These are due to the portal using utf-8, while tomcat/jboss is
>>> configured to use iso-8859-1 encoding for the url.
>>>
>>> In both cases if the URIEncoding is set to UTF-8 in the server.xml  
>>> (or
>>> if useBodyEncodingForURI is set to true) then it works.
>>>
>>> For the portal naming issue it was possible to get around it in the
>>> code, for the tab issue its going to be a bit more complex.
>>>
>>> Can we just change the URIEncoding in the server.xml for the bundles
>>> we
>>> create? or would it be better to find workarounds for this issue in
>>> the
>>> portal code itself?
>>>
>>> _______________________________________________
>>> gatein-dev mailing list
>>> gatein-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/gatein-dev
>>
>
>




More information about the gatein-dev mailing list