[gatein-dev] Setting URIEncoding to UTF-8 in server.xml
Julien Viet
julien at julienviet.com
Wed Mar 17 15:20:57 EDT 2010
Hi Matt
in WCI there is code to decode query parameters using UTF-8 charset:
http://anonsvn.jboss.org/repos/gatein/components/wci/trunk/wci/src/main/java/org/gatein/wci/util/RequestDecoder.java
that delegates most of the job to :
org.gatein.common.http.QueryStringParser
(somehow the code of WCI: WebRequest and WebResponse should be used in
GateIn in a later feature release, as it provides value add services
on top of HTTP request/response)
On Mar 17, 2010, at 7:36 PM, Matthew Wringe wrote:
> On Wed, 2010-03-17 at 18:59 +0100, Julien Viet wrote:
>> actually it is servlet container specific so we want to avoid it.
>
> ok, so that means we want to handle it in the portal code completely
> then?
>
>> where does the encoding issue occur ?
>
> It occurs when the url is encoded to one value, but we are expecting
> another. So when the servlet container handles decoding something from
> the url (pathInfo, properties from query strings, etc...) it will
> decode
> it using whatever encoding it has been configured to use (iso-8859-1
> by
> default, or another value configured in the server.xml).
>
> So we have the situation where something we set something like
> 'thúy' in
> the url, the servlet container will take those bytes and convert them
> into the iso-8859-1 standard (where the ú gets converted into two
> bytes), and then the string value we get back is 'thúy' (it converts
> the two bytes back into utf-8 characters).
>
> If we set the url encoding to be UTF-8 in the server.xml, then we
> don't
> get these encoding issues since everything is handled in the same
> encoding.
>
> For the portal naming issue, I found a way to get around that by using
> values not decoded by the servlet container. For the dashboard naming
> issue, it might be a bit more complex to handle.
>
> I really wish the javax.servlet classes/spec was written better to
> handle encoding issue more nicely. Some methods allow you to get the
> non-decoded values, other don't. Character encoding only works on
> non-querystring properties, etc...
>
>
>> On Mar 17, 2010, at 4:46 PM, Matthew Wringe wrote:
>>
>>> There have been a few issues which have been brought up in regards
>>> to
>>> portal and tab names not working properly if they contain non ascii
>>> characters.
>>>
>>> https://jira.jboss.org/jira/browse/GTNPORTAL-337
>>> https://jira.jboss.org/jira/browse/GTNPORTAL-596
>>>
>>> These are due to the portal using utf-8, while tomcat/jboss is
>>> configured to use iso-8859-1 encoding for the url.
>>>
>>> In both cases if the URIEncoding is set to UTF-8 in the server.xml
>>> (or
>>> if useBodyEncodingForURI is set to true) then it works.
>>>
>>> For the portal naming issue it was possible to get around it in the
>>> code, for the tab issue its going to be a bit more complex.
>>>
>>> Can we just change the URIEncoding in the server.xml for the bundles
>>> we
>>> create? or would it be better to find workarounds for this issue in
>>> the
>>> portal code itself?
>>>
>>> _______________________________________________
>>> gatein-dev mailing list
>>> gatein-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/gatein-dev
>>
>
>
More information about the gatein-dev
mailing list