[gatein-dev] Setting URIEncoding to UTF-8 in server.xml

Matthew Wringe mwringe at redhat.com
Wed Mar 17 15:48:16 EDT 2010


On Wed, 2010-03-17 at 20:20 +0100, Julien Viet wrote:
> Hi Matt
> 
> in WCI there is code to decode query parameters using UTF-8 charset:
> 
> http://anonsvn.jboss.org/repos/gatein/components/wci/trunk/wci/src/main/java/org/gatein/wci/util/RequestDecoder.java
> 
> that delegates most of the job to :  
> org.gatein.common.http.QueryStringParser

hmm, ok, so it handles the querystring parameters directly from the url
and not from the request properties. Which means we need to bypass the
request parameters itself to add this feature. We can do this in the
PortalRequestContext [or maybe we should wait to add this feature until
a later date? with wci]

> (somehow the code of WCI: WebRequest and WebResponse should be used in  
> GateIn in a later feature release, as it provides value add services  
> on top of HTTP request/response)

yeah, it probably should, that code isn't being used right now

Thanks

> 
> On Mar 17, 2010, at 7:36 PM, Matthew Wringe wrote:
> 
> > On Wed, 2010-03-17 at 18:59 +0100, Julien Viet wrote:
> >> actually it is servlet container specific so we want to avoid it.
> >
> > ok, so that means we want to handle it in the portal code completely
> > then?
> >
> >> where does the encoding issue occur ?
> >
> > It occurs when the url is encoded to one value, but we are expecting
> > another. So when the servlet container handles decoding something from
> > the url (pathInfo, properties from query strings, etc...) it will  
> > decode
> > it using whatever encoding it has been configured to use (iso-8859-1  
> > by
> > default, or another value configured in the server.xml).
> >
> > So we have the situation where something we set something like  
> > 'thúy' in
> > the url, the servlet container will take those bytes and convert them
> > into the iso-8859-1 standard (where the ú gets converted into two
> > bytes), and then the string value we get back is 'thúy' (it converts
> > the two bytes back into utf-8 characters).
> >
> > If we set the url encoding to be UTF-8 in the server.xml, then we  
> > don't
> > get these encoding issues since everything is handled in the same
> > encoding.
> >
> > For the portal naming issue, I found a way to get around that by using
> > values not decoded by the servlet container. For the dashboard naming
> > issue, it might be a bit more complex to handle.
> >
> > I really wish the javax.servlet classes/spec was written better to
> > handle encoding issue more nicely. Some methods allow you to get the
> > non-decoded values, other don't. Character encoding only works on
> > non-querystring properties, etc...
> >
> >
> >> On Mar 17, 2010, at 4:46 PM, Matthew Wringe wrote:
> >>
> >>> There have been a few issues which have been brought up in regards  
> >>> to
> >>> portal and tab names not working properly if they contain non ascii
> >>> characters.
> >>>
> >>> https://jira.jboss.org/jira/browse/GTNPORTAL-337
> >>> https://jira.jboss.org/jira/browse/GTNPORTAL-596
> >>>
> >>> These are due to the portal using utf-8, while tomcat/jboss is
> >>> configured to use iso-8859-1 encoding for the url.
> >>>
> >>> In both cases if the URIEncoding is set to UTF-8 in the server.xml  
> >>> (or
> >>> if useBodyEncodingForURI is set to true) then it works.
> >>>
> >>> For the portal naming issue it was possible to get around it in the
> >>> code, for the tab issue its going to be a bit more complex.
> >>>
> >>> Can we just change the URIEncoding in the server.xml for the bundles
> >>> we
> >>> create? or would it be better to find workarounds for this issue in
> >>> the
> >>> portal code itself?
> >>>
> >>> _______________________________________________
> >>> gatein-dev mailing list
> >>> gatein-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/gatein-dev
> >>
> >
> >
> 




More information about the gatein-dev mailing list