Thanks for your fast response.

It fails in edge, and firefox too, so unfortunately not just chrome.

I agree with you that the IIS server is violating the spec - it should never attempt to send UTF-8 encoded characters in the header, but it does so anyway and it works with chome, firefox and edge when using http1.1 (they must autodetect the utf-8 encoding) - but with http/2 it fails on the transport level somewhere.

I guess that undertow should not really try to guess a codepage and encode it different, so there is probably not anything else to do - I just wanted to check if there could be another bug that caused the protocol violation error, or if you know of the http/2 spec mentioning anything about this, or if undertow did something special with non-ascii characters.

I'll try encoding the header if I detect non-ascii characters in it and see how it goes...

Below is the information from a trace in my proxy - sorry for the image, but it doesn't cut'n paste well - the odd characters in the Content-Disposition field are UTF-8 encoded version of of the danish letter "å" (å in html)




2018-07-05 3:10 GMT+02:00 Stuart Douglas <sdouglas@redhat.com>:


On Wed, Jul 4, 2018 at 6:15 PM Kim Rasmussen <kr@asseco.dk> wrote:
Hi,

I have a setup where I have my own variant of a ProxyHandler within undertow.
In one case, I proxy requests towards an IIS running MailEnable - if I try to download a webmail attachment where the filename contains non-ascii characters, MailEnable sends the filename in UTF-8 characters in the HTTP header.

I guess this is kinda a violation of the HTTP protocol, but thats how it is.

When I run my undertow proxy using HTTP1.1 between the browser and undertow, everything works as expected - the browser detects and supports UTF-8 characters in the filename in the HTTP headers.
But, if I run HTTP/2 between the browser and undertow, using Chrome I am getting an SPDY_PROTOCOL_ERROR displayed within chrome.

Does it work with other browsers? Its possible that we have a bug in how we handle this, but I think it is more likely that chrome is just being more strict with HTTP/2 and enforcing the spec.
 

So, I guess that it is because Chrome chokes on the UTF-8 characters in the HTTP/2 headers - I tried digging into the spec but I cannot really find anything mentioned there regarding restrictions on header content - just on header naming.


"   Historically, HTTP has allowed field content with text in the
   ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
   through use of [RFC2047] encoding.  In practice, most HTTP header
   field values use only a subset of the US-ASCII charset [USASCII].
   Newly defined header fields SHOULD limit their field values to
   US-ASCII octets.  A recipient SHOULD treat other octets in field
   content (obs-text) as opaque data."
 

Any suggestions ? I could of course strip the invalid characters from the response header before forwarding them but wanted to check if there is a better way first....

Maybe you could use RFC2047 encoding, although I don't think it is particularly widely used, but I guess chrome probably supports it. 

Looking at our HTTP/2 encoder it does not attempt to deal with UTF-8 at all, it just casts the character to a byte, so we would not be encoding the characters properly anyway, but I am not sure if it matters as I don't think the browser would treat them as UTF-8 anyway.

Stuart
 

--
Med venlig hilsen / Best regards

Kim Rasmussen
Partner, IT Architect

Asseco Denmark A/S
Kronprinsessegade 54
DK-1306 Copenhagen K
Mobile: +45 26 16 40 23
Ph.: +45 33 36 46 60
Fax: +45 33 36 46 61

https://ceptor.io
https://asseco.dk


_______________________________________________
undertow-dev mailing list
undertow-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/undertow-dev



--
Med venlig hilsen / Best regards

Kim Rasmussen
Partner, IT Architect

Asseco Denmark A/S
Kronprinsessegade 54
DK-1306 Copenhagen K
Mobile: +45 26 16 40 23
Ph.: +45 33 36 46 60
Fax: +45 33 36 46 61

https://ceptor.io
https://asseco.dk