Ike,

   I got a chance to dig into the issue this weekend and I believe I have found the cause (or causes, as the case may be).  First, the code currently in the HTTP Server does not assume that there is only one body part in the multipart POST (it assumes there could be a number of fields with one of them being the actual deployment being uploaded).  This was something that I did not include originally, but added later after Jason and I discussed how the upload API may need to be used in the future (i.e. what would happen if the form post needs to provide other information in addition to the deployment as part of the upload).  Therefore, the code keeps reading header/body pairs from the stream until it finds one of content-type "application/octet-stream".  However, upon further investigation (and using different browsers), it appears that different browsers provide different content types for the input type="file" input element in a "multipart/form-data" form (for instance, Firefox 3.x reports application/octet-stream, while IE 9 reports application/x-zip-compressed for a .war file and I suspect Safari is doing something different as well).  There is definitely a defect in the logic that reads through the stream where it could never exit the outer while loop in the extractPostContent method if it never finds a part of type application/octet-stream.  I have posted a patch to fix the issue here:  https://github.com/jdpgrailsdev/jboss-as/commit/83f0d293e018a550562cd03819edfe071dafa0df.  Note that this patch does NOT fix the issue with the different content-types -- it merely fixes the OOM and returns a HTTP 500 if we do not find a part of type application/octet-stream in the POST (see below for more thoughts on what to do with this).  I have added a comment to this effect on JBAS-9268.

On a related note, it appears that your patch does not handle the case described above (mutliple parts in the form post that need to be inspected to find the actual deployment -- please correct me if I am reading the code wrong!).  From what I can tell, the proposed patch consumes the header (i.e. the part that contains the content-disposition and content-type declarations) and then reads the body of the part, writing it to the output stream point to the temporary file.  It repeats this process until there are no more parts to read (this means that in the case of multiple parts, the bodies will be concatenated together in the output file).  It *seems* as though the new patch assumes only one part for the deployment or that the deployment itself will be split across multiple boundaries.  This is not the assumption that the original code was produced under (I will defer to Jason on this as regards to whether or not we need the ability to handle upload requests with multiple parts with only one of those parts being the actual deployment).

I guess this leads us to a decision point as to how to modify the HTTP server:

1) Patch the existing code to a) not produce OOM when no deployment is found in the incoming POST and b) modify the code to look for the appropriate content-types that we need to support (instead of just application/octet-stream).  This would mean that we want to keep the logic to handle inspecting each part of the multipart form post to find the deployment.
2) Use Ike's patch as-is.
3) Modify Ike's patch to include the logic for handling inspecting each part of the multipart form post (hybrid of 1 and 2, basically).

Options 1 and 3 assume that we still need the logic to "find" the deployment in the upload POST request by interpreting the content-type of each part.  Thinking about this some more, it may make sense to standardize on the name of the field instead of the type, as we can enforce it from an API perspective what people have to name the part for the API to deploy it, instead of relying on all browsers to use the same content-type for a deployment.

Please let me know what you think and/or if you need me to make any of the changes!

--Jonathan

On Fri, Apr 15, 2011 at 7:50 AM, Heiko Braun <hbraun@redhat.com> wrote:



I've run into a OOM issue when uploading content through the HTTP API again.
(https://issues.jboss.org/browse/JBAS-9268)


I did take a look at the current implementation and propose that we change it in the following way:

- read multipart contents from mime boundary
- replace the custom multipart stream implementation with a mature one (leans on apache commons fileupload)

I've got this changes  tested and verified in a custom branch:
https://github.com/heiko-braun/jboss-as/commits/out_of_memory

However before going ahead, I would like get some feedback from

a) the original author (Jonathan Pearlin. Welcome onboard btw)
b) Jason wrt the Apache License sources (See org.jboss.as.domain.http.server.multipart.asf.*)


Regards, Ike




_______________________________________________
jboss-as7-dev mailing list
jboss-as7-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev