Thanks Jonathan.
I just did a big mass change today that should fix all the issues, as
well as converting to use direct streams. Feel free to beat it up if you
like.
On 4/17/11 4:32 PM, Jonathan Pearlin wrote:
Ike,
I got a chance to dig into the issue this weekend and I believe I
have found the cause (or causes, as the case may be). First, the code
currently in the HTTP Server does not assume that there is only one body
part in the multipart POST (it assumes there could be a number of fields
with one of them being the actual deployment being uploaded). This was
something that I did not include originally, but added later after Jason
and I discussed how the upload API may need to be used in the future
(i.e. what would happen if the form post needs to provide other
information in addition to the deployment as part of the upload).
Therefore, the code keeps reading header/body pairs from the stream
until it finds one of content-type "application/octet-stream". However,
upon further investigation (and using different browsers), it appears
that different browsers provide different content types for the input
type="file" input element in a "multipart/form-data" form (for
instance,
Firefox 3.x reports application/octet-stream, while IE 9 reports
application/x-zip-compressed for a .war file and I suspect Safari is
doing something different as well). There is definitely a defect in the
logic that reads through the stream where it could never exit the outer
while loop in the extractPostContent method if it never finds a part of
type application/octet-stream. I have posted a patch to fix the issue
here:
https://github.com/jdpgrailsdev/jboss-as/commit/83f0d293e018a550562cd0381....
Note that this patch does NOT fix the issue with the different
content-types -- it merely fixes the OOM and returns a HTTP 500 if we do
not find a part of type application/octet-stream in the POST (see below
for more thoughts on what to do with this). I have added a comment to
this effect on JBAS-9268.
On a related note, it appears that your patch does not handle the case
described above (mutliple parts in the form post that need to be
inspected to find the actual deployment -- please correct me if I am
reading the code wrong!). From what I can tell, the proposed patch
consumes the header (i.e. the part that contains the content-disposition
and content-type declarations) and then reads the body of the part,
writing it to the output stream point to the temporary file. It repeats
this process until there are no more parts to read (this means that in
the case of multiple parts, the bodies will be concatenated together in
the output file). It *seems* as though the new patch assumes only one
part for the deployment or that the deployment itself will be split
across multiple boundaries. This is not the assumption that the
original code was produced under (I will defer to Jason on this as
regards to whether or not we need the ability to handle upload requests
with multiple parts with only one of those parts being the actual
deployment).
I guess this leads us to a decision point as to how to modify the HTTP
server:
1) Patch the existing code to a) not produce OOM when no deployment is
found in the incoming POST and b) modify the code to look for the
appropriate content-types that we need to support (instead of just
application/octet-stream). This would mean that we want to keep the
logic to handle inspecting each part of the multipart form post to find
the deployment.
2) Use Ike's patch as-is.
3) Modify Ike's patch to include the logic for handling inspecting each
part of the multipart form post (hybrid of 1 and 2, basically).
Options 1 and 3 assume that we still need the logic to "find" the
deployment in the upload POST request by interpreting the content-type
of each part. Thinking about this some more, it may make sense to
standardize on the name of the field instead of the type, as we can
enforce it from an API perspective what people have to name the part for
the API to deploy it, instead of relying on all browsers to use the same
content-type for a deployment.
Please let me know what you think and/or if you need me to make any of
the changes!
--Jonathan
On Fri, Apr 15, 2011 at 7:50 AM, Heiko Braun <hbraun(a)redhat.com
<mailto:hbraun@redhat.com>> wrote:
I've run into a OOM issue when uploading content through the HTTP
API again.
(
https://issues.jboss.org/browse/JBAS-9268)
I did take a look at the current implementation and propose that we
change it in the following way:
- read multipart contents from mime boundary
- replace the custom multipart stream implementation with a mature
one (leans on apache commons fileupload)
I've got this changes tested and verified in a custom branch:
https://github.com/heiko-braun/jboss-as/commits/out_of_memory
However before going ahead, I would like get some feedback from
a) the original author (Jonathan Pearlin. Welcome onboard btw)
b) Jason wrt the Apache License sources (See
org.jboss.as.domain.http.server.multipart.asf.*)
Regards, Ike
_______________________________________________
jboss-as7-dev mailing list
jboss-as7-dev(a)lists.jboss.org <mailto:jboss-as7-dev@lists.jboss.org>
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev
--
Jason T. Greene
JBoss, a division of Red Hat