[mod_cluster-issues] [JBoss JIRA] (MODCLUSTER-349) mod_manager truncates ENABLE-APP messages

Michal Babacek (JIRA) jira-events at lists.jboss.org
Tue Sep 17 17:51:03 EDT 2013


    [ https://issues.jboss.org/browse/MODCLUSTER-349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805228#comment-12805228 ] 

Michal Babacek commented on MODCLUSTER-349:
-------------------------------------------

Hi guys, I'm sorry, this patch looks wrong to me. It appears to be doing what it should, but IMHO it doesn't allocate enough space for the incoming message, because {{MAXMESSSIZE}} is not enough.

I've used a fake client (not the one attached to this JIRA) that sends these messages: *CONFIG*, *ENABLE-APP* with 500 aliases and finally *STATUS*.
Note that the [mod_cluster.conf|https://gist.github.com/Karm/343bfd9b2ccc921bd458#file-mod_cluster-conf] had been left unchanged during both tests.

This first test corresponds to the following scenario:
 # start httpd 2.2.22 with mod_cluster 1.0.10.GA_CP04 from EAP 5.2 distribution
 # GET mod_cluster-manager console, check for HTTP 200
 # send *CONFIG* messages
 # read response
 # send *ENABLE-APP* messages
 # read response
 # send *STATUS* messages
 # read responses
 # retrieve and archive mod_cluster-manager console content
 # stop httpd
 # backup mod_manager.so
 # replace mod_manager.so with the patched one from [JBPAPP-10851]
 # repeat steps 1 - 10 (with a new mod_manager.so)
 # retrieve mod_manager.so from backup and repeat steps 1 - 10

You might observe in the [first scenario record|https://gist.github.com/Karm/343bfd9b2ccc921bd458/raw/504567e14448c2908d94d922342e3b3c14deb607/manager-console_WRONG] that with the first run, only 361 aliases were correctly loaded, with the 362nd one being corrupted and the rest missing whatsoever. If you follow that record further, you might see that the patched version managed to display 376 aliases correctly. That's better, yet, not really satisfying. Furthermore, at the end of the record, the original version did again 361 aliases. This third iteration serves merely as a sanity check. Here is the corresponding [error_log|https://gist.github.com/Karm/343bfd9b2ccc921bd458/raw/43e165ea87ad22cc82ab1aa5de6fe87b0f81e62c/error_log_WRONG].

 The second test corresponds to the aforementioned scenario but for the mod_cluster versions. Instead of 1.0.10.GA_CP04, 1.2.4 was used during the first and third iteration. For the second iteration, mod_manager.so was replaced with the one from mod_cluster 1.2.6. The results are dramatically different:

 You might observe in the [second scenario record|https://gist.github.com/Karm/343bfd9b2ccc921bd458/raw/243f5a00a4ab2b6e542149111bb1f0da311a997c/manager-console_COOL] that with the first iteration, again only 361 aliases were correctly loaded, whereas for the second iteration, the one with mod_manager from 1.2.6, it was the correct 500 aliases value. At the end of the record, the third iteration brought the same number of aliases as the first one: 361.
 Here is the corresponding [error_log|https://gist.github.com/Karm/343bfd9b2ccc921bd458/raw/97dbbd5ca6d19aa46d6226035d49054a4810ac96/error_log_COOL]. (i) It is noteworthy, that in the middle of that log, *ENABLE-APP* message ends with {{alias-372.e}} by the time mod_cluster-console reported correctly 500 loaded aliases. It looks like {{buff}} does not contain what it should or is not displayed correctly:
 {code}
     ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
                "manager_handler %s (%s) processing: \"%s\"", r->method, r->filename, buff);
 {code}

Anyhow, I think that that this patch:
{code}
-    status = ap_get_brigade(r->input_filters, input_brigade, AP_MODE_READBYTES, APR_BLOCK_READ, MAXMESSSIZE);
+
+    while ((status = ap_get_brigade(r->input_filters, input_brigade, AP_MODE_READBYTES, APR_BLOCK_READ, len)) == APR_SUCCESS) {
+        apr_brigade_flatten(input_brigade, buff + bufsiz, &len);
+        apr_brigade_cleanup(input_brigade);
+        bufsiz += len;
+        if (bufsiz >= MAXMESSSIZE || len == 0) break;
+        len = MAXMESSSIZE - bufsiz;
+    }
+
{code}
is not correct and that calculating the buffer size from all the relevant elements as in mod_cluster 1.2.6 is better:
{code}
    /* Use a buffer to read the message */
    if (mconf->maxmesssize)
       maxbufsiz = mconf->maxmesssize;
    else {
       /* we calculate it */
       maxbufsiz = 9 + JVMROUTESZ;
       maxbufsiz = bufsiz + (mconf->maxhost * HOSTALIASZ) + 7;
       maxbufsiz = bufsiz + (mconf->maxcontext * CONTEXTSZ) + 8;
    }
    if (maxbufsiz< MAXMESSSIZE)
       maxbufsiz = MAXMESSSIZE;
    buff = apr_pcalloc(r->pool, maxbufsiz);
    input_brigade = apr_brigade_create(r->pool, r->connection->bucket_alloc);
    len = maxbufsiz;
    while ((status = ap_get_brigade(r->input_filters, input_brigade, AP_MODE_READBYTES, APR_BLOCK_READ, len)) == APR_SUCCESS) {
        apr_brigade_flatten(input_brigade, buff + bufsiz, &len);
        apr_brigade_cleanup(input_brigade);
        bufsiz += len;
        if (bufsiz >= maxbufsiz || len == 0) break;
        len = maxbufsiz - bufsiz;
    }
{code}

What do you think about the results?
                
> mod_manager truncates ENABLE-APP messages
> -----------------------------------------
>
>                 Key: MODCLUSTER-349
>                 URL: https://issues.jboss.org/browse/MODCLUSTER-349
>             Project: mod_cluster
>          Issue Type: Bug
>    Affects Versions: MOD_CLUSTER_1_0_10_GA_CP04, 1.2.4.Final
>         Environment: Windows Server 2008
> JBoss EAP 5.1.0
> EWS 1.0.1
>            Reporter: Gregory Lardiere
>            Assignee: Jean-Frederic Clere
>             Fix For: 1.2.5.Final
>
>         Attachments: ModCluster349.java, mod_cluster.patch
>
>
> With lots of aliases, ENABLE-APP messages can be big enough to span several IP packets.
> Sometimes, mod_manager will read only the first packet.
> The message is therefore incomplete and mod_cluster configuration ends up corrupted : some aliases are missing and the context-root is replaced by "/".
> This is caused by a wrong usage of ap_get_brigade() : it's called only once whereas it should be called until there's nothing left to read (see the attached patch).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the mod_cluster-issues mailing list