September 2014 - mod_cluster-issues

[JBoss JIRA] (MODCLUSTER-399) Draining pending requests fails with Oracle JDK8

by Michal Babacek (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-399?page=com.atlassian.jira.pl... ] Michal Babacek updated MODCLUSTER-399: -------------------------------------- Attachment: clusterbench.war JDK7-JDK8.log mod_cluster.conf simplecontext-666.war standalone-ha.xml Flood.py > Draining pending requests fails with Oracle JDK8 > ------------------------------------------------ > > Key: MODCLUSTER-399 > URL: https://issues.jboss.org/browse/MODCLUSTER-399 > Project: mod_cluster > Issue Type: Bug > Affects Versions: 1.2.8.Final > Environment: RHEL7 x86_64, Apache/2.2.26 (Red Hat Enterprise Web Server), Oracle JDK 8 > Reporter: Michal Babacek > Assignee: Radoslav Husar > Attachments: 1-access_log.zip, 1-error_log.zip, 1-server.log, 2-access_log.zip, 2-error_log.zip, 2-server.log, 3-access_log.zip, 3-error_log.zip, 3-server.log, 4-access_log.zip, 4-error_log.zip, 4-server.log, 5-access_log.zip, 5-error_log.zip, 5-server.log, clusterbench.war, Flood.py, JDK7-JDK8.log, mod_cluster.conf, simplecontext-666.war, standalone-ha.xml > > > I've noticed several tests that run O.K. with Oracle JDK7 and fail with Oracle JDK8. One can reproduce these failures anytime, they are not just random glitches. > Notes on the setup: > * {{simplecontext-666}} is a simple JSP web app, non-distributable > * {{clusterbench}} is a distributable webapp > * all the tests use 1 worker node and 1 balancer > * many clients throwing many requests at the {{simplecontext-666}} context > h3. Stop Context Timeout test > First, there is the setup with {{stop-context-timeout="20"}}: > {code} > <subsystem xmlns="urn:jboss:domain:modcluster:1.2"> > <mod-cluster-config advertise-socket="modcluster" stop-context-timeout="20" connector="ajp"> > <dynamic-load-provider> > <load-metric type="busyness"/> > </dynamic-load-provider> > </mod-cluster-config> > </subsystem> > {code} > On *JDK7*, one may observe exactly the desired behavior. On a clean shutdown, message {{Failed to drain 294 remaining pending requests from default-host:/simplecontext-666 within 20.0 seconds}} is displayed. See the whole log here: [^1-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 22) JBAS018224: Unregister web context: /simplecontext-666 > [org.jboss.as.server.deployment] (MSC service thread 1-8) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 284ms > [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000024: All active sessions drained from default-host:/simplecontext-666 in 20.0 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000022: Failed to drain 294 remaining pending requests from default-host:/simplecontext-666 within 20.0 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000002: Initiating mod_cluster shutdown > [org.apache.coyote.ajp] (MSC service thread 1-8) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.apache.coyote.ajp] (MSC service thread 1-8) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.jboss.as] (MSC service thread 1-11) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 20035ms > {noformat} > On the other hand, with *JDK8*, this utterly fails. No waiting for pending requests, no message. See the whole log here: [^2-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 18) JBAS018224: Unregister web context: /simplecontext-666 > [org.apache.tomcat.util] (http-/10.16.88.17:8080-Acceptor) JBWEB003022: Channel processing failed > [org.jboss.modcluster] (ServerService Thread Pool -- 16) MODCLUSTER000002: Initiating mod_cluster shutdown > [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.jboss.as.server.deployment] (MSC service thread 1-10) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 310ms > [org.jboss.as] (MSC service thread 1-2) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 348ms > {noformat} > These are the Apache HTTP Server logs, showing exactly what was the test doing: > Note that thousands of {{"GET /simplecontext-666/ HTTP/1.1"}} messages were omitted from these snippets. > *JDK7*, access_log, (See the complete log here [^1-access_log.zip]) > {noformat} > [06:47:04] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [06:47:12] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:12] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:12] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:12] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:12] "STATUS / HTTP/1.1" 200 60 "-" "ClusterListener/1.0" > [06:47:13] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [06:47:19] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:19] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:20] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [06:47:40] "STOP-APP / HTTP/1.1" 200 86 "-" "ClusterListener/1.0" > [06:47:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > {noformat} > *JDK8*, access_log, (See the complete log here [^2-access_log.zip]) > {noformat} > [06:43:10] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [06:43:17] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:17] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:17] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:17] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:17] "STATUS / HTTP/1.1" 200 62 "-" "ClusterListener/1.0" > [06:43:19] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [06:43:25] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:25] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [06:43:25] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:25] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:25] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > {noformat} > In the aforementioned snippets, it is noteworthy that the *JDK8* run sent only one {{STOP-APP}} message. This is confirmed in error logs: > *JDK7*, error_log, {{grep STOP-APP}} (See the complete log here [^1-error_log.zip]) > {noformat} > [06:47:20] mod_manager.c(1901): manager_trans STOP-APP (/) > [06:47:20] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [06:47:20] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [06:47:20] mod_manager.c(2666): manager_handler STOP-APP OK > [06:47:40] mod_manager.c(1901): manager_trans STOP-APP (/) > [06:47:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fsimplecontext-666" > [06:47:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 294 > [06:47:40] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > *JDK8*, error_log, {{grep STOP-APP}} (See the complete log here [^2-error_log.zip]) > {noformat} > [06:43:25] mod_manager.c(1901): manager_trans STOP-APP (/) > [06:43:25] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [06:43:25] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [06:43:25] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > One could argue that there was a test error and the *JDK8* test run actually didn't access {{simplecontext-666}} application. Let me dissolve this argument right away: > {noformat} > grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK8/httpd/logs/access_log > 27250 > grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK7/httpd/logs/access_log > 27927 > {noformat} > h3. Default settings, the same problem... > Trial with the default settings reveals the same problem with some additional findings in the Apache's error log. > {code} > <subsystem xmlns="urn:jboss:domain:modcluster:1.2"> > <mod-cluster-config advertise-socket="modcluster" connector="ajp"> > <dynamic-load-provider> > <load-metric type="busyness"/> > </dynamic-load-provider> > </mod-cluster-config> > </subsystem> > {code} > On *JDK7*, one may observe exactly the desired behavior, this time with the default 10 seconds timeout. See the whole log here: [^3-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 3) JBAS018224: Unregister web context: /simplecontext-666 > [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 263ms > [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000024: All active sessions drained from default-host:/simplecontext-666 in 10.0 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000022: Failed to drain 340 remaining pending requests from default-host:/simplecontext-666 within 10.0 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000002: Initiating mod_cluster shutdown > [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.jboss.as] (MSC service thread 1-9) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 10082ms > {noformat} > *JDK8* fails. No waiting for pending requests, no message. This time even with ISE :(. See the whole log here: [^4-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 16) JBAS018224: Unregister web context: /simplecontext-666 > [org.jboss.modcluster] (ServerService Thread Pool -- 19) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.0 seconds > [org.jboss.as.txn] (ajp-/10.16.88.17:8009-21) JBAS010151: Unable to get transaction state: java.lang.IllegalStateException > at org.jboss.msc.value.InjectedValue.getValue(InjectedValue.java:47) > at org.jboss.as.txn.deployment.TransactionRollbackSetupAction.checkTransactionStatus(TransactionRollbackSetupAction.java:112) > at org.jboss.as.txn.deployment.TransactionRollbackSetupAction.teardown(TransactionRollbackSetupAction.java:66) > at org.jboss.as.web.ThreadSetupBindingListener.unbind(ThreadSetupBindingListener.java:61) [jboss-as-web-7.4.0.Final-redhat-6.jar:7.4.0.Final-redhat-6] > at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:184) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:97) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:102) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:340) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:490) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:420) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:926) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at java.lang.Thread.run(Thread.java:744) [rt.jar:1.8.0] > +++ > [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 301ms > [org.jboss.as] (MSC service thread 1-15) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 313ms > {noformat} > In the access logs, it is noteworthy that both {{clusterbench}} and {{simplecontext-666}} sent {{ENABLE-APP}}, but there was only one {{STOP-APP}} message. Furthermore, who and why used {{OPTIONS}} method? > *JDK7*, access_log, (See the complete log here [^3-access_log.zip]) > Note that thousands of {{"GET /simplecontext-666/ HTTP/1.1"}} messages were omitted from these snippets. > {noformat} > [07:31:49] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:31:57] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:31:57] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:31:57] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:31:57] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:31:57] "STATUS / HTTP/1.1" 200 63 "-" "ClusterListener/1.0" > [07:31:58] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:32:05] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:32:14] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > [07:32:15] "STOP-APP / HTTP/1.1" 200 86 "-" "ClusterListener/1.0" > [07:32:15] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:32:15] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:32:15] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > {noformat} > *JDK8*, access_log, (See the complete log here [^4-access_log.zip]) > {noformat} > [07:29:25] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:29:32] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:32] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:32] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:32] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:32] "STATUS / HTTP/1.1" 200 63 "-" "ClusterListener/1.0" > [07:29:34] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:29:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [07:29:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:40] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > {noformat} > Error log shows what context was the {{STOP-APP}} message linked to: > *JDK7*, error_log, {{grep STOP-APP}} (See the complete log here [^3-error_log.zip]) > {noformat} > [07:32:15] mod_manager.c(1901): manager_trans STOP-APP (/) > [07:32:15] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fsimplecontext-666" > [07:32:15] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 340 > [07:32:15] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > *JDK8*, error_log, {{grep STOP-APP}} (See the complete log here [^4-error_log.zip]) > {noformat} > [07:29:40] mod_manager.c(1901): manager_trans STOP-APP (/) > [07:29:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [07:29:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [07:29:40] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > h3. Default settings, again > I had considered the {{JBAS010151: Unable to get transaction state: java.lang.IllegalStateException}} being probably an unrelated glitch, so I tried the default test run with *JDK8* again. Test failed, yet this time without any exception. Nothing has changed since the last run as far as environment and configuration are concerned. > *JDK8* fails. No waiting for pending requests on {{/simplecontext-666}}. See the whole log here: [^5-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 16) JBAS018224: Unregister web context: /clusterbench > [org.apache.catalina.core] (MSC service thread 1-6) JBWEB001079: Container org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/] has not been started > [org.apache.coyote.http11.Http11NioProtocol] (MSC service thread 1-14) JBWEB003075: Coyote HTTP/1.1 pausing on: http-/10.16.88.17:8080 > [org.apache.coyote.http11.Http11NioProtocol] (MSC service thread 1-14) JBWEB003077: Coyote HTTP/1.1 stopping on : http-/10.16.88.17:8080 > [org.jboss.web] (ServerService Thread Pool -- 3) JBAS018224: Unregister web context: /simplecontext-666 > [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-15) JBAS010409: Unbound data source [java:jboss/datasources/ExampleDS] > [org.jboss.modcluster] (ServerService Thread Pool -- 16) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.1 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 14) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.1 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 14) MODCLUSTER000002: Initiating mod_cluster shutdown > [org.apache.coyote.ajp] (MSC service thread 1-1) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.apache.coyote.ajp] (MSC service thread 1-1) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 9) JBAS010282: Stopped repl cache from web container > [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 16) JBAS010282: Stopped default-host/clusterbench cache from web container > [org.jboss.as.server.deployment] (MSC service thread 1-13) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 306ms > [org.jboss.as.server.deployment] (MSC service thread 1-5) JBAS015877: Stopped deployment clusterbench.war (runtime-name: clusterbench.war) in 310ms > [org.jboss.as] (MSC service thread 1-16) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 333ms > {noformat} > *JDK8*, access_log, (See the complete log here [^5-access_log.zip]) > {noformat} > [07:45:25] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:45:33] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:33] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:33] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:33] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:33] "STATUS / HTTP/1.1" 200 62 "-" "ClusterListener/1.0" > [07:45:34] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:45:39] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > [07:45:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [07:45:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [07:45:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > [07:45:41] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > {noformat} > This time, there were two {{STOP-APP}} messages all right, but instead one of them being for {{clusterbench}} and the other one for {{simplecontext-666}}, they were both for {{clusterbench}}. I find this behavior being a rather disturbing one. > *JDK8*, error_log, {{grep STOP-APP}} (See the complete log here [^5-error_log.zip]) > {noformat} > [07:45:40] mod_manager.c(1901): manager_trans STOP-APP (/) > [07:45:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [07:45:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [07:45:40] mod_manager.c(2666): manager_handler STOP-APP OK > [07:45:40] mod_manager.c(1901): manager_trans STOP-APP (/) > [07:45:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [07:45:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [07:45:40] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > Again, there were plenty of requests sent to {{simplecontext-666/}}: > {noformat} > grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK8/httpd/logs/access_log > 27587 > {noformat} > WDYT? -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 3 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-399) Draining pending requests fails with Oracle JDK8

by Michal Babacek (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-399?page=com.atlassian.jira.pl... ] Michal Babacek commented on MODCLUSTER-399: ------------------------------------------- I've failed to find a rock-solid deterministic reproducer and I think it's clearly a race condition. The only conclusion I can give is to state that: * With JDK7, EAP always tried to drain sessions/requests for _simplecontext-666_ context, whereas * With JDK8, *sometimes* (in 4 out of 7 runs), EAP didn't drain _simplecontext-666_ context at all The *sometimes* is important as it points to the fact that it's not completely broken, as I might have somewhat overstated in the description of this Jira. It looks like JDK8 might be more prone to this race condition... :-( Please, take a look and share your thoughts on whether this is actually a bug after all. h3. System * RHEL 7 VM, 2 CPUs, 5G RAM * JDK7: Open JDK 1.7.0_65 * JDK8: Oracle JDK 1.8.0_20 h3. Setup * Apache HTTP Server 2.2.26, [^mod_cluster.conf] * EAP 6.3.0, [^standalone-ha.xml] * Distributable app [^clusterbench.war] * Non-Distributable app (and the one accessed in this test) [^simplecontext-666.war] * Test script [^Flood.py] h3. Test # JAVA_HOME, PATH, points to JDK8 # /path/to/httpd/sbin/apachectl restart;./standalone.sh -c standalone-ha.xml # As soon as {{MODCLUSTER000011: jboss.web will use jboss-eap-6.3 as jvm-route}} appears: # In a new terminal: {{python Flood.py -u http://192.168.122.172:8847/simplecontext-666 -r 1000 -s 5000}} # As soon as {{All sessions were established. Subsequent requests won't be reported...}} message appears, you can hit {{Ctrl+C}} in 3. (EAP instance). # {{Ctrl+C}} for Flood.py as well (EAP is down, no point in continuing...) # Examine EAP's output in terminal 3. With JDK8, I _almost always_ get something like: {noformat} 13:17:25,284 INFO [org.jboss.web] (ServerService Thread Pool -- 59) JBAS018224: Unregister web context: /simplecontext-666 13:17:25,282 INFO [org.jboss.web] (ServerService Thread Pool -- 60) JBAS018224: Unregister web context: /clusterbench 13:17:25,318 INFO [org.jboss.modcluster] (ServerService Thread Pool -- 58) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.0 seconds 13:17:25,326 INFO [org.jboss.modcluster] (ServerService Thread Pool -- 58) MODCLUSTER000002: Initiating mod_cluster shutdown 13:17:25,328 INFO [org.apache.coyote.ajp] (MSC service thread 1-3) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/192.168.122.172:8009 13:17:25,328 INFO [org.apache.coyote.ajp] (MSC service thread 1-3) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/192.168.122.172:8009 13:17:25,422 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread Pool -- 69) ISPN000029: Passivating all entries to disk 13:17:25,430 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread Pool -- 69) ISPN000030: Passivated 0 entries in 7 milliseconds 13:17:25,447 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread Pool -- 59) ISPN000029: Passivating all entries to disk 13:17:25,447 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread Pool -- 59) ISPN000030: Passivated 0 entries in 0 milliseconds 13:17:25,450 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 69) JBAS010282: Stopped default-host/clusterbench cache from web container 13:17:25,457 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 59) JBAS010282: Stopped repl cache from web container 13:17:25,470 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-3) ISPN000080: Disconnecting and closing JGroups Channel 13:17:25,481 INFO [org.jboss.as.server.deployment] (MSC service thread 1-4) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 243ms 13:17:25,488 INFO [org.jboss.as.server.deployment] (MSC service thread 1-1) JBAS015877: Stopped deployment clusterbench.war (runtime-name: clusterbench.war) in 246ms 13:17:25,500 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-3) ISPN000082: Stopping the RpcDispatcher 13:17:25,513 INFO [org.jboss.as] (MSC service thread 1-1) JBAS015950: JBoss EAP 6.3.0.GA (AS 7.4.0.Final-redhat-19) stopped in 270ms {noformat} There is not a word about draining anything from simplecontext-666. I would expect something like: {{All pending requests drained from default-host:/simplecontext-666 ...}} On the other hand, with JDK7, one always gets something along these lines: {noformat} 13:23:13,376 INFO [org.jboss.web] (ServerService Thread Pool -- 58) JBAS018224: Unregister web context: /clusterbench 13:23:13,384 INFO [org.jboss.web] (ServerService Thread Pool -- 60) JBAS018224: Unregister web context: /simplecontext-666 13:23:13,474 INFO [org.jboss.modcluster] (ServerService Thread Pool -- 57) MODCLUSTER000024: All active sessions drained from default-host:/simplecontext-666 in 0.1 seconds 13:23:13,479 INFO [org.jboss.modcluster] (ServerService Thread Pool -- 57) MODCLUSTER000021: All pending requests drained from default-host:/simplecontext-666 in 0.1 seconds 13:23:13,484 INFO [org.jboss.modcluster] (ServerService Thread Pool -- 57) MODCLUSTER000002: Initiating mod_cluster shutdown 13:23:13,487 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread Pool -- 70) ISPN000029: Passivating all entries to disk 13:23:13,495 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread Pool -- 70) ISPN000030: Passivated 0 entries in 7 milliseconds 13:23:13,497 INFO [org.apache.coyote.ajp] (MSC service thread 1-2) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/192.168.122.172:8009 13:23:13,498 INFO [org.apache.coyote.ajp] (MSC service thread 1-2) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/192.168.122.172:8009 13:23:13,510 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 70) JBAS010282: Stopped default-host/clusterbench cache from web container 13:23:13,529 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread Pool -- 71) ISPN000029: Passivating all entries to disk 13:23:13,529 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread Pool -- 71) ISPN000030: Passivated 0 entries in 0 milliseconds 13:23:13,535 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 71) JBAS010282: Stopped repl cache from web container 13:23:13,539 INFO [org.jboss.as.server.deployment] (MSC service thread 1-2) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 197ms 13:23:13,541 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-1) ISPN000080: Disconnecting and closing JGroups Channel 13:23:13,548 INFO [org.jboss.as.server.deployment] (MSC service thread 1-3) JBAS015877: Stopped deployment clusterbench.war (runtime-name: clusterbench.war) in 205ms 13:23:13,554 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-1) ISPN000082: Stopping the RpcDispatcher 13:23:13,576 INFO [org.jboss.as] (MSC service thread 1-2) JBAS015950: JBoss EAP 6.3.0.GA (AS 7.4.0.Final-redhat-19) stopped in 233ms {noformat} Although, the problem with drawing any definite conclusion is that even JDK8 sometimes manages to display the correct draining behavior. Please, see [^JDK7-JDK8.log]. Each record separated with a blank line is a test run like the aforementioned one. You may notice that with JDK7, in 7 out of 7 runs, it's possible to look up this expression {{.*drained from .*simplecontext}}, whereas with JDK8 it's only 3 our of 7 cases. Obviously, the number 7 is a magic/arbitrary one, hence no proper conclusion... > Draining pending requests fails with Oracle JDK8 > ------------------------------------------------ > > Key: MODCLUSTER-399 > URL: https://issues.jboss.org/browse/MODCLUSTER-399 > Project: mod_cluster > Issue Type: Bug > Affects Versions: 1.2.8.Final > Environment: RHEL7 x86_64, Apache/2.2.26 (Red Hat Enterprise Web Server), Oracle JDK 8 > Reporter: Michal Babacek > Assignee: Radoslav Husar > Attachments: 1-access_log.zip, 1-error_log.zip, 1-server.log, 2-access_log.zip, 2-error_log.zip, 2-server.log, 3-access_log.zip, 3-error_log.zip, 3-server.log, 4-access_log.zip, 4-error_log.zip, 4-server.log, 5-access_log.zip, 5-error_log.zip, 5-server.log > > > I've noticed several tests that run O.K. with Oracle JDK7 and fail with Oracle JDK8. One can reproduce these failures anytime, they are not just random glitches. > Notes on the setup: > * {{simplecontext-666}} is a simple JSP web app, non-distributable > * {{clusterbench}} is a distributable webapp > * all the tests use 1 worker node and 1 balancer > * many clients throwing many requests at the {{simplecontext-666}} context > h3. Stop Context Timeout test > First, there is the setup with {{stop-context-timeout="20"}}: > {code} > <subsystem xmlns="urn:jboss:domain:modcluster:1.2"> > <mod-cluster-config advertise-socket="modcluster" stop-context-timeout="20" connector="ajp"> > <dynamic-load-provider> > <load-metric type="busyness"/> > </dynamic-load-provider> > </mod-cluster-config> > </subsystem> > {code} > On *JDK7*, one may observe exactly the desired behavior. On a clean shutdown, message {{Failed to drain 294 remaining pending requests from default-host:/simplecontext-666 within 20.0 seconds}} is displayed. See the whole log here: [^1-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 22) JBAS018224: Unregister web context: /simplecontext-666 > [org.jboss.as.server.deployment] (MSC service thread 1-8) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 284ms > [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000024: All active sessions drained from default-host:/simplecontext-666 in 20.0 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000022: Failed to drain 294 remaining pending requests from default-host:/simplecontext-666 within 20.0 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000002: Initiating mod_cluster shutdown > [org.apache.coyote.ajp] (MSC service thread 1-8) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.apache.coyote.ajp] (MSC service thread 1-8) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.jboss.as] (MSC service thread 1-11) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 20035ms > {noformat} > On the other hand, with *JDK8*, this utterly fails. No waiting for pending requests, no message. See the whole log here: [^2-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 18) JBAS018224: Unregister web context: /simplecontext-666 > [org.apache.tomcat.util] (http-/10.16.88.17:8080-Acceptor) JBWEB003022: Channel processing failed > [org.jboss.modcluster] (ServerService Thread Pool -- 16) MODCLUSTER000002: Initiating mod_cluster shutdown > [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.jboss.as.server.deployment] (MSC service thread 1-10) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 310ms > [org.jboss.as] (MSC service thread 1-2) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 348ms > {noformat} > These are the Apache HTTP Server logs, showing exactly what was the test doing: > Note that thousands of {{"GET /simplecontext-666/ HTTP/1.1"}} messages were omitted from these snippets. > *JDK7*, access_log, (See the complete log here [^1-access_log.zip]) > {noformat} > [06:47:04] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [06:47:12] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:12] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:12] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:12] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:12] "STATUS / HTTP/1.1" 200 60 "-" "ClusterListener/1.0" > [06:47:13] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [06:47:19] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:19] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:20] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [06:47:40] "STOP-APP / HTTP/1.1" 200 86 "-" "ClusterListener/1.0" > [06:47:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:47:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > {noformat} > *JDK8*, access_log, (See the complete log here [^2-access_log.zip]) > {noformat} > [06:43:10] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [06:43:17] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:17] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:17] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:17] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:17] "STATUS / HTTP/1.1" 200 62 "-" "ClusterListener/1.0" > [06:43:19] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [06:43:25] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:25] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [06:43:25] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:25] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [06:43:25] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > {noformat} > In the aforementioned snippets, it is noteworthy that the *JDK8* run sent only one {{STOP-APP}} message. This is confirmed in error logs: > *JDK7*, error_log, {{grep STOP-APP}} (See the complete log here [^1-error_log.zip]) > {noformat} > [06:47:20] mod_manager.c(1901): manager_trans STOP-APP (/) > [06:47:20] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [06:47:20] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [06:47:20] mod_manager.c(2666): manager_handler STOP-APP OK > [06:47:40] mod_manager.c(1901): manager_trans STOP-APP (/) > [06:47:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fsimplecontext-666" > [06:47:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 294 > [06:47:40] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > *JDK8*, error_log, {{grep STOP-APP}} (See the complete log here [^2-error_log.zip]) > {noformat} > [06:43:25] mod_manager.c(1901): manager_trans STOP-APP (/) > [06:43:25] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [06:43:25] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [06:43:25] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > One could argue that there was a test error and the *JDK8* test run actually didn't access {{simplecontext-666}} application. Let me dissolve this argument right away: > {noformat} > grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK8/httpd/logs/access_log > 27250 > grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK7/httpd/logs/access_log > 27927 > {noformat} > h3. Default settings, the same problem... > Trial with the default settings reveals the same problem with some additional findings in the Apache's error log. > {code} > <subsystem xmlns="urn:jboss:domain:modcluster:1.2"> > <mod-cluster-config advertise-socket="modcluster" connector="ajp"> > <dynamic-load-provider> > <load-metric type="busyness"/> > </dynamic-load-provider> > </mod-cluster-config> > </subsystem> > {code} > On *JDK7*, one may observe exactly the desired behavior, this time with the default 10 seconds timeout. See the whole log here: [^3-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 3) JBAS018224: Unregister web context: /simplecontext-666 > [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 263ms > [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000024: All active sessions drained from default-host:/simplecontext-666 in 10.0 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000022: Failed to drain 340 remaining pending requests from default-host:/simplecontext-666 within 10.0 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000002: Initiating mod_cluster shutdown > [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.jboss.as] (MSC service thread 1-9) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 10082ms > {noformat} > *JDK8* fails. No waiting for pending requests, no message. This time even with ISE :(. See the whole log here: [^4-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 16) JBAS018224: Unregister web context: /simplecontext-666 > [org.jboss.modcluster] (ServerService Thread Pool -- 19) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.0 seconds > [org.jboss.as.txn] (ajp-/10.16.88.17:8009-21) JBAS010151: Unable to get transaction state: java.lang.IllegalStateException > at org.jboss.msc.value.InjectedValue.getValue(InjectedValue.java:47) > at org.jboss.as.txn.deployment.TransactionRollbackSetupAction.checkTransactionStatus(TransactionRollbackSetupAction.java:112) > at org.jboss.as.txn.deployment.TransactionRollbackSetupAction.teardown(TransactionRollbackSetupAction.java:66) > at org.jboss.as.web.ThreadSetupBindingListener.unbind(ThreadSetupBindingListener.java:61) [jboss-as-web-7.4.0.Final-redhat-6.jar:7.4.0.Final-redhat-6] > at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:184) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:97) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:102) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:340) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:490) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:420) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:926) [jbossweb-7.4.0.Final.jar:7.4.0.Final] > at java.lang.Thread.run(Thread.java:744) [rt.jar:1.8.0] > +++ > [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 301ms > [org.jboss.as] (MSC service thread 1-15) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 313ms > {noformat} > In the access logs, it is noteworthy that both {{clusterbench}} and {{simplecontext-666}} sent {{ENABLE-APP}}, but there was only one {{STOP-APP}} message. Furthermore, who and why used {{OPTIONS}} method? > *JDK7*, access_log, (See the complete log here [^3-access_log.zip]) > Note that thousands of {{"GET /simplecontext-666/ HTTP/1.1"}} messages were omitted from these snippets. > {noformat} > [07:31:49] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:31:57] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:31:57] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:31:57] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:31:57] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:31:57] "STATUS / HTTP/1.1" 200 63 "-" "ClusterListener/1.0" > [07:31:58] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:32:05] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:32:14] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > [07:32:15] "STOP-APP / HTTP/1.1" 200 86 "-" "ClusterListener/1.0" > [07:32:15] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:32:15] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:32:15] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > {noformat} > *JDK8*, access_log, (See the complete log here [^4-access_log.zip]) > {noformat} > [07:29:25] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:29:32] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:32] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:32] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:32] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:32] "STATUS / HTTP/1.1" 200 63 "-" "ClusterListener/1.0" > [07:29:34] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:29:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [07:29:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:29:40] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > {noformat} > Error log shows what context was the {{STOP-APP}} message linked to: > *JDK7*, error_log, {{grep STOP-APP}} (See the complete log here [^3-error_log.zip]) > {noformat} > [07:32:15] mod_manager.c(1901): manager_trans STOP-APP (/) > [07:32:15] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fsimplecontext-666" > [07:32:15] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 340 > [07:32:15] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > *JDK8*, error_log, {{grep STOP-APP}} (See the complete log here [^4-error_log.zip]) > {noformat} > [07:29:40] mod_manager.c(1901): manager_trans STOP-APP (/) > [07:29:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [07:29:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [07:29:40] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > h3. Default settings, again > I had considered the {{JBAS010151: Unable to get transaction state: java.lang.IllegalStateException}} being probably an unrelated glitch, so I tried the default test run with *JDK8* again. Test failed, yet this time without any exception. Nothing has changed since the last run as far as environment and configuration are concerned. > *JDK8* fails. No waiting for pending requests on {{/simplecontext-666}}. See the whole log here: [^5-server.log]. > {noformat} > [org.jboss.web] (ServerService Thread Pool -- 16) JBAS018224: Unregister web context: /clusterbench > [org.apache.catalina.core] (MSC service thread 1-6) JBWEB001079: Container org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/] has not been started > [org.apache.coyote.http11.Http11NioProtocol] (MSC service thread 1-14) JBWEB003075: Coyote HTTP/1.1 pausing on: http-/10.16.88.17:8080 > [org.apache.coyote.http11.Http11NioProtocol] (MSC service thread 1-14) JBWEB003077: Coyote HTTP/1.1 stopping on : http-/10.16.88.17:8080 > [org.jboss.web] (ServerService Thread Pool -- 3) JBAS018224: Unregister web context: /simplecontext-666 > [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-15) JBAS010409: Unbound data source [java:jboss/datasources/ExampleDS] > [org.jboss.modcluster] (ServerService Thread Pool -- 16) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.1 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 14) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.1 seconds > [org.jboss.modcluster] (ServerService Thread Pool -- 14) MODCLUSTER000002: Initiating mod_cluster shutdown > [org.apache.coyote.ajp] (MSC service thread 1-1) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.apache.coyote.ajp] (MSC service thread 1-1) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 > [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 9) JBAS010282: Stopped repl cache from web container > [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 16) JBAS010282: Stopped default-host/clusterbench cache from web container > [org.jboss.as.server.deployment] (MSC service thread 1-13) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 306ms > [org.jboss.as.server.deployment] (MSC service thread 1-5) JBAS015877: Stopped deployment clusterbench.war (runtime-name: clusterbench.war) in 310ms > [org.jboss.as] (MSC service thread 1-16) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 333ms > {noformat} > *JDK8*, access_log, (See the complete log here [^5-access_log.zip]) > {noformat} > [07:45:25] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:45:33] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:33] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:33] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:33] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:33] "STATUS / HTTP/1.1" 200 62 "-" "ClusterListener/1.0" > [07:45:34] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" > [07:45:39] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > [07:45:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [07:45:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" > [07:45:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" > [07:45:40] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > [07:45:41] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" > {noformat} > This time, there were two {{STOP-APP}} messages all right, but instead one of them being for {{clusterbench}} and the other one for {{simplecontext-666}}, they were both for {{clusterbench}}. I find this behavior being a rather disturbing one. > *JDK8*, error_log, {{grep STOP-APP}} (See the complete log here [^5-error_log.zip]) > {noformat} > [07:45:40] mod_manager.c(1901): manager_trans STOP-APP (/) > [07:45:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [07:45:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [07:45:40] mod_manager.c(2666): manager_handler STOP-APP OK > [07:45:40] mod_manager.c(1901): manager_trans STOP-APP (/) > [07:45:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" > [07:45:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 > [07:45:40] mod_manager.c(2666): manager_handler STOP-APP OK > {noformat} > Again, there were plenty of requests sent to {{simplecontext-666/}}: > {noformat} > grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK8/httpd/logs/access_log > 27587 > {noformat} > WDYT? -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 3 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-427) mod_cluster can break stickiness for the first request on new child processes

by RH Bugzilla Integration (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-427?page=com.atlassian.jira.pl... ] RH Bugzilla Integration updated MODCLUSTER-427: ----------------------------------------------- Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1136976, https://bugzilla.redhat.com/show_bug.cgi?id=1139389 (was: https://bugzilla.redhat.com/show_bug.cgi?id=1136976) > mod_cluster can break stickiness for the first request on new child processes > ----------------------------------------------------------------------------- > > Key: MODCLUSTER-427 > URL: https://issues.jboss.org/browse/MODCLUSTER-427 > Project: mod_cluster > Issue Type: Bug > Components: Native (httpd modules) > Affects Versions: 1.2.9.Final, 1.3.1.Alpha1 > Environment: JBoss EAP 6.3.0 > Reporter: Aaron Ogburn > Assignee: Jean-Frederic Clere > Fix For: 1.2.10.Final, 1.3.1.Final > > > mod_cluster can break stickiness for the first request on new child processes. It looks like this occurs specifically when "CreateBalancers 1" is used. Prefork typically makes this much worse as well. > My debugging showed that the proxy_balancer would exist, but it would essentially be empty in the new child process. find_session_route/find_route_worker would be called as expected, but the for loop in find_route_worker wasn't even doing anything because balancer->workers->nelts was 0. The balancer would then finally be populated in the new child when the first request hits internal_find_best_byrequests and calls update_workers_node. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 3 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-414) Fix compilation warnings in the native part

by Radoslav Husar (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-414?page=com.atlassian.jira.pl... ] Radoslav Husar updated MODCLUSTER-414: -------------------------------------- Status: Resolved (was: Pull Request Sent) Resolution: Done Merged. > Fix compilation warnings in the native part > ------------------------------------------- > > Key: MODCLUSTER-414 > URL: https://issues.jboss.org/browse/MODCLUSTER-414 > Project: mod_cluster > Issue Type: Feature Request > Security Level: Public(Everyone can see) > Components: Native (httpd modules) > Affects Versions: 1.3.0.Final > Reporter: Radoslav Husar > Assignee: Radoslav Husar > Fix For: 1.3.1.Final > > > There are currently 14 warnings: > {noformat} > [rhusar@x220 native]$ for x in advertise mod_cluster_slotmem mod_manager mod_proxy_cluster; do cd $x; ./buildconf; ./configure --with-apxs=/usr/bin/apxs; make clean; make; cd ..; done > Creating configure ... > checking for Apache httpd installation... APXS is /usr/bin/apxs > apxs_support is true > configure: creating ./config.status > config.status: creating Makefile > make[1]: Entering directory `/home/rhusar/git/mod_cluster/native/advertise' > rm -f *.o *.lo *.slo *.obj *.a *.la > rm -rf .libs > make[1]: Leaving directory `/home/rhusar/git/mod_cluster/native/advertise' > rm -f *.o *.lo *.slo *.so > rm -rf .libs > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c mod_advertise.c && touch mod_advertise.slo > mod_advertise.c: In function 'cmd_advertise_g': > mod_advertise.c:178:34: warning: comparison with string literal results in unspecified behavior [-Waddress] > mconf->ma_advertise_adrs != MA_DEFAULT_GROUP) > ^ > mod_advertise.c: In function 'post_config_hook': > mod_advertise.c:657:13: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'int' [-Wformat=] > ptr = apr_psprintf(pproc, "%s:%lu", ma_server_rec->server_hostname, port); > ^ > mod_advertise.c: In function 'advertise_info': > mod_advertise.c:734:24: warning: format '%d' expects argument of type 'int', but argument 6 has type 'long int' [-Wformat=] > ); > ^ > mod_advertise.c: At top level: > mod_advertise.c:89:26: warning: 'ma_listen_socket' defined but not used [-Wunused-variable] > static apr_socket_t *ma_listen_socket = NULL; > ^ > /usr/lib64/apr-1/build/libtool --silent --mode=link gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wl,-z,relro,-z,now -o mod_advertise.la -rpath /usr/lib64/httpd/modules -module -avoid-version mod_advertise.lo > /usr/lib64/httpd/build/instdso.sh SH_LIBTOOL='/usr/lib64/apr-1/build/libtool --silent' mod_advertise.la `pwd` > /usr/lib64/apr-1/build/libtool --silent --mode=install install mod_advertise.la /home/rhusar/modc/native/advertise/ > libtool: install: warning: remember to run `libtool --finish /usr/lib64/httpd/modules' > Creating configure ... > checking for Apache httpd installation... APXS is /usr/bin/apxs > apxs_support is true > configure: creating ./config.status > config.status: creating Makefile > make[1]: Entering directory `/home/rhusar/git/mod_cluster/native/mod_cluster_slotmem' > rm -f *.o *.lo *.slo *.obj *.a *.la > rm -rf .libs > make[1]: Leaving directory `/home/rhusar/git/mod_cluster/native/mod_cluster_slotmem' > rm -f *.o *.lo *.slo *.so > rm -rf .libs > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c sharedmem_util.c && touch sharedmem_util.slo > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c mod_sharedmem.c && touch mod_sharedmem.slo > /usr/lib64/apr-1/build/libtool --silent --mode=link gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wl,-z,relro,-z,now -o mod_cluster_slotmem.la -rpath /usr/lib64/httpd/modules -module -avoid-version mod_sharedmem.lo sharedmem_util.lo > /usr/lib64/httpd/build/instdso.sh SH_LIBTOOL='/usr/lib64/apr-1/build/libtool --silent' mod_cluster_slotmem.la `pwd` > /usr/lib64/apr-1/build/libtool --silent --mode=install install mod_cluster_slotmem.la /home/rhusar/modc/native/mod_cluster_slotmem/ > libtool: install: warning: remember to run `libtool --finish /usr/lib64/httpd/modules' > Creating configure ... > checking for Apache httpd installation... APXS is /usr/bin/apxs > apxs_support is true > configure: creating ./config.status > config.status: creating Makefile > make[1]: Entering directory `/home/rhusar/git/mod_cluster/native/mod_manager' > rm -f *.o *.lo *.slo *.obj *.a *.la > rm -rf .libs > make[1]: Leaving directory `/home/rhusar/git/mod_cluster/native/mod_manager' > rm -f *.o *.lo *.slo *.so > rm -rf .libs > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c mod_manager.c && touch mod_manager.slo > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c node.c && touch node.slo > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c context.c && touch context.slo > context.c: In function 'remove_context': > context.c:182:5: warning: 'rv' may be used uninitialized in this function [-Wmaybe-uninitialized] > return rv; > ^ > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c host.c && touch host.slo > host.c: In function 'remove_host': > host.c:179:5: warning: 'rv' may be used uninitialized in this function [-Wmaybe-uninitialized] > return rv; > ^ > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c balancer.c && touch balancer.slo > balancer.c: In function 'remove_balancer': > balancer.c:179:5: warning: 'rv' may be used uninitialized in this function [-Wmaybe-uninitialized] > return rv; > ^ > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c sessionid.c && touch sessionid.slo > sessionid.c: In function 'remove_sessionid': > sessionid.c:179:5: warning: 'rv' may be used uninitialized in this function [-Wmaybe-uninitialized] > return rv; > ^ > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c domain.c && touch domain.slo > domain.c: In function 'remove_domain': > domain.c:179:5: warning: 'rv' may be used uninitialized in this function [-Wmaybe-uninitialized] > return rv; > ^ > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c jgroupsid.c && touch jgroupsid.slo > jgroupsid.c: In function 'remove_jgroupsid': > jgroupsid.c:175:5: warning: 'rv' may be used uninitialized in this function [-Wmaybe-uninitialized] > return rv; > ^ > /usr/lib64/apr-1/build/libtool --silent --mode=link gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wl,-z,relro,-z,now -o mod_manager.la -rpath /usr/lib64/httpd/modules -module -avoid-version mod_manager.lo node.lo context.lo host.lo balancer.lo sessionid.lo domain.lo jgroupsid.lo > /usr/lib64/httpd/build/instdso.sh SH_LIBTOOL='/usr/lib64/apr-1/build/libtool --silent' mod_manager.la `pwd` > /usr/lib64/apr-1/build/libtool --silent --mode=install install mod_manager.la /home/rhusar/modc/native/mod_manager/ > libtool: install: warning: remember to run `libtool --finish /usr/lib64/httpd/modules' > Creating configure ... > checking for Apache httpd installation... APXS is /usr/bin/apxs > apxs_support is true > configure: creating ./config.status > config.status: creating Makefile > make[1]: Entering directory `/home/rhusar/git/mod_cluster/native/mod_proxy_cluster' > rm -f *.o *.lo *.slo *.obj *.a *.la > rm -rf .libs > make[1]: Leaving directory `/home/rhusar/git/mod_cluster/native/mod_proxy_cluster' > rm -f *.o *.lo *.slo *.so > rm -rf .libs > /usr/lib64/apr-1/build/libtool --silent --mode=compile gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/httpd -I. -I/usr/include/apr-1 -I../include -prefer-pic -c mod_proxy_cluster.c && touch mod_proxy_cluster.slo > mod_proxy_cluster.c: In function 'add_balancer_node': > mod_proxy_cluster.c:677:12: warning: unused variable 'sizew' [-Wunused-variable] > int sizew = conf->workers->elt_size; > ^ > mod_proxy_cluster.c: In function 'remove_workers_node': > mod_proxy_cluster.c:958:13: warning: unused variable 'sizeb' [-Wunused-variable] > int sizeb = conf->balancers->elt_size; > ^ > mod_proxy_cluster.c:957:15: warning: unused variable 'ptr' [-Wunused-variable] > char *ptr = conf->balancers->elts; > ^ > mod_proxy_cluster.c:956:15: warning: unused variable 'name' [-Wunused-variable] > char *name = apr_pstrcat(pool, "balancer://", node->mess.balancer, NULL); > ^ > mod_proxy_cluster.c: In function 'proxy_cluster_try_pingpong': > mod_proxy_cluster.c:1419:9: warning: format '%d' expects argument of type 'int', but argument 8 has type 'struct conn_rec *' [-Wformat=] > ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server, > ^ > mod_proxy_cluster.c: In function 'proxy_cluster_post_config': > mod_proxy_cluster.c:2669:17: warning: unused variable 'userdata_key' [-Wunused-variable] > const char *userdata_key = "mod_cluster_init"; > ^ > mod_proxy_cluster.c: In function 'find_node_context_host.isra.19': > mod_proxy_cluster.c:1709:21: warning: 'balancer' may be used uninitialized in this function [-Wmaybe-uninitialized] > proxy_balancer *balancer; > ^ > /usr/lib64/apr-1/build/libtool --silent --mode=link gcc -std=gnu99 -pthread -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wl,-z,relro,-z,now -o mod_proxy_cluster.la -rpath /usr/lib64/httpd/modules -module -avoid-version mod_proxy_cluster.lo > /usr/lib64/httpd/build/instdso.sh SH_LIBTOOL='/usr/lib64/apr-1/build/libtool --silent' mod_proxy_cluster.la `pwd` > /usr/lib64/apr-1/build/libtool --silent --mode=install install mod_proxy_cluster.la /home/rhusar/modc/native/mod_proxy_cluster/ > libtool: install: warning: remember to run `libtool --finish /usr/lib64/httpd/modules' > {noformat} -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 4 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-427) mod_cluster can break stickiness for the first request on new child processes

by Aaron Ogburn (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-427?page=com.atlassian.jira.pl... ] Aaron Ogburn commented on MODCLUSTER-427: ----------------------------------------- Ah, thanks, JF, you are right. I hadn't looked into proxy_cluster_child_init() yet. But checking it out, I see currently that it creates the proxy_cluster_watchdog thread, but doesn't do anything itself to fill the balancer. The created proxy_cluster_watchdog thread will fill the balancer after its first two seconds of sleep though. Further testing around this matched up. If I waited 2+ seconds for a subsequent request, sticky sessions can work since the proxy_cluster_watchdog executed and filled the balancer in the new process. If the next request comes in less than 2 seconds later, proxy_cluster_watchdog didn't execute yet and so the balancer is not yet filled in the new process, and sticky sessions are not maintained as mentioned in my original description. I modified my PRs to address proxy_cluster_child_init() since it's the origin of the issue. I essentially added the update_workers_node calls here to ensure it is done within init rather than lazily 2 seconds later when the created proxy_cluster_watchdog first executes. This fixes my reproduction as well. What do you think? > mod_cluster can break stickiness for the first request on new child processes > ----------------------------------------------------------------------------- > > Key: MODCLUSTER-427 > URL: https://issues.jboss.org/browse/MODCLUSTER-427 > Project: mod_cluster > Issue Type: Bug > Security Level: Public(Everyone can see) > Components: Native (httpd modules) > Affects Versions: 1.2.9.Final, 1.3.1.Alpha1 > Environment: JBoss EAP 6.3.0 > Reporter: Aaron Ogburn > Assignee: Jean-Frederic Clere > Fix For: 1.2.10.Final, 1.3.1.Final > > > mod_cluster can break stickiness for the first request on new child processes. It looks like this occurs specifically when "CreateBalancers 1" is used. Prefork typically makes this much worse as well. > My debugging showed that the proxy_balancer would exist, but it would essentially be empty in the new child process. find_session_route/find_route_worker would be called as expected, but the for loop in find_route_worker wasn't even doing anything because balancer->workers->nelts was 0. The balancer would then finally be populated in the new child when the first request hits internal_find_best_byrequests and calls update_workers_node. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 4 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-427) mod_cluster can break stickiness for the first request on new child processes

by Jean-Frederic Clere (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-427?page=com.atlassian.jira.pl... ] Jean-Frederic Clere commented on MODCLUSTER-427: ------------------------------------------------ I am not sure the fix is at the right place, basically the balancer should be filled when the child is created (in proxy_cluster_child_init()), > mod_cluster can break stickiness for the first request on new child processes > ----------------------------------------------------------------------------- > > Key: MODCLUSTER-427 > URL: https://issues.jboss.org/browse/MODCLUSTER-427 > Project: mod_cluster > Issue Type: Bug > Security Level: Public(Everyone can see) > Components: Native (httpd modules) > Affects Versions: 1.2.9.Final, 1.3.1.Alpha1 > Environment: JBoss EAP 6.3.0 > Reporter: Aaron Ogburn > Assignee: Jean-Frederic Clere > Fix For: 1.2.10.Final, 1.3.1.Final > > > mod_cluster can break stickiness for the first request on new child processes. It looks like this occurs specifically when "CreateBalancers 1" is used. Prefork typically makes this much worse as well. > My debugging showed that the proxy_balancer would exist, but it would essentially be empty in the new child process. find_session_route/find_route_worker would be called as expected, but the for loop in find_route_worker wasn't even doing anything because balancer->workers->nelts was 0. The balancer would then finally be populated in the new child when the first request hits internal_find_best_byrequests and calls update_workers_node. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 4 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-427) mod_cluster can break stickiness for the first request on new child processes

by RH Bugzilla Integration (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-427?page=com.atlassian.jira.pl... ] RH Bugzilla Integration updated MODCLUSTER-427: ----------------------------------------------- Bugzilla Update: Perform Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1136976 > mod_cluster can break stickiness for the first request on new child processes > ----------------------------------------------------------------------------- > > Key: MODCLUSTER-427 > URL: https://issues.jboss.org/browse/MODCLUSTER-427 > Project: mod_cluster > Issue Type: Bug > Security Level: Public(Everyone can see) > Components: Native (httpd modules) > Affects Versions: 1.2.9.Final, 1.3.1.Alpha1 > Environment: JBoss EAP 6.3.0 > Reporter: Aaron Ogburn > Assignee: Jean-Frederic Clere > Fix For: 1.2.10.Final, 1.3.1.Final > > > mod_cluster can break stickiness for the first request on new child processes. It looks like this occurs specifically when "CreateBalancers 1" is used. Prefork typically makes this much worse as well. > My debugging showed that the proxy_balancer would exist, but it would essentially be empty in the new child process. find_session_route/find_route_worker would be called as expected, but the for loop in find_route_worker wasn't even doing anything because balancer->workers->nelts was 0. The balancer would then finally be populated in the new child when the first request hits internal_find_best_byrequests and calls update_workers_node. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 4 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-427) mod_cluster can break stickiness for the first request on new child processes

by Aaron Ogburn (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-427?page=com.atlassian.jira.pl... ] Aaron Ogburn edited comment on MODCLUSTER-427 at 9/3/14 1:46 PM: ----------------------------------------------------------------- PRs for how I was able to fix locally. I'm not sure if something deeper may be going on to cause the empty balancer, but doing update_workers_node before internal_find_best_byrequests and before the sticky session check populates the balancer in time so stickiness can be maintained. was (Author: aogburn): PRs for how I was able to fix locally. I'm not sure if something deeper may be going on to cause the empty balancer, but doing update_workers_node before internal_find_best_byrequests and before the sticky session check populates it in time so stickiness can be maintained. > mod_cluster can break stickiness for the first request on new child processes > ----------------------------------------------------------------------------- > > Key: MODCLUSTER-427 > URL: https://issues.jboss.org/browse/MODCLUSTER-427 > Project: mod_cluster > Issue Type: Bug > Security Level: Public(Everyone can see) > Components: Native (httpd modules) > Affects Versions: 1.2.9.Final, 1.3.1.Alpha1 > Environment: JBoss EAP 6.3.0 > Reporter: Aaron Ogburn > Assignee: Jean-Frederic Clere > Fix For: 1.2.10.Final, 1.3.1.Final > > > mod_cluster can break stickiness for the first request on new child processes. It looks like this occurs specifically when "CreateBalancers 1" is used. Prefork typically makes this much worse as well. > My debugging showed that the proxy_balancer would exist, but it would essentially be empty in the new child process. find_session_route/find_route_worker would be called as expected, but the for loop in find_route_worker wasn't even doing anything because balancer->workers->nelts was 0. The balancer would then finally be populated in the new child when the first request hits internal_find_best_byrequests and calls update_workers_node. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 4 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-427) mod_cluster can break stickiness for the first request on new child processes

by Aaron Ogburn (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-427?page=com.atlassian.jira.pl... ] Aaron Ogburn updated MODCLUSTER-427: ------------------------------------ Status: Pull Request Sent (was: Open) Git Pull Request: https://github.com/modcluster/mod_cluster/pull/106, https://github.com/modcluster/mod_cluster/pull/107 PRs for how I was able to fix locally. I'm not sure if something deeper may be going on to cause the empty balancer, but doing update_workers_node before internal_find_best_byrequests and before the sticky session check populates it in time so stickiness can be maintained. > mod_cluster can break stickiness for the first request on new child processes > ----------------------------------------------------------------------------- > > Key: MODCLUSTER-427 > URL: https://issues.jboss.org/browse/MODCLUSTER-427 > Project: mod_cluster > Issue Type: Bug > Security Level: Public(Everyone can see) > Components: Native (httpd modules) > Affects Versions: 1.2.9.Final, 1.3.1.Alpha1 > Environment: JBoss EAP 6.3.0 > Reporter: Aaron Ogburn > Assignee: Jean-Frederic Clere > Fix For: 1.2.10.Final, 1.3.1.Final > > > mod_cluster can break stickiness for the first request on new child processes. It looks like this occurs specifically when "CreateBalancers 1" is used. Prefork typically makes this much worse as well. > My debugging showed that the proxy_balancer would exist, but it would essentially be empty in the new child process. find_session_route/find_route_worker would be called as expected, but the for loop in find_route_worker wasn't even doing anything because balancer->workers->nelts was 0. The balancer would then finally be populated in the new child when the first request hits internal_find_best_byrequests and calls update_workers_node. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 4 months

1
0
0 / 0

[JBoss JIRA] (MODCLUSTER-427) mod_cluster can break stickiness for the first request on new child processes

by Radoslav Husar (JIRA)

[ https://issues.jboss.org/browse/MODCLUSTER-427?page=com.atlassian.jira.pl... ] Radoslav Husar updated MODCLUSTER-427: -------------------------------------- Fix Version/s: 1.2.10.Final 1.3.1.Final > mod_cluster can break stickiness for the first request on new child processes > ----------------------------------------------------------------------------- > > Key: MODCLUSTER-427 > URL: https://issues.jboss.org/browse/MODCLUSTER-427 > Project: mod_cluster > Issue Type: Bug > Security Level: Public(Everyone can see) > Components: Native (httpd modules) > Affects Versions: 1.2.9.Final, 1.3.1.Alpha1 > Environment: JBoss EAP 6.3.0 > Reporter: Aaron Ogburn > Assignee: Jean-Frederic Clere > Fix For: 1.2.10.Final, 1.3.1.Final > > > mod_cluster can break stickiness for the first request on new child processes. It looks like this occurs specifically when "CreateBalancers 1" is used. Prefork typically makes this much worse as well. > My debugging showed that the proxy_balancer would exist, but it would essentially be empty in the new child process. find_session_route/find_route_worker would be called as expected, but the for loop in find_route_worker wasn't even doing anything because balancer->workers->nelts was 0. The balancer would then finally be populated in the new child when the first request hits internal_find_best_byrequests and calls update_workers_node. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 4 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

mod_cluster-issues September 2014