[
https://issues.jboss.org/browse/AS7-6397?page=com.atlassian.jira.plugin.s...
]
Richard Achmatowicz commented on AS7-6397:
------------------------------------------
Manuel
An update. In trying to reproduce the error you are seeing, I set up an Apache load
balancer + two node cluster with non-sticky sessions, firing thousands of requests at the
load balancer. This setup caused each request to be processed at a node different from the
one before it, and so forced the global locking mechanism to transfer lock ownership from
one node to the other. I did not see any problems in this configuration in all my test
scenarios.
Where I did manage to recreate the problem was by setting a reply_timeout on the workers,
so that if the processing of a request took longer than the reply timeout, the load
balancer would not wait for completion of the request and would retry the same request on
an alternate host. This would create a situation where the original request (now
considered failed by the load balancer) and the failed over request were simultaneously
needing access to the same lock, and the exception would result.
I am still investigating what happens with the locking mechanism after the condition
occurs.
However, in the mean time, please check to see if you have retry_timeout values set for
your workers. If they are set, increasing these values may allow a long-running request
time to complete before it is failed over and restarted on another node. This is one
possible explanationfor what you are seeing.
In fact, if you could let me know which load balancer you are using and its configuration,
that would be helpful.
Richard
Cluster Environment Web Session Locks
-------------------------------------
Key: AS7-6397
URL:
https://issues.jboss.org/browse/AS7-6397
Project: Application Server 7
Issue Type: Bug
Components: Clustering
Affects Versions: 7.1.1.Final
Environment: Windows 7 64bits, 8 GB RAM
Reporter: Manuel Pinto
Assignee: Paul Ferraro
Attachments: AS7-4260.patch
Hi,
I found a problem with web session locks in a cluster environment. We have two Liferay
6.1.1 nodes (over JBoss 7.1.1 Final) in standalone-ha.xml configuration with infinispan
"web" cache-container, replicated-cache and file store. The load balancer is
configured in non sticky session mode.
Problem: when a node processes requests in some cases locks the session and never unlock
it, preventing other node from processing requests for that session. The affected node
never regain the locked session and keep throwing the following exception for all
subsequent requests and only recover a session when other node shutdown:
Note: we also tried invalidation-cache and distributed-cache and all locking modes but
without success.
17:39:00,174 ERROR [org.apache.catalina.connector.CoyoteAdapter]
(http--172.16.250.105-8080-4) An exception or error occurred in the container during the
request processing: java.lang.RuntimeException: JBAS018060: Exception acquiring ownership
of Cvn-K+r-cBGesIBoDrakJhrO
at
org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:528)
[jboss-as-web-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.as.web.session.ClusteredSession.access(ClusteredSession.java:496)
[jboss-as-web-7.1.1.Final.jar:7.1.1.Final]
at org.apache.catalina.connector.Request.doGetSession(Request.java:2625)
[jbossweb-7.0.13.Final.jar:]
at org.apache.catalina.connector.Request.getSession(Request.java:2375)
[jbossweb-7.0.13.Final.jar:]
at
org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:81)
[jboss-as-web-7.1.1.Final.jar:7.1.1.Final]
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:155)
[jbossweb-7.0.13.Final.jar:]
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
[jbossweb-7.0.13.Final.jar:]
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
[jbossweb-7.0.13.Final.jar:]
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368)
[jbossweb-7.0.13.Final.jar:]
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:877)
[jbossweb-7.0.13.Final.jar:]
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:671)
[jbossweb-7.0.13.Final.jar:]
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930)
[jbossweb-7.0.13.Final.jar:]
at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_32]
Caused by: org.jboss.as.clustering.lock.TimeoutException: JBAS010223: Cannot acquire lock
//default-host//Cvn-K+r-cBGesIBoDrakJhrO from cluster
at
org.jboss.as.clustering.lock.SharedLocalYieldingClusterLockManager.lock(SharedLocalYieldingClusterLockManager.java:439)
at
org.jboss.as.clustering.web.infinispan.DistributedCacheManager.acquireSessionOwnership(DistributedCacheManager.java:372)
at
org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:520)
[jboss-as-web-7.1.1.Final.jar:7.1.1.Final]
... 12 more
The standalone-ha.xml "web" cache-container config is the following:
<cache-container name="web" aliases="standard-session-cache"
default-cache="repl">
<transport lock-timeout="60000"/>
<replicated-cache name="repl" mode="SYNC"
batching="true">
<file-store/>
</replicated-cache>
<replicated-cache name="sso" mode="SYNC"
batching="true"/>
<distributed-cache name="dist" mode="ASYNC"
batching="true">
<file-store/>
</distributed-cache>
</cache-container>
Thanks,
Manuel Pinto
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira