[Clustering/JBoss] - Wierd issue with clustered nodes
by mohitanchlia
I've posted the issue here: http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4138599#4138599
I am also pasting it here, in case other one is not being looked at:
I am seeing some weird behavior. We are running into this serious issue where Nodes join the cluster and soon after they disappear from the cluster. So for eg: If I have 5 nodes then they initially join the cluster and then after some time we see any one of the following:
1. Dead member message for one of the nodes - even though the dead node is up and running. I can run Jgroup sender reciever test from dead node to other nodes with no problems. I would assume that dead member would try to communicate back after some time in case there was a temporary problem. But that doesn't seem to be happening.
2. As noted above in your discussion I get 2008-02-21 08:35:11,784 WARN [org.jgroups.protocols.pbcast.GMS] failed to collect all ACKs (3) for view [172.17.65.39:40883|5] [172.17.65.39:40883, 172.17.66.39:35267, 172.17.67.39:39896, 172.17.64.39:52927] after 5000ms, missing ACKs from [172.17.65.39:40883, 172.17.66.39:35267, 172.17.6 .....
I am not sure why that;s happening and what it really means. All I can guess is that it's not able to get the datagram. Also, I am assuming this is the coordinator.
3. In a cluster of 5, all 5 initially join cluster and after some time what we see is that node 1,2,3 become part of one cluster and 4,5 becomes another cluster. All of them have same udp group, name and port so I don't really understand how they can split and why they don't get merged back together if there was a temporary issue.
Overall I am not able to understand this wierdness. I am planning to run some Jgroup load test. We've spoken to our network team and they don't see any issues on switch. I've looked at the NIC and don't see any problems. IGMP is enabled on all the routers. Also, how can I tell which node is now the coordinator?
I also did tracroute to make sure ttl is not the problem.
It would really be helpful if you could let me know how I can debug this issue. It's really weird. Below is the UDP jgroups config:
30000
<!-- The JGroups protocol configuration -->
<!--
The default UDP stack:
- If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
appropriate NIC IP address, e.g bind_addr="192.168.0.2".
- On Windows machines, because of the media sense feature being broken with multicast
(even after disabling media sense) set the UDP protocol's loopback attribute to true
-->
<UDP mcast_addr="${efe.partition.udpGroup:228.1.2.3}"
mcast_port="${jboss.hapartition.mcast_port:45566}"
tos="8"
ucast_recv_buf_size="20000000"
ucast_send_buf_size="640000"
mcast_recv_buf_size="25000000"
mcast_send_buf_size="640000"
loopback="false"
discard_incompatible_packets="true"
enable_bundling="false"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="false"
ip_ttl="${jgroups.udp.ip_ttl:2}"
down_thread="false" up_thread="false"/>
<PING timeout="2000"
down_thread="false" up_thread="false" num_initial_members="3"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD timeout="10000" max_tries="5" down_thread="false" up_thread="false" shun="true"/>
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<UNICAST timeout="300,600,1200,2400,3600"
down_thread="false" up_thread="false"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"
view_bundling="true"/>
<FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false" use_flush="false"/>
---------
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4191813#4191813
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4191813
17 years, 5 months
[JBoss Portal] - Portal event and ViewExpiredException
by jaro777
Hello,
I got following issue.
Environment:
jboss-portal-2.7.0.GA (bundled version) -> JSF RI 1.2_08 and Jboss portlet bridge 1.0.0.B4
I have PortletA and PortletB - both are JSF portlets. PortletA publishes Event and PortletB processes it. Sometimes when event is triggered from PortletA, ViewExpiredException is thrown during restoring view of PortletB
| Caused by: javax.faces.application.ViewExpiredException: viewId:/scsearch/scsearchdetail.jsp - View /scsearch/scsearchdetail.jsp could not be restored.
| at com.sun.faces.lifecycle.RestoreViewPhase.execute(RestoreViewPhase.java:186)
| at com.sun.faces.lifecycle.Phase.doPhase(Phase.java:100)
| at com.sun.faces.lifecycle.RestoreViewPhase.doPhase(RestoreViewPhase.java:104)
| at com.sun.faces.lifecycle.LifecycleImpl.execute(LifecycleImpl.java:118)
| at org.jboss.portletbridge.AjaxPortletBridge.execute(AjaxPortletBridge.java:587)
| at org.jboss.portletbridge.AjaxPortletBridge.renderResponse(AjaxPortletBridge.java:441)
| at org.jboss.portletbridge.AjaxPortletBridge.doFacesRequest(AjaxPortletBridge.java:344)
As workaround this partly works (at least stack trace is not displayed though event is not processed) - add it to web.xml of PortletB's war
| <context-param>
| <param-name>com.sun.faces.enableRestoreView11Compatibility</param-name>
| <param-value>true</param-value>
|
Still not satisfied with "solution" I debug JSF and came to strange conclusion: JSF adds hidden field into form (there is form in both PortletA and PortletB underlying jsp page) like
| <INPUT id="javax.faces.ViewState" type="hidden" name="javax.faces.ViewState" value="j_id9" /> w
| </context-param>
|
This values stores version of view stored in session for particular view. Problem is that the same version is used for both portlets - in case that PortletB does have such a version (e.g. html would like like)
| <INPUT id="javax.faces.ViewState" type="hidden" name="javax.faces.ViewState" value="j_id8" /> w
| </context-param>
|
than no view is found and ViewExpiredException is thrown.
Any idea how to solve this properly?
Thanks
Jaro
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4191811#4191811
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4191811
17 years, 5 months
[JBoss jBPM] - ExceptionHandler / changes since 3.2GA
by mpet
Hello,
I used a global exception handler in my process definition, like the following:
<?xml version="1.0" encoding="UTF-8"?>
| <process-definition
| xmlns="urn:jbpm.org:jpdl-3.2"
| name="simple">
| <start-state name="start">
| <transition name="to_state" to="first">
| <action name="action" class="com.sample.action.NoHandlerAvailable">
| <message>Going to the first state!</message>
| </action>
| </transition>
| </start-state>
| <state name="first">
| <transition to="end"></transition>
| </state>
| <end-state name="end"></end-state>
| <exception-handler>
| <action class="com.sample.action.TestExceptionHandler">
| </action>
| </exception-handler>
| </process-definition>
In the above example, the action handler com.sample.action.NoHandlerAvailable is not available (or maybe causes an exception). In case the exception handler class is not available (or causes an exception itself), 3.1.2 and 3.2GA stopped executing the process flow. After upgrading to 3.2.2 or above (have not tried 3.2.1), if the exception handler class is not available or causes an exception, that exception seems to be handled by the same exception handler - causing an (infinite ?) loop.
Might this be a bug or is the new behaviour intended? Or am I missing something?
Thanks for any help.
Marko
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4191799#4191799
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4191799
17 years, 5 months
[EJB 3.0] - Re: client interceptor - ejb3-interceptors-aop.xml
by ALRubinger
There's not much we can do about Community AS 4.2.x.
In current trunk I see your problem:
10:47:58,654 ERROR [ProfileServiceBootstrap] Failed to load profile: Summary of incomplete deployments (SEE PREVIOUS ERRORS FOR DETAILS):
|
| *** CONTEXTS MISSING DEPENDENCIES: Name -> Dependency{Required State:Actual State}
|
| StatelessSessionClientInterceptors
| -> StatelessSessionClientInterceptors$4{Configured:Instantiated}
|
| StatelessSessionClientInterceptors$4
| -> com.alrubinger.Test{Configured:** NOT FOUND Depends on 'com.alrubinger.Test' **}
|
|
| *** CONTEXTS IN ERROR: Name -> Error
|
| com.alrubinger.Test -> ** NOT FOUND Depends on 'com.alrubinger.Test' **
Though this isn't really within the EJB3 domain; this file's processed by the AOP AspectManager and I'll have to ask AOP team why these classes declared on the stack must be available immediately.
Please raise an EJBTHREE JIRA and I'll dig around.
S,
ALR
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4191798#4191798
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4191798
17 years, 5 months