Hello,
I have a problem when I remove a node from a cluster. The cluster consists of 2 or more
identical configured nodes on which the application is deployed via farming.
Adding nodes to the cluster works fine. As soon as the deployment is completed the
JNDIView shows all registered beans and proxies:
HA-JNDI Namespace
| +- QueueConnectionFactory
| +- XAConnectionFactory
| +- HTTPXAConnectionFactory
| +- queue
| | +- D
| | +- DLQ
| | +- C
| | +- ex
| | +- B
| | +- A
| | +- testQueue
| +- HTTPConnectionFactory
| +- UIL2XAConnectionFactory[link -> XAConnectionFactory]
| +- kusssdemo
| | +- TermPeriodEM
| | | +- local (proxy: $Proxy1060 implements interface at.jku......)
| | +- StudyCodeBusinessLogicBean
| | | +- remote (proxy: $Proxy934 implements interface at.jku.......)
| | +- StudyMajorFieldBusinessLogicBean
|
Now when I shutdown a node in the cluster the proxies are removed from all nodes and not
restarted again (thus leaving any application inoperable):
HA-JNDI Namespace
| +- HTTPXAConnectionFactory
| +- XAConnectionFactory
| +- QueueConnectionFactory
| +- queue
| | +- D
| | +- C
| | +- DLQ
| | +- B
| | +- ex
| | +- A
| | +- testQueue
| +- HTTPConnectionFactory
| +- UIL2XAConnectionFactory[link -> XAConnectionFactory]
| +- kusssdemo
| | +- TermPeriodEM
| | +- StudyMajorFieldBusinessLogicBean
| | +- StudyCodeBusinessLogicBean
|
The only related suspect thing I can see in the log of another (still alive) node is the
"was NOT removed !!!" message:
INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] New cluster view
for partition DefaultPartition: 13 ([192.168.1.104:1099, 192.168.1.106:1099] delta: -1)
| DEBUG [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] dead members:
[192.168.1.105:1099]
| DEBUG [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] membership
changed from 2 to 2
| DEBUG [org.jgroups.protocols.pbcast.NAKACK] removing 192.168.1.105:7800 from
received_msgs (not member anymore)
| DEBUG [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] Begin
notifyListeners, viewID: 13
| DEBUG [org.jgroups.protocols.FD_SOCK] VIEW_CHANGE received: [192.168.1.104:7800,
192.168.1.106:7800]
| INFO
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] I am
(192.168.1.106:1099) received membershipChanged event:
| INFO
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] Dead
members: 1 ([192.168.1.105:1099])
| INFO
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] New
Members : 0 ([])
| INFO
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] All
Members : 2 ([192.168.1.104:1099, 192.168.1.106:1099])
| DEBUG
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition]
purgeDeadMembers, [192.168.1.105:1099]
| DEBUG
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] trying to
remove deadMember 192.168.1.105:1099 for key DCacheBridge-DefaultJGBridge
| DEBUG [org.jgroups.protocols.FD] suspected_mbrs: [], after adjustment: []
| DEBUG
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition]
192.168.1.105:1099 was NOT removed!!!
| DEBUG
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] trying to
remove deadMember 192.168.1.105:1099 for key jboss.ha:service=HASingletonDeployer
| DEBUG
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition]
192.168.1.105:1099 was NOT removed!!!
| DEBUG [org.jgroups.protocols.FD_SOCK] determinePingDest()=192.168.1.104:7800,
pingable_mbrs=[192.168.1.104:7800, 192.168.1.106:7800]
| DEBUG
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] trying to
remove deadMember 192.168.1.105:1099 for key HAJNDI
| DEBUG
[org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition]
192.168.1.105:1099 was NOT removed!!!
| DEBUG [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] End
notifyListeners, viewID: 13
As soon as I join any other node to the cluster again, everything continous to work fine.
Does anybody have an idea what might be wrong here?
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4030499#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...