[JBoss JIRA] Created: (JGRP-1326) Gossip Router dropping message for node that is in its routing table list
by vivek v (JIRA)
Gossip Router dropping message for node that is in its routing table list
-------------------------------------------------------------------------
Key: JGRP-1326
URL: https://issues.jboss.org/browse/JGRP-1326
Project: JGroups
Issue Type: Bug
Affects Versions: 2.10
Environment: Linux, Windows
Reporter: vivek v
Assignee: Bela Ban
We are using Tunnel protocol with two Gossip Routers. For some reason we start seeing lots of suspect messages in all the nodes - there are 7 nodes in the group. Six of the nodes (including the coordinator) was suspecting node A (manager_172.27.75.11) and node A was suspecting the coordinator, but no new view was being created. After turning on the trace on both gossip routers (GR1 and GR2) I see following for every message that's sent to Node A (manager_172.27.75.11),
{noformat}
2011-05-20 15:56:21,186 TRACE [gossip-handlers-6] GossipRouter - cannot find manager_172.27.75.11:4576 in the routing table,
routing table=
172.27.75.11_group: probe_172.27.75.13:4576, collector_172.27.75.12:4576, probe_172.27.75.15:4576, manager_172.27.75.11:4576, probe_172.27.75.16:4576, probe_172.27.75.14:4576
{noformat}
Now, the issue is the routing table does indeed shows that there is "manager_172.27.75.11" - so why is the GR dropping messages for that node. I suspect that somehow the Gossip Router has got some old entry which has not been cleaned up - different UUID with same logical address. I tried going through the GossipRouter.java code, but couldn't find how would this be possible.
As I understand a node randomly chooses a GR if there are multiple of them for its communication. Each GR would keep a separate list of physical addresses for each node - so is it possible somehow it uses physical address instead of UUID for cleaning/retrieving the node list?
This seems to be creating big issue and the only work around is to restart the Gossip Routers.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 2 months
[JBoss JIRA] Created: (JGRP-1165) Out-of-sync views in the cluster causes NAKACK issues and invalid node list at application layer
by vivek v (JIRA)
Out-of-sync views in the cluster causes NAKACK issues and invalid node list at application layer
-------------------------------------------------------------------------------------------------
Key: JGRP-1165
URL: https://jira.jboss.org/jira/browse/JGRP-1165
Project: JGroups
Issue Type: Bug
Affects Versions: 2.9, 2.8
Reporter: vivek v
Assignee: Bela Ban
There is a logic in GMS (in the installView(..) method) where it checks whether the node itself is in the view or not, if not then just discard the view,
if(checkSelfInclusion(mbrs) == false) {
if(log.isWarnEnabled()) log.warn(local_addr + ": not member of view " + new_view + "; discarding it");
return;
}
Now, the problem /w this logic is that the node will remain /w the old view and when trying to send message to the members in the old view the messages would be discarded /w NAKACK as this node won't be there in their new view. So here is an example,
1) 3 nodes all with same view - V1 {n1, n2, n3}
2) n1 (coordinator) suspects (due to missing heartbeat) n2 and publishes new view - V2 {n1, n3}
- n2 discards the suspect message from n1 as FD_SOCK is still connected
3) n2 receives this view, but discards it due to the logic in GMS
4) n2 still keeps the old view V1 and continue to send messages to n1 and n3. n1 and n3 will discard messages from n2 /w NAKACK as it's not in their view (V2).
5) After few minutes (could be 10-15 minutes or more) n1 will publish a merge view V3(n1, n2,n3} - joining V1 and V2. Now all nodes got the same view
The problem is on n2 the application layer will never know that it can't talk to n1 and n3 - thus, the RPC calls will fail during the time the nodes had different views.
I would assume if a node gets a view, which doesn't have itself in it - it should drop all the nodes that are in that new view. So, basically we will create two new subgroups. This way we won't discard messages from each other. The application layer needs to know at all times what nodes can it talk to.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 2 months
[JBoss JIRA] Created: (AS7-1928) Allow deployments to get access to the META-INF folder of their dependency module
by Marius Bogoevici (JIRA)
Allow deployments to get access to the META-INF folder of their dependency module
---------------------------------------------------------------------------------
Key: AS7-1928
URL: https://issues.jboss.org/browse/AS7-1928
Project: Application Server 7
Issue Type: Feature Request
Affects Versions: 7.1.0.Alpha1
Reporter: Marius Bogoevici
Priority: Critical
Currently there is absolutely no mechanism (outside of manipulating dependencies programmatically through a subsystem) for deployments to get access to some META-INF/xyz files from a module that they declare as a dependency, except are in META-INF/services.
This is a serious impediment when considering the installation of third party libraries as shared libraries (modules) in JBoss AS. Some frameworks may rely upon locating internal descriptors in META-INF, and if the libraries are installed as modules, the descriptors are not accessible to the framework. Essentially, there is no way of accessing resources under META-INF unless they are either services, or the library is packaged in the deployment.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 2 months
[JBoss JIRA] (AS7-2847) TS: Clustering tests cant be run for single test with -Dtest=
by Radoslav Husar (Created) (JIRA)
TS: Clustering tests cant be run for single test with -Dtest=
-------------------------------------------------------------
Key: AS7-2847
URL: https://issues.jboss.org/browse/AS7-2847
Project: Application Server 7
Issue Type: Bug
Components: Clustering, Documentation, Test Suite
Affects Versions: 7.1.0.Beta1
Reporter: Radoslav Husar
Assignee: Paul Ferraro
Fix For: 7.1.0.CR1
Maybe I missed something?
{code}./integration-tests.sh clean install -Dts.clust -Dts.noSmoke{code}
works okay, but when I do:
{code}./integration-tests.sh clean install -Dts.clust -Dts.noSmoke -Dtest=org.jboss.as.test.clustering.cluster.ClusteredWebTestCase.java{code}
then I get:
{code}
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running org.jboss.as.test.clustering.cluster.ClusteredWebTestCase
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.915 sec <<< FAILURE!
Results :
Tests in error:
org.jboss.as.test.clustering.cluster.ClusteredWebTestCase: DeploymentScenario contains targets not matching any defined Container in the registry. clustering-udp-1. Possible causes are: No Deployable Container found on Classpath or your have defined a @org.jboss.arquillian.container.test.api.Deployment with a @org.jboss.arquillian.container.test.api.TargetsContainer value that does not match any found/configured Containers (see arquillian.xml container@qualifier)
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
{code}
Thanks
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 2 months