[JBoss JIRA] (ISPN-2633) IllegalArgumentException: Address shall not be null and broken views upon JOIN
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2633?page=com.atlassian.jira.plugin.... ]
Mircea Markus commented on ISPN-2633:
-------------------------------------
[~rhusar] Can you please fill the "Affects Version" field when creating a bug? This is important for us to be able to reproduce it.
Also what functional impact does this have? Node cannot join? Is there a workaround for this? (e.g. re-doing the join operation). This would help in assessing the priority.
> IllegalArgumentException: Address shall not be null and broken views upon JOIN
> ------------------------------------------------------------------------------
>
> Key: ISPN-2633
> URL: https://issues.jboss.org/browse/ISPN-2633
> Project: Infinispan
> Issue Type: Bug
> Environment: Windows 2008
> Reporter: Radoslav Husar
> Assignee: Dan Berindei
> Fix For: 5.2.0.CR1
>
> Attachments: configs.zip, serverlogs.zip
>
>
> Running our AS testsuite with master (which includes fix for ISPN-2572) I am instead seeing the following.
> The same node is present twice in the view, looks like something is wrong with JOINs as this happen upon joins.
> {noformat}
> 15:51:41,376 WARN [org.jgroups.protocols.TP$ProtocolAdapter] (OOB-20,null) dropping unicast message to wrong destination node-1/ejb; my local_addr is node-1/ejb
> 15:51:41,821 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 19) ISPN000094: Received new cluster view: [node-0/ejb|12] [node-0/ejb, node-1/ejb, node-1/ejb]
> 15:51:41,825 WARN [org.jgroups.protocols.TCP] (FD_SOCK pinger,ejb,node-1/ejb) null: logical address cache didn't contain all physical address, sending up a discovery request
> 15:51:41,825 ERROR [org.jgroups.protocols.TCP] (FD_SOCK pinger,ejb,node-1/ejb) failed sending message to cluster (69 bytes): java.lang.NullPointerException, cause: null
> 15:51:42,185 ERROR [org.jgroups.JChannel] (ServerService Thread Pool -- 19) exception in channelConnected() callback: java.lang.IllegalStateException: JBAS010272: A node named node-1 already exists in this cluster. Perhaps there is already a server running on this host? If so, restart this server with a unique node name, via -Djboss.node.name=<node-name>
> at org.jboss.as.clustering.jgroups.subsystem.ChannelService.channelConnected(ChannelService.java:105)
> at org.jgroups.Channel.notifyChannelConnected(Channel.java:495)
> at org.jgroups.JChannel.connect(JChannel.java:286)
> at org.jgroups.JChannel.connect(JChannel.java:268)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:207)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.start(JGroupsTransport.java:198)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.6.0_32]
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [rt.jar:1.6.0_32]
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_32]
> at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_32]
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:883)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:654)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:643)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:546)
> at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:225)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:681)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:653)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:549)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:563)
> at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:107)
> at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:98)
> at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:78)
> at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:82)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_32]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_32]
> at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_32]
> at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.0.0.GA.jar:2.0.0.GA]
> 15:51:42,204 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 19) MSC00001: Failed to start service jboss.infinispan.ejb.remote-connector-client-mappings: org.jboss.msc.service.StartException in service jboss.infinispan.ejb.remote-connector-client-mappings: org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport
> at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:87)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_32]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_32]
> at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_32]
> at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.0.0.GA.jar:2.0.0.GA]
> Caused by: org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport
> at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:247)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:681)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:653)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:549)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:563)
> at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:107)
> at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:98)
> at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:78)
> at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:82)
> ... 4 more
> Caused by: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:883)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:654)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:643)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:546)
> at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:225)
> ... 12 more
> Caused by: java.lang.IllegalArgumentException: Address shall not be null
> at org.infinispan.remoting.transport.jgroups.JGroupsAddress.<init>(JGroupsAddress.java:48)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.fromJGroupsAddress(JGroupsTransport.java:686)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:227)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.start(JGroupsTransport.java:198)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.6.0_32]
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [rt.jar:1.6.0_32]
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_32]
> at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_32]
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
> ... 17 more
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years
[JBoss JIRA] (ISPN-2501) State transfer optimizations
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2501?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2501:
--------------------------------
Fix Version/s: 5.2.0.Final
(was: 5.2.0.CR1)
> State transfer optimizations
> ----------------------------
>
> Key: ISPN-2501
> URL: https://issues.jboss.org/browse/ISPN-2501
> Project: Infinispan
> Issue Type: Enhancement
> Components: State transfer
> Affects Versions: 5.2.0.Beta3
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Fix For: 5.2.0.Final
>
>
> There are two obvious optimizations possible in the code that handles installation of a new topology.
> 1. Currently a new topology is not confirmed until the node successfully sends START_STATE_TRANSFER requests to all nodes it wants to fetch segments from. This does not wait for the actual data to arrive but it still blocks quite a lot. To fix this we need to confirm the topology right after transactions were received.
> 2. Fetching transactions from other members is done serially. To improve it we could split it into multiple concurrent tasks.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years
[JBoss JIRA] (ISPN-2550) NoSuchElementException in Hot Rod Encoder
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-2550?page=com.atlassian.jira.plugin.... ]
Work on ISPN-2550 started by Dan Berindei.
> NoSuchElementException in Hot Rod Encoder
> -----------------------------------------
>
> Key: ISPN-2550
> URL: https://issues.jboss.org/browse/ISPN-2550
> Project: Infinispan
> Issue Type: Bug
> Components: Remote protocols
> Affects Versions: 5.2.0.Beta4
> Reporter: Michal Linhard
> Assignee: Dan Berindei
> Priority: Blocker
> Fix For: 5.2.0.CR1
>
>
> Tomas noticed this a while ago in a specific functional test:
> https://bugzilla.redhat.com/show_bug.cgi?id=875151
> I'm creating a more general JIRA, cause I'm having this in resilience test.
> What I found by quick debug, is that here:
> https://github.com/infinispan/infinispan/blob/master/server/hotrod/src/ma...
> {code}
> for (segmentIdx <- 0 until numSegments) {
> val denormalizedSegmentHashIds = allDenormalizedHashIds(segmentIdx)
> val segmentOwners = ch.locateOwnersForSegment(segmentIdx)
> for (ownerIdx <- 0 until segmentOwners.length) {
> val address = segmentOwners(ownerIdx % segmentOwners.size)
> val serverAddress = members(address)
> val hashId = denormalizedSegmentHashIds(ownerIdx)
> log.tracef("Writing hash id %d for %s:%s", hashId, serverAddress.host, serverAddress.port)
> writeString(serverAddress.host, buf)
> writeUnsignedShort(serverAddress.port, buf)
> buf.writeInt(hashId)
> }
> }
> {code}
> we're trying to obtain serverAddress for nonexistent address and NoSuchElementException is not handled properly.
> It hapens after I kill a node in a resilience test and the exception appears when querying for the node in the members cache.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years
[JBoss JIRA] (ISPN-2588) Lock leak during state transfer (causing StaleLocksTransactionTest to fail)
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2588?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2588:
--------------------------------
Priority: Critical (was: Major)
> Lock leak during state transfer (causing StaleLocksTransactionTest to fail)
> ---------------------------------------------------------------------------
>
> Key: ISPN-2588
> URL: https://issues.jboss.org/browse/ISPN-2588
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.0.Beta5
> Reporter: Mircea Markus
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.0.CR1
>
> Attachments: StaleLocksTransactionTest.zip
>
>
> numOwners=1, pessimistic cache (same applies if A is the only node in cluster)
> 1. tx1 running on A with writes on k, lockOwner(k) == {A}
> 2. A.tx1.lock(k), this doesn't go remotely, and control returns in the InterceptorStack
> 3. at this point B is started and lockOwner(k) == {B}
> 4. the StateTransferInterceptor forwards the command to B which acquires the lock locally
> 5. this is followed by a tx.commit/rollback that would not send the message to B, so the lock on B is pending.
> The logic which determines whether the message to be sent remotely or not is in DistributionInterceptor.visitCommitCommand, which invokes:
> {code:java}
> protected boolean shouldInvokeRemoteTxCommand(TxInvocationContext ctx) {
> return ctx.isOriginLocal() && (ctx.hasModifications() ||
> !((LocalTxInvocationContext) ctx).getRemoteLocksAcquired().isEmpty());
> }
> {code}
> The problem here is that, when forwarding, we don't register the remote node as a locked.I think a more generic solution would also work, e.g. if the viewId of the tx is different from the viewId of the cluster at commit time, always go remotely.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years
[JBoss JIRA] (ISPN-2580) Do not request segments from all nodes at once
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-2580?page=com.atlassian.jira.plugin.... ]
Dan Berindei reassigned ISPN-2580:
----------------------------------
Assignee: Adrian Nistor (was: Dan Berindei)
> Do not request segments from all nodes at once
> ----------------------------------------------
>
> Key: ISPN-2580
> URL: https://issues.jboss.org/browse/ISPN-2580
> Project: Infinispan
> Issue Type: Enhancement
> Components: State transfer
> Affects Versions: 5.2.0.Beta5
> Reporter: Radim Vansa
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.CR1
>
>
> When a new node joins large cluster filled with data, it gets the new CH and REBALANCE_START command, and requests data from all nodes at once (or almost all with even distribution of segments). It may be not able to handle this amount of transfers in parallel even at the JGroups level - this results in data sent to the node and discarded at the receiver, sent again and again. With a heavy congestion the node just buffers fragments of a message from one sender and never passes this up.
> The number of StateRequestCommands(START_STATE_TRANSFER) should be limited so that the node is not congested.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years