[JBoss JIRA] (ISPN-4879) Log a clear error message when an incompatible node joins the cluster
by Galder Zamarreño (Jira)
[ https://issues.jboss.org/browse/ISPN-4879?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-4879:
-----------------------------------
Fix Version/s: 9.4.5.Final
(was: 9.4.4.Final)
> Log a clear error message when an incompatible node joins the cluster
> ---------------------------------------------------------------------
>
> Key: ISPN-4879
> URL: https://issues.jboss.org/browse/ISPN-4879
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 7.0.0.CR2
> Reporter: Dan Berindei
> Priority: Major
> Fix For: 9.4.5.Final
>
>
> We don't check the Infinispan version when a node joins the cluster. If the node has an incompatible version, it will most likely fail to join, but the error message is not at all straightforward. As an example:
> {noformat}
> Exception in thread "main" org.infinispan.commons.CacheException: Unable
> to invoke method public void
> org.infinispan.statetransfer.StateTransferManagerImpl.start() throws
> java.lang.Exception on object of type StateTransferManagerImpl
> at
> org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:170)
> at
> org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
> at
> org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
> at
> org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
> at
> org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
> at
> org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:216)
> at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:764)
> at
> org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:584)
> at
> org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:539)
> at
> org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:416)
> at ch.nexustelecom.lbd.engine.ImsiCache.init(ImsiCache.java:49)
> at
> ch.nexustelecom.dexclient.engine.DefaultDexClientEngine.init(DefaultDexClientEngine.java:120)
> at ch.nexustelecom.dexclient.DexClient.initClient(DexClient.java:169)
> at
> ch.nexustelecom.dexclient.tool.DexClientManager.startup(DexClientManager.java:196)
> at
> ch.nexustelecom.dexclient.tool.DexClientManager.main(DexClientManager.java:83)
> Caused by: org.infinispan.commons.CacheException:
> java.lang.ClassNotFoundException:
> org.infinispan.partionhandling.impl.AvailabilityMode
> at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655)
> at
> org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176)
> at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536)
> at
> org.infinispan.topology.LocalTopologyManagerImpl.executeOnCoordinator(LocalTopologyManagerImpl.java:388)
> at
> org.infinispan.topology.LocalTopologyManagerImpl.join(LocalTopologyManagerImpl.java:102)
> at
> org.infinispan.statetransfer.StateTransferManagerImpl.start(StateTransferManagerImpl.java:108)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at
> org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
> ... 14 more
> Caused by: java.lang.ClassNotFoundException:
> org.infinispan.partionhandling.impl.AvailabilityMode
> at java.net.URLClassLoader$1.run(Unknown Source)
> at java.net.URLClassLoader$1.run(Unknown Source)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(Unknown Source)
> at java.lang.ClassLoader.loadClass(Unknown Source)
> at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
> at java.lang.ClassLoader.loadClass(Unknown Source)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Unknown Source)
> at
> org.jboss.marshalling.AbstractClassResolver.loadClass(AbstractClassResolver.java:131)
> at
> org.jboss.marshalling.AbstractClassResolver.resolveClass(AbstractClassResolver.java:112)
> at
> org.jboss.marshalling.river.RiverUnmarshaller.doReadClassDescriptor(RiverUnmarshaller.java:1002)
> at
> org.jboss.marshalling.river.RiverUnmarshaller.doReadNewObject(RiverUnmarshaller.java:1239)
> at
> org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:272)
> at
> org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)
> at
> org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41)
> at
> org.infinispan.topology.CacheStatusResponse$Externalizer.readObject(CacheStatusResponse.java:76)
> at
> org.infinispan.topology.CacheStatusResponse$Externalizer.readObject(CacheStatusResponse.java:62)
> at
> org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.readObject(ExternalizerTable.java:424)
> at
> org.infinispan.marshall.core.ExternalizerTable.readObject(ExternalizerTable.java:221)
> at
> org.infinispan.marshall.core.JBossMarshaller$ExternalizerTableProxy.readObject(JBossMarshaller.java:148)
> at
> org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:351)
> at
> org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)
> at
> org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41)
> at
> org.infinispan.remoting.responses.SuccessfulResponse$Externalizer.readObject(SuccessfulResponse.java:79)
> at
> org.infinispan.remoting.responses.SuccessfulResponse$Externalizer.readObject(SuccessfulResponse.java:64)
> at
> org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.readObject(ExternalizerTable.java:424)
> at
> org.infinispan.marshall.core.ExternalizerTable.readObject(ExternalizerTable.java:221)
> at
> org.infinispan.marshall.core.JBossMarshaller$ExternalizerTableProxy.readObject(JBossMarshaller.java:148)
> at
> org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:351)
> at
> org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)
> at
> org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41)
> at
> org.infinispan.commons.marshall.jboss.AbstractJBossMarshaller.objectFromObjectStream(AbstractJBossMarshaller.java:135)
> at
> org.infinispan.marshall.core.VersionAwareMarshaller.objectFromByteBuffer(VersionAwareMarshaller.java:101)
> at
> org.infinispan.commons.marshall.AbstractDelegatingMarshaller.objectFromByteBuffer(AbstractDelegatingMarshaller.java:80)
> at
> org.infinispan.remoting.transport.jgroups.MarshallerAdapter.objectFromBuffer(MarshallerAdapter.java:28)
> at
> org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390)
> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:250)
> at
> org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:674)
> at org.jgroups.JChannel.up(JChannel.java:733)
> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
> at org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:146)
> at org.jgroups.protocols.RSVP.up(RSVP.java:190)
> at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:379)
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:1042)
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234)
> at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1034)
> at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:752)
> at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:399)
> at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:610)
> at org.jgroups.protocols.BARRIER.up(BARRIER.java:152)
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:155)
> at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:200)
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:297)
> at org.jgroups.protocols.MERGE3.up(MERGE3.java:288)
> at org.jgroups.protocols.Discovery.up(Discovery.java:277)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1568)
> at org.jgroups.protocols.TP$MyHandler.run(TP.java:1787)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> {noformat}
> Optionally, we could allow the user to configure an "application version" and prevent nodes with different application versions from joining the same cluster.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-4846) State transfer keeps trying to fetch transaction data after the cache was stopped
by Galder Zamarreño (Jira)
[ https://issues.jboss.org/browse/ISPN-4846?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-4846:
-----------------------------------
Fix Version/s: 9.4.5.Final
(was: 9.4.4.Final)
> State transfer keeps trying to fetch transaction data after the cache was stopped
> ---------------------------------------------------------------------------------
>
> Key: ISPN-4846
> URL: https://issues.jboss.org/browse/ISPN-4846
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.0.CR1
> Reporter: Dan Berindei
> Priority: Major
> Fix For: 9.4.5.Final
>
>
> StateConsumerImpl doesn't check if the cache is stopped while fetching transaction data, it only stops when it's no longer able to find providers for transactions.
> However, JGroupsTransport throws a generic CacheException when the channel is stopped. The state transfer thread can enter a busy-wait loop, retrying to get the transaction data and immediately getting the CacheException, filling the log with messages like this:
> {noformat}
> 19:32:28,237 WARN (remote-thread-NodeN-p42592-t1:) [StateConsumerImpl] ISPN000209: Failed to retrieve transactions for segments [10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 42, 43, 40, 41, 46, 47, 44, 45, 51, 50, 49, 48, 55, 54, 53, 52, 59, 58, 57, 56] of cache testCache from node NodeM-53416
> org.infinispan.commons.CacheException: java.lang.IllegalStateException: channel is not connected
> at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:290)
> at org.infinispan.statetransfer.StateConsumerImpl.getTransactions(StateConsumerImpl.java:766)
> at org.infinispan.statetransfer.StateConsumerImpl.requestTransactions(StateConsumerImpl.java:685)
> at org.infinispan.statetransfer.StateConsumerImpl.addTransfers(StateConsumerImpl.java:629)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:331)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:195)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:43)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:116)
> {noformat}
> We should check is the cache is stopped before retrying in StateConsumerImpl.requestTransactions. I also think we should change the stop order - it would make sense to stop the remote executor threads and the RpcDispatcher before we stop the channel.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-4817) Cluster Listener replication of listener to remote dist nodes needs privilege action
by Galder Zamarreño (Jira)
[ https://issues.jboss.org/browse/ISPN-4817?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-4817:
-----------------------------------
Fix Version/s: 9.4.5.Final
(was: 9.4.4.Final)
> Cluster Listener replication of listener to remote dist nodes needs privilege action
> ------------------------------------------------------------------------------------
>
> Key: ISPN-4817
> URL: https://issues.jboss.org/browse/ISPN-4817
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 7.0.0.CR1
> Reporter: William Burns
> Priority: Critical
> Fix For: 9.4.5.Final
>
>
> The ClusterListenerReplicateCallable needs to add the listener in a privileged block so that it can work properly.
> I was also thinking if we need to change all of our messages to send along the current Subject so it can be properly applied to remote invocations as well. In this case we wouldn't need to wrap the listener invocation.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-4624) Allow custom partition handling strategy
by Galder Zamarreño (Jira)
[ https://issues.jboss.org/browse/ISPN-4624?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-4624:
-----------------------------------
Fix Version/s: 9.4.5.Final
(was: 9.4.4.Final)
> Allow custom partition handling strategy
> ----------------------------------------
>
> Key: ISPN-4624
> URL: https://issues.jboss.org/browse/ISPN-4624
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 7.0.0.Beta1
> Reporter: Dan Berindei
> Priority: Major
> Labels: partition_handling
> Fix For: 9.4.5.Final
>
>
> Users should be able to configure a custom PartitionHandlingStrategy. It should be able to specify a behaviour on merge as well.
> We might want to merge this with the RebalancePolicy, which was also supposed to be configurable.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-5515) Purge store if there is another node already running
by Galder Zamarreño (Jira)
[ https://issues.jboss.org/browse/ISPN-5515?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-5515:
-----------------------------------
Fix Version/s: 9.4.5.Final
(was: 9.4.4.Final)
> Purge store if there is another node already running
> ----------------------------------------------------
>
> Key: ISPN-5515
> URL: https://issues.jboss.org/browse/ISPN-5515
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core, Loaders and Stores
> Affects Versions: 7.2.2.Final, 8.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 9.4.5.Final
>
>
> Preloading happens before communicating with other nodes that might already have the cache running. When joining the existing members, the cache then waits to receive the first CH in which it is a member, and then deletes only the entries in the segments that it doesn't own in that CH.
> The intention of this was to remove as little as possible from the existing data, e.g. if the first node to start up is not the one that was stopped last. But the preloaded entries are not replicated to the other nodes, so this can lead to inconsistencies.
> It would be better to delay preloading until we know we are the first node to start up, but failing that we could clear the data container and the store before receiving the initial state.
> Note that this will only allow preloading data from one node. Restoring data from more nodes is harder to do, and we will implement it as part of graceful restart.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-5513) State Transfer can miss entries that are concurrently activated
by Galder Zamarreño (Jira)
[ https://issues.jboss.org/browse/ISPN-5513?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-5513:
-----------------------------------
Fix Version/s: 9.4.5.Final
(was: 9.4.4.Final)
> State Transfer can miss entries that are concurrently activated
> ---------------------------------------------------------------
>
> Key: ISPN-5513
> URL: https://issues.jboss.org/browse/ISPN-5513
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer
> Affects Versions: 8.0.0.Alpha1
> Reporter: William Burns
> Priority: Major
> Fix For: 9.4.5.Final
>
>
> Currently the OutboundTransferTask iterates upon the data container and then runs process for the state loader. However if an entry is activated during or after the data container iteration it is possible this entry is then not seen and subsequently is not present in the store when it is processed.
> EntryRetriever had this same issue and it was required to register a cache listener to listen for activations and then replay the data after finishing with the store.
> This can cause duplicate values as well, however replacing the same exact value is fine and if a non ST write occurs the state is ignored anyways.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-5510) Provide better Hot Rod client socket timeout and retry defaults
by Galder Zamarreño (Jira)
[ https://issues.jboss.org/browse/ISPN-5510?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-5510:
-----------------------------------
Fix Version/s: 9.4.5.Final
(was: 9.4.4.Final)
> Provide better Hot Rod client socket timeout and retry defaults
> ---------------------------------------------------------------
>
> Key: ISPN-5510
> URL: https://issues.jboss.org/browse/ISPN-5510
> Project: Infinispan
> Issue Type: Enhancement
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Priority: Major
> Fix For: 9.4.5.Final
>
>
> The current defaults are:
> * Socket timeout = 60 seconds
> * Max retries = 10
> As a result of these defaults, if the server hangs an operation, it'd take 10 minutes (60 second timeout x 10 retries) for the operation to finally return an exception to the client, which is way too much.
> So, these default value should change to be more aggressive: 30 second socket timeout and 3 max retries.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-5506) Dist/ReplEmbeddedRestHotRodTest failing randomly with "java.net.SocketException: Socket closed"
by Galder Zamarreño (Jira)
[ https://issues.jboss.org/browse/ISPN-5506?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-5506:
-----------------------------------
Fix Version/s: 9.4.5.Final
(was: 9.4.4.Final)
> Dist/ReplEmbeddedRestHotRodTest failing randomly with "java.net.SocketException: Socket closed"
> -----------------------------------------------------------------------------------------------
>
> Key: ISPN-5506
> URL: https://issues.jboss.org/browse/ISPN-5506
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Protocols
> Affects Versions: 7.2.1.Final, 8.0.0.Alpha1
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Priority: Major
> Fix For: 9.4.5.Final
>
> Attachments: infinispan_5506.tgz
>
>
> {code}
> testEmbeddedPutRestHotRodGet(org.infinispan.it.compatibility.DistEmbeddedRestHotRodTest) Time elapsed: 0.002 sec <<< FAILURE!
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at java.net.Socket.connect(Socket.java:538)
> at java.net.Socket.<init>(Socket.java:434)
> at java.net.Socket.<init>(Socket.java:286)
> at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
> at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
> at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
> at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
> at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
> at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
> at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
> at org.infinispan.it.compatibility.ReplEmbeddedRestHotRodTest.testEmbeddedPutRestHotRodGet(ReplEmbeddedRestHotRodTest.java:80)
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months