February 2020 - infinispan-issues

[JBoss JIRA] (ISPN-11282) CLI: site command isn't working properly

by Pedro Ruivo (Jira)

[ https://issues.redhat.com/browse/ISPN-11282?page=com.atlassian.jira.plugi... ] Pedro Ruivo updated ISPN-11282: ------------------------------- Description: * {{site status}}: option {{--site}} isn't working properly. It returns all the backups even if you use a non-existing site: {noformat} [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache { "NYC" : "online" } [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache --site=NYC { "NYC" : "online" } [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache --site=ajdhds { "NYC" : "online" } {noformat} * {{clear-push-state-status}} operation isn't registered * {{bring-online}} and {{take-offline}} operations seems to fail: {noformat} [pedro-laptop-3-35787@cluster//containers/default]> site take-offline --cache=xsiteCache --site=NYC Not Found [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache { "NYC" : "offline" } [pedro-laptop-3-35787@cluster//containers/default]> site bring-online --cache=xsiteCache --site=NYC Not Found {noformat} was: * {{site status}}: option {{--site}} isn't working properly. It returns all the backups even if you use a non-existing site: {noformat} [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache { "NYC" : "online" } [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache --site=NYC { "NYC" : "online" } [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache --site=ajdhds { "NYC" : "online" } {noformat} * {clear-push-state-status} operation isn't registered * {bring-online} and {take-offline} operations seems to fail: {noformat} [pedro-laptop-3-35787@cluster//containers/default]> site take-offline --cache=xsiteCache --site=NYC Not Found [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache { "NYC" : "offline" } [pedro-laptop-3-35787@cluster//containers/default]> site bring-online --cache=xsiteCache --site=NYC Not Found {noformat} > CLI: site command isn't working properly > ---------------------------------------- > > Key: ISPN-11282 > URL: https://issues.redhat.com/browse/ISPN-11282 > Project: Infinispan > Issue Type: Bug > Components: CLI > Affects Versions: 10.1.1.Final > Reporter: Pedro Ruivo > Assignee: Pedro Ruivo > Priority: Major > > * {{site status}}: option {{--site}} isn't working properly. It returns all the backups even if you use a non-existing site: > {noformat} > [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache > { > "NYC" : "online" > } > [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache --site=NYC > { > "NYC" : "online" > } > [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache --site=ajdhds > { > "NYC" : "online" > } > {noformat} > * {{clear-push-state-status}} operation isn't registered > * {{bring-online}} and {{take-offline}} operations seems to fail: > {noformat} > [pedro-laptop-3-35787@cluster//containers/default]> site take-offline --cache=xsiteCache --site=NYC > Not Found > [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache > { > "NYC" : "offline" > } > [pedro-laptop-3-35787@cluster//containers/default]> site bring-online --cache=xsiteCache --site=NYC > Not Found > {noformat} -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-11282) CLI: site command isn't working properly

by Pedro Ruivo (Jira)

Pedro Ruivo created ISPN-11282: ---------------------------------- Summary: CLI: site command isn't working properly Key: ISPN-11282 URL: https://issues.redhat.com/browse/ISPN-11282 Project: Infinispan Issue Type: Bug Components: CLI Affects Versions: 10.1.1.Final Reporter: Pedro Ruivo Assignee: Pedro Ruivo * {{site status}}: option {{--site}} isn't working properly. It returns all the backups even if you use a non-existing site: {noformat} [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache { "NYC" : "online" } [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache --site=NYC { "NYC" : "online" } [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache --site=ajdhds { "NYC" : "online" } {noformat} * {clear-push-state-status} operation isn't registered * {bring-online} and {take-offline} operations seems to fail: {noformat} [pedro-laptop-3-35787@cluster//containers/default]> site take-offline --cache=xsiteCache --site=NYC Not Found [pedro-laptop-3-35787@cluster//containers/default]> site status --cache=xsiteCache { "NYC" : "offline" } [pedro-laptop-3-35787@cluster//containers/default]> site bring-online --cache=xsiteCache --site=NYC Not Found {noformat} -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-11266) Split CacheTopologyControlCommand into individual commands

by Ryan Emerson (Jira)

[ https://issues.redhat.com/browse/ISPN-11266?page=com.atlassian.jira.plugi... ] Ryan Emerson updated ISPN-11266: -------------------------------- Status: Open (was: New) > Split CacheTopologyControlCommand into individual commands > ---------------------------------------------------------- > > Key: ISPN-11266 > URL: https://issues.redhat.com/browse/ISPN-11266 > Project: Infinispan > Issue Type: Enhancement > Components: Core > Affects Versions: 10.1.1.Final > Reporter: Ryan Emerson > Assignee: Ryan Emerson > Priority: Major > Fix For: 11.0.0.Alpha1 > > > Currently the {{CacheTopologyControlCommand}} uses a Type field and a switch statement to differentiate between various topology actions. This worked well for the old Externalizer approach, however it does not fit well with protobuf messages. Instead, the CacheTopologyControlCommand should be split into individual commands, e.g. a TopologyJoinCommand etc. > This enables the logic of the command types to be separated, making it easier to maintain backwards compatibility in the long term. Each command will use a ProtoStream TypeId in the range of 1000 -> 3999, so the cost of two bytes is the same as the existing class ID plus enum Type that we require with the single CacheTopologyControlCommand. -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-4996) Problem with capacityFactor=0 and restart of all nodes with capacityFactor > 0

by Dan Berindei (Jira)

[ https://issues.redhat.com/browse/ISPN-4996?page=com.atlassian.jira.plugin... ] Dan Berindei commented on ISPN-4996: ------------------------------------ [~johnou] as a workaround, you can replace {code:java} globalConfigurationBuilder.zeroCapacityNode(true); {code} or {code:java} builder.clustering().hash().capacityFactor(0f); {code} with {code:java} builder.clustering().hash().capacityFactor(0.00001f); {code} That will make the node own all the segments when there is no other node with capacity factor >= 1f, and zero segments when there is a node with capacity factor >= 1. > Problem with capacityFactor=0 and restart of all nodes with capacityFactor > 0 > ------------------------------------------------------------------------------ > > Key: ISPN-4996 > URL: https://issues.redhat.com/browse/ISPN-4996 > Project: Infinispan > Issue Type: Bug > Components: Core > Affects Versions: 7.0.2.Final > Reporter: Enrico Olivelli > Assignee: Dan Berindei > Priority: Blocker > > I have a only one DIST_SYNC cache, most of the JVM in the cluster are configured with capacityFactor = 0 (like the distibutedlocalstorage=false property of Coherence) and some node are configured with capacityFactor>0 (for instance 1000). We are talking about 100 nodes with capacityFactor=0 and 4 nodes of the other kind, al the cluster is indide one single "site/rack". Partition Handling is off, numOwners is 1. > When all the nodes with capacityFactor > 0 are down the cluster comes to a degraded state > the ploblem is that even if nodes with capacityFactor>0 are up again the cluster does not recover, a full restart is needed > If I enable partition-handling AvailablyExceptions start to be throw and I think is the expected behaviour (see the "Infinispan User Guide"). > > I think this is the problem and it is a bug: > > {noformat} > 14/11/17 09:27:25 WARN topology.CacheTopologyControlCommand: ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=shared, type=JOIN, sender=testserver1@xxxxxxx-22311, site-id=xxx, rack-id=xxx, machine-id=24 bytes, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.impl.TopologyAwareConsistentHashFactory@78b791ef, hashFunction=MurmurHash3, numSegments=60, numOwners=1, timeout=120000, totalOrder=false, distributed=true}, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, throwable=null, viewId=3} > java.lang.IllegalArgumentException: A cache topology's pending consistent hash must contain all the current consistent hash's members > at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:48) > at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:43) > at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:631) > at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:85) > at org.infinispan.partionhandling.impl.PreferAvailabilityStrategy.onJoin(PreferAvailabilityStrategy.java:22) > at org.infinispan.topology.ClusterCacheStatus.doJoin(ClusterCacheStatus.java:540) > at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:123) > at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:158) > at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:140) > at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:278) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > After that error every "put" results in: > {noformat} > 14/11/17 09:27:27 ERROR interceptors.InvocationContextInterceptor: ISPN000136: Execution error > org.infinispan.util.concurrent.TimeoutException: Timed out waiting for topology 1 > at org.infinispan.statetransfer.StateTransferLockImpl.waitForTransactionData(StateTransferLockImpl.java:93) > at org.infinispan.interceptors.base.BaseStateTransferInterceptor.waitForTransactionData(BaseStateTransferInterceptor.java:96) > at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:188) > at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:95) > at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) > at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) > at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:148) > at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:134) > at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) > at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) > at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102) > at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71) > at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:35) > at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) > at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333) > at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1576) > at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:1054) > at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1046) > at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1646) > at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:245) > {noformat} > > This is the actual configuration: > > {code:java} > GlobalConfiguration globalConfig = new GlobalConfigurationBuilder() > .globalJmxStatistics() > .allowDuplicateDomains(true) > .cacheManagerName(instanceName) > .transport() > .defaultTransport() > .clusterName(clustername) > .addProperty("configurationFile", configurationFile) (udp for my cluster, approx 100 machines) > .machineId(instanceName) > .siteId("site1") > .rackId("rack1") > .nodeName(serviceName + "@" + instanceName) > .remoteCommandThreadPool().threadPoolFactory(CachedThreadPoolExecutorFactory.create()) > .build(); > Configuration wildcard = new ConfigurationBuilder() > .locking().lockAcquisitionTimeout(lockAcquisitionTimeout) > .concurrencyLevel(10000).isolationLevel(IsolationLevel.READ_COMMITTED).useLockStriping(true) > .clustering() > .cacheMode(CacheMode.DIST_SYNC) > .l1().lifespan(l1ttl) > .hash().numOwners(numOwners).capacityFactor(capacityFactor) > .partitionHandling().enabled(false) > .stateTransfer().awaitInitialTransfer(false).timeout(initialTransferTimeout).fetchInMemoryState(false) > .storeAsBinary().enabled(true).storeKeysAsBinary(false).storeValuesAsBinary(true) > .jmxStatistics().enable() > .unsafe().unreliableReturnValues(true) > .build(); > {code} > One workaround is to set capacityFactor = 1 instead of 0, but I do not want "simple-nodes" (with less RAM) to becaome key-owners > For me this is a showstopper problem -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-11227) Cluster fails to startup due to initial state transfer timing out

by Dan Berindei (Jira)

[ https://issues.redhat.com/browse/ISPN-11227?page=com.atlassian.jira.plugi... ] Dan Berindei closed ISPN-11227. ------------------------------- Resolution: Duplicate Issue > Cluster fails to startup due to initial state transfer timing out > ----------------------------------------------------------------- > > Key: ISPN-11227 > URL: https://issues.redhat.com/browse/ISPN-11227 > Project: Infinispan > Issue Type: Bug > Components: Core > Affects Versions: 10.1.1.Final > Reporter: Johno Crawford > Priority: Major > Attachments: ISPN11227.zip > > > If a zero capacity node is part of a running cluster and all other nodes are restarted, the nodes will hang on startup. > {code:java} > "ForkJoinPool.commonPool-worker-2@11514" daemon prio=5 tid=0xa3 nid=NA waiting > java.lang.Thread.State: WAITING > at sun.misc.Unsafe.park(Unsafe.java:-1) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:270) > at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:1091) > at org.infinispan.cache.impl.AbstractDelegatingCache.start(AbstractDelegatingCache.java:513) > at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:693) > at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:632) > at org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:517) > at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:498) > at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:491) > {code} -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-4996) Problem with capacityFactor=0 and restart of all nodes with capacityFactor > 0

by Dan Berindei (Jira)

[ https://issues.redhat.com/browse/ISPN-4996?page=com.atlassian.jira.plugin... ] Dan Berindei updated ISPN-4996: ------------------------------- Description: I have a only one DIST_SYNC cache, most of the JVM in the cluster are configured with capacityFactor = 0 (like the distibutedlocalstorage=false property of Coherence) and some node are configured with capacityFactor>0 (for instance 1000). We are talking about 100 nodes with capacityFactor=0 and 4 nodes of the other kind, al the cluster is indide one single "site/rack". Partition Handling is off, numOwners is 1. When all the nodes with capacityFactor > 0 are down the cluster comes to a degraded state the ploblem is that even if nodes with capacityFactor>0 are up again the cluster does not recover, a full restart is needed If I enable partition-handling AvailablyExceptions start to be throw and I think is the expected behaviour (see the "Infinispan User Guide"). I think this is the problem and it is a bug: {noformat} 14/11/17 09:27:25 WARN topology.CacheTopologyControlCommand: ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=shared, type=JOIN, sender=testserver1@xxxxxxx-22311, site-id=xxx, rack-id=xxx, machine-id=24 bytes, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.impl.TopologyAwareConsistentHashFactory@78b791ef, hashFunction=MurmurHash3, numSegments=60, numOwners=1, timeout=120000, totalOrder=false, distributed=true}, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, throwable=null, viewId=3} java.lang.IllegalArgumentException: A cache topology's pending consistent hash must contain all the current consistent hash's members at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:48) at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:43) at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:631) at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:85) at org.infinispan.partionhandling.impl.PreferAvailabilityStrategy.onJoin(PreferAvailabilityStrategy.java:22) at org.infinispan.topology.ClusterCacheStatus.doJoin(ClusterCacheStatus.java:540) at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:123) at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:158) at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:140) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:278) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} After that error every "put" results in: {noformat} 14/11/17 09:27:27 ERROR interceptors.InvocationContextInterceptor: ISPN000136: Execution error org.infinispan.util.concurrent.TimeoutException: Timed out waiting for topology 1 at org.infinispan.statetransfer.StateTransferLockImpl.waitForTransactionData(StateTransferLockImpl.java:93) at org.infinispan.interceptors.base.BaseStateTransferInterceptor.waitForTransactionData(BaseStateTransferInterceptor.java:96) at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:188) at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:95) at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:148) at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:134) at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102) at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71) at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:35) at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333) at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1576) at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:1054) at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1046) at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1646) at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:245) {noformat} This is the actual configuration: {code:java} GlobalConfiguration globalConfig = new GlobalConfigurationBuilder() .globalJmxStatistics() .allowDuplicateDomains(true) .cacheManagerName(instanceName) .transport() .defaultTransport() .clusterName(clustername) .addProperty("configurationFile", configurationFile) (udp for my cluster, approx 100 machines) .machineId(instanceName) .siteId("site1") .rackId("rack1") .nodeName(serviceName + "@" + instanceName) .remoteCommandThreadPool().threadPoolFactory(CachedThreadPoolExecutorFactory.create()) .build(); Configuration wildcard = new ConfigurationBuilder() .locking().lockAcquisitionTimeout(lockAcquisitionTimeout) .concurrencyLevel(10000).isolationLevel(IsolationLevel.READ_COMMITTED).useLockStriping(true) .clustering() .cacheMode(CacheMode.DIST_SYNC) .l1().lifespan(l1ttl) .hash().numOwners(numOwners).capacityFactor(capacityFactor) .partitionHandling().enabled(false) .stateTransfer().awaitInitialTransfer(false).timeout(initialTransferTimeout).fetchInMemoryState(false) .storeAsBinary().enabled(true).storeKeysAsBinary(false).storeValuesAsBinary(true) .jmxStatistics().enable() .unsafe().unreliableReturnValues(true) .build(); {code} One workaround is to set capacityFactor = 1 instead of 0, but I do not want "simple-nodes" (with less RAM) to becaome key-owners For me this is a showstopper problem was: I have a only one DIST_SYNC cache, most of the JVM in the cluster are configured with capacityFactor = 0 (like the distibutedlocalstorage=false property of Coherence) and some node are configured with capacityFactor>0 (for instance 1000). We are talking about 100 nodes with capacityFactor=0 and 4 nodes of the other kind, al the cluster is indide one single "site/rack". Partition Handling is off, numOwners is 1. When all the nodes with capacityFactor > 0 are down the cluster comes to a degraded state the ploblem is that even if nodes with capacityFactor>0 are up again the cluster does not recover, a full restart is needed If I enable partition-handling AvailablyExceptions start to be throw and I think is the expected behaviour (see the "Infinispan User Guide"). I think this is the problem and it is a bug: 14/11/17 09:27:25 WARN topology.CacheTopologyControlCommand: ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=shared, type=JOIN, sender=testserver1@xxxxxxx-22311, site-id=xxx, rack-id=xxx, machine-id=24 bytes, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.impl.TopologyAwareConsistentHashFactory@78b791ef, hashFunction=MurmurHash3, numSegments=60, numOwners=1, timeout=120000, totalOrder=false, distributed=true}, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, throwable=null, viewId=3} java.lang.IllegalArgumentException: A cache topology's pending consistent hash must contain all the current consistent hash's members at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:48) at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:43) at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:631) at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:85) at org.infinispan.partionhandling.impl.PreferAvailabilityStrategy.onJoin(PreferAvailabilityStrategy.java:22) at org.infinispan.topology.ClusterCacheStatus.doJoin(ClusterCacheStatus.java:540) at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:123) at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:158) at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:140) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:278) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) After that error every "put" results in: 14/11/17 09:27:27 ERROR interceptors.InvocationContextInterceptor: ISPN000136: Execution error org.infinispan.util.concurrent.TimeoutException: Timed out waiting for topology 1 at org.infinispan.statetransfer.StateTransferLockImpl.waitForTransactionData(StateTransferLockImpl.java:93) at org.infinispan.interceptors.base.BaseStateTransferInterceptor.waitForTransactionData(BaseStateTransferInterceptor.java:96) at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:188) at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:95) at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:148) at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:134) at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102) at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71) at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:35) at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333) at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1576) at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:1054) at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1046) at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1646) at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:245) This is the actual configuration: GlobalConfiguration globalConfig = new GlobalConfigurationBuilder() .globalJmxStatistics() .allowDuplicateDomains(true) .cacheManagerName(instanceName) .transport() .defaultTransport() .clusterName(clustername) .addProperty("configurationFile", configurationFile) (udp for my cluster, approx 100 machines) .machineId(instanceName) .siteId("site1") .rackId("rack1") .nodeName(serviceName + "@" + instanceName) .remoteCommandThreadPool().threadPoolFactory(CachedThreadPoolExecutorFactory.create()) .build(); Configuration wildcard = new ConfigurationBuilder() .locking().lockAcquisitionTimeout(lockAcquisitionTimeout) .concurrencyLevel(10000).isolationLevel(IsolationLevel.READ_COMMITTED).useLockStriping(true) .clustering() .cacheMode(CacheMode.DIST_SYNC) .l1().lifespan(l1ttl) .hash().numOwners(numOwners).capacityFactor(capacityFactor) .partitionHandling().enabled(false) .stateTransfer().awaitInitialTransfer(false).timeout(initialTransferTimeout).fetchInMemoryState(false) .storeAsBinary().enabled(true).storeKeysAsBinary(false).storeValuesAsBinary(true) .jmxStatistics().enable() .unsafe().unreliableReturnValues(true) .build(); One workaround is to set capacityFactor = 1 instead of 0, but I do not want "simple-nodes" (with less RAM) to becaome key-owners For me this is a showstopper problem > Problem with capacityFactor=0 and restart of all nodes with capacityFactor > 0 > ------------------------------------------------------------------------------ > > Key: ISPN-4996 > URL: https://issues.redhat.com/browse/ISPN-4996 > Project: Infinispan > Issue Type: Bug > Components: Core > Affects Versions: 7.0.2.Final > Reporter: Enrico Olivelli > Assignee: Dan Berindei > Priority: Blocker > > I have a only one DIST_SYNC cache, most of the JVM in the cluster are configured with capacityFactor = 0 (like the distibutedlocalstorage=false property of Coherence) and some node are configured with capacityFactor>0 (for instance 1000). We are talking about 100 nodes with capacityFactor=0 and 4 nodes of the other kind, al the cluster is indide one single "site/rack". Partition Handling is off, numOwners is 1. > When all the nodes with capacityFactor > 0 are down the cluster comes to a degraded state > the ploblem is that even if nodes with capacityFactor>0 are up again the cluster does not recover, a full restart is needed > If I enable partition-handling AvailablyExceptions start to be throw and I think is the expected behaviour (see the "Infinispan User Guide"). > > I think this is the problem and it is a bug: > > {noformat} > 14/11/17 09:27:25 WARN topology.CacheTopologyControlCommand: ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=shared, type=JOIN, sender=testserver1@xxxxxxx-22311, site-id=xxx, rack-id=xxx, machine-id=24 bytes, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.impl.TopologyAwareConsistentHashFactory@78b791ef, hashFunction=MurmurHash3, numSegments=60, numOwners=1, timeout=120000, totalOrder=false, distributed=true}, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, throwable=null, viewId=3} > java.lang.IllegalArgumentException: A cache topology's pending consistent hash must contain all the current consistent hash's members > at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:48) > at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:43) > at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:631) > at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:85) > at org.infinispan.partionhandling.impl.PreferAvailabilityStrategy.onJoin(PreferAvailabilityStrategy.java:22) > at org.infinispan.topology.ClusterCacheStatus.doJoin(ClusterCacheStatus.java:540) > at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:123) > at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:158) > at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:140) > at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:278) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > After that error every "put" results in: > {noformat} > 14/11/17 09:27:27 ERROR interceptors.InvocationContextInterceptor: ISPN000136: Execution error > org.infinispan.util.concurrent.TimeoutException: Timed out waiting for topology 1 > at org.infinispan.statetransfer.StateTransferLockImpl.waitForTransactionData(StateTransferLockImpl.java:93) > at org.infinispan.interceptors.base.BaseStateTransferInterceptor.waitForTransactionData(BaseStateTransferInterceptor.java:96) > at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:188) > at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:95) > at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) > at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) > at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:148) > at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:134) > at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) > at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) > at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102) > at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71) > at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:35) > at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) > at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333) > at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1576) > at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:1054) > at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1046) > at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1646) > at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:245) > {noformat} > > This is the actual configuration: > > {code:java} > GlobalConfiguration globalConfig = new GlobalConfigurationBuilder() > .globalJmxStatistics() > .allowDuplicateDomains(true) > .cacheManagerName(instanceName) > .transport() > .defaultTransport() > .clusterName(clustername) > .addProperty("configurationFile", configurationFile) (udp for my cluster, approx 100 machines) > .machineId(instanceName) > .siteId("site1") > .rackId("rack1") > .nodeName(serviceName + "@" + instanceName) > .remoteCommandThreadPool().threadPoolFactory(CachedThreadPoolExecutorFactory.create()) > .build(); > Configuration wildcard = new ConfigurationBuilder() > .locking().lockAcquisitionTimeout(lockAcquisitionTimeout) > .concurrencyLevel(10000).isolationLevel(IsolationLevel.READ_COMMITTED).useLockStriping(true) > .clustering() > .cacheMode(CacheMode.DIST_SYNC) > .l1().lifespan(l1ttl) > .hash().numOwners(numOwners).capacityFactor(capacityFactor) > .partitionHandling().enabled(false) > .stateTransfer().awaitInitialTransfer(false).timeout(initialTransferTimeout).fetchInMemoryState(false) > .storeAsBinary().enabled(true).storeKeysAsBinary(false).storeValuesAsBinary(true) > .jmxStatistics().enable() > .unsafe().unreliableReturnValues(true) > .build(); > {code} > One workaround is to set capacityFactor = 1 instead of 0, but I do not want "simple-nodes" (with less RAM) to becaome key-owners > For me this is a showstopper problem -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-11281) Marshalling error

by Katia Aresti (Jira)

Katia Aresti created ISPN-11281: ----------------------------------- Summary: Marshalling error Key: ISPN-11281 URL: https://issues.redhat.com/browse/ISPN-11281 Project: Infinispan Issue Type: Feature Request Components: Quarkus Affects Versions: 10.1.1.Final Reporter: Katia Aresti Assignee: Katia Aresti When we have near caching enabled in quarkus and we start reading a value from infinispan, we get a marshalling error at least at the first access. More details, see the issue in Quarkus https://github.com/quarkusio/quarkus/issues/3548 -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-11263) Some updates needs to be done on CDI section from dev guide

by Donald Naro (Jira)

[ https://issues.redhat.com/browse/ISPN-11263?page=com.atlassian.jira.plugi... ] Donald Naro updated ISPN-11263: ------------------------------- Issue Type: Enhancement (was: Bug) > Some updates needs to be done on CDI section from dev guide > ------------------------------------------------------------ > > Key: ISPN-11263 > URL: https://issues.redhat.com/browse/ISPN-11263 > Project: Infinispan > Issue Type: Enhancement > Components: Documentation > Affects Versions: 10.1.1.Final > Reporter: Gustavo Lira e Silva > Assignee: Donald Naro > Priority: Major > > * https://infinispan.org/docs/stable/titles/developing/developing.html#mave... > Code format are broken > * https://infinispan.org/docs/stable/titles/developing/developing.html#inje... > warning there's no default cache anymore, you should either create a new cache or override the cache manager like https://infinispan.org/docs/stable/titles/developing/developing.html#over... but including the creation of a new cache -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-11278) Deprecate SiteConfiguration

by Pedro Ruivo (Jira)

[ https://issues.redhat.com/browse/ISPN-11278?page=com.atlassian.jira.plugi... ] Pedro Ruivo updated ISPN-11278: ------------------------------- Git Pull Request: https://github.com/infinispan/infinispan/pull/7824, https://github.com/infinispan/infinispan/pull/7830 (was: https://github.com/infinispan/infinispan/pull/7824) > Deprecate SiteConfiguration > --------------------------- > > Key: ISPN-11278 > URL: https://issues.redhat.com/browse/ISPN-11278 > Project: Infinispan > Issue Type: Enhancement > Components: Cross-Site Replication > Reporter: Pedro Ruivo > Assignee: Pedro Ruivo > Priority: Major > > {{org.infinispan.configuration.global.SiteConfiguration}} is used to set the local site name and the problem is that it can be set to anything (or forget to set it). The only value for the local site that matters if the value from RELAY2. > Infinispan can get the local site name from RELAY2 (JGroups). -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

[JBoss JIRA] (ISPN-11231) Transcoder lookup is inefficient

by Ryan Emerson (Jira)

[ https://issues.redhat.com/browse/ISPN-11231?page=com.atlassian.jira.plugi... ] Ryan Emerson resolved ISPN-11231. --------------------------------- Fix Version/s: 10.1.2.Final Resolution: Done > Transcoder lookup is inefficient > -------------------------------- > > Key: ISPN-11231 > URL: https://issues.redhat.com/browse/ISPN-11231 > Project: Infinispan > Issue Type: Bug > Components: Core > Affects Versions: 9.4.17.Final, 10.1.1.Final > Reporter: Dan Berindei > Assignee: Dan Berindei > Priority: Major > Fix For: 9.4.18.Final, 10.1.2.Final, 11.0.0.Alpha1 > > > {{EncoderRegistryImpl}} stores transcoders in a set, without any lookup optimization because it assumes the number of transcoders is small. > However, {{InternalCacheFactory.bootstrap()}} registers a "different" {{TranscoderMarshallerAdapter}} for each cache, even though they all delegate to the same global marshaller. > I believe the global marshaller transcoder adapter should be registered only once in {{EncoderRegistryFactory}}, and ideally we should have a way of looking up transcoders by media types. -- This message was sent by Atlassian Jira (v7.13.8#713008)

5 years, 8 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues February 2020