August 2017 - infinispan-issues - Jboss List Archives

[JBoss JIRA] (ISPN-8182) Asynchronous commands should be retried if topology is outdated

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-8182?page=com.atlassian.jira.plugin.... ] Galder Zamarreño commented on ISPN-8182: ---------------------------------------- A couple of IRC discussions we've had so far: {code} [15:52:15] > pruivo: dberindei: it'd be interesting to hear your thoughts about ISPN-8182 [15:53:47] <dberindei> galderz: I don't think it's feasible, the originator forgets the command immediately after sending it to the owners [15:54:29] <dberindei> galderz: OTOH I don't think the remote nodes should throw an OutdatedTopologyException if the command is async [15:54:50] <pruivo> dberindei, galderz well, IMO I don't think we should do it. If you want to apply and update async, use the putAsync() [15:55:29] > rvansa: FYI ^ [15:55:55] <dberindei> pruivo: but you agree that in DIST_ASYNC, remote nodes throwing OTE is a bug, right? [15:56:05] > dberindei: we already supply custom interceptors for HB 2L, so we could try to do that: not bothering about outdated topologies [15:57:14] > pruivo: i guess you mean that putAsync() would retry in case of outdated topology? [15:57:16] <pruivo> dberindei, more or less. I think it should check if it is an owner or not. throwing the exception definitely isn't needed [15:57:42] <rvansa> dberindei: I think that the topology check in DI does not care if the cache is async [15:57:42] <pruivo> galderz, yes, if I'm not mistaken, it is a sync put in a separate threads (with the benefits of sync mode) [15:57:51] <rvansa> dberindei: if it does not match, it simply throws [15:58:00] <dberindei> rvansa: ok, that's a bug [15:58:39] <rvansa> dberindei: it shouldn't ignore it either, IMO... everything below STI should be executed in the same topology, IMO [15:59:37] > dberindei: what's the bug? [16:00:32] <rvansa> dberindei: I think that the DI should rather send the command to proper primary owner in the recent topology [16:00:55] <rvansa> dberindei: if this node is meant as primary {code} And: {code} <rvansa> galderz: I think that you shouldn't need to catch - the OTE does not have to be thrown at all > rvansa: right, assuming we provide our own interceptor for timestamp and query, we can just simply ignore any topology checks... <rvansa> galderz: not only for our own interceptor, it's a general infinispan issue > rvansa: both for REPL and DIST? <rvansa> galderz: yesterday before the meeting I've suggested that async commands should just execute if these are still on the owner <rvansa> galderz: yes <rvansa> galderz: in async mode, after you call cache.put(k, v2), all owners should eventually contain v2 <rvansa> galderz: unless 'error' happens <rvansa> galderz: topology change (node joining) is not an error > rvansa: makes sense <rvansa> galderz: actually, throwing and retrying locally might be needed - I would prefer to fix topology for a given command below STI and retry if it changes in any place we need to consider it <rvansa> galderz: that's a cozy invariant; not sure if it's really needed here <rvansa> galderz: anyway, regrettably we don't have any plan so far how to make the 'eventually' happen > rvansa: but we need something better than what we have now... <rvansa> galderz: because if a new owner pops up and fetches data from node that did not get the update yet, its version would be stale > dberindei: pruivo: we were interrupted yday discussing ISPN-8182 <jbossbot> jira [ISPN-8182] Asynchronous commands should be retried if topology is outdated [New (Unresolved) Enhancement, Major, Core, Unassigned] https://issues.jboss.org/browse/ISPN-8182 <rvansa> galderz: quick fix would be just not throwing <rvansa> galderz: + a set of stress tests that will try out this with all combos of primary/backup/non-owner transitions to see if anything goes wrong <rvansa> 'wrong' meaning NPEs and such, stale data should be expected in thos <rvansa> those *** First activity: dberindei joined 33 minutes 16 seconds ago. <dberindei> galderz rvansa: indeed, if a new node joins OR a node leaves, some keys will have new owners, and those owners may or may not receive the updated value > rvansa: if stale data is expected, then we're in the same scenario as now really <dberindei> galderz: the fact that we currently check the topology id and throw an exception means the update will can be missed on owners that aren't new > what do you mean by "aren't new"? <dberindei> galderz: say in topology 1 k is owned by AB, and in topology 2 it's owned by CB <dberindei> galderz: C would be a new owner, B would be "non-new" :) > dberindei: got it > dberindei: so, should remote nodes not throw that exception for async puts? or should still be thrown and then retried? > dberindei: we're assuming we'd change core for this <dberindei> galderz: throwing and catching would be nice because the only change would be in StateTransferInterceptor (I think) > dberindei: ok > dberindei: do we have any stress tests where we could add tests for seeing that it all works fine for repl async puts? {code} > Asynchronous commands should be retried if topology is outdated > --------------------------------------------------------------- > > Key: ISPN-8182 > URL: https://issues.jboss.org/browse/ISPN-8182 > Project: Infinispan > Issue Type: Enhancement > Components: Core > Affects Versions: 9.1.0.Final > Reporter: Galder Zamarreño > > If an asynchronous command fails at a remote node, it should be retried. > I'm not sure how feasible this really is. One possible solution could be this: having NACK style implementation where by default the originator assumes an asynchronous command has been executed, but if the receiver tells it that the topology is outdated, the originator retries? > This is related to ISPN-8027 where we've discovered that some updates are not applied when asynchronous commands to update the Hibernate 2L timestamp cache fail as a result of an outdated topology. -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8182) Asynchronous commands should be retried if topology is outdated

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-8182?page=com.atlassian.jira.plugin.... ] Galder Zamarreño edited comment on ISPN-8182 at 8/8/17 5:05 AM: ---------------------------------------------------------------- A couple of IRC discussions we've had so far: {code} [15:52:15] > pruivo: dberindei: it'd be interesting to hear your thoughts about ISPN-8182 [15:53:47] <dberindei> galderz: I don't think it's feasible, the originator forgets the command immediately after sending it to the owners [15:54:29] <dberindei> galderz: OTOH I don't think the remote nodes should throw an OutdatedTopologyException if the command is async [15:54:50] <pruivo> dberindei, galderz well, IMO I don't think we should do it. If you want to apply and update async, use the putAsync() [15:55:29] > rvansa: FYI ^ [15:55:55] <dberindei> pruivo: but you agree that in DIST_ASYNC, remote nodes throwing OTE is a bug, right? [15:56:05] > dberindei: we already supply custom interceptors for HB 2L, so we could try to do that: not bothering about outdated topologies [15:57:14] > pruivo: i guess you mean that putAsync() would retry in case of outdated topology? [15:57:16] <pruivo> dberindei, more or less. I think it should check if it is an owner or not. throwing the exception definitely isn't needed [15:57:42] <rvansa> dberindei: I think that the topology check in DI does not care if the cache is async [15:57:42] <pruivo> galderz, yes, if I'm not mistaken, it is a sync put in a separate threads (with the benefits of sync mode) [15:57:51] <rvansa> dberindei: if it does not match, it simply throws [15:58:00] <dberindei> rvansa: ok, that's a bug [15:58:39] <rvansa> dberindei: it shouldn't ignore it either, IMO... everything below STI should be executed in the same topology, IMO [15:59:37] > dberindei: what's the bug? [16:00:32] <rvansa> dberindei: I think that the DI should rather send the command to proper primary owner in the recent topology [16:00:55] <rvansa> dberindei: if this node is meant as primary {code} And: {code} <rvansa> galderz: I think that you shouldn't need to catch - the OTE does not have to be thrown at all > rvansa: right, assuming we provide our own interceptor for timestamp and query, we can just simply ignore any topology checks... <rvansa> galderz: not only for our own interceptor, it's a general infinispan issue > rvansa: both for REPL and DIST? <rvansa> galderz: yesterday before the meeting I've suggested that async commands should just execute if these are still on the owner <rvansa> galderz: yes <rvansa> galderz: in async mode, after you call cache.put(k, v2), all owners should eventually contain v2 <rvansa> galderz: unless 'error' happens <rvansa> galderz: topology change (node joining) is not an error > rvansa: makes sense <rvansa> galderz: actually, throwing and retrying locally might be needed - I would prefer to fix topology for a given command below STI and retry if it changes in any place we need to consider it <rvansa> galderz: that's a cozy invariant; not sure if it's really needed here <rvansa> galderz: anyway, regrettably we don't have any plan so far how to make the 'eventually' happen > rvansa: but we need something better than what we have now... <rvansa> galderz: because if a new owner pops up and fetches data from node that did not get the update yet, its version would be stale > dberindei: pruivo: we were interrupted yday discussing ISPN-8182 <jbossbot> jira [ISPN-8182] Asynchronous commands should be retried if topology is outdated [New (Unresolved) Enhancement, Major, Core, Unassigned] https://issues.jboss.org/browse/ISPN-8182 <rvansa> galderz: quick fix would be just not throwing <rvansa> galderz: + a set of stress tests that will try out this with all combos of primary/backup/non-owner transitions to see if anything goes wrong <rvansa> 'wrong' meaning NPEs and such, stale data should be expected in thos <rvansa> those *** First activity: dberindei joined 33 minutes 16 seconds ago. <dberindei> galderz rvansa: indeed, if a new node joins OR a node leaves, some keys will have new owners, and those owners may or may not receive the updated value > rvansa: if stale data is expected, then we're in the same scenario as now really <dberindei> galderz: the fact that we currently check the topology id and throw an exception means the update will can be missed on owners that aren't new > what do you mean by "aren't new"? <dberindei> galderz: say in topology 1 k is owned by AB, and in topology 2 it's owned by CB <dberindei> galderz: C would be a new owner, B would be "non-new" :) > dberindei: got it > dberindei: so, should remote nodes not throw that exception for async puts? or should still be thrown and then retried? > dberindei: we're assuming we'd change core for this <dberindei> galderz: throwing and catching would be nice because the only change would be in StateTransferInterceptor (I think) > dberindei: ok > dberindei: do we have any stress tests where we could add tests for seeing that it all works fine for repl async puts? {code} was (Author: galder.zamarreno): A couple of IRC discussions we've had so far: {code} [15:52:15] > pruivo: dberindei: it'd be interesting to hear your thoughts about ISPN-8182 [15:53:47] <dberindei> galderz: I don't think it's feasible, the originator forgets the command immediately after sending it to the owners [15:54:29] <dberindei> galderz: OTOH I don't think the remote nodes should throw an OutdatedTopologyException if the command is async [15:54:50] <pruivo> dberindei, galderz well, IMO I don't think we should do it. If you want to apply and update async, use the putAsync() [15:55:29] > rvansa: FYI ^ [15:55:55] <dberindei> pruivo: but you agree that in DIST_ASYNC, remote nodes throwing OTE is a bug, right? [15:56:05] > dberindei: we already supply custom interceptors for HB 2L, so we could try to do that: not bothering about outdated topologies [15:57:14] > pruivo: i guess you mean that putAsync() would retry in case of outdated topology? [15:57:16] <pruivo> dberindei, more or less. I think it should check if it is an owner or not. throwing the exception definitely isn't needed [15:57:42] <rvansa> dberindei: I think that the topology check in DI does not care if the cache is async [15:57:42] <pruivo> galderz, yes, if I'm not mistaken, it is a sync put in a separate threads (with the benefits of sync mode) [15:57:51] <rvansa> dberindei: if it does not match, it simply throws [15:58:00] <dberindei> rvansa: ok, that's a bug [15:58:39] <rvansa> dberindei: it shouldn't ignore it either, IMO... everything below STI should be executed in the same topology, IMO [15:59:37] > dberindei: what's the bug? [16:00:32] <rvansa> dberindei: I think that the DI should rather send the command to proper primary owner in the recent topology [16:00:55] <rvansa> dberindei: if this node is meant as primary {code} And: {code} <rvansa> galderz: I think that you shouldn't need to catch - the OTE does not have to be thrown at all > rvansa: right, assuming we provide our own interceptor for timestamp and query, we can just simply ignore any topology checks... <rvansa> galderz: not only for our own interceptor, it's a general infinispan issue > rvansa: both for REPL and DIST? <rvansa> galderz: yesterday before the meeting I've suggested that async commands should just execute if these are still on the owner <rvansa> galderz: yes <rvansa> galderz: in async mode, after you call cache.put(k, v2), all owners should eventually contain v2 <rvansa> galderz: unless 'error' happens <rvansa> galderz: topology change (node joining) is not an error > rvansa: makes sense <rvansa> galderz: actually, throwing and retrying locally might be needed - I would prefer to fix topology for a given command below STI and retry if it changes in any place we need to consider it <rvansa> galderz: that's a cozy invariant; not sure if it's really needed here <rvansa> galderz: anyway, regrettably we don't have any plan so far how to make the 'eventually' happen > rvansa: but we need something better than what we have now... <rvansa> galderz: because if a new owner pops up and fetches data from node that did not get the update yet, its version would be stale > dberindei: pruivo: we were interrupted yday discussing ISPN-8182 <jbossbot> jira [ISPN-8182] Asynchronous commands should be retried if topology is outdated [New (Unresolved) Enhancement, Major, Core, Unassigned] https://issues.jboss.org/browse/ISPN-8182 <rvansa> galderz: quick fix would be just not throwing <rvansa> galderz: + a set of stress tests that will try out this with all combos of primary/backup/non-owner transitions to see if anything goes wrong <rvansa> 'wrong' meaning NPEs and such, stale data should be expected in thos <rvansa> those *** First activity: dberindei joined 33 minutes 16 seconds ago. <dberindei> galderz rvansa: indeed, if a new node joins OR a node leaves, some keys will have new owners, and those owners may or may not receive the updated value > rvansa: if stale data is expected, then we're in the same scenario as now really <dberindei> galderz: the fact that we currently check the topology id and throw an exception means the update will can be missed on owners that aren't new > what do you mean by "aren't new"? <dberindei> galderz: say in topology 1 k is owned by AB, and in topology 2 it's owned by CB <dberindei> galderz: C would be a new owner, B would be "non-new" :) > dberindei: got it > dberindei: so, should remote nodes not throw that exception for async puts? or should still be thrown and then retried? > dberindei: we're assuming we'd change core for this <dberindei> galderz: throwing and catching would be nice because the only change would be in StateTransferInterceptor (I think) > dberindei: ok > dberindei: do we have any stress tests where we could add tests for seeing that it all works fine for repl async puts? {code} > Asynchronous commands should be retried if topology is outdated > --------------------------------------------------------------- > > Key: ISPN-8182 > URL: https://issues.jboss.org/browse/ISPN-8182 > Project: Infinispan > Issue Type: Enhancement > Components: Core > Affects Versions: 9.1.0.Final > Reporter: Galder Zamarreño > > If an asynchronous command fails at a remote node, it should be retried. > I'm not sure how feasible this really is. One possible solution could be this: having NACK style implementation where by default the originator assumes an asynchronous command has been executed, but if the receiver tells it that the topology is outdated, the originator retries? > This is related to ISPN-8027 where we've discovered that some updates are not applied when asynchronous commands to update the Hibernate 2L timestamp cache fail as a result of an outdated topology. -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8178) SingleNodeJdbcStoreIT.testForcedShutdown failure

by Ryan Emerson (JIRA)

[ https://issues.jboss.org/browse/ISPN-8178?page=com.atlassian.jira.plugin.... ] Ryan Emerson resolved ISPN-8178. -------------------------------- Fix Version/s: 9.1.1.Final Resolution: Done > SingleNodeJdbcStoreIT.testForcedShutdown failure > ------------------------------------------------ > > Key: ISPN-8178 > URL: https://issues.jboss.org/browse/ISPN-8178 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Server > Affects Versions: 9.1.0.Final > Reporter: Tristan Tarrant > Assignee: Tristan Tarrant > Labels: testsuite_stability > Fix For: 9.1.1.Final > > > Occasionally the test fails with: > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertNotNull(Assert.java:621) > at org.junit.Assert.assertNotNull(Assert.java:631) > at org.infinispan.server.test.cs.jdbc.SingleNodeJdbcStoreIT.testRestartStringStoreAfter(SingleNodeJdbcStoreIT.java:195) > at org.infinispan.server.test.cs.jdbc.SingleNodeJdbcStoreIT.testForcedShutdown(SingleNodeJdbcStoreIT.java:151) -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8168) IndexNotFoundException with topology changes

by Gustavo Fernandes (JIRA)

[ https://issues.jboss.org/browse/ISPN-8168?page=com.atlassian.jira.plugin.... ] Gustavo Fernandes updated ISPN-8168: ------------------------------------ Summary: IndexNotFoundException with topology changes (was: Index corruption with topology changes) > IndexNotFoundException with topology changes > -------------------------------------------- > > Key: ISPN-8168 > URL: https://issues.jboss.org/browse/ISPN-8168 > Project: Infinispan > Issue Type: Bug > Components: Lucene Directory > Affects Versions: 9.1.0.Final > Reporter: Gustavo Fernandes > Assignee: Gustavo Fernandes > Labels: query, testsuite_stability > Fix For: 9.1.1.Final > > Attachments: trace.zip > > > This can be observed in the LiveRunningTest, that fails very often with > {noformat} > Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='emails'}: files: [] > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:726) > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683) > {noformat} > The cache entry that contains the list of files the lucene directory (FileListCacheValue) for some reason is empty, although the index is not. The missing value for FileListCacheValue causes the index reader to think the index is empty and thus the error -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8179) Test in TimestampsRegionImplTest missing @Test

by Tristan Tarrant (JIRA)

[ https://issues.jboss.org/browse/ISPN-8179?page=com.atlassian.jira.plugin.... ] Tristan Tarrant updated ISPN-8179: ---------------------------------- Status: Resolved (was: Pull Request Sent) Resolution: Done > Test in TimestampsRegionImplTest missing @Test > ---------------------------------------------- > > Key: ISPN-8179 > URL: https://issues.jboss.org/browse/ISPN-8179 > Project: Infinispan > Issue Type: Bug > Components: Hibernate Cache > Affects Versions: 9.1.0.Final > Reporter: Galder Zamarreño > Assignee: Galder Zamarreño > Fix For: 9.1.1.Final > > > {{testClearTimestampsRegionInIsolated}} is missing {{@Test}} annotation in TimestampsRegionImplTest. -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8181) XSite tests fail randomly with java.net.BindException

by Tristan Tarrant (JIRA)

[ https://issues.jboss.org/browse/ISPN-8181?page=com.atlassian.jira.plugin.... ] Tristan Tarrant updated ISPN-8181: ---------------------------------- Status: Resolved (was: Pull Request Sent) Fix Version/s: 9.1.1.Final Resolution: Done > XSite tests fail randomly with java.net.BindException > ----------------------------------------------------- > > Key: ISPN-8181 > URL: https://issues.jboss.org/browse/ISPN-8181 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Affects Versions: 9.1.0.Final > Reporter: Gustavo Fernandes > Assignee: Gustavo Fernandes > Labels: testsuite_stability > Fix For: 9.1.1.Final > > > {noformat} > [ERROR] createBeforeClass(org.infinispan.xsite.BackupWithSecurityTest) Time elapsed: 0.047 s <<< FAILURE! > org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport > at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:252) > at org.infinispan.manager.DefaultCacheManager.start(DefaultCacheManager.java:686) > at org.infinispan.manager.DefaultCacheManager.<init>(DefaultCacheManager.java:261) > at org.infinispan.test.fwk.TestCacheManagerFactory.newDefaultCacheManager(TestCacheManagerFactory.java:394) > at org.infinispan.test.fwk.TestCacheManagerFactory.newDefaultCacheManager(TestCacheManagerFactory.java:70) > at org.infinispan.test.fwk.TestCacheManagerFactory.createClusteredCacheManager(TestCacheManagerFactory.java:198) > at org.infinispan.test.fwk.TestCacheManagerFactory.createClusteredCacheManager(TestCacheManagerFactory.java:189) > at org.infinispan.xsite.AbstractXSiteTest$TestSite.addClusterEnabledCacheManager(AbstractXSiteTest.java:259) > at org.infinispan.xsite.AbstractXSiteTest$TestSite.createClusteredCaches(AbstractXSiteTest.java:233) > at org.infinispan.xsite.AbstractXSiteTest.createSite(AbstractXSiteTest.java:96) > at org.infinispan.xsite.BackupWithSecurityTest.access$201(BackupWithSecurityTest.java:23) > at org.infinispan.xsite.BackupWithSecurityTest.lambda$createSite$0(BackupWithSecurityTest.java:60) > at org.infinispan.security.Security.doAs(Security.java:118) > at org.infinispan.xsite.BackupWithSecurityTest.createSite(BackupWithSecurityTest.java:60) > at org.infinispan.xsite.AbstractMultipleSitesTest.createSites(AbstractMultipleSitesTest.java:87) > at org.infinispan.xsite.AbstractXSiteTest.createBeforeClass(AbstractXSiteTest.java:51) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84) > at org.testng.internal.Invoker.invokeConfigurationMethod(Invoker.java:564) > at org.testng.internal.Invoker.invokeConfigurations(Invoker.java:213) > at org.testng.internal.Invoker.invokeConfigurations(Invoker.java:138) > at org.testng.internal.TestMethodWorker.invokeBeforeClassMethods(TestMethodWorker.java:175) > at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:107) > at org.testng.TestRunner.privateRun(TestRunner.java:767) > at org.testng.TestRunner.run(TestRunner.java:617) > at org.testng.SuiteRunner.runTest(SuiteRunner.java:348) > at org.testng.SuiteRunner.access$000(SuiteRunner.java:38) > at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382) > at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport > at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:95) > at org.infinispan.commons.util.SecurityActions.doPrivileged(SecurityActions.java:83) > at org.infinispan.commons.util.SecurityActions.invokeAccessibly(SecurityActions.java:88) > at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:165) > at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869) > at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:635) > at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:624) > at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:549) > at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:239) > ... 35 more > Caused by: org.infinispan.commons.CacheException: Unable to start JGroups Channel > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:507) > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.start(JGroupsTransport.java:437) > at sun.reflect.GeneratedMethodAccessor163.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:91) > ... 43 more > Caused by: java.net.BindException: No available port to bind to in range [8400 .. 8409] > at org.jgroups.util.Util.createServerSocketChannel(Util.java:3539) > at org.jgroups.blocks.cs.NioServer.<init>(NioServer.java:71) > at org.jgroups.protocols.TCP_NIO2.start(TCP_NIO2.java:97) > at org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:861) > at org.jgroups.JChannel.startStack(JChannel.java:1017) > at org.jgroups.JChannel._preConnect(JChannel.java:886) > at org.jgroups.JChannel.connect(JChannel.java:390) > at org.jgroups.JChannel.connect(JChannel.java:384) > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:505) > ... 48 more > {noformat} -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8168) Index corruption with topology changes

by Gustavo Fernandes (JIRA)

[ https://issues.jboss.org/browse/ISPN-8168?page=com.atlassian.jira.plugin.... ] Gustavo Fernandes updated ISPN-8168: ------------------------------------ Fix Version/s: 9.1.1.Final > Index corruption with topology changes > -------------------------------------- > > Key: ISPN-8168 > URL: https://issues.jboss.org/browse/ISPN-8168 > Project: Infinispan > Issue Type: Bug > Components: Lucene Directory > Affects Versions: 9.1.0.Final > Reporter: Gustavo Fernandes > Assignee: Gustavo Fernandes > Labels: query, testsuite_stability > Fix For: 9.1.1.Final > > Attachments: trace.zip > > > This can be observed in the LiveRunningTest, that fails very often with > {noformat} > Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='emails'}: files: [] > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:726) > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683) > {noformat} > The cache entry that contains the list of files the lucene directory (FileListCacheValue) for some reason is empty, although the index is not. The missing value for FileListCacheValue causes the index reader to think the index is empty and thus the error -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8181) XSite tests fail randomly with java.net.BindException

by Gustavo Fernandes (JIRA)

[ https://issues.jboss.org/browse/ISPN-8181?page=com.atlassian.jira.plugin.... ] Gustavo Fernandes updated ISPN-8181: ------------------------------------ Status: Pull Request Sent (was: Open) Git Pull Request: https://github.com/infinispan/infinispan/pull/5361 > XSite tests fail randomly with java.net.BindException > ----------------------------------------------------- > > Key: ISPN-8181 > URL: https://issues.jboss.org/browse/ISPN-8181 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Affects Versions: 9.1.0.Final > Reporter: Gustavo Fernandes > Assignee: Gustavo Fernandes > Labels: testsuite_stability > > {noformat} > [ERROR] createBeforeClass(org.infinispan.xsite.BackupWithSecurityTest) Time elapsed: 0.047 s <<< FAILURE! > org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport > at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:252) > at org.infinispan.manager.DefaultCacheManager.start(DefaultCacheManager.java:686) > at org.infinispan.manager.DefaultCacheManager.<init>(DefaultCacheManager.java:261) > at org.infinispan.test.fwk.TestCacheManagerFactory.newDefaultCacheManager(TestCacheManagerFactory.java:394) > at org.infinispan.test.fwk.TestCacheManagerFactory.newDefaultCacheManager(TestCacheManagerFactory.java:70) > at org.infinispan.test.fwk.TestCacheManagerFactory.createClusteredCacheManager(TestCacheManagerFactory.java:198) > at org.infinispan.test.fwk.TestCacheManagerFactory.createClusteredCacheManager(TestCacheManagerFactory.java:189) > at org.infinispan.xsite.AbstractXSiteTest$TestSite.addClusterEnabledCacheManager(AbstractXSiteTest.java:259) > at org.infinispan.xsite.AbstractXSiteTest$TestSite.createClusteredCaches(AbstractXSiteTest.java:233) > at org.infinispan.xsite.AbstractXSiteTest.createSite(AbstractXSiteTest.java:96) > at org.infinispan.xsite.BackupWithSecurityTest.access$201(BackupWithSecurityTest.java:23) > at org.infinispan.xsite.BackupWithSecurityTest.lambda$createSite$0(BackupWithSecurityTest.java:60) > at org.infinispan.security.Security.doAs(Security.java:118) > at org.infinispan.xsite.BackupWithSecurityTest.createSite(BackupWithSecurityTest.java:60) > at org.infinispan.xsite.AbstractMultipleSitesTest.createSites(AbstractMultipleSitesTest.java:87) > at org.infinispan.xsite.AbstractXSiteTest.createBeforeClass(AbstractXSiteTest.java:51) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84) > at org.testng.internal.Invoker.invokeConfigurationMethod(Invoker.java:564) > at org.testng.internal.Invoker.invokeConfigurations(Invoker.java:213) > at org.testng.internal.Invoker.invokeConfigurations(Invoker.java:138) > at org.testng.internal.TestMethodWorker.invokeBeforeClassMethods(TestMethodWorker.java:175) > at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:107) > at org.testng.TestRunner.privateRun(TestRunner.java:767) > at org.testng.TestRunner.run(TestRunner.java:617) > at org.testng.SuiteRunner.runTest(SuiteRunner.java:348) > at org.testng.SuiteRunner.access$000(SuiteRunner.java:38) > at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382) > at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport > at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:95) > at org.infinispan.commons.util.SecurityActions.doPrivileged(SecurityActions.java:83) > at org.infinispan.commons.util.SecurityActions.invokeAccessibly(SecurityActions.java:88) > at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:165) > at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869) > at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:635) > at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:624) > at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:549) > at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:239) > ... 35 more > Caused by: org.infinispan.commons.CacheException: Unable to start JGroups Channel > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:507) > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.start(JGroupsTransport.java:437) > at sun.reflect.GeneratedMethodAccessor163.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:91) > ... 43 more > Caused by: java.net.BindException: No available port to bind to in range [8400 .. 8409] > at org.jgroups.util.Util.createServerSocketChannel(Util.java:3539) > at org.jgroups.blocks.cs.NioServer.<init>(NioServer.java:71) > at org.jgroups.protocols.TCP_NIO2.start(TCP_NIO2.java:97) > at org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:861) > at org.jgroups.JChannel.startStack(JChannel.java:1017) > at org.jgroups.JChannel._preConnect(JChannel.java:886) > at org.jgroups.JChannel.connect(JChannel.java:390) > at org.jgroups.JChannel.connect(JChannel.java:384) > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:505) > ... 48 more > {noformat} -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8182) Asynchronous commands should be retried if topology is outdated

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-8182?page=com.atlassian.jira.plugin.... ] Dan Berindei commented on ISPN-8182: ------------------------------------ -1 to retry from the originator, because the whole point of asynchronous replication is to not keep track of commands after they were sent to the owners. In other words, the fact that updates can be lost is the main reason why {{cache.put(k, v)}}/*DIST_ASYNC* is faster than {{cache.putAsync(k, v)}}/*DIST_SYNC*. OTOH it's not ok that the remote node throws an {{OutdatedTopologyException}} and then pretends it can send it back to the originator (at least in the log), and the originator could retry the command. The remote node should either not throw the {{OutdatedTopologyException}} at all, or it should catch it and retry locally. > Asynchronous commands should be retried if topology is outdated > --------------------------------------------------------------- > > Key: ISPN-8182 > URL: https://issues.jboss.org/browse/ISPN-8182 > Project: Infinispan > Issue Type: Enhancement > Components: Core > Affects Versions: 9.1.0.Final > Reporter: Galder Zamarreño > > If an asynchronous command fails at a remote node, it should be retried. > I'm not sure how feasible this really is. One possible solution could be this: having NACK style implementation where by default the originator assumes an asynchronous command has been executed, but if the receiver tells it that the topology is outdated, the originator retries? > This is related to ISPN-8027 where we've discovered that some updates are not applied when asynchronous commands to update the Hibernate 2L timestamp cache fail as a result of an outdated topology. -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

[JBoss JIRA] (ISPN-8114) Random failures in loading from Hibernate Cache

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-8114?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-8114: ----------------------------------- Status: Resolved (was: Pull Request Sent) Resolution: Done > Random failures in loading from Hibernate Cache > ----------------------------------------------- > > Key: ISPN-8114 > URL: https://issues.jboss.org/browse/ISPN-8114 > Project: Infinispan > Issue Type: Bug > Components: Hibernate Cache > Affects Versions: 9.1.0.Final > Reporter: Galder Zamarreño > Assignee: Galder Zamarreño > Labels: testsuite_stability > Fix For: 9.1.1.Final > > > {{org.infinispan.test.hibernate.cache.functional.cluster.NaturalIdInvalidationTest.testAll[read-only, INVALIDATION_SYNC]}} > {{org.infinispan.test.hibernate.cache.functional.cluster.NaturalIdInvalidationTest.testAll[transactional, INVALIDATION_SYNC]}} > {code} > java.lang.AssertionError: Citizen (1234) should have present in the cache > at org.junit.Assert.fail(Assert.java:88) > at org.infinispan.test.hibernate.cache.functional.cluster.NaturalIdInvalidationTest.assertLoadedFromCache(NaturalIdInvalidationTest.java:144) > at org.infinispan.test.hibernate.cache.functional.cluster.NaturalIdInvalidationTest.testAll(NaturalIdInvalidationTest.java:114) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at org.hibernate.testing.junit4.ExtendedFrameworkMethod.invokeExplosively(ExtendedFrameworkMethod.java:45) > at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.2.3#72005)

8 years, 10 months

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues August 2017