[JBoss JIRA] (ISPN-5515) Purge store if there is another node already running
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5515?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5515:
----------------------------------
Fix Version/s: 9.1.0.Final
(was: 9.0.0.Final)
> Purge store if there is another node already running
> ----------------------------------------------------
>
> Key: ISPN-5515
> URL: https://issues.jboss.org/browse/ISPN-5515
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core, Loaders and Stores
> Affects Versions: 7.2.2.Final, 8.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.1.0.Final
>
>
> Preloading happens before communicating with other nodes that might already have the cache running. When joining the existing members, the cache then waits to receive the first CH in which it is a member, and then deletes only the entries in the segments that it doesn't own in that CH.
> The intention of this was to remove as little as possible from the existing data, e.g. if the first node to start up is not the one that was stopped last. But the preloaded entries are not replicated to the other nodes, so this can lead to inconsistencies.
> It would be better to delay preloading until we know we are the first node to start up, but failing that we could clear the data container and the store before receiving the initial state.
> Note that this will only allow preloading data from one node. Restoring data from more nodes is harder to do, and we will implement it as part of graceful restart.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 1 month
[JBoss JIRA] (ISPN-5499) SizeTest.testPersistentDistributedCacheSize random failures
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5499?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5499:
----------------------------------
Fix Version/s: 9.1.0.Final
(was: 9.0.0.Final)
> SizeTest.testPersistentDistributedCacheSize random failures
> -----------------------------------------------------------
>
> Key: ISPN-5499
> URL: https://issues.jboss.org/browse/ISPN-5499
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Server
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 9.1.0.Final
>
>
> {noformat}
> 16:04:28,678 ERROR (testng-SizeTest:) [UnitTestTestNGListener] Test testPersistentDistributedCacheSize(org.infinispan.client.hotrod.SizeTest) failed.
> java.lang.AssertionError: expected:<20> but was:<38>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
> at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252)
> at org.infinispan.client.hotrod.SizeTest.testPersistentDistributedCacheSize(SizeTest.java:59)
> {noformat}
> I have been able to make the test fail reliably by replacing the assertion on line 57 with this:
> {code}
> for (int i = 0; i < SIZE; i++) {
> assertEquals(SIZE, clients.get(0).getCache(cacheName).size());
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 1 month
[JBoss JIRA] (ISPN-5510) Provide better Hot Rod client socket timeout and retry defaults
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5510?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5510:
----------------------------------
Fix Version/s: 9.1.0.Final
(was: 9.0.0.Final)
> Provide better Hot Rod client socket timeout and retry defaults
> ---------------------------------------------------------------
>
> Key: ISPN-5510
> URL: https://issues.jboss.org/browse/ISPN-5510
> Project: Infinispan
> Issue Type: Enhancement
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Fix For: 9.1.0.Final
>
>
> The current defaults are:
> * Socket timeout = 60 seconds
> * Max retries = 10
> As a result of these defaults, if the server hangs an operation, it'd take 10 minutes (60 second timeout x 10 retries) for the operation to finally return an exception to the client, which is way too much.
> So, these default value should change to be more aggressive: 30 second socket timeout and 3 max retries.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 1 month
[JBoss JIRA] (ISPN-5415) Expose protobuf entries to scripting
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5415?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5415:
----------------------------------
Fix Version/s: 9.1.0.Final
(was: 9.0.0.Final)
> Expose protobuf entries to scripting
> ------------------------------------
>
> Key: ISPN-5415
> URL: https://issues.jboss.org/browse/ISPN-5415
> Project: Infinispan
> Issue Type: Feature Request
> Components: Remote Querying
> Affects Versions: 8.0.0.Final
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Fix For: 9.1.0.Final
>
>
> We need an alternative API for Protostream marshalling that is easy to consume from scripting languages. The messages need to be unmarshalled into a map-like object that can be accessed easily from scripting languages. No marshaller implementation code should be provided by users, also no annotations.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 1 month
[JBoss JIRA] (ISPN-5570) Cross-site: retry backup commands
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5570?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5570:
----------------------------------
Fix Version/s: 9.1.0.Final
(was: 9.0.0.Final)
> Cross-site: retry backup commands
> ---------------------------------
>
> Key: ISPN-5570
> URL: https://issues.jboss.org/browse/ISPN-5570
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Cross-Site Replication
> Affects Versions: 7.2.3.Final
> Reporter: Dan Berindei
> Fix For: 9.1.0.Final
>
>
> There are 3 phases in a backup RPC:
> 1. Sender -> Local site master: caused by the site master is shutting down or crashing, or by a network split.
> 2. Local site master -> Remote site master:
> 2.1. Local site master is no longer a site master, e.g. because it's shutting down or because it's no longer coordinator after a merge.
> 2.2. Remote site master is not longer a site master.
> 2.3. Link between local site and remote site is down.
> 3. Remote site master -> Backup targets
> Replication failures in phase 3 are handled by retrying (except for TimeoutExceptions), because {{BaseBackupReceiver}} uses regular cache methods to perform the updates.
> But replication failures in phases 1 and 2 are not handled in any way, except for causing the remote site to be taken offline after a certain number of replication failures (if backup is synchronous). We should instead retry backup RPCs when we get a {{SuspectException}} or {{UnreachableException}}, and perhaps even when we get no response (2.2?), and only stop when the timeout expires or when the backup is taken offline.
> Async backup probably needs retrying as well, and perhaps even a more sophisticated approach like I-RAC (ISPN-2634).
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 1 month
[JBoss JIRA] (ISPN-5572) Exposed JMX MBeans should be separate components
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5572?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5572:
----------------------------------
Fix Version/s: 9.1.0.Final
(was: 9.0.0.Final)
> Exposed JMX MBeans should be separate components
> ------------------------------------------------
>
> Key: ISPN-5572
> URL: https://issues.jboss.org/browse/ISPN-5572
> Project: Infinispan
> Issue Type: Task
> Components: Core
> Affects Versions: 8.0.0.Alpha2, 7.2.3.Final
> Reporter: Dan Berindei
> Fix For: 9.1.0.Final
>
>
> We currently expose internal components as JMX MBeans, and that makes our JMX "API" very unstructured. The exposed MBeans should be separate components, and the only concern in their interfaces should be ease of use.
> One example of JMX getting in the way of refactoring is {{CacheMgmtInterceptor}}. The interceptor chain is dynamic, so it should be possible to insert the interceptor only when statistics are enabled. But because the {{statisticsEnabled}} attribute is on the interceptor itself, that becomes a lot trickier, and we had to introduce a separate configuration attribute that disables statistics permanently (ISPN-5542).
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 1 month
[JBoss JIRA] (ISPN-5929) InfinispanQueryIT.testQueryOnFirstNode random failures
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5929?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5929:
----------------------------------
Fix Version/s: 9.1.0.Final
(was: 9.0.0.Final)
> InfinispanQueryIT.testQueryOnFirstNode random failures
> ------------------------------------------------------
>
> Key: ISPN-5929
> URL: https://issues.jboss.org/browse/ISPN-5929
> Project: Infinispan
> Issue Type: Bug
> Components: Integration , Test Suite - Query
> Affects Versions: 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Adrian Nistor
> Priority: Blocker
> Fix For: 9.1.0.Final
>
>
> {{InfinispanQueryIT.testQueryOnFirstNode()}} and {{InfinispanQueryIT.testQueryOnSecondNode()}} fail randomly in CI with this assertion:
> {noformat}
> java.lang.AssertionError: expected:<3> but was:<2>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at org.infinispan.test.integration.as.query.InfinispanQueryIT.testQueryOnFirstNode(InfinispanQueryIT.java:99)
> {noformat}
> Example: http://ci.infinispan.org/viewLog.html?buildId=31810&tab=buildResultsDiv&b...
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 1 month