[Red Hat JIRA] (ISPN-1407) Server-initiated cluster switch
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-1407?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-1407:
----------------------------------
Fix Version/s: 12.1.0.Final
(was: 12.0.0.Final)
> Server-initiated cluster switch
> -------------------------------
>
> Key: ISPN-1407
> URL: https://issues.redhat.com/browse/ISPN-1407
> Project: Infinispan
> Issue Type: Feature Request
> Components: Remote Protocols
> Reporter: Sanne Grinovero
> Assignee: Tristan Tarrant
> Priority: Major
> Fix For: 12.1.0.Final
>
>
> when starting a rolling upgrade it will be needed to upgrade the server_lists with new entries at the start of the upgrade, and possibly remove some entries when the upgrade is done
> Aside from exposing this operation through the REST endpoint, the CLI should be extended with a {{migrate cluster clients}} command which will tell all clients connected to the source cluster to switch to the target cluster.
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
5 years, 3 months
[Red Hat JIRA] (ISPN-5290) Better automatic merge for caches with enabled partition handling
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-5290?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-5290:
----------------------------------
Fix Version/s: 12.1.0.Final
(was: 12.0.0.Final)
> Better automatic merge for caches with enabled partition handling
> -----------------------------------------------------------------
>
> Key: ISPN-5290
> URL: https://issues.redhat.com/browse/ISPN-5290
> Project: Infinispan
> Issue Type: Feature Request
> Environment: JDG cluster with partitionHandling enabled
> Reporter: Wolf-Dieter Fink
> Assignee: Dan Berindei
> Priority: Major
> Labels: cluster, clustering, infinispan, partition_handling
> Fix For: 12.1.0.Final
>
>
> At the moment there is no detection whether a node which join a cluster is one of the nodes which are known from the "last stable view" or not.
> This will have the drawback that the cluster will be still in DEGRADED_MODE if there are some nodes restarted during the split-brain.
> Assuming the cluster split is a power failure of some nodes the available nodes are DEGRADED as >=numOwners are lost.
> If the failed nodes are restarted, let's say we have an application which use library mode in EAP, these instances are now identified as new nodes as the node-ID's are different.
> If these nodes join the 'cluster' all the nodes are still degraded as the restarted are now known as different nodes and not as the lost nodes, so the cluster will not heal and come back to AVAILABLE.
> There is a way to prevent some of the possibilities by using server hinting to ensure that at least one owner will survive.
> But there are other cases where it would be good to have a different strategy to get the cluster back to AVAILABLE mode.
> During the split-brain there is no way to continue as there is no possiblity to know whether "the other" part is gone or still acessable but not seen.
> For a shared persistence it might be possible but there is a huge drawback for normal working state to synchronize that with locking and version columns.
> If the node ID can be kept I see the following enhancements:
> - with a shared persistence there should no data lost, if all nodes are back in the cluster it can go AVAILABLE and reload the missing entries
> - for a 'side' cache the values are calculated or retrieved from other (slow) systems, so the cluster can be AVAILABLE and reload the entries
> - In other cases there might be a WARNING/ERROR that all members are back from split, there is maybe some data lost and automaticaly or manually set back to AVAILABLE
> It might be complicated to calculate this modes, but a configuration for partition-handling might give the possibility to the administrator to decide which behaviour is apropriate for a cache
> i.e.
> <partition-handling enabled="true" healing="HEALING.MODE"/>
> where modes are
> AVAILABLE_NO_WARNING back to available after all nodes from "last stable" are back
> AVAILABLE_WARNING_DATALOST dto. but log a warning that some DATA can be lost
> WARNING_DATALOST only a warning and a hint how to enable manually
> NONE same as current behaviour (if necessary, maybe WARNING_DATALOST is similar or better)
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
5 years, 3 months
[Red Hat JIRA] (ISPN-5557) Core threading redesign
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-5557?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-5557:
----------------------------------
Fix Version/s: 12.1.0.Final
(was: 12.0.0.Final)
> Core threading redesign
> -----------------------
>
> Key: ISPN-5557
> URL: https://issues.redhat.com/browse/ISPN-5557
> Project: Infinispan
> Issue Type: Task
> Components: Core
> Affects Versions: 7.2.2.Final
> Reporter: Dan Berindei
> Assignee: Will Burns
> Priority: Critical
> Fix For: 12.1.0.Final
>
>
> Infinispan needs a lot of threads, because everything is synchronous: locking, remote command invocations, cache writers. This causes various issues, from general context switching overhead to the thread pools getting full and causing deadlocks.
> We should redesign the core so that most blocking happens on the application threads, and the number of internal threads is kept to a minimum.
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
5 years, 3 months
[Red Hat JIRA] (ISPN-9222) Custom clientListener filters without a need to deploy java code to Infinispan server
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-9222?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-9222:
----------------------------------
Fix Version/s: 12.1.0.Final
(was: 12.0.0.Final)
> Custom clientListener filters without a need to deploy java code to Infinispan server
> -------------------------------------------------------------------------------------
>
> Key: ISPN-9222
> URL: https://issues.redhat.com/browse/ISPN-9222
> Project: Infinispan
> Issue Type: Enhancement
> Components: Documentation, Hot Rod
> Affects Versions: 9.2.1.Final
> Reporter: Marek Posolda
> Assignee: Donald Naro
> Priority: Major
> Labels: redhat-summit-18
> Fix For: 12.1.0.Final
>
>
> Currently JDG has a way to register client listeners for the remote HotRod events. There are also ways to filter the events, so that client listener doesn't receive the filtered events, which it's not interested in. But it looks that filtering currently requires custom code with CacheEventFilterFactory to be available on JDG server side as described in https://access.redhat.com/documentation/en-us/red_hat_jboss_data_grid/7.2... .
> I was wondering if it's possible to have custom filter, which is able to somehow filter fields of custom objects without a need to deploy custom code to the Infinispan/JDG server? Both the object and CacheEventFilterFactory to not be required on JDG side. AFAIK the protobuf schema could be used to query custom objects on JDG server side without having the code of the objects available on the JDG side? So iwas thinking about something similar.
> More details: Let's assume that on HotRod client side, I have entity like this:
> {code}
> public class UserEntity {
> private String username;
> private String email;
> private String country;
> }
> {code}
> I will be able to create client listener like this (I don't need to deploy "protobuf-factory". It will be available on JDG out of the box):
> {code}
> @org.infinispan.client.hotrod.annotation.ClientListener(filterFactoryName = "protobuf-factory")
> public class CustomLogListener {
> ...
> }
> {code}
> Then I will be able to use the examples like this to register client listener on client side (just an example how can the filtering "psudo-language" look like):
> Interested just for users from Czech republic:
> {code}
> remoteCache.addClientListener(listener, new String[] { "country.equals('cs')" }, null);
> {code}
> Interested just for users from Czech republic with emails from "@redhat.com":
> {code}
> remoteCache.addClientListener(listener, new String[] { "country.equals('cs') && email.endsWith('@redhat.com')" }, null);
> {code}
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
5 years, 3 months
[Red Hat JIRA] (ISPN-12261) Protocol server transport management
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-12261?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-12261:
-----------------------------------
Fix Version/s: 12.1.0.Final
(was: 12.0.0.Final)
> Protocol server transport management
> -------------------------------------
>
> Key: ISPN-12261
> URL: https://issues.redhat.com/browse/ISPN-12261
> Project: Infinispan
> Issue Type: Feature Request
> Components: CLI, JMX, reporting and management, Server
> Affects Versions: 12.0.0.Dev02
> Reporter: Tristan Tarrant
> Assignee: Tristan Tarrant
> Priority: Major
> Fix For: 12.1.0.Final
>
>
> the WF-server had the ability to stop/start a transport via the CLI (ISPN-11240).
> The new server should have a similar capability.
> h4. Protocol management via CLI
> {noformat}
> $ cli.sh server connector ls
> $ cli.sh server connector describe endpoint-default
> $ cli.sh server connector stop endpoint-default
> $ cli.sh server connector start endpoint-default
> {noformat}
> Aside from start/stop, we should also leverage netty's ipfilter handler which allows filtering based on subnet so that traffic can be blocked selectively.
> h4. IP Filtering
> {code:xml}
> <endpoints socket-binding="default" security-realm="default">
> <ip-filter>
> <reject from="172.16.0.0/16"/>
> <accept from="127.0.0.0/8"/>
> </ip-filter>
> <hotrod-connector/>
> <rest-connector/>
> </endpoints>
> {code}
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
5 years, 3 months
[Red Hat JIRA] (ISPN-11903) Overall process complete event for rebalance and conflict resolution
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11903?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11903:
-----------------------------------
Fix Version/s: 12.1.0.Final
(was: 12.0.0.Final)
> Overall process complete event for rebalance and conflict resolution
> --------------------------------------------------------------------
>
> Key: ISPN-11903
> URL: https://issues.redhat.com/browse/ISPN-11903
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 10.1.7.Final
> Reporter: Prakash Kolandaivelu
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 12.1.0.Final
>
>
> 09:51:44.207 [remote-thread--p3-t5] INFO org.infinispan.CLUSTER - [Context=transactional-type] ISPN100010: Finished rebalance with members [prakash-1092, prakash-63507], topology id 13
> 09:51:44.265 [persistence-thread--p6-t2] INFO org.infinispan.CLUSTER - [Context=devices] ISPN100013: Finished conflict resolution with members [prakash-1092, prakash-63507], topology id 8
> The overall rebalance and confit resolution are clearly printed in the log. If there is an event for the same to notify, it would be really good.
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
5 years, 3 months
[Red Hat JIRA] (ISPN-10309) Convert Remaining Parts to Non Blocking & Reduce Thread Pools
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-10309?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-10309:
-----------------------------------
Fix Version/s: 12.1.0.Final
(was: 12.0.0.Final)
> Convert Remaining Parts to Non Blocking & Reduce Thread Pools
> -------------------------------------------------------------
>
> Key: ISPN-10309
> URL: https://issues.redhat.com/browse/ISPN-10309
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Reporter: Will Burns
> Assignee: Will Burns
> Priority: Major
> Fix For: 12.1.0.Final
>
>
> We would love to get our thread pools down to a single CPU thread pool (size = numCores) and a blocking thread pool (arbitrarily large). We may also require a scheduler pool for various options as well (limited size 1-2?).
> To do this we need to remove remnants of our blocking code as possible. Possible issues for blocking are mostly around locks and io operations.
> The persistence layer was completed with ISPN-9722 so that is not an issue.
> The requirement around locking can be relaxed if the locks are guaranteed to be small in scope and do not wrap other blocking operations. An example would be a lock such as ones in CHM as long as we don't have large blocks for functional argument types.
> If code cannot be made non blocking we must offload the operation to the blocking thread pool. Care must be taken to ensure that once the blocking portion of code is completed that we switch back the to CPU thread pool as soon as possible. The listener API for example is violating this and will run code in Infinispan from any thread that completes the listener that could be done from a user.
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
5 years, 3 months
[Red Hat JIRA] (ISPN-5938) ClusterListenerReplInitialStateTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent fails randomly
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-5938?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-5938:
----------------------------------
Fix Version/s: 12.1.0.Final
(was: 12.0.0.Final)
> ClusterListenerReplInitialStateTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent fails randomly
> -------------------------------------------------------------------------------------------------
>
> Key: ISPN-5938
> URL: https://issues.redhat.com/browse/ISPN-5938
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite
> Affects Versions: 11.0.0.Alpha1
> Reporter: Roman Macor
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 12.1.0.Final
>
>
> ClusterListenerReplInitialStateTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent fails randomly with:
> Stacktrace
> java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask.get(FutureTask.java:205)
> at org.infinispan.notifications.cachelistener.cluster.ClusterListenerReplTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent(ClusterListenerReplTest.java:123)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
5 years, 3 months