[JBoss JIRA] (ISPN-3731) Multicast messages can be replayed to new node
by Radim Vansa (JIRA)
Radim Vansa created ISPN-3731:
---------------------------------
Summary: Multicast messages can be replayed to new node
Key: ISPN-3731
URL: https://issues.jboss.org/browse/ISPN-3731
Project: Infinispan
Issue Type: Bug
Affects Versions: 6.0.0.Final
Reporter: Radim Vansa
Assignee: Mircea Markus
Priority: Critical
Messages that target all current members are sent as multicast messages.
However, these retransmissions can be replayed on new nodes that have just joined the cluster.
This can result for example in execution of already completed transaction on the new node, causing possible data inconsistency for those entries which are owned by the new node in backup way - the replayed transaction sequence authoritatively overwrites them.
The node should remember the first topologyId it has seen and do not execute any commands that have lower topologyId.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 1 month
[JBoss JIRA] (ISPN-3702) Too many threads for cleaning up infinispan transactions
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3702?page=com.atlassian.jira.plugin.... ]
Mircea Markus commented on ISPN-3702:
-------------------------------------
makes sense. Would be good to have the thread pool factory configurable, to share the thread for your use case.
> Too many threads for cleaning up infinispan transactions
> --------------------------------------------------------
>
> Key: ISPN-3702
> URL: https://issues.jboss.org/browse/ISPN-3702
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache, Transactions
> Affects Versions: 5.3.0.Final
> Environment: Mac and Linux
> Reporter: Prasanth Pallamreddy
> Assignee: Mircea Markus
>
> When using multiple transactional caches, we are seeing that each cache has a dedicated cleanup thread. While this is not an issue for small number of caches, when the number of caches is high as in our case (~100), we see around a 100 threads dedicated for cleanup like the following.
> "TxCleanupService,{default}_{XXX},user-mac-54275" daemon prio=5 tid=0x00007fa0f50d3800 nid=0x10f03 waiting on condition [0x00000001a5a5d000]
> "TxCleanupService,{default}_{XXX},user-mac-54275" daemon prio=5 tid=0x00007fa0f507e800 nid=0x10e03 waiting on condition [0x00000001a595a000]
> "TxCleanupService,{default}_{XXX},user-mac-54275" daemon prio=5 tid=0x00007fa0f507e000 nid=0x10d03 waiting on condition [0x00000001a5857000]
> "TxCleanupService,{default}_{XXX},user-mac-54275" daemon prio=5 tid=0x00007fa0f5817800 nid=0x10c03 waiting on condition [0x00000001a5754000]
> ...
> Looking at the source code for
> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/o...
> if (!totalOrder) {
> // Periodically run a task to cleanup the transaction table from completed transactions.
> ThreadFactory tf = new ThreadFactory() {
> @Override
> public Thread newThread(Runnable r) {
> String address = rpcManager != null ? rpcManager.getTransport().getAddress().toString() : "local";
> Thread th = new Thread(r, "TxCleanupService," + cacheName + "," + address);
> th.setDaemon(true);
> return th;
> }
> };
> executorService = Executors.newSingleThreadScheduledExecutor(tf);
> This code can benefit from drawing the threads from a dedicated pool which is bounded.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 1 month
[JBoss JIRA] (ISPN-3729) Minimize the number of moved segments for SyncConsistentHashFactory
by Dan Berindei (JIRA)
Dan Berindei created ISPN-3729:
----------------------------------
Summary: Minimize the number of moved segments for SyncConsistentHashFactory
Key: ISPN-3729
URL: https://issues.jboss.org/browse/ISPN-3729
Project: Infinispan
Issue Type: Bug
Components: Distributed Cache
Affects Versions: 6.0.0.Final
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 7.0.0.Final
SyncConsistentHash uses an algorithm that's similar to consistent hashing, but when there is a collision (two nodes map to the same segment), the second node is moved to the next segment. Since the nodes are ordered by their UUID, that means it's possible for a joiner to change the mapping of existing nodes.
In order to make the load distribution more even, SyncConsistentHash also uses "virtual nodes": each node actually maps to multiple segments. This makes the number of collisions much higher (and implicitly, the number of extra moved segments).
Reading the original [consistent hashing paper|http://thor.cs.ucsb.edu/~ravenben/papers/coreos/kll%2B97.pdf], it looks like the collision handling should be done differently: a joiner should replace an existing node when it's "closer" to the segment boundary, but the existing node should never "move" to another segment (the property of monotonicity mentioned in the paper). We should investigate whether changing this would allow us to achieve better load balancing by using a much higher number of "virtual nodes" (without moving extra segments). If successful, we could even use SyncConsistentHashFactory as the default hash algorithm.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 1 month