July 2009 - infinispan-dev - Jboss List Archives

DLD continued/optimizing modifications within a tx

by Mircea Markus

Hi, Right now we support DLD in the following scenarios: local caches, symmetric tx that create deadlocks over two replicated caches (symmetric in the sense that A replicates to B and B replicates to A at the same time). In the case of distribution (replication as well, especially async one) there might still be a deadlock if both A and B replicate to C, but they replicate transactions that would result in deadlocks, so deadlock would result in C. Let's take an example, A wants to replicate Tx1 that affects (key1, key2) in this sequence. B wants to replicate Tx2 that affects (key2, key1) in this sequence. While replicating on C, Tx1 and Tx2 would result in a deadlock (classic scenario). Now a simple way of solving this (if I'm not missing something!!) is to order the keys in the replicated transaction based on some criterion (e.g. lexicographic) and apply them in the same sequence: Tx1(key1,key2) and Tx2(key1, key2). This will avoid deadlocks -> increase throughput. Now, at the core of this approach is the fact that the order of operations in the transaction is not relevant - which does not stand in current implementation. E.g. let's say we have a tx that does (remove(key), put(key, 'otherVal')). If we change the order result is totally different - not good! A way to avoid this (and to also reduce the amount of replication, by compacting changes by key) is to keep the modifications in a Map<key,Modification>. For each key we keep only the last value set within it, so if we modify the same key 1000 times within a tx we only replicate last modification vs current approach where 1000 operations are replicated. If a key is deleted, we still keep it in the map, with a marker for deletion. This way we only replicate the delta of modifications, and the order of operations is no longer relevant, so that we can leverage previously described deadlock detection. The advantages of this would be higher throughput by reducing the chance of 'asymmetric' deadlocks (A,B encounter a deadlock while replicating on C) and possible reduction in the size of the transaction (if it has multiple operations on the same key). The drawback is mainly the fact that some additional computation needs to be made to manage the map content (not a lot I reckon, instead of a List.add we'll do an map.get() + map.put(), for each modification within the transaction). Wdyt? Cheers, Mircea

16 years, 2 months

3
5
0 / 0

JMX Wishlist

by Manik Surtani

So far (4.0.0.ALPHA6) we have the current set of JMX components and the information they expose. http://infinispan.sourceforge.net/4.0/apidocs/jmxComponents.html This email is a wishlist for what *else* folks would like to see exposed via JMX for the final release. Remember than anything exposed via JMX will also be exposed via the JOPR [1] plugin. Cheers Manik [1] https://www.jboss.org/jopr -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

16 years, 2 months

3
5
0 / 0

Maven.assembly.useTransitiveDependencies true or false? WAS Re: More demo

by Galder Zamarreno

Hmmmmm, this has a domino effect. Because s3 module depends on infinispan-core, it now pulls all its dependencies, so we need to exclude them somehow, cos they're dependencies of the core module and hence they should reside there rather than have them in each and every module. Personally, in spite of the repetition, I prefer explicit dependency definition rather than letting maven bring all transitive dependencies and then having to go and do exclusions. More globally, I think this needs to be put the infinispan-dev list. During assembly, do we either: - Leave useTransitiveDependencies as it is, which is false, and make modules like s3 explicitly define the 3rd party dependencies that jclouds depends on. - Or set useTransitiveDependencies to true so that they're all brought in and exclude direct infinispan-core dependecies from getting to the lib directories of submodules such as s3. For the moment, and to make my life easier getting the demo working, I'm using the latter option. Thoughts? On 06/30/2009 05:47 PM, Adrian Cole wrote: > beware of the assemblies... they are extremely crufty compared to ant ;) > > On Tue, Jun 30, 2009 at 5:45 PM, Galder Zamarreno< > galder.zamarreno(a)redhat.com> wrote: > >> >> On 06/30/2009 05:22 PM, Adrian Cole wrote: >> >>> Well.. :) I setup the dependencies of jclouds, so it shouldn't bring >>> anything crazy in! Besides, a missing dep might be symptomatic of a >>> larger >>> issue in the distribution assembly. >>> >> Ahh, maybe just found the reason why transitive dependencies are not >> included: >> >> src/main/resources/assemblies/bin.xml: >> <useTransitiveDependencies>false</useTransitiveDependencies> >> >> I'm trying out with that parameter set to true. >> >> >> >>> Anyway, let me know how it goes. >>> >>> on the SNAPSHOT bit, it was irrelevant to the topic. no worries on that. >>> >>> Thanks again for sorting this out, Galder. >>> -a >>> >>> On Tue, Jun 30, 2009 at 5:11 PM, Galder Zamarreno< >>> galder.zamarreno(a)redhat.com> wrote: >>> >>> >>>> On 06/30/2009 05:00 PM, Galder Zamarreno wrote: >>>> >>>> >>>>> On 06/30/2009 04:55 PM, Adrian Cole wrote: >>>>> >>>>> many thanks, Galder. It should be a transitive dep from jclouds-s3... >>>>>> mvn >>>>>> dependency:tree >>>>>> >>>>>> maybe the distribution isn't pulling in deps properly. >>>>>> >>>>>> Yeah, that looks to be the issue. At the moment I'm trying to redefine >>>>> them in the s3 project but maybe there's another way to do this? Google >>>>> to the rescue... >>>>> >>>>> Hmmm, even though it's repetition, I think it might be safer to >>>> redefine >>>> the transitive dependencies rather than let it bring them all. You never >>>> know what Maven might bring with it... >>>> >>>> >>>> >>>> I will be cutting a new beta within the next couple days, so maybe you >>>>>> can >>>>>> switch temporarily to snapshot to ensure it will work? >>>>>> >>>>>> Hmmm, why would I need a snapshot of jclouds? I've simply been adding >>>>> the transitive dependencies manually in s3/pom.xml >>>>> >>>>> >>>>> Thanks, >>>>>> -Adrian >>>>>> >>>>>> On Tue, Jun 30, 2009 at 4:40 PM, Galder Zamarreno< >>>>>> galder.zamarreno(a)redhat.com> wrote: >>>>>> >>>>>> Hi Adrian, >>>>>> >>>>>>> I'm trying to make further progress with the demo and I'm >>>>>>> encountering the >>>>>>> following issue: >>>>>>> >>>>>>> Caused by: java.lang.ClassNotFoundException: >>>>>>> org.jclouds.logging.config.LoggingModule >>>>>>> >>>>>>> The list of jars that is distributed does not include >>>>>>> jclouds-core-1.0-beta-1.jar and this looks to be cos there's no direct >>>>>>> dependency on this jar in the pom.xml for the s3 cache store module. >>>>>>> >>>>>>> FYI: I'm adding this dependency. I think there's a similar issue with >>>>>>> google guice... >>>>>>> >>>>>>> Anyway, just letting you know of issues as I find them. >>>>>>> >>>>>>> See ya, >>>>>>> -- >>>>>>> Galder Zamarreño >>>>>>> Sr. Software Engineer >>>>>>> Infinispan, JBoss Cache >>>>>>> >>>>>>> >>>>>>> -- >>>> Galder Zamarreño >>>> Sr. Software Engineer >>>> Infinispan, JBoss Cache >>>> >>>> >> -- >> Galder Zamarreño >> Sr. Software Engineer >> Infinispan, JBoss Cache >> > -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache

16 years, 2 months

4
10
0 / 0

DeadlockDetection (DLD) - benchmarks and status

by Mircea Markus

Hi, I've extended the original DLD design to also support deadlock detection on local caches and updated design forum [1]. This, together with the replicated deadlock detection is implemented in trunk (some minor stuff to do still: DLD for aggregation methods like clear and addAll + unit test). I've also created a benchmark to test what's the throughput (tx/min) between caches running with and without DLD. You can find full test description within test class: http://anonsvn.jboss.org/repos/infinispan/trunk/core/src/test/java//org/i... Local DLD does good job (cca 5.5 times better) but replicated DLD does extraordinary: cca 101 better throughput (see attached). I think DLD is cool stuff and differentiates us a bit from competition, afaik none of them have a DLD. One more thing that's worth mentioning: while running DLD tests I've noticed that if all tx acquire locks on keys in same order, then no deadlocks exists. This is logic and might seem obvious, but is not stated anywhere and the performance increase by doing this might be very dramatic. I think we should document this as a best practice when it comes to transactions - any suggestion where? I also intend to blog about it shortly. Cheers, Mircea [1] http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4244838#4244838

16 years, 2 months

2
15
0 / 0

JPA and the query API

by Manik Surtani

Guys I was just thinking whether it makes sense to make the Query API a public one if we're going to support querying via the JPA interface. Does it make more sense to only allow querying via the JPA API? What do folks think? One less API to support? Cheers -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

16 years, 2 months

3
3
0 / 0

Distribution, take 2

by Manik Surtani

So another good thing that came out of this week's travelling between JUGs is that I was able to sit down with Adrian Cole in Berlin and, over a record-breaking number of coffees in a single sitting, we were able to rehash the distribution algorithm, to remove the weird complexities and race conditions we have in the current setup. I've made a note of this below - please have a read through and provide feedback. Cheers, Manik -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org DIST, take 2! Previous design failed due to two flaws: 1. Transaction logs maintained on sender, for each recipient. Plenty of scope for races, or heavily synchronized. 2. Consistent hash attempted to be overly fair in evenly dispersing nodes across a hash space. Meant that there was an often large and unnecessary amount of rehashing to do, which exacerbated the problem in 1. So we have a new approach, based on the following premises. 1. Consistent hash (CH) based on fixed positions in the hash space rather than relative ones. 1.1. Pros: limited and finite rehashing, particularly when there is a leave. 1.1.1. If the leaver is L, only node (L - 1) and node (L + 1) will have to push state, and only (L + 1) and (L + replCount) will have to receive state. 1.2. Cons: uneven spread of load (mitigated with grid size) 1.3. Virtual nodes is a good way to reduce this, but leaving VN out for now since it adds complexity. VN won't change the rest of the algorithm anyway. 2. State is both pulled (on JOIN) and pushed (on LEAVE) 3. State transfers are implemented using RPC commands and byte array payloads 3.1. Will later evolve into streaming scheme 4. Implementation notes: 4.1. Each node has a DistributionManager (DM) 4.2. Each DM has a RehashExecutor (with 1 low prio thread) 4.3. LeaveTask and JoinTask are 2 types of tasks handled by this executor encapsulating the processing that takes place when there is a leave or join. 4.4. InstallConsistentHashCommand - an RPC command that "installs" a consistent hash instance on remote nodes. 4.5. GetConsistentHashCommand - an RPC command that "asks" a node to serialize and transmit across its current CH impl. 4.6. PullStateCommand - an RPC command that "asks" for state. Command should have an Address of the requestor. Return value for this command is the state. 4.7. PushStateCommand - an RPC command that "pushes" state by providing a payload of state. 4.8. Current RPC responses are either SuccessfulResponses or UnsuccessfulResponses. Introducing a new type, UnsureResponse to deal with eventual consistency semantics of concurrent operation during a rehash. ClusteredGetResponseFilter should look for a quorum for UnsureResponses. 5. JoinTask: This is a PULL based rehash. JoinTask is kicked off on the JOINER. 5.1. Obtain OLD_CH from coordinator (using GetConsistentHashCommand) 5.2. Generate TEMP_CH (which is a union of OLD_CH and NEW_CH) 5.3. Broadcast TEMP_CH across the cluster (using InstallConsistentHashCommand) 5.4. Log all incoming writes/txs and respond with a positive ack. 5.5. Ignore incoming reads, forcing callers to check next owner of data. 5.6. Ping each node in OLD_CH's view and ask for state (PullStateCommand) 5.7. Apply state received from 5.6. 5.8. Drain tx log and apply, stop logging writes once drained. 5.9. Reverse 5.5. 5.10. Broadcast NEW_CH so this is applied (using InstallConsistentHashCommand) 5.11. Loop through data container and unicast invalidations for keys that "could" exist on OLD_CH and not in NEW_CH 6. When a leave is detected (view change with 1 less member): 6.1. Make a note of the Leaver (L) 6.2. All nodes switch to a NEW_CH, which does not include L 6.3. Nodes (L - replCount + 1), (L + 1) and (L + replCount) kick off a LeaveTask. 6.3.1. Not everyone in the cluster need be involved in this rehash thanks to fixed CH positions. 7. LeaveTask: This is PUSH based 7.1. If an existing LeaveTask is running, the existing task should be cancelled first. 7.2. if is_receiver(): Log all writes/incoming txs 7.3. if is_receiver(): Incoming reads: 7.3.1. If the key was owned by L in OLD_CH and not by self, and in NEW_CH is owned by self, return UnsureResponse. Else, SuccessfulResponse. 7.4. if is_pusher(): 7.4.1. Iterate over all keys in data container. If any key maps to L in OLD_CH, that key needs to be rehashed. Call addState(stateMap, k). See below for details of this algo. 7.4.2. Call PushStateCommand for each recipient in stateMap. 7.5. if is_receiver() and on receiving PushStateCommand 7.5.1. Drain tx log and apply, stop logging writes once drained. 7.5.2. Reverse 7.3. 8. A state map is a data structure representing state generated by each node to be pushed. 8.1. StateMap{ Map<Address, Map<Object, InternalCacheEntry>> state } 8.2. addState(stateMap, k): OL = OLD_CH.locate(k) if (OL.contains(L) and ((position(L, OL) == last and current_address == L - 1) or (position(L, OL) != last and current_address == L + 1)) { stateMap.add(NEW_CH.locate(k) - OL, k) break } 9. Functions to determine who pushes and who receives state during a leave: 9.1. is_pusher(address): return address == (L + 1) or (L - 1) 9.2. is_receiver(address): return address == (L + 1) or (L + replCount) Concurrency 1. Deal with concurrent JOINs 1.1. Concurrent JOINs in completely disjoint parts of the hash space will have no overlap whatsoever. 1.2. Concurrent JOINs in the same hash space would work as well, since JOINs are controlled by the joiner. 2. Deal with concurrent LEAVEs 2.1. Concurrent LEAVEs in completely disjoint parts of the hash space will have no overlap whatsoever. 2.2. Concurrent LEAVEs that do not affect the same nodes involved in providing or receiving state will not be a problem. 2.3. Concurrent LEAVES affecting the same nodes already providing or receiving state will interrupt the LeaveTasks on those nodes. Leave tasks will need to clean up appropriately, which may involve issuing a "cleanup command" to ensure partially transmitted state is reconciled. 3. Deal with concurrent LEAVE and JOIN 3.1. Concurrent JOIN and LEAVEs in completely disjoint parts of the hash space will have no overlap whatsoever. 3.2. FOr overlapping areas of hash space, this needs further thought.

16 years, 2 months

3
9
0 / 0

Infinispan in Germany

by Manik Surtani

FYI, I just blogged about the JUGs I presented at last week: http://infinispan.blogspot.com/2009/07/berlin-and-stuttgart-say-hello-to.... -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

16 years, 2 months

1
0
0 / 0

ISPN-105 and ISPN-106

by Klaus Friedel

Hi ! I recently submitted two RFEs to enhance Jdbc-Store(s) The first one suggests to add support for JNDIDatasoures. The second one suggests extending the backing JDBC table with a "cacheName" column to make CacheManagers connected with a Jdbc-Store usable for more than one Cache. I could spend some time in implementing these. How should I contribute ? Klaus Friedel

16 years, 2 months

2
2
0 / 0

Re: [infinispan-dev] [jboss-dev] Infinispan configuration - all in one solution

by Vladimir Blagojevic

On 7/17/09 5:16 PM, Bill Burke wrote: > Wouldn't it be better to write a javadoc-like generator that > introspected for JAXB and Inifispan annotations instead of writing your > own XML parser and schema generator? Or maybe I'm misunderstanding your > post. > > In other words, use JAXB for marshalling/unmarshalling/schema > generation. Use inifinispan javadoc generator for documentation. > I looked around a bit more to see that now in JAXB 2.x we have annotations as well :) If we annotate our configuration beans with JAXB annotations I suppose we can do exactly what you suggest - use JAXB for marshalling/unmarshalling/schema generation and at the same time use those annotation in a custom tool that creates reference documentation. > By creating your own XML parsing you've just complicated the maintenance > problems for future maintainers of infinispan. > I agree. Cheers, Vladimir

16 years, 2 months

4
4
0 / 0

Clustering mode attribute

by Vladimir Blagojevic

Hey dudes, Ambiguity that we allow in attribute mode of <clustering> element is not good IMHO. Normally, we use allowedValues attribute of ConfigurationElement annotation to capture that the type even though is a String or int or whatever only allows only certain set of values. We constrain properly locking, shutdownHook and what not. However, for mode attribute in <clustering> we allow all possible combination of substrings, upper caps, lower caps and what not. I was wondering if we can constrain the allowed values to LOCAL, INVALIDATION, REPL, and DIST. Presence of either <async> or <sync> subelement of <clustering> determines if it is synchronous or not. Having these clear constraints will be automatically documented in configuration reference and reflected in generated schema. The downside is that current alpha customers that created their own configuration files will have to change their values to new ones. WDYT? Vladimir

16 years, 2 months

2
3
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-dev July 2009