April 2012 - infinispan-dev - Jboss List Archives

by Sanne Grinovero

Hello all, currently Infinispan Query is an interceptor registering on the specific Cache instance which has indexing enabled; one such interceptor is doing all what it needs to do in the sole scope of the cache it was registered in. If you enable indexing - for example - on 3 different caches, there will be 3 different Hibernate Search engines started in background, and they are all unaware of each other. After some design discussions with Ales for CapeDwarf, but also calling attention on something that bothered me since some time, I'd evaluate the option to have a single Hibernate Search Engine registered in the CacheManager, and have it shared across indexed caches. Current design limitations: A- If they are all configured to use the same base directory to store indexes, and happen to have same-named indexes, they'll share the index without being aware of each other. This is going to break unless the user configures some tricky parameters, and even so performance won't be great: instances will lock each other out, or at best write in alternate turns. B- The search engine isn't particularly "heavy", still it would be nice to share some components and internal services. C- Configuration details which need some care - like injecting a JGroups channel for clustering - needs to be done right isolating each instance (so large parts of configuration would be quite similar but not totally equal) D- Incoming messages into a JGroups Receiver need to be routed not only among indexes, but also among Engine instances. This prevents Query to reuse code from Hibernate Search. Problems with a unified Hibernate Search Engine: 1#- Isolation of types / indexes. If the same indexed class is stored in different (indexed) caches, they'll share the same index. Is it a problem? I'm tempted to consider this a good thing, but wonder if it would surprise some users. Would you expect that? 2#- configuration format overhaul: indexing options won't be set on the cache section but in the global section. I'm looking forward to use the schema extensions anyway to provide a better configuration experience than the current <properties />. 3#- Assuming 1# is fine, when a search hit is found I'd need to be able to figure out from which cache the value should be loaded. 3#A we could have the cache name encoded in the index, as part of the identifier: {PK,cacheName} 3#B we actually shard the index, keeping a physically separate index per cache. This would mean searching on the joint index view but extracting hits from specific indexes to keep track of "which index".. I think we can do that but it's definitely tricky. It's likely easier to keep indexed values from different caches in different indexes. that would mean to reject #1 and mess with the user defined index name, to add for example the cache name to the user defined string. Any comment? Cheers, Sanne

10 years, 2 months

6
14
0 / 0

Re: [infinispan-dev] Removing Infinispan dependency on the Hibernate-Infinispan module in 4.x

by Galder Zamarreño

Scott, what do you suggest doing instead then? Without the commands, evictAll invalidation won't work. Are you suggesting that I revert back to using the cache as a notification bus so that regions are invalidated? On Feb 8, 2012, at 4:13 PM, Scott Marlow wrote: > http://lists.jboss.org/pipermail/infinispan-dev/2012-February/010125.html has more context. > > Since there are no easy/quick fixes that can be applied at this time, to remove the AS7 Infinispan dependency on the Hibernate-Infinispan module, I think we should avoid depending on the service loader way to supply the custom commands (in the Hibernate-Infinispan module), at least until this can be addressed elsewhere. > > I propose that the Hibernate-Infinispan second level cache should not use the Service Loader to pass custom commands into Infinispan. If we agree, I'll create a jira for this. > > Scott -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache

10 years, 10 months

8
20
0 / 0

Map Reduce 2.0

by Vladimir Blagojevic

Hey guys, Before moving forward with next iteration of map reduce I wanted to hear your thoughts about the following proposal. After we agree on the general direction I will transcribe the agreed design on a wiki page and start implementation. Shortcoming of current map reduce implementation While our current map reduce implementation is more than a proof of a concept there are several drawbacks preventing it from being an industrial grade map reduce solution. The main drawback is the inability of the current solution to deal with a large data (in GB/TB) map reduce problems. This shortcoming is mainly around our reduce phase execution. Reduce phase, as you might know, is currently done on a single Infinispan master task node; reduce phase of map reduce problems we can support (data size wise) is therefore shrunk to a working memory of a single node. Proposed solution The proposed solution involves distributing execution of reduce phase tasks across the cluster thus effectively achieving higher reduce task parallelization and at the same time removing the above mentioned reduce phase restriction. Effectively leveraging our consistent hashing solution even further we can parallelize reduce phase and elevate our map reduce solution to an industrial level. Here is how we can achieve that. Map phase MapReduceTask, as it currently does, will hash task input keys and group them by execution node N they are hashed to. For each node N and its grouped input KIn keys MapReduceTask creates a MapCombineCommand which is migrated to an execution target node N. MapCombineCommand is similar to current MapReduceCommand. MapCombineCommand takes an instance of a Mapper and an instance of a Reducer, which is a combiner [1]. Once loaded into target execution node MapCombineCommand takes each local KIn key and executes Mapper method void map(KIn key, VIn value, Collector<KOut, VOut> collector). Results are collected to a common Collector<KOut, VOut> collector and combine phase is initiated. A Combiner, if specified, takes KOut keys and imediatelly invokes reduce phase on keys. The result of mapping phase executed on each node is <KOut, VOut> map M. There will be one resulting M map per execution node N. At the end of combine phase instead of returning map M to the master task node (as we currently do), we now hash each KOut in map M and group KOut keys by the execution node N they are hashed to. Each group of KOut keys and its VOut values, hashed to the same node, is wrapped with a new command Migrate. Command Migrate, which is very similar to PutKeyValueCommand,executed on Infinispan target node N esentially maintains KOut K -> List<VOut> mapping, i.e all KOut/VOut pairs from all executed MapCombineCommands will be collocated on a node N where KOut is hashed to and value for KOut will be a list of all VOut values. We essentially collect all VOut values under each KOut for all executed MapCombineCommands. At this point MapCombineCommand has finished its execution; list of KOut keys is returned to a master node and its MapReduceTask. We do not return VOut values as we do not need them at master task node. MapReduceTask is ready to start with reduce phase. Reduce phase MapReduceTask initializes ReduceCommand with a user specified Reducer. For each key KOut collected from a map phase we group them by execution node N they are hashed to. For each node N and its grouped input KOut keys MapReduceTask creates a ReduceCommand and sends it to a node N where KOut keys are hashed. Once loaded on target execution node, ReduceCommand for each KOut key grabs list of values VOut and invokes: VOut reduce(KOut reducedKey, Iterator<VOut> iter). A result of ReduceCommand is a map M where each key is KOut and value is VOut. Each Infinispan execution node N returns one map M where each key KOut is hashed to N and each VOut is KOut's reduced value. When all ReduceCommands return to a calling node, MapReduceTask simply combines all these M maps and returns final Map<KOut, VOut> as a result of MapReduceTask. All intermediate KOut->List<VOut> maps left on Infinispan cluster are then cleaned up. [1] See section 4.3 of http://research.google.com/archive/mapreduce.html

11 years, 10 months

3
17
0 / 0

Time for a tryLock() ?

by Galder Zamarreño

Looks like rolling back the transaction when a lock timeout is encountered can be problematic: https://community.jboss.org/message/731307#731307 Maybe time to implement a tryLock() that attempts to acquire the lock but does not rollback the transaction if it cannot acquire it? Thoughts? -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache

11 years, 11 months

8
13
0 / 0

IRC meeting

by Galder Zamarreño

Minutes of our last IRC meeting can be found here: http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2012/inf... Cheers, -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache

11 years, 11 months

5
13
0 / 0

Time for a tryPut()? :-)

by Thomas Fromm

Heyho, Similar to the tryLock-issue discussed in another thread, I've a problem with put(). Cache.put(...) can have a lot of reasons for failling. e.g. java.lang.RuntimeException: org.infinispan.CacheException: Member ISNode-35671 no longer in cluster.... javax.transaction.HeuristicMixedException (different reasons) ... Reasons of failing puts are often normal cluster operations. e.g. new node joins in, other one leaves ... Under load, this stuff causes lots of transactions to fail. For a single put was my solution just to try (limited times) again, when exception appear. For transactions this is a bit more complex, due I need to "replay" the operations of the whole failed transaction. Do you have any best practice for these situations? Makes the feature request for a uncontended put makes sense? --tf

12 years

1
0
0 / 0

How the remote fetching for GET works in DIST mode

by Michal Linhard

Maybe this is an ancient topic discussed, ages ago somewhere, but if there's a quick answer, please save me from searching through tons of archives. Let's have a DIST mode cache. When we're doing a memcached/REST tests where for each request there's cca numOwners/numNodes probability that we'll hit the owner, in numNodes=4, numOwners=2 this means cca every second request needs a remote fetch (assume no L1 cache) I'm just looking at the DistributionManagerImpl.retrieveFromRemoteSource and trying understand how it works. Does it really cause 2 more GET request processing in the cluster ? OK even when we're assuming serving a GET request should be quick, I'd like to see how it would perform if we only allowed getting from the first owner. I guess right now we don't allow configuring this. Are there any catches with this ? m. -- Michal Linhard Quality Assurance Engineer JBoss Datagrid Red Hat Czech s.r.o. Purkynova 99 612 45 Brno, Czech Republic phone: +420 532 294 320 ext. 62320 mobile: +420 728 626 363

12 years

2
1
0 / 0

Hot Rod Java client and NIO

by Manik Surtani

Mircea, When you wrote the Hot Rod client, you abstracted the transport away so we could have alternate network implementations, right? The reason I ask is that at some point we should look at not only an NIO impl (I know you had one that you experimented with some while back) but also a JDK7 NIO2 one, and benchmark the three. WDYT? Cheers Manik -- Manik Surtani manik(a)jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org

12 years

2
6
0 / 0

PutMapCommand throws a NullPointerException in Distributed Mode

by Pedro Ruivo

Hi all, I've spotted a bug in PutMapCommand. When the keys in the Command touches in multiple nodes, the remote nodes (nodes that didn't created the command) can throw the exception [1] when executing the perform() method. I'm using a transactional cache. The test case in [2] reproduces the bug. If you want, I can open a JIRA and if you need more details let me know. Cheers, Pedro Ruivo [1] Exception: Caused by: java.lang.NullPointerException at org.infinispan.commands.write.PutMapCommand.perform(PutMapCommand.java:79) at org.infinispan.interceptors.CallInterceptor.handleDefault(CallInterceptor.java:83) at org.infinispan.commands.AbstractVisitor.visitPutMapCommand(AbstractVisitor.java:82) at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:67) -- [2] branch: https://github.com/pruivo/infinispan/tree/issue_2 test case: https://github.com/pruivo/infinispan/blob/issue_2/core/src/test/java/org/...

12 years

3
3
0 / 0

Issue when preload from cache loader with write skew check

by Pedro Ruivo

Hi all, I think I've spotted a issue when I use repeatable read with write skew check and I preload the cache. I've made a test case to reproduce the bug. It can be found here [1]. The problem is that each keys preloaded is put in the container with version = null. When I try to commit a transaction, I get this exception: java.lang.IllegalStateException: Entries cannot have null versions! at org.infinispan.container.entries.ClusteredRepeatableReadEntry.performWriteSkewCheck(ClusteredRepeatableReadEntry.java:44) at org.infinispan.transaction.WriteSkewHelper.performWriteSkewCheckAndReturnNewVersions(WriteSkewHelper.java:81) at org.infinispan.interceptors.locking.ClusteringDependentLogic$AllNodesLogic.createNewVersionsAndCheckForWriteSkews(ClusteringDependentLogic.java:133) at org.infinispan.interceptors.VersionedEntryWrappingInterceptor.visitPrepareCommand(VersionedEntryWrappingInterceptor.java:64) I think that all info is in the test case, but if you need something let me know. Cheers, Pedro [1] https://github.com/pruivo/infinispan/blob/issue_1/core/src/test/java/org/...

12 years

3
3
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-dev April 2012