May 2009 - infinispan-dev - Jboss List Archives

JGroups 2.7 and 2.8 work with Infinispan now

by Vladimir Blagojevic

Hi, Give it a try as well! I suggest we move to 2.8 sooner than later. Currently, with tcp and 2.7 I get 0 failures and with 2.8 I get 9 faliures. Cheers, Vladimir

15 years, 5 months

4
18
0 / 0

Distributed hashing - heterogenous nodes

by Manik Surtani

This is based on a question that came up when discussing Infinispan with the Red Hat MRG group. My current design treats nodes as equal, and as such would get an equal number of keys mapped to the node. This is not always desirable, since nodes aren't always equal. Nodes could be weighted. Inspired by Amazon's Dynamo paper [1], I'm considering providing an alternate ConsistentHash implementation that supports weights, with the use of "virtual nodes", or "tokens". I suppose the purpose of this email is, do we need a "simplistic" CH implementation like the one I have right now, anymore? Should we not just ship a weighted CH, with the default weight being equal? Cheers Manik [1] http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

15 years, 5 months

3
3
0 / 0

InternalCacheValue, Marshalling and Cache Stores

by Manik Surtani

As pointed out by genman, the use of InternalCacheEntries (ICEs) in certain cache stores can lead to unnecessary marshalling of a key twice, impacting both time to marshall and space consumed by the resulting byte stream. Cache stores specifically affected are the JdbcStringBasedCacheStore, BdbjeCacheStore, JdbmCacheStore and ClusteredCacheLoader. The ClusteredGetCommand - used by both the ClusteredCacheLoader as well as the DistributionInterceptor for remote lookups - is also affected. There is no point in a remote host marshalling the key in ICE as a part of the response stream, the caller already has the key as it was a part of the request. All the same, we cannot remove the key portion of ICE or exclude it from marshalling as it is needed for other purposes such as state transfer. Bucket-based cache stores too require the key to be stored with the entry. So to work around this, I have provided an additional interface: InternalCacheValue (ICV). This is a representation of a value (+ any information pertaining to expiry). So essentially all the stuff in an ICE minus the key. An ICV is obtained by invoking ICE.toInternalCacheValue(). The cache stores mentioned above - and ClusteredGetCommand - should make sure they marshall ICVs rather than ICEs. When loading entries, an ICE can be created by using ICV.toInternalCacheEntry(Object key). Could the respective authors of the stores above please modify their code accordingly - I've just checked in the ICV changes in trunk. Cheers -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

15 years, 5 months

2
1
0 / 0

Re: [hibernate-dev] Re: Pushing indexes through JGroups

by Emmanuel Bernard

Lukasz, I have been discussing with Manik on #3 and we think that JBoss Cache / Infinispan are probably a better fit than plain JGroups for that as all the plumbing will be configured for you. When you reach this problem, let's revive this discussion. On May 25, 2009, at 11:07, Hardy Ferentschik wrote: > Hi, > > I talked with Łukasz about this last wekk. Definitely, #1 and #3. > #2 I don't like either. > > The befefit of #3 would also be that one could drop the requirement of > having a shared file system (NFS, NAS, ...) #3 should be quite easy to > implement. Maybe easy to get started with. > > --Hardy > > On Mon, 25 May 2009 10:55:52 +0200, Emmanuel Bernard <emmanuel(a)hibernate.org > > wrote: > >> Hello >> I am not sure this is where we should go, or at least, it depends. >> here are three scenarii >> >> >> #1 JMS replacement >> If you want to use JGroups as a replacement for the JMS backend, >> then I think you should write a jgroups backend. Check >> org.hibernate.search.backend.impl.jms >> In this case all changes are sent via JGroups to a "master". The >> master could be voted by the cluster possibly dynamically but >> that's not necessary for the first version. >> >> #2 apply indexing on all nodes >> JGroups could send the work queue to all nodes and each node could >> apply the change. >> for various reasons I am not fan of this solution as it creates >> overhead in CPU / memory usage and does nto scale very well from a >> theoretical PoV. >> >> #3 Index copy >> this is what you are describing, copying the index using JGroups >> instead of my file system approach. This might have merits esp as >> we could diminish network traffic using multicast but it also >> require to rethink the master / slave modus operandi. >> Today the master copy on a regular basis a clean index to a shared >> directory >> On a regular basis, the slave go and copy the clean index from the >> shared directory. >> In your approach, the master would send changes to the slaves and >> slaves would have to apply them "right away" (on their passive >> version) >> >> I think #1 is more interesting than #3, we probably should start >> with that. #3 might be interesting too, thoughts? >> >> Emmanuel >> >> PS: refactoring is a fact of life, so feel free to do so. Just >> don't break public contracts. >> >> On May 21, 2009, at 22:14, Łukasz Moreń wrote: >> >>> Hi, >>> >>> I have few questions that concern using JGroups to copy index >>> files. I think to create sender(for master) and receiver(slave) >>> directory providers. >>> Sender class mainly based on existing FSMasterDirectoryProvider, >>> first create local index copy and send later to slave nodes >>> (or send without copying, but that may cause lower performance?). >>> To avoid code redundancy it would be good to refactor a little >>> FSMasterDirectoryProvider class, so then I can use copying >>> functionality in new DirectoryProvider and add sending one; or >>> rather I should work around it? >>> >>> I do not understand completely how does the multithreading access >>> to index file work. Does FileChannel class assure that, when index >>> is copied and new Lucene works are pushed? > > >

15 years, 5 months

1
0
0 / 0

JIRA mail for ISPN?

by Galder Zamarreno

Hi, Has ISPN jira project have all JIRA activity mail enabled? I can only seem to find emails related to JIRAs I've created or watched, but not the rest like in JBCACHE...etc. Cheers, -- Galder Zamarreño Sr. Software Maintenance Engineer JBoss, a division of Red Hat

15 years, 5 months

2
6
0 / 0

InboundInvocationHandlerImpl optimization

by Mircea Markus

Hi, There are several ComponentRegistry.getComponent() lookups in this class: perform(): gcr.getNamedComponentRegistry(cacheName); cr.getComponent(Configuration.class) cr.getLocalComponent(CommandsFactory.class) cr.getComponent(ResponseGenerator.class) applyState() and generateState() call (indirectly): cr.getComponent(StateTransferManager.class); Now all this calls are indirectly map lookups. Another approach would be to cache them in class members, wdyt? Cheers, Mircea

15 years, 5 months

3
5
0 / 0

Object Stream Pooling design

by Galder Zamarreno

Hi, Re: https://jira.jboss.org/jira/browse/ISPN-42 I can see two clear options when trying to implement this together with JBoss Marshalling. One would be to use some sort of blocking queue like Manik did for JBC 2.1.0.GA. The other option would be to simply use two thread local instances (one for marshaller and one for the unmarshaller). The main advantages of thread locals would be reducing contention by not having a shared data structure and simplicity configuration wise, no need to define pooling parameters...etc. The main disadvantages of thread locals would be any potential leaks arising from marshaller/unmarshaller. Having talked to David and having checked the JBMAR code, whenever a message needs to be written, we'd be calling start() and the finish() on the marshaller/unmarshaller, and as long as finish() was called on a finally, we're guaranteed that we won't be leaking any user classes/instances via the marshaller/unmarshaller. Most of the environments these days run on some sort of pooled thread strategy and hence thread locals would be certainly improve performance. So, my current preference, would be to simply use thread locals for this. Thoughts? -- Galder Zamarreño Sr. Software Maintenance Engineer JBoss, a division of Red Hat

15 years, 5 months

2
1
0 / 0

Re: [jbosscache-dev] JBoss Cache Lucene Directory

by Manik Surtani

Sanne, Agreed. Could all involved please make sure we post to both hibernate- dev as well as infinispan-dev (rather than jbosscache-dev) when discussing anything to do with such integration work. As there are parallel efforts which can be brought together. Cheers Manik On 25 May 2009, at 10:53, Sanne Grinovero wrote: > Hello, > I'm forwarding this email to Emmanuel and Hibernate Search dev, as I > believe we should join the discussion. > Could we keep both dev-lists (jbosscache-dev(a)lists.jboss.org, > hibernate-dev(a)lists.jboss.org ) on CC ? > > Sanne > > 2009/4/29 Manik Surtani <manik(a)jboss.org>: >> >> On 27 Apr 2009, at 05:18, Andrew Duckworth wrote: >> >>> Hello, >>> >>> I have been working on a Lucene Directory provider based on JBoss >>> Cache, >>> my starting point was an implementation Manik had already written >>> which >>> pretty much worked with a few minor tweaks. Our use case was to >>> cluster a >>> Lucene index being used with Hibernate Search in our application, >>> with the >>> requirements that searching needed to be fast, there was no shared >>> file >>> system and it was important that the index was consistent across >>> the cluster >>> in a relatively short time frame. >>> >>> Maniks code used a token node in the cache to implement the >>> distributed >>> lock. During my testing I set up multiple cache copies with >>> multiple threads >>> reading/writing to each cache copy. I was finding a lot of >>> transactions to >>> acquire or release this lock were timing out, not understanding >>> JBC well I >>> modified the distributed lock to use JGroups >>> DistrubutedLockManager. This >>> worked quite well, however the time taken to acquire/release the >>> lock (~100 >>> ms for both) dwarfed the time to process the index update, lowering >>> throughput. Even using Hibernate Search with an async worker >>> thread, there >>> was still a lot of contention for the single lock which seemed to >>> limit the >>> scalability of the solution. I thinkl part of the problem was that >>> our use >>> of HB Search generates a lot of small units of work (remove index >>> entry, add >>> index entry) and each of these UOW acquire a new IndexWriter and >>> new write >>> lock on the underlying Lucene Directory implementation. >>> >>> >>> Out of curiosity, I created an alternative implementation based on >>> the >>> Hibernate Search JMS clustering strategy. Inside JBoss Cache I >>> created a >>> queue node and each slave node in the cluster creates a separate >>> queue >>> underneath where indexing work is written: >>> >>> /queue/slave1/[work0, work1, work2 ....] >>> /slave2 >>> /slave3 >>> >>> etc >>> >>> In each cluster member a background thread runs continuously when >>> it wakes >>> up, it decides if it is the master node or not (currently checks >>> if it is >>> the view coordinator, but I'm considering changing it to use a >>> longer lived >>> distributed lock). If it is the master it merges the tasks from >>> each slave >>> queue, and updates the JBCDirectory in one go, it can safely do >>> this with >>> only local VM locking. This approach means that in all the slave >>> nodes they >>> can write to their queue without needing a global lock that any >>> other slave >>> or the master would be using. On the master, it can perform >>> multiple updates >>> in the context of a single Lucene index writer. With a cache loader >>> configured, work that is written into the slave queue is >>> persistent, so it >>> can survive the master node crashing with automatic fail over to a >>> new >>> master meaning that eventually all updates should be applied to >>> the index. >>> Each work element in the queue is time stamped to allow them to be >>> processed >>> in order (requires! >>> time synchronisation across the cluster) by the master. For our >>> workload >>> the master/slave pattern seems to improve the throughput of the >>> system. >>> >>> >>> Currently I'm refining the code and I have a few JBoss Cache >>> questions >>> which I hope you can help me with: >>> >>> 1) I have noticed that under high load I get LockTimeoutExceptions >>> writing >>> to /queue/slave0 when the lock owner is a transaction working on >>> /queue/slave1 , i.e. the same lock seems to be used for 2 >>> unrelated nodes in >>> the cache. I'm assuming this is a result of the lock striping >>> algorithm, if >>> you could give me some insight into how this works that would be >>> very >>> helpful. Bumping up the cache concurrency level from 500 to 2000 >>> seemed to >>> reduce this problem, however I'm not sure if it just reduces the >>> probability >>> of a random event of if there is some level that will be >>> sufficient to >>> eliminate the issue. >> >> It could well be the lock striping at work. As of JBoss Cache >> 3.1.0 you can >> disable lock striping and have one lock per node. While this is >> expensive >> in that if you have a lot of nodes, you end up with a lot of locks, >> if you >> have a finite number of nodes this may help you a lot. >> >>> 2) Is there a reason to use separate nodes for each slave queue ? >>> Will it >>> help with locking, or can each slave safely insert to the same >>> parent node >>> in separate transactions without interfering or blocking each >>> other ? If I >>> can reduce it to a single queue I thin that would be a more elegant >>> solution. I am setting the lockParentForChildInsertRemove to false >>> for the >>> queue nodes. >> >> It depends. Are the work objects attributes in /queue/slaveN ? >> Remember >> that the granularity for all locks is the node itself so if all >> slaves write >> to a single node, they will all compete for the same lock. >> >>> 3) Similarly, is there any reason why the master should/shouldn't >>> take >>> responsibility for removing work nodes that have been processed ? >> >> Not quite sure I understand your design - so this distributes the >> work >> objects and each cluster member maintains indexes locally? If so, >> you need >> to know when all members have processed the work objects before >> removing >> these. >> >>> Thanks in advance for help, I hope to make this solution general >>> purpose >>> enough to be able to contribute back to Hibernate Search and JBC >>> teams. >> >> Thanks for offering to contribute. :-) One other thing that may >> be of >> interest is that I just launched Infinispan [1] [2] - a new data grid >> product. You could implement a directory provider on Infinispan >> too - it is >> a lot more efficient than JBC at many things, including >> concurrency. Also, >> Infinispan's lock granularity is per-key/value pair. So a single >> distributed cache would be all you need for work objects. Also, >> another >> thing that could help is the eager locking we have on the roadmap >> [3] which >> may make a more traditional approach of locking + writing indexes >> to the >> cache more feasible. I'd encourage you to check it out. >> >> [1] http://www.infinispan.org >> [2] >> http://infinispan.blogspot.com/2009/04/infinispan-start-of-new-era-in-ope... >> [3] https://jira.jboss.org/jira/browse/ISPN-48 >> -- >> Manik Surtani >> manik(a)jboss.org >> Lead, Infinispan >> Lead, JBoss Cache >> http://www.infinispan.org >> http://www.jbosscache.org >> >> >> >> >> _______________________________________________ >> jbosscache-dev mailing list >> jbosscache-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/jbosscache-dev >> -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

15 years, 5 months

1
0
0 / 0

[ISPN-70] - Transparent eager locking for transactions

by Vladimir Blagojevic

Hi, I believe that this feature is complete. Relevant unit test is SyncReplImplicitLockingTest. If you have a few minutes to review the code I left the guideline comments under the JIRA - https://jira.jboss.org/jira/browse/ISPN-70 If you have more testing ideas/scenarios do help out. Regards, Vladimir

15 years, 5 months

1
0
0 / 0

NBST and TxInterceptor

by Mircea Markus

Hi, At the moment we always use the TransactionLog within the TxInterceptor to log stuff which *might* be needed by the in-memory state transfer process. This takes place even when 'fetchInMemoryState' is turned off - and this is redundant (or isn't it?). If so, what abut moving the code that manages TransactionLog from TxInterceptor into another interceptor (StateTransferInterceptor perhaps) that would only be present in the chain if the 'fetchInMemoryState' is set to true. Ah, at a second thought even if fetchInMemoryState is set to false, this node might still act as an state supplier for somebody else, so we cannot decide based on 'fetchInMemoryState'. Do you reckon it's worthed having an canSupplyState attribute for a node - when this canSupplyState is false, we don't include the StateTransferInterceptor in the chain and give a performance boost by not recording and unrecording 2PC prepares all the time... Cheers, Mircea

15 years, 5 months

1
0
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-dev May 2009