May 2009 - infinispan-dev - Jboss List Archives

by Mircea Markus

Hi Manik, In the following modified version of NodeAPITest.testAddingDataTx, the output will be "false" (the version here is slightly modified). public void testAddingDataTx() throws Exception { Node<Object, Object> rootNode = cache.getRoot(); tm.begin(); Node<Object, Object> nodeA = rootNode.addChild(A); NodeKey dataKey = new NodeKey(Fqn.fromString("/a"), NodeKey.Type.DATA); System.out.println("is it locked???" + cache.getCache().getAdvancedCache().getInvocationContextContainer().get().hasLockedKey(dataKey)); nodeA.put("key", "value"); assertEquals("value", nodeA.get("key")); tm.commit(); } Now, after creating a node within a tx, shouldn't the corresponding dataKey have a WL? (i.e. the code to print "true"). Cheers, Mircea

15 years, 5 months

2
1
0 / 0

Async cache interface

by Manik Surtani

Picking up on https://jira.jboss.org/jira/browse/ISPN-72 I was thinking about from what point the call is placed on a separate thread. The obvious bit is the network stack so that the caller gets a Future before the call goes out on the network, but maybe there is sense to do this even earlier? Perhaps as soon as the invocation is made? Things to consider though are transaction context - which exist on the caller's thread - and potential context class loaders on the thread. Stuff that can be dealt with easily enough - e.g., if we have an AsyncInterceptor that handles the thread pooling and management of the Future, and this sits *after* the TxInterceptor so transaction participation is already determined and transaction context attached to the call. Context classloaders and the like could be attached to the worker thread that then carries the call down the chain and to the wire. Thoughts? -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

15 years, 6 months

1
1
0 / 0

JDBM cache store

by Manik Surtani

Genman Could you please have a look at the maven deps for this module - it's pulling in a lot of seemingly unnecessary JARs, including antlr, cglib, ldap constants from Apache, SLF4J, etc. Surely Apache's fork of JDBM doesn't need this stuff?!? Cheers -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

15 years, 6 months

1
0
0 / 0

Rehashing for DIST

by Manik Surtani

So the blocker for distribution now is rehashing. This is much trickier a problem than I previously thought, since it brings all of the concerns we have with state transfer - the ability to generate and apply state, preferably while not stopping the cluster, not overwriting state from ongoing transactions, etc. I've detailed out two approaches - one is pessimistic, based on FLUSH, and probably won't be implemented but I've included it here for completeness. The second is more optimistic, based on NBST, although several degrees more complex. Still not happy with it as a solution though, but it could be a fallback. Still researching other alternatives as well, and an open to suggestions and ideas. Anyway, here we go. For each of the steps outlined below, ML1 is the old member list prior to a topology change, and ML2 is the new one. A. Pessimistic approach ------------------- JOIN: 1. Joiner FLUSHes 2. Joiner requests state from *all* other members (concurrently, using separate threads?) using channel.getState(address) 3. All other caches iterate through their local data container. 3.1. If the cache is the primary owner based on ML1 and the joiner is *an* owner based on ML2, this entry is written to the state stream. 3.2. If the cache is no longer an owner of the entry based on ML2, the entry is moved to L1 cache if enabled (otherwise, removed). 4. All caches close their state streams. 5. Joiner, after applying state received, releases FLUSH and joins the cluster. LEAVE: 1. Coordinator requests a FLUSH 2. All caches iterate through their data containers. 2.1. If based on ML1, the cache is the primary owner of an entry, and based on ML1 and ML2, there is a new owner who does not have the entry, an RPC request is sent to the new owner. 2.1.1. The new owner requests state from the primary owner with channel.getState() 2.1.2. The primary owner writes this state for the new owner. 2.2. If the owner is no longer an owner based on ML2, the entry is moved to L1 if enabled (otherwise, removed) 3. Caches close any state streams opened. Readers apply all state received. 4. Caches send an RPC to the coordinator informing the coord of an end-of-rehash phase. 5. Coordinator releases FLUSH This process will involve multiple streams connecting each cache to potentially every other cache. Perhaps streams over UDP multicast may be more efficient here? NBST approach ----------------------- This approach is very similar to NBST where the FLUSH phases defined in the pessimistic approach are replaced by modification logging and streaming of the modification log, using RPC to control a brief partial lock on modifications when the last of the state is written. Also, to allow for ongoing remote gets to retrieve correct state when dealing with a race of making a request from a new owner when the new owner hasn't applied state, or an old owner when the old owner has already removed state, we should support responding to a remote get even if you are no longer an owner - either by using L1, or my making another remote get in turn with the view that you deem correct. This does increase complexity over NBST quite significantly though, since each cache would need to maintain a transaction log for each and every other cache based on which keys are mapped there, to allow state application as per the pessimistic steps above to be accurate. Like I said, not the most elegant, but I thought I'd just throw it out there until I come up with a better approach. :-) Cheers -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

15 years, 6 months

2
3
0 / 0

Eviction overhaul

by Manik Surtani

Hello all. I have finished my work with the eviction code in Infinispan, here is a summary of what has happened. From a user perspective (including API and configuration) as well as a design overview, please have a look at http://www.jboss.org/community/docs/DOC-13449 From an implementation perspective, have a look at the srcs of FIFODataContainer and LRUDataContainer. These two classes are where everything happens. The javadocs should explain the details, but in a nutshell you can expect constant time operations for all puts, gets, removes, iterations. :-) Feedback on the impls would be handy. :-) Cheers -- Manik Surtani Lead, JBoss Cache http://www.jbosscache.org manik(a)jboss.org

15 years, 6 months

3
10
0 / 0

Re: Infinispan and search APIs

by Manik Surtani

Emmanuel and I started discussing this offline, bringing this on to the ML now. On 5 May 2009, at 16:49, Emmanuel Bernard wrote: > > On May 1, 2009, at 16:13, Manik Surtani wrote: > >> Hey dude >> >> When you have some time, could we chat about this? Since we have a >> chance to build any hooks into Infinispan that we may need in >> future for the query API, it would be good to get an idea of this >> now. > > you need: > - change notification (create, update, delete) > - define a change notification as belonging to a context (tx usually) > - define a way to execute things at context ending (typically > tx.commit) > - start / stop notifications incl the ability to pass properties the > Infinispan way (probably materialized as Map<String, Object> and the > list of object types being indexed. (relaxing this constraint leads > to a reduction of concurrency in HSearch) > - ability to access the HSearch change interceptor to retrieve the > SearchFactory from the main Infinispan api (could be a SPI) (ie from > an Hibernate SessionImpl we can go access the SessionFactory and the > inerceptor and then the SearchFactory > - might need an object key from an object (not sure) > - ability to load a set of objects by id (in batch) Yes, this is all stuff we figured out in JBC-Searchable. All straightforward enough to do. And rather than implement as a listener as we did in JBC, we ought to make this an interceptor - it will perform better and you have greater access to transaction context. > I think that's the most important hooks. > Also the ability to put a Lucene index on infinispan Well, there should be two options to storing indexes. The first, a simplistic approach where indexes are replicated everywhere, is achieved by using a separate cache for indexes which uses REPL regardless of what cache mode the data cache uses. Any query on any instance uses the cached indexes, to retrieve keys, and then does a get() on the data cache to load keys. This load may retrieve the entries from across the network if the cache is in DIST mode, load off a cache store, etc. The second approach to storing indexes is more interesting IMO. Each node only maintains LOCAL (*) indexes for keys that are mapped on to itself. E.g., in a cluster of {A, B, C, D, E}, we have {K1, K2, K3} mapped to A and {K4, K5, K6} mapped to C. A stores indexes for K1 - K3 locally, and C stores indexes for K4 - K6. Running a query on, say, E, would result in a query RPC call broadcast around the cluster and each node runs the query on their local indexes only, returning results to E. E then collates these results. I think this is much more scalable as a) the indexes themselves are fragmented and the system will be able to cope with more indexes as you add more nodes, and b) processing time to search through the indexes is divided up between the processors. Naturally there is overhead in broadcasting the query and collating results, and this is why we need efficient result set implementations that could retrieve this lazily, etc. E.g., your result iterator should not wait for all results to arrive before starting to iterate through what you have, and nodes should first send back result counts before actually loading and sending back results, etc. But this is a part of what we need to design. (*) We could use DIST for the indexes as well, provided the data indexed and the indexes are hashed to the same nodes for storage. This is, IMO, tricky since indexes would span several entries, not all of which are guaranteed to be hashed to the same nodes. Hence my thoughts on indexes being stored in a LOCAL cache. >> Like I was saying, one of the things we would need is the ability >> to do aggregate queries. The "nice to have" would be to offer an >> EJBQL style interface rather than a Lucene one, but the second part >> is not crucial IMO. > > aggregate query is kinda possible: > - count(*) is really just getResultSize() > - distinct count(property) is harder, needs some agregattion in > memory but likely possible > - sum() / avg() would need some actual aggregation in memory from > all matching results we could store some values in the lucene index > and sum them > - group by / having is harder and will probably have to either be > done in memory or done by calling n queries, one per "group" (likely > doable) If you feel this can be achieved easily enough using Lucene queries, I'm fine with this being the basis of our impl. > JPA-QL query is more a fantasy, just like GAE has "support" for JPA- > QL. Every time join is used, you would be fucked. It's just a familiarity thing - I don't necessarily need JPA-QL in itself, just that it has decent and familiar support for aggregation and query writing in general. Joins would obviously not be supported and the query parser would barf on encountering one. The other plus with JPA-QL is that it could offer easy migration off JPA and onto a data grid to some degree (certain caveats exist such as joins though). Like I said, this isn't important, could even be a plugin for later on that translates JPA-QL to Lucene queries. Cheers -- Manik Surtani manik(a)jboss.org Lead, Infinispan Lead, JBoss Cache http://www.infinispan.org http://www.jbosscache.org

15 years, 6 months

1
0
0 / 0

expose LogFactory.IS_LOG4J_AVAILABLE ?

by Adrian Cole

Hi, team. Does anyone mind if I expose LogFactory.IS_LOG4J_AVAILABLE? I'd rather not copy/paste the detection logic into the S3 module who has similar needs. Cheers, -Adrian

15 years, 6 months

2
1
0 / 0

OOB for commits

by Mircea Markus

Hi, Currently we are sending all the commit messages flagged as OOB, not the same about prepares. Guess the reason for this is to make commits move quicker on the wire - am I right? If so, what about doing the same for the rest of tx messages (prepares and rollbacks)? Cheers, Mircea

15 years, 6 months

3
3
0 / 0

Infinispan - runGuiDemo.bat

by list. rb

I thought I'd share my infinispan start script for win32 platforms. Is it worthy of committing? This is my first open-source contribution; just wanted to help. Kind regards SETLOCAL ENABLEDELAYEDEXPANSION @echo off rem set JAVA_HOME="C:\Program Files\Java\jre1.5.0_06" rem set PATH=%JAVA_HOME%\bin;%PATH% if not exist bin (cd .. ) set INFINISPAN_HOME=%CD% set CP="%INFINISPAN_HOME%\etc" set JVM_PARAMS="-Djava.net.preferIPv4Stack=true -Dlog4j.configuration=%INFINISPAN_HOME%\etc\log4j.xml" set module=core for /F "tokens=* delims=\" %%a in ('dir /S /B /AA %INFINISPAN_HOME%\modules\%module%\*.jar') do set CP=%%a;!CP! set module=gui-demo for /F "tokens=* delims=\" %%a in ('dir /S /B /AA %INFINISPAN_HOME%\modules\%module%\*.jar') do set CP=%%a;!CP! java -cp %CP% %JVM_PARAMS% org.infinispan.demo.InfinispanDemo

15 years, 6 months

2
1
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-dev May 2009