Infinispan embedded off-heap cache
                                
                                
                                
                                    
                                        by yavuz gokirmak
                                    
                                
                                
                                        Hi all,
Is it possible to use infinispan as embedded off-heap cache.
As I understood it is not implemented yet.
If this is the case, we are planning to put effort for off-heap embedded
cache development.
I really need to hear your advices,
best regards
                                
                         
                        
                                
                                11 years, 7 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Design change in Infinispan Query
                                
                                
                                
                                    
                                        by Sanne Grinovero
                                    
                                
                                
                                        Hello all,
currently Infinispan Query is an interceptor registering on the
specific Cache instance which has indexing enabled; one such
interceptor is doing all what it needs to do in the sole scope of the
cache it was registered in.
If you enable indexing - for example - on 3 different caches, there
will be 3 different Hibernate Search engines started in background,
and they are all unaware of each other.
After some design discussions with Ales for CapeDwarf, but also
calling attention on something that bothered me since some time, I'd
evaluate the option to have a single Hibernate Search Engine
registered in the CacheManager, and have it shared across indexed
caches.
Current design limitations:
  A- If they are all configured to use the same base directory to
store indexes, and happen to have same-named indexes, they'll share
the index without being aware of each other. This is going to break
unless the user configures some tricky parameters, and even so
performance won't be great: instances will lock each other out, or at
best write in alternate turns.
  B- The search engine isn't particularly "heavy", still it would be
nice to share some components and internal services.
  C- Configuration details which need some care - like injecting a
JGroups channel for clustering - needs to be done right isolating each
instance (so large parts of configuration would be quite similar but
not totally equal)
  D- Incoming messages into a JGroups Receiver need to be routed not
only among indexes, but also among Engine instances. This prevents
Query to reuse code from Hibernate Search.
Problems with a unified Hibernate Search Engine:
   1#- Isolation of types / indexes. If the same indexed class is
stored in different (indexed) caches, they'll share the same index. Is
it a problem? I'm tempted to consider this a good thing, but wonder if
it would surprise some users. Would you expect that?
   2#- configuration format overhaul: indexing options won't be set on
the cache section but in the global section. I'm looking forward to
use the schema extensions anyway to provide a better configuration
experience than the current <properties />.
   3#- Assuming 1# is fine, when a search hit is found I'd need to be
able to figure out from which cache the value should be loaded.
      3#A  we could have the cache name encoded in the index, as part
of the identifier: {PK,cacheName}
      3#B  we actually shard the index, keeping a physically separate
index per cache. This would mean searching on the joint index view but
extracting hits from specific indexes to keep track of "which index"..
I think we can do that but it's definitely tricky.
It's likely easier to keep indexed values from different caches in
different indexes. that would mean to reject #1 and mess with the user
defined index name, to add for example the cache name to the user
defined string.
Any comment?
Cheers,
Sanne
                                
                         
                        
                                
                                11 years, 8 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        singleton @Listeners
                                
                                
                                
                                    
                                        by Mircea Markus
                                    
                                
                                
                                        This is a problem that pops up constantly: 
User:      "I add a listener to my distributed/replicated cache but this gets invoked numOwners times - can I make that to be invoked only once cluster wise?" 
Developer: "Yes, you can! You have to do that and that..."
What about a "singleton" attribute on the Listener? Would make the reply shorter:
Developer: "Use @Listener(singleton=true)"
Cheers,
Mircea
                                
                         
                        
                                
                                12 years, 2 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        storeAsBinary keeps both the object and the byte[] - why?
                                
                                
                                
                                    
                                        by Mircea Markus
                                    
                                
                                
                                        Hi,
We have the following behaviour when storeAsBinary is enabled:
- when an entry is added it is initially stored in binary format (byte[])
- when it is read from an *owning node*, it is unmarshalled and the object reference is cached in memory together with the byte representation
- the object reference is only cleaned up when cache.compact() is invoked explicitly
Assuming a key is read uniformly on all the nodes, after a while the system ends up with all the entries stored twice: the byte[] and the object in unserialized form. Of course this can be mitigated by asking the users to invoke Cache.compact - but that's quite confusing and not very user friendly as the user needs to be concerned with memory management. 
Can anybody think of some reasons why the value is kept twice? I mean besides optimising for local gets, which I think is not a good enough reason given the potentially huge memory consumption and the complexity added.
Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)
                                
                         
                        
                                
                                12 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        CacheStore redesign: no XA cache stores
                                
                                
                                
                                    
                                        by Mircea Markus
                                    
                                
                                
                                        Hi,
I don't think can support XA (JTA) enabled cache stores. Here's why:
- C1 (cache store instance) runs on node N1
- an JTA tx is started on N2 which writes to(has a key that maps to) N1 - both to the DataContainer and C1.
- the JTA transaction manager running the transaction resides on N2, so it's impossible for C1@N1 (different process) to enlist within that transaction
This limitation doesn't exist for local caches configured with a cache store, but that's rather a particular case.
Our current recovery mechanism supports situations when writing to a cache store fails during commit. If the commit fails, the user is given the opportunity to re-apply transaction's changed and that includes both memory and local storage.
Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)
                                
                         
                        
                                
                                12 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Secondary cache stores moved to their own repos
                                
                                
                                
                                    
                                        by Tristan Tarrant
                                    
                                
                                
                                        Dear all,
I have just completed the move of all "secondary" cachestores to their 
own repos on github. This means:
bdbje, jdbm, cassandra, cloud, hbase, leveldb, jpa and mongodb
This means that their release cycle is now decoupled from that of 
Infinispan itself and releases will happen on a "best-effort" basis. 
Needless to say that we should consider them at all abandoned and we 
should strive to keep them working with upstream, which is what I intend 
to do for example as part of the upcoming "undeprecation" of the 
cachestore API.
Tristan
                                
                         
                        
                                
                                12 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [infinispan-dev] SimpleFileCacheStore
                                
                                
                                
                                    
                                        by Mircea Markus
                                    
                                
                                
                                        Adding infinispan dev and Martin.
I think it makes a lot of sense for QE to run the tests you suggested. 
Sent from my iPhone
On 30 Jul 2013, at 17:56, Shane Johnson <shjohnso(a)redhat.com> wrote:
> I was looking at the code for this cache store the other day.
> 
> I noticed that is neither a log structured nor a b/b+ tree implementation and it lacks compaction. Maybe it is elsewhere or I simply missed it?
> 
> Have we run a long test (as in hours) with variable length values? It looks like this implementation may be inefficient in that it would result in numerous gaps (unusable free entries) if the values grow or shrink in size. Where the tests run on SSD drives? I'd think that this type of random read/write would not be terribly efficient on an HDD.
> 
> Have we done performance testing using FileDescriptor.sync() and have we considered using an AsynchronousFileChannel instead?
> 
> Thanks,
> Shane
                                
                         
                        
                                
                                12 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [infinispan-dev] SimpleFileCacheStore
                                
                                
                                
                                    
                                        by Mircea Markus
                                    
                                
                                
                                        On 30 Jul 2013, at 20:03, Shane Johnson <shjohnso(a)redhat.com> wrote:
> One option might be to use a fix key set size and simply increment the value for each key by X every time it is written. Sort of like an object with a collection and every time a nested object is added to the collection, the parent object is written to the cache.
In this example the aggregated objects should hold a foreign key to the aggregator, otherwise the object would grow indefinitely causing OOMs.
But good point nevertheless: if the size of every object varies from 1k,2k,3k..Nk circularly, then the total disk capacity allocated for storing an entry is ((1+N)*N)/2.
So for storing 100MB of data you'd end up with a file with size 5GB. On top of that the memory consumption grows proportionally, as we keep in memory information about all the allocated segments on disk.  
Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)
                                
                         
                        
                                
                                12 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        requirements for the new CacheStore API
                                
                                
                                
                                    
                                        by Mircea Markus
                                    
                                
                                
                                        Hi,
Starting from the original document Manik rolled out a while ago [1], here's the list of requirements I'm currently aware of in the context of the new CacheStore API:
- better integration with the fluent API (CacheStore.init() is horrendous) 
- support for non-distributed transaction cache stores (1PC) and support for XA capable cache store
- support iteration over all the keys/entries in the store
  - needed for efficient Map/Reduce integration
  - needed for efficient implementation of Cache.keySet(), Cache.entrySet(), Cache.values() methods
- a simple read(k) + write(k,v) interface to be implemented by users that just want to position ISPN as a cache between an app and a legacy system and which don't need/want to be bothered with all the other complex features
- support for expiration notification (ISPN-3064)
- support for size (efficient implementation of the cache.size() method)
Re: JSR-107 integration, I don't think we should depend on the JSR-107 API as it forces us to use JSR-107 internal structures[2] but we should at least provide an adapter layer.
[1] https://community.jboss.org/wiki/CacheLoaderAndCacheStoreSPIRedesign
[2] https://github.com/jsr107/jsr107spec/blob/v0.8/src/main/java/javax/cache/...
Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)
                                
                         
                        
                                
                                12 years, 3 months