June 2013 - infinispan-dev - Jboss List Archives

by yavuz gokirmak

Hi all, Is it possible to use infinispan as embedded off-heap cache. As I understood it is not implemented yet. If this is the case, we are planning to put effort for off-heap embedded cache development. I really need to hear your advices, best regards

10 years, 12 months

8
36
0 / 0

Design change in Infinispan Query

by Sanne Grinovero

Hello all, currently Infinispan Query is an interceptor registering on the specific Cache instance which has indexing enabled; one such interceptor is doing all what it needs to do in the sole scope of the cache it was registered in. If you enable indexing - for example - on 3 different caches, there will be 3 different Hibernate Search engines started in background, and they are all unaware of each other. After some design discussions with Ales for CapeDwarf, but also calling attention on … [View More]something that bothered me since some time, I'd evaluate the option to have a single Hibernate Search Engine registered in the CacheManager, and have it shared across indexed caches. Current design limitations: A- If they are all configured to use the same base directory to store indexes, and happen to have same-named indexes, they'll share the index without being aware of each other. This is going to break unless the user configures some tricky parameters, and even so performance won't be great: instances will lock each other out, or at best write in alternate turns. B- The search engine isn't particularly "heavy", still it would be nice to share some components and internal services. C- Configuration details which need some care - like injecting a JGroups channel for clustering - needs to be done right isolating each instance (so large parts of configuration would be quite similar but not totally equal) D- Incoming messages into a JGroups Receiver need to be routed not only among indexes, but also among Engine instances. This prevents Query to reuse code from Hibernate Search. Problems with a unified Hibernate Search Engine: 1#- Isolation of types / indexes. If the same indexed class is stored in different (indexed) caches, they'll share the same index. Is it a problem? I'm tempted to consider this a good thing, but wonder if it would surprise some users. Would you expect that? 2#- configuration format overhaul: indexing options won't be set on the cache section but in the global section. I'm looking forward to use the schema extensions anyway to provide a better configuration experience than the current <properties />. 3#- Assuming 1# is fine, when a search hit is found I'd need to be able to figure out from which cache the value should be loaded. 3#A we could have the cache name encoded in the index, as part of the identifier: {PK,cacheName} 3#B we actually shard the index, keeping a physically separate index per cache. This would mean searching on the joint index view but extracting hits from specific indexes to keep track of "which index".. I think we can do that but it's definitely tricky. It's likely easier to keep indexed values from different caches in different indexes. that would mean to reject #1 and mess with the user defined index name, to add for example the cache name to the user defined string. Any comment? Cheers, Sanne [View Less]

11 years

6
14
0 / 0

singleton @Listeners

by Mircea Markus

This is a problem that pops up constantly: User: "I add a listener to my distributed/replicated cache but this gets invoked numOwners times - can I make that to be invoked only once cluster wise?" Developer: "Yes, you can! You have to do that and that..." What about a "singleton" attribute on the Listener? Would make the reply shorter: Developer: "Use @Listener(singleton=true)" Cheers, Mircea

11 years, 6 months

5
11
0 / 0

release name for Infinispan 6.0.0

by Mircea Markus

Hi, Following the tradition, each Infinispan release is code is a beer. Suggestions? I'll start: - Infinium Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org)

11 years, 7 months

10
11
0 / 0

L1 Consistency with Sync Caches

by William Burns

First off I apologize for the length. There have been a few Jiras recently that have identified L1 consistency issues with both TX and non TX sync caches. Async caches with L1 have their own issues as well, but I only wanted to talk about sync caches. https://issues.jboss.org/browse/ISPN-3197 https://issues.jboss.org/browse/ISPN-2965 https://issues.jboss.org/browse/ISPN-2990 I have proposed a solution in https://github.com/infinispan/infinispan/pull/1922 which should start L1 consistency … [View More]down the right track. There are quite a few comments on it if you want to look into it more, but because of that I am moving this to the dev mailing list. The key changes in the PR are the following (non-tx): 1. Concurrent reads for a key that can retrieve a remote value are "corralled" into a single thread of execution for that given key. This would reduce network traffic with concurrent gets for the same key. Note the "corralling" only happens on a per key basis. 2. The single thread that is doing the remote get would update the L1 if able (without locking) and make available the value to all the requests waiting on the get. 3. Invalidations that are received would first check to see if there is a current remote get occurring for it's keys. If there is it will attempt to cancel the L1 write(s) before it occurs. If it cannot cancel the L1 write, then it must also wait on the current remote get completion and subsequently run the invalidation. Note the cancellation would fail when the remote get was done and it is in the middle of updating the L1, so this would be very small window. 4. Local writes will also do the same thing as the invalidation with cancelling or waiting. Note that non tx local writes only do L1 invalidations and don't write the value to the data container. Reasons why I found at https://issues.jboss.org/browse/ISPN-3214 5. Writes that require the previous value and don't have it in the L1 would also do it's get operations using the same "corralling" method. 4/5 are not currently implemented in PR. This approach would use no locking for non tx caches for all L1 operations. The synchronization point would be done through the "corralling" method and invalidations/writes communicating to it. Transactional caches would do almost the same thing as non-tx. Note these changes are not done in any way yet. 1. Gets would now update the L1 immediately after retrieving the value without locking, but still using the "corralling" technique that non-tx does. Previously the L1 update from a get was transactional. This actually would remedy issue [1] 2. Writes currently acquire the remote lock when committing, which is why tx caches are able to update the L1 with the value. Writes would do the same cancellation/wait method as non-tx. 3. Writes that require the previous value and don't have it in the L1 would also do it's get operations using the same method. 4. For tx cache [2] would also have to be done. [1] - https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&page=... [2] - https://issues.jboss.org/browse/ISPN-1540 Also rehashing is another issue, but we should be able to acquire the state transfer lock before updating the L1 on a get, just like when an entry is committed to the data container. Any comments/concerns would be appreciated. Thanks, - Will [View Less]

11 years, 8 months

3
15
0 / 0

protobuf as a marshalling format for infinispan remote-query

by Adrian Nistor

Hi all, as some probably know protobuf was chosen as the serialization format for the remote query [1]. Them main reason for choosing it was that is /simple/, time tested, has good support for schema evolution, is nearly ubiquitous (some people say google uses it :)), has multiple language support and most importantly it mandates the existence of a schema for our objects - the proto file. We need that schema on the server side to be able to extract indexable fields from those binary … [View More]cache values and index them without the need to unmarshall them into plain java domain objects. And I need to stress that the format was chosen, rather than the actual API/library provided by Google [2]. While the protobuf wire format is superb, Google's approach to create a library for marshaling objects to/from a protobuf stream is heavily based on code generation via the protoc tool. Both the marshalling code /and the entities/ to be marshaled (your beloved domain model) are generated. This does not work well if you want to bring your own domain classes to the party. So what we did is create a small set of support classes on top of google's low level wire format classes to assist users in marshaling their own domain model to the protobuf wire format without using google's code generator. These attempts are hosted on a small github project experiment [3] that will be moved to infinispan once we have a conclusion. This project contains several modules demonstrating the attempts. [4] explains the purpose of each module. So far, the approach in module /stream-like/ offers the best user experience. I would name this plan A, which is going to be implemented for infinispan 6.0. Plan A can also use the classes generated by google's protoc code generator tool. So if somebody prefers to go old school it can still work well. Also, the amount of new support code we added on top of google's library is small, so I don't foresee any nightmare in porting this to another language. I would like to get some feedback from anyone who has some time to have a look at [3], specifically at the /stream-like/ module. Stream-like is about 89.9% implemented. There are still some unimplemented methods but do not mind them for the review. And finally, if anyone could suggest a better name for the stream-like package? I can't think of any other option except /streamlike/ :) (which might be trade-marked) So any other options? If no options then we'll just call it /marshaling/. I'll move this to the ispn branch once I know the package name :) Have a nice weekend guys! -------------- [1] https://community.jboss.org/wiki/RemoteQueryDesignInInfinispan [2] https://developers.google.com/protocol-buffers/ [3] https://github.com/anistor/protobuf-playground [4] https://github.com/anistor/protobuf-playground/blob/master/README.md [View Less]

11 years, 8 months

6
10
0 / 0

configuring fetchInMemoryState for topology caches

by Mircea Markus

Hi Galder, Whilst reviewing Tristan's pull request for ISPN-3008[1] I saw that we allow configuring "fetchInMemoryState" for topology caches and wondering why we do that? Shouldn't it be enabled by default/enforced? [1] https://github.com/infinispan/infinispan/pull/1802 Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org)

11 years, 8 months

4
8
0 / 0

L1 Data Container

by William Burns

All the L1 data for a DIST cache is stored in the same data container as the actual distributed data itself. I wanted to propose breaking this out so there is a separate data container for the L1 cache as compared to the distributed data. I thought of a few quick benefits/drawbacks: Benefits: 1. L1 cache can be separately tuned - L1 maxEntries for example 2. L1 values will not cause eviction of real data 3. Would make https://issues.jboss.org/browse/ISPN-3229 an easy fix 4. Could add a new … [View More]

11 years, 8 months

9
34
0 / 0

Cachestores performance

by Radim Vansa

Hi all, according to [1] I've created the comparison of performance in stress-tests. All setups used local-cache, benchmark was executed via Radargun (actually version not merged into master yet [2]). I've used 4 nodes just to get more data - each slave was absolutely independent of the others. First test was preloading performance - the cache started and tried to load 1GB of data from harddrive. Without cachestore the startup takes about 2 - 4 seconds, average numbers for the cachestores … [View More]are below: FileCacheStore: 9.8 s KarstenFileCacheStore: 14 s LevelDB-JAVA impl.: 12.3 s LevelDB-JNI impl.: 12.9 s IMO nothing special, all times seem affordable. We don't benchmark exactly storing the data into the cachestore, here FileCacheStore took about 44 minutes, while Karsten about 38 seconds, LevelDB-JAVA 4 minutes and LevelDB-JNI 96 seconds. The units are right, it's minutes compared to seconds. But we all know that FileCacheStore is bloody slow. Second test is stress test (5 minutes, preceded by 2 minute warmup) where each of 10 threads works on 10k entries with 1kB values (~100 MB in total). 20 % writes, 80 % reads, as usual. No eviction is configured, therefore the cache-store works as a persistent storage only for case of crash. FileCacheStore: 3.1M reads/s 112 writes/s // on one node the performance was only 2.96M reads/s 75 writes/s KarstenFileCacheStore: 9.2M reads/s 226k writes/s // yikes! LevelDB-JAVA impl.: 3.9M reads/s 5100 writes/s LevelDB-JNI impl.: 6.6M reads/s 14k writes/s // on one node the performance was 3.9M/8.3k - about half of the others Without cache store: 15.5M reads/s 4.4M writes/s Karsten implementation pretty rules here for two reasons. First of all, it does not flush the data (it calls only RandomAccessFile.write()). Other cheat is that it stores in-memory the keys and offsets of data values in the database file. Therefore, it's definitely the best choice for this scenario, but it does not allow to scale the cache-store, especially in cases where the keys are big and values small. However, this performance boost is definitely worth checking - I could think of caching the disk offsets in memory and querying persistent index only in case of missing record, with part of the persistent index flushed asynchronously (the index can be always rebuilt during the preloading for case of crash). The third test should have tested the scenario with more data to be stored than memory - therefore, the stressors operated on 100k entries (~100 MB of data) but eviction was set to 10k entries (9216 entries ended up in memory after the test has ended). FileCacheStore: 750 reads/s 285 writes/s // one node had only 524 reads and 213 writes per second KarstenFileCacheStore: 458k reads/s 137k writes/s LevelDB-JAVA impl.: 21k reads/s 9k writes/s // a bit varying performance LevelDB-JNI impl.: 13k-46k reads/s 6.6k-15.2k writes/s // the performance varied a lot! 100 MB of data is not much, but it takes so long to push it into FileCacheStore that I won't use more unless we exclude this loser from the comparison :) Radim [1] https://community.jboss.org/wiki/FileCacheStoreRedesign [2] https://github.com/rvansa/radargun/tree/t_keygen ----------------------------------------------------------- Radim Vansa Quality Assurance Engineer JBoss Datagrid tel. +420532294559 ext. 62559 Red Hat Czech, s.r.o. Brno, Purkyňova 99/71, PSČ 612 45 Czech Republic [View Less]

11 years, 8 months

7
29
0 / 0

New bundler performance

by Radim Vansa

Hi, I was going through the commits (running tests on each of them) to seek the performance regression we've recently discovered and it seems that our test (replicated udp non-transactional stress test on 4 nodes) experiences a serious regression on the commit ISPN-2848 Use the new bundling mechanism from JGroups 3.3.0 (73da108cdcf9db4f3edbcd6dbda6938d6e45d148) The performance drops from about 7800 writes/s to 4800 writes/s, and from 1.5M reads/s to 1.2M reads/s (having slower reads in … [View More]

11 years, 8 months

5
19
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-dev June 2013