Configuration visitor - Re: [JBoss JIRA] Commented: (ISPN-145) No transport and singleton store enabled should not be allowed
by Vladimir Blagojevic
Hi,
Galder and I talked about this offline. Time to involve you guys!
I just completed visitor pattern for our configuration objects. Visitor
is passed from root of configuration - InfinispanConfiguration object.
InfinispanConfiguration class has a new method:
public void accept(ConfigurationBeanVisitor v)
How do we want to integrate this visitor into existing structure?
1) We add a new factory method to InfinispanConfiguration with
additional ConfigurationBeanVisitor parameter
2) We leave everything as is and if there is a need to pass some visitor
we pass it to InfinispanConfiguration instance directly (from
DefaultCacheManager)
DefaultCacheManager will pass ValidationVisitor to
InfinispanConfiguration that will verify configuration semantically.
Regards,
Vladimir
On 09-09-09 10:19 AM, Galder Zamarreno wrote:
> Good idea :)
>
> On 09/09/2009 04:13 PM, Vladimir Blagojevic wrote:
>> Yeah,
>>
>> I was thinking that we can make a visitor for configuration tree and
>> then you can do verification of any node and other things as well. Use
>> cases will come up in the future for sure.
>>
>> Cheers
>>
>>
>>
>> On 09-09-09 3:29 AM, Galder Zamarreno (JIRA) wrote:
>>> [
>>> https://jira.jboss.org/jira/browse/ISPN-145?page=com.atlassian.jira.plugi...
>>>
>>> ]
>>>
>>> Galder Zamarreno commented on ISPN-145:
>>> ---------------------------------------
>>>
>>> Not sure I understand what you mean by generic though. You mean any
>>> component to have a validation step of some sort?
>>>
>>> Thanks for taking this on :)
>>>
>>>> No transport and singleton store enabled should not be allowed
>>>> --------------------------------------------------------------
>>>>
>>>> Key: ISPN-145
>>>> URL: https://jira.jboss.org/jira/browse/ISPN-145
>>>> Project: Infinispan
>>>> Issue Type: Bug
>>>> Components: Loaders and Stores
>>>> Affects Versions: 4.0.0.ALPHA6
>>>> Reporter: Galder Zamarreno
>>>> Assignee: Vladimir Blagojevic
>>>> Priority: Minor
>>>> Fix For: 4.0.0.CR1
>>>>
>>>>
>>>> Throw configuration exception if singleton store configured without
>>>> transport having been configured.
>>>> It makes no sense to have singleton store enabled when there's no
>>>> transport.
>>
>
13 years, 3 months
Using Coverity scan?
by Sanne Grinovero
Hello,
Did you consider enabling Infinispan to be monitored by coverity's
code analysis services? They are free for OSS projects, I saw a demo
recently and was quite amazed. It's similar to FindBugs, but not only
about static code checks. They checkout your code from trunk and then
run several analysis on it periodically, one of them is about dynamic
thread behavior to predict deadlocks or missing fences instrumenting
the code, and produce nice public reports; AFAIK you don't need to
setup anything yourself, besides getting in touch to ask for it.
It's only available for C and Java code, and they have an impressive
list of OSS projects in the C world (linux kernel, httpd server,
samba, gnome, GCC, PostgreSQL, ...) but not much on Java.
http://scan.coverity.com/
No, I'm not affiliated :-) Just thinking that it might be useful to
have if it's not too hard to setup.
Cheers,
Sanne
14 years
Server location hints in Infinispan
by Manik Surtani
This relates to https://jira.jboss.org/jira/browse/ISPN-180.
In JBoss Cache, we had a provision to allow for pluggable buddy selection algorithms. By default, the buddy selection process would first try and pick a buddy in the same buddy group, failing which any buddy *not* on the same physical machine, failing which any buddy not in the same JVM, and finally any buddy at all. Further, being pluggable, people could write their own buddy selection algorithms to pick buddies based on any additional metrics, such as machine performance by hooking into monitoring tools, etc.
In Infinispan we do not have an equivalent as yet. The consistent hash approach to distribution takes a hash of each server's address and uses this to place the server on a consistent hash wheel. Owners for keys are picked based on consecutive places on the wheel. So there is every possibility that nodes on the same physical host or rack are selected to back each other up, which is not optimal for data durability.
One approach is for each node to provide additional hints as to where it is - hints including "machine id", "rack id" and maybe even "site id". The hash function that calculates an addresses position on the hash wheel would take these 3 metrics into account, so this should be robust and pretty efficient. The only drawback with this approach is that for each address, this additional data needs to be globally available since CH's need to work globally and deterministically. This information could be a part of a DIST JOIN request, which would work well.
What do people think? Any interesting alternate approaches to this problem?
Cheers
Manik
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
14 years, 7 months
HotRod, ClientIntelligence and client-side key location
by Manik Surtani
I've been thinking about how we handle this, and I think we have a problem with smart clients where clients have the ability to locate the key on the server cluster in order to direct the request to the specific node.
The problem is in hash code calculation. The HotRod protocol caters for this with regards to calculating node address hash code by passing this in the topology map (see "Hasher Client Topology Change Header" in [1]), but the only way this can be meaningfully used is if the client has the ability to calculate the hash code of the key in the same manner the servers do. Firstly, this is hard when consumed by non-Java clients as you'd need to implement the way the JDK calculates the hash code of a byte array. Second, you'd need detailed and specific knowledge of any bit spreading that takes place within Infinispan - and this is internal implementation detail which may change from release to release.
So the way I see it I can't see how non-Java clients will be able to locate keys and then direct requests to the necessary nodes. In fact, even with Java clients the only way this could be done would be to send back marshalled Addresses in the topology map, *and* have the same version of the Infinispan server libs installed on the client, *and* ensure that the same JDK/JVM version is used on the client.
Can we think of a better way to do this? If not, is it worth still supporting client-side consistent hash based key location for the weird but vaguely workable scenario for Java-based clients?
Thoughts?
Cheers
Manik
[1] http://community.jboss.org/wiki/HotRodProtocol
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
14 years, 8 months
JBMAR issue with nested objects
by Galder Zamarreno
Hi David,
A user in the Infinispan forum an issue in JBoss Marshalling
(http://community.jboss.org/message/534814#534814). I've condensed the
test into a JBoss Marshalling test that you can find attached. Could you
please have a look to it? I've tested this with JBMAR 1.2.0.GA.
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
14 years, 9 months
More thoughts on HotRod's ForceReturnPreviousValue
by Galder Zamarreno
Hi all,
I've been thinking further about ForceReturnPreviousValue flag in
http://community.jboss.org/wiki/HotRodProtocol.
Currently, if no ForceReturnPreviousValue is passed, the responses for
put, putIfAbsent, replace, replaceIfUmodified, remove and
removeIfUmodified looks like this:
[header]
Now, if ForceReturnPreviousValue is passed, the idea is this: If
ForceReturnPreviousValue has been passed, responses will contain previous
[value length][value] for that key. If the key does not exist or previous
was null, value length would be 0. Otherwise, if no
ForceReturnPreviousValue was sent, the response would be empty.
However, as it is, this means that for a put, the response would be
either, [header] or [header][value lenght][value]. So, in effect, the
client decoder needs to know about the request itself to be able to
determine whether it needs to read the value length or not. I wonder
whether this could be imposing some restrictions on the client decoder.
So, instead, I was wondering whether this might make more sense and make
implementation of the client decoder easier.
put, putIfAbsent, replace, replaceIfUmodified, remove and
removeIfUmodified responses looks like this:
[header][value length]
*If no ForceReturnPreviousValue is sent, value length will always be 0*.
If ForceReturnPreviousValue is passed and key was not present, or previous
value was null, value length is 0. Otherwise, value length will be non
zero and value will follow.
The idea here is that these ops responses always return a value length,
regardless of whether the flag is present or not. This, I think, would
make client dcoders easier since they always know what they need to read
and can pass it back without knowing which flags were passed on the
request.
Thoughts?
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
14 years, 9 months
Hash functions
by Manik Surtani
Some very early results,
http://pastie.org/883111
Only testing MurmurHash2 (endian-neutral variant which is 2x as slow as the original algo) and SuperFastHash. I haven't implemented FNV-1 as yet.
Some notes:
* Test was run over 100k random keys
* Max size for a String and byte[] key set to 16. Actual size is a random number between 1 and MaxSize.
* Functions mainly implemented to handle byte[]'s.
* Functions handle Strings by calling String.getBytes(). The bulk of the time spent in String keys is therefore attributed to String.getBytes().
* Functions handle Object hashcodes by taking the int hashcode and creating a 4-element byte[] out of it. Again, the bulk of the time spent here is in this conversion.
* Keys generated before any measurements taken, a full cycle run to warm up the hotspot compiler as well.
Looks like MurmurHash2, despite using the slower version to accommodate CPU endian neutral behaviour, is winning in terms of distribution. And by a fair way too.
For those interested in the test and the hash impls, have a look at this (dependency on Apache commons-math):
http://pastie.org/883135
Cheers
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
14 years, 9 months
asynchronous change manager for high load Lucene
by Sanne Grinovero
Hello,
as I had anticipated in some chats, my current feeling with the Lucene
Directory is that I got a marvelous new device but I can't completely
use it,
like currently in my home town they installed a new pair of train
trails for high speed trains to connect to the city but they can't
afford the high-speed trains.
Or, more fitting, like getting a 64-way server for a service which
can't use more than one thread, because of some old dependency. You
would definitely want to solve this limitation :D
The Lucene Directory is doing well in read-most situations, and a nice
to have for huge sized indexes to easily sync them in dynamic-topology
clusters.
But when the many participant nodes all potentially apply changes, a
global index lock needs to be acquired; this is safely achievable by
using the LockFactory implementations provided in the lucene-directory
package but of course doesn't scale.
In practice, it's worse than a performance problem: Lucene expects
full ownership of all resources, and so the IndexWriter implementation
doesn't expect the Lock to timeout, and has no notion of fairness in
the wait process, so if your architecture does something like "each
node can ask for the lock and apply changes" without some external
coordination, this only works in case the contention on this lock is
low enough; if it piles up, it will blow up the application with
exceptions on indexing.
Best solution I've seen so far is what Emmanuel implemented years ago
for Hibernate Search by using JMS to send changes to a single master
node. Currently I think the state-of-the-art installation should
combine such a queue-like solution to delegate all changes to a single
node, and this single node should apply the changes to an Infinispan
Directory - so making changes to all other nodes visible through
efficient Infinispan distribution/replication.
Replication was done in the past by using an rsync-like file copy, so
the new benefit would be to ease the setup of Directory replication,
but you still need a dedicated master node and work on setting this
up.
Now I would like to use Infinispan to replace the JMS approach,
especially as in cloud environments it's useful that the different
participants which make up the service are all equally configured:
having a Master Writer node is fine as long as it's auto-elected.
(not mandatory, just very nice to have)
The problems to solve:
A) Locking
The current locking solution is implemented by atomically adding a
marker value in the cache. I can't use transactions as the lock could
span several transactions and must be visible.
It's Lucene's responsibility to clear this lock properly, and I can
trust it for that as far as the code can do. But what happens if the
node dies? Other nodes taking over the writer role should be able to
detect the situation and remove the lock.
Proposals:
- A1) We don't lock, but find a way to elect one and only one writer
node. This would be very cool, but I have no idea about how to
implement it.
- A2) We could store the node-address as value in the marker object,
if the address isn't part of the members the lock is cleared. (can I
trust the members view? the index will corrupt if changed by two
nodes)
B) Sending changes to writer
As for Lucene's design the IndexWriter is threadsafe and is an heavy
to build object, it should be reused as much as possible to insert
many Documents at once. So when a node managed to acquire the Lock it
should keep the IndexWriter open a relatively long time, and possibly
receive changes from other nodes to be applied on the index.
Proposal:
- B1) Use JMS or JGroups directly (as Hibernate Search is currently
capable to use both), again I face the problem of IndexWriter node
election, and have the messages sent to the correct node.
In both cases I would like to receive enough information from
Infinispan to now where to send messages from the queue.
- B2) Exploit the ConcurrentMap nature of Infinispan: I don't need
strict ordering of change-requests if we can make an assumption on the
Lucene Documents: to have each document identifiable.
This is usually the case, and always is in Hibernate Search where
each Document entry is identified by (typeOfEntity, PrimaryKey).
Assuming we find out how to start the IndexWriting process on a
single node, we could have a Cache to store change requests on the
index: if for each changed entity I was inserting a key made of
(documentID,typeOfOperation) and value containing eventually the new
Document to be written and some timestamp/counter. typeOfOperation
could be "delete" or "add", which are the only supported operations by
Lucene. I believe I could read the timestamp from the entry?
The timestamp would be needed to recognize what to do in case of
having both a delete and a add operation on same entity (add,delete ->
noop; delete,add --> update).
So the node which is running the Lucene IndexWriting task could
periodically iterate on all entries of this cache and apply latest
needed operations, using atomic removeIfUnchanged . It would be fine,
even better, in case of overwritten entries as I would write only
latest version of a Document in case it's being changed quicker than
what we can write.
The drawback of this approach is that it assumes that, while it will
buffer temporary spikes of load, it needs to be on average faster to
write out all changes, and that changes on different entities are
applied in unpredictable order, still I'm liking this solution the
most as it looks like it can best use Infinispan for maximum
performance.
Thoughts? Problems?
I'd also appreciate some example of a job that needs to be running on
a single node only if you have one, and would love to avoid depending
on more than Infinispan.
Cheers,
Sanne
14 years, 9 months
Hot Rod protocol design at CR stage. Feedback welcome
by Galder Zamarreno
Hi all,
After having yet another round of discussions, we consider the Hot Rod
protocol design at CR stage. Here's the final results:
http://community.jboss.org/wiki/HotRodProtocol
While me and Mircea are busy coding both the server and client, please
take some time to have a read through it and let us know what you think.
By the way, remember that we've deferred certain functionality for the
moment as indicated in a previous email:
[ISPN-375] Enable Hot Rod clients to start transactions [Open, Major,
Galder Zamarreno] http://jira.jboss.org/jira/browse/ISPN-375
[ISPN-374] Add event handling to Hot Rod [Open, Major, Galder Zamarreno]
http://jira.jboss.org/jira/browse/ISPN-374
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
14 years, 9 months
Infinispan public API
by Manik Surtani
Mircea and I were chatting about the HotRod client that he is writing, and we agreed that the best approach as far as API is concerned is for RemoteCache to extend Cache, RemoteCacheManager to extend CacheManager, etc., to maintain familiarity with p2p-style interaction with Infinispan as well as some ease in switching between interaction styles.
One impact of this is that the HotRod client would then have a dependency on infinispan-core. So the question is, is this OK? Pulling in a large number of classes that won't really be used except for the super-interfaces?
One alternative is to pull the public API interfaces out of infinispan-core into a very light infinispan-api module. This way the HotRod client can just add a dependency on this jar, but it would mean infinispan-core has a dependency on this jar as well. One side-effect benefit here is that public API will be very clearly defined.
Thoughts?
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
14 years, 9 months