Avro for basic type marshalling (ISPN-508)
by Galder Zamarreño
Hi,
Re: https://jira.jboss.org/browse/ISPN-508
I've done some little tests with simple data types and you can find the results of my tests in http://spreadsheets.google.com/pub?key=0Ag5RGdzR_GsldDdOVnpwbDUyT20xZlZPZ...
After some consideration, I'm going for Apache Avro for basic type marshalling. It offers the best compromise between small payloads and simple marshaller implementation. I had doubts between Avro and Protobufs, particularly when Protobufs was encoded manually. However, Avro offers a higher level, easier to read and simpler implementation (http://pastebin.mozilla.org/759797) compared to manual Protobufs (http://pastebin.mozilla.org/759800). On top of that, Avro offers support for two extra languages that Protobufs doesn't, which include C and Ruby. Finally, Avro offers same payload sizes as the most efficient encoding for Protobufs.
Same as Protobufs manual encoding, Avro does not require any precompilation for handling basic type/collections.
Note that the choice of library at this level does not force users to have to use this library for custom objects, although the library for basic types might not support a language that the tool users choose for custom types, which could be a problem. With Avro, I think we have a fair few languages covered, so I don't forsee this problem in the near future. Users can safely use Protobufs or any other library and the marshaller will just treat it as a normal byte[].
Thrift and Message Pack are IMO still in their infancy and are not ready for the job.
Next up I'll build the full marshaller. At this point, I think it makes sense to have this marshaller in the Hot Rod module.
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
13 years, 10 months
multiple conflicting versions as dependencies (Lucene but not only)
by Sanne Grinovero
Hello,
I'm wondering how to solve ISPN-275 (Update Lucene Directory to work
with newer Lucene versions v.3.0.x)
The Lucene Directory works fine with latest Lucene 3.0.2, my doubt is
if I should bump the dependency version to that the tests are really
running against 3.0.2; they are currently run against 2.9.3.
The property <version.lucene> was local to the LD module only, to not
interfere with a potential different version in other modules: the
Query module
is depending on Search 3.2 which needs Lucene 2.9.x (and is not 100%
compatible with 3.0.x).
So until Hibernate Search is able to deal properly with the newer
Lucene we should use 2.9.x as main reference, but for all potential
users using the newer Lucene I think we should have the Lucene
Directory module tested on 3.0.x too - not an easy feat with Maven
AFAIK.
Being compatible with Hibernate Search is not the only reason, as the
directory works fine with all Lucene versions from 2.4 to 3.x it's
useful to a broader range of users: Lucene's API changes are relevant
and intrusive, it's not easy to jump from 2.4 to 3 for existing
applications, for this reasons it's imho a great value if we keep it
tested against both branches of Lucene, at least for some time.
As said, no code changes are needed; the easy way is to add a Hudson
target using
"mvn clean test -Dversion.lucene=3.0.2", I could add a profile for
that, but still you have to invoke the build twice.
I wonder if someone has better ideas?
As a side note, I noticed an additional inconsitency: the parent
pom.xml defines a dependency to javax.persistence:1.0 used by
infinispan-jopr-plugin but Search is going to introduce in classpath
hibernate-jpa-2.0-api; also this module is tested with hibernate-core
3.3.1; wouldn't this need to be tested by the same versions introduced
by the Query module?
13 years, 10 months
Comments on latest iteration of Distributed Task Execution design
by Manik Surtani
Vladimir,
Here are some thoughts on your latest iteration (Version 8) of this page:
http://community.jboss.org/wiki/InfinispanDistributedExecutionFramework
There are actually 2 distinct use cases that this feature would touch:
1) pushing a Callable to the node where state is located and pull back a result
2) breaking up a task into Callables and then pushing the Callables out as per (1).
Now only (2) is true map/reduce (and relevant to fork/join) but for many cases, (1) alone is enough. So whatever API we propose should support both. Simple remote code execution to leverage data locality where the user doesn't care about breaking up tasks into subtasks, as well as proper task decomposition.
But we should be clear about these differences - even though on an impl level they are very closely related. The stuff you have so far handles case (2) pretty well but we should also expose (1).
Some other feedback:
* I presume distributed task monitoring and annotations are sections that come under "outside of scope"? They seem to be on the same level so I wasn't sure of your intentions here.
* Proposed interfaces - Not sure if I understand the purpose of DistributedCallable#mapped(). You already assign a cache to the callable in DistributedCallable#initialized(), right?
* DistributedCallable#preferredExecutionNodes() - do we really want to support this at this stage? Or is it better to not support arbitrary node selection by end-user code? Simpler would be to add a DistributedCallable#getRelatedKeys() which returns a Set of keys which the callable would be expected to touch, so we can decide where to route the task. Or maybe you want to offer both forms, so that a DistributedCallable could *either* provide a set of nodes to execute on or a set of keys which it expects to touch.
Cheers
Manik
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
13 years, 10 months