[infinispan-dev] Feedback from Mobicents Cluster Framework on top of Infinispan 5.0 Alpha1

Tue Jan 4 05:30:45 EST 2011

See comments inline:

On Jan 3, 2011, at 3:44 PM, Eduardo Martins wrote:

> See inline.
> 
> On Mon, Jan 3, 2011 at 10:38 AM, Galder Zamarreño <galder at redhat.com> wrote:
>> See below my comments:
>> 
>> On Dec 30, 2010, at 7:05 AM, Eduardo Martins wrote:
>> 
>>> Hi all, just completed first iteration of Mobicents Cluster Framework
>>> 2.x, which includes impl using first alpha release of Infinispan 5. We
>>> have everything we had in jboss cache working, it's a dumb down but
>>> higher level framework, with stuff like Fault Tolerant timers, which
>>> is then reuse in whole Mobicents platform to provide cluster related
>>> features. I believe we now already use a lot of stuff from Infinispan
>>> rich feature set, so I guess it's good timing to give some feedback,
>>> report some minor issues, and clear some doubts.
>>> 
>>> 1) Marshallers
>>> 
>>> Generally I'm "just OK" with the current API, the fact it
>>> Externalizers depends on Annotations makes impossible for a higher
>>> level framework to allow it's clients to plug their own kind of
>>> Externalizers, without exposing Infinispan classes. In concrete, our
>>> framework exposes its own Externalizer model, but to use them we need
>>> to wrap the instances of classes these handle, in classes bound to
>>> Infinispan Externalizers, which when invoked by Infinispan then lookup
>>> our "own" Externalizers, that is, 2 Externalizers lookups per call. If
>>> there was no annotations required we could wrap instead our clients
>>> Externalizers and plug them into Infinispan, this would mean a single
>>> Externalizer lookup. By the way, since the Externalizer now uses
>>> generics the Annotations values look bad, specially since it's
>>> possible to introduce errors that can compile.
>> 
>> First of all, the current API that you use in 5.0.0.ALPHA1 is not yet set in stone, and so feedback like this is invaluable to get it right for ALPHA2 or BETA.
>> 
>> Externalizer is a concept that was borrowed from JBoss Marshalling that was squashed to the bare minimum needed by Infinispan. Unless you plan to provide alternative Externalizer implementations based in another framework (I don't see this really doable since the way they're instantiated and used is very particular), I don't see the need to abstract this as hugely important, although I can understand your leak worries.
>> 
> 
> We have the need to keep our framework fully independent of what is
> below, for instance in 1.x we had a few bounds to JBoss Cache API,
> such as FQN, now we had to redo a lot in the framework. The framework
> is used in different projects of the Mobicents platform, being
> independent of the underlying impl allows the "client" code to don't
> need any update, that is, we just update the impl and everything keeps
> working (ideally).
> 
>> Personally I think annotations fit perfectly for defining the classes it externalizers and optionally the ids. However, a different avenue could be taken here which is adding getTypeClasses()/getTypeClassNames()/getId() methods to Externalizer interface, but I'm not keen on doing that since IMO pollutes a very clean Externalizer interface. Another possibility could be to provide a sub-interface of Externalizer where those methods are added and give people the freedom to decide how to feed in this metadata, either via API or annotations. However, this could be adding complexity that might confuse Externalizer providers.
>> 
> 
> We are talking about 3 methods, don't think it would be that bad, the
> implementation of these methods would possibly be as simple as setting
> the annotations. IMHO this would be a very good move.
> 
>> Not sure I understand your 2 lookup call problem though. Maybe you can clarify this further with the proposal I make above and how it would vary?
> 
> Our framework does not limits what the "clients" store in the cache,
> it can be objects of any class, which means that if we want
> serialization to be as fast as possible we need to expose
> Externalizers concept. Our current solution is to wrap the data going
> into the Cache in objects that have Externalizers, the double lookup
> is the Infinispan lookup for the wrapper externalizer, and then the
> wrapper's Externalizer lookup process for the concrete "client"
> externalizer code.
> 
> A quick example, lets say the "client" wants to store X and provided
> our framework a class XExternalizer that has the methods to read and
> write X to I/O. When storing an X instance in Infinispan we wrap it in
> XWrapper, being XWrapper bound to a XWrapperExternalizer installed in
> Infinispan. So for instance on reading an X instance from Input,
> Infinispan will lookup XWrapperExternalizer and invoke it, then
> XWrapperExternalizer will lookup XExternalizer to do the actual read.

Right, so your XWrapperExternalizer would be the bridge that would be annotated, hence hiding away the annotation from the user, correct?

> 
>> And I don't understand the errors that compile either. Please provide an example to understand this better.
> 
> @Marshalls(typeClasses = Y.class, id = ExternalizerIds.XExternalizer)
> public class XExternalizer implements
> 		Externalizer<X> {
> 
> Is there any validation that Y is related to X?

Good point Eduardo. No, there's no such validation and it's not possible to do cos generics are not available at runtime and there's no link between the annotation and the Externalizer. I think this is a powerful reason to integrate the get* methods referred above in the Externalizer interface to allow us to tie types, i.e.

List<Class<? extends T>> getTypeClasses();

Bounded wildcards in this case are Ok cos the consumer of this method would be the internal framework and not any client code. Also, using collections rather than arrays makes it easier to code.

This is being tracked by https://issues.jboss.org/browse/ISPN-857

> 
>>> 
>>> Another issue, in this case I consider it of minor importance due to
>>> low level of Externalizers concept, is the lack of inheritance of
>>> Externalizers, for instance we extend DefaultConsistentHash, instead
>>> of extending its Externalizer I had to make an Externalizer from
>>> scratch, and copy of DefaultConsistenHash's Externalizer code. This is
>>> messy to manage, if the code on Infinispan change we will always have
>>> to check if the code we copied is still valid.
>> 
>> Hmmm, can't you use delegation instead of inheritance? Delegation is generally considered to be cleaner than inheritance. For example, see org.infinispan.transaction.xa.DldGlobalTransaction and org.infinispan.transaction.xa.GlobalTransaction. DldGlobalTransaction extends GlobalTransaction but DldGlobalTransaction's Externalizer does not extend GlobalTransaction's. Instead, DldGlobalTransaction's Externalizer takes GlobalTransaction's Externalizer in the constructor and then delegates accordingly. This avoids any code duplication.
>> 
> 
> The problem in such model is that you need a way to construct the
> externalizer to delegate, so unless you have a factory somewhere in
> Infinispan you will always rely on its code. That or require that
> externalizers have a specific constructor.

Hmmm, not necessarily. Normally Externalizer instances are singletons, without any state that actually changes, so this kind of code would be quite typical:

private final GlobalTransaction.Externalizer delegate = new GlobalTransaction.Externalizer(new GlobalTransactionFactory(true));

Where DldGlobalTransaction's Externalizer has the delegate as a private final constant.

> 
>> Btw, out of curiosity, why finding yourself needing to extend the DefaultConsistentHash externalizer?
> 
> I actually don't know, it was not made due to an execution error. I
> saw there an Externalizer, I assume it's going to be "externalized",
> which means our DefaultConsistentHashExt that replaces it will also
> be.
> 
> Out of curiosity, here is the code, it forces colocation of certain data:
> 
> http://mobicents.googlecode.com/svn/branches/servers/cluster/2.x/infinispan/src/main/java/org/mobicents/cluster/infinispan/distribution/DefaultConsistentHashExt.java

That code is a bit strange. If you need to make sure that the hash from InfinispanClusterDataKey is derived from its key's hash depending on whether the key is DependentClusterDataKey or something else, why not just adjust  InfinispanClusterDataKey.hashCode() accordingly? i.e.

InfinispanClusterDataKey.hashCode() {
  if (getKey() instanceof DependentClusterDataKey) {
    return ((DependentClusterDataKey) clusterDataKey).dependsOn().hashCode();
  } else {
    return getKey().hashCode()
  }
}

I don't think you need a different consistent hash implementation.

> 
>>> 
>>> 2) CacheContainer vs CacheManager
>>> 
>>> The API is now deprecating the CacheManager, replacing with
>>> CacheContainer, but some functionality is missing, for instance
>>> CacheContainer is not a Listenable, thus no option to register
>>> Listeners, unless unsafe casting of the instance exposed by the Cache
>>> or a ref is available to the original CacheManager. Related matter,
>>> IMHO a Cache listener should be able to listen to CacheManager events,
>>> becoming global listeners.
>> 
>> Did you look at EmbeddedCacheManager? That's what you should be coding against that allows you to add/remove listeners. You should not use CacheManager at all.
>> 
> Yes, I looked, the concrete issue here is not being able to register a
> CacheManager listener starting with a Cache instance, that is, not
> without un unsafe cast, did I miss anything?

Oh right, I see what you mean. You're saying that getCacheManager() in Cache returns a CacheContainer rather than an EmbeddedCacheManager. There's a reason for this:

CacheContainer is the common demoninator for operations that are accessible by remote and local caches. EmbeddedCacheManager extends CacheContainer to add operations that only make sense on a local basis, whereas RemoteCacheManager does the same for remotely accessed Infinispan instances. For the moment, you cannot add listeners to remote cache/cachemanager instances, hence why those methods are only available in EmbeddedCacheManager.

It might make sense to have an EmbeddedCache interface that ties up this up via covariant returns, i.e.:

EmbeddedCacheManager getCacheManager();

> 
>> Wrt listener, what are you trying to say? That if a listener is registered to a Cache, then it should receive CacheManager notifications if it has methods annotated accordingly, i.e. @CacheStopped?
> 
> Yes.

I think this could be tricky cos if you promote something like this and you have @CacheStarted and @ViewChanged annotations, you're likely to miss them at least for the Cache in which you've added the cache manager listener, and you'd have maybe lost the first view change. This could be partially solved by adding the capability to get a cache without starting it, but you'd still have the issue of view change miss.

To make sure you don't miss a notification, the recommended thing is add cache manager listeners after creating the cache manager and before starting it.

> 
>>> 
>>> 3) Configuration stuff
>>> 
>>> Generally I think the Configuration and GlobalConfiguration could be
>>> simplified a bit, I found myself several times looking at the impl
>>> code to understand how to achieve some configurations.
>> 
>> Configurations such as?
> 
> Example below, but this is a generic remark, more like a call to try
> to simplify as much as possible, I can say that it took me much more
> time to settle on configurations than to code.
> 
>> 
>>> Better
>>> documentation wouldn't hurt too, it's great to have a complete
>>> reference, but the configuration samples are not ok, one is basically
>>> empty, the other has all possible stuff, very unrealistic, would be
>>> better to have reusable examples for each mode, with then
>>> recommendations on how to improve these.
>> 
>> Which configuration samples do you refer to? The ones in the testsuite?
> 
> The ones bundled in etc/config-samples.

Precisely for 5.0.0.ALPHA1 and 4.2.0.FINAL we removed all.xml and added sample.xml that has more meaningful configurations in there with several of them commented and ready for use.

> 
>> 
>> I suppose the reusable examples that you look for might be partially available via TestCacheManagerFactory except that these are oriented at testing (i.e. make sure each testsuite thread uses a different mcast address). Maybe we should extract these out and make more easily available.
>> 
>>> 
>>> Infinispan 5 introduces a new global configuration setter to provide
>>> an instance, with same method name as the one to provide the class
>>> name. I believe one is enough, and to be more friendly with
>>> Microcontainer and similar frameworks I would choose the one to set
>>> the instance.
>> 
>> Which configuration setter are you referring to?
> 
> setMBeanServerLookup(...)

Right. The one that takes a String is used by JAXB code whereas the one taking the instance is used by programmatic configuration.

I think the latter should be renamed to setMBeanServerLookupInstance at least.

On a more global level, GlobalConfiguration might need separating into a superclass for programmatic configuration and a subclass for JAXB or something similar, in order to avoid confusion between the different methods. Vladimir WDYT?

> 
>> It's worth mentioning that GlobalConfiguration mixes both JAXB method requirements and programmatic configuration needs, so maybe that's what's confusing things.
> 
> Perhaps.
> 
>>> 
>>> 4) AtomicMap doubt
>>> 
>>> I read in the Infinispan blog that AtomicMap provides colocation of
>>> all entries, is that idea outdated? If not we may need a way to turn
>>> that off :) For instance would not that mean the Tree API does not
>>> works well with distribution mode? I apologize in advance if I'm
>>> missing something, but if AtomicMap defines colocation, AtomicMap is
>>> good for the node's data map, but not for the node's childs fqns.
>>> Shouldn't each child fqn be freely distributed, being colocated
>>> instead with the related node cache entry and data (atomic)map? Our
>>> impl is kind of an "hybrid" of the Tree API, allows cache entries
>>> references (similar to childs) but no data map, and the storage of
>>> references through AtomicMap in same way as Tree API worries me.
>>> Please clarify.
>> 
>> Each FQN has underneath an AtomicMap, so each you won't find yourself finding k,v pairs belonging to a particular Fqn in different nodes.
>> 
>> We make no guarantees wrt child fqn nodes. So, just cos FQN B is child of FQN A, it does not mean that the atomic map of B will be in same node as atomic map of B.
> 
> Before I proceed with my reasoning, please clarify, the colocation
> within AtomicMap is real? If I store data there, all data will be
> colocated?

Yup, all *data* stored in the AtomicMap will be located in the same node. It's treated as a single entity.

> 
>> 
>>> 
>>> 5) Minor issues found
>>> 
>>> See these a lot, forgotten info logging?
>>> 
>>> 03:39:06,603 INFO  [TransactionLoggerImpl] Starting transaction logging
>>> 03:39:06,623 INFO  [TransactionLoggerImpl] Stopping transaction logging
>> 
>> Most likely: https://issues.jboss.org/browse/ISPN-852
>> 
>>> 
>>> MBean registration tries twice the same MBean, the second time fails
>>> and prints log (no harm besides that, the process continues without
>>> failures):
>>> 
>>> 03:39:06,395 INFO  [ComponentsJmxRegistration] Could not register
>>> object with name:
>>> org.infinispan:type=Cache,name="___defaultcache(dist_sync)",manager="DefaultCacheManager",component=Cache
>> 
>> That should not happen: https://issues.jboss.org/browse/ISPN-853
>> 
>>> 
>>> 6) Final thoughts
>>> 
>>> Kind of feel bad to provide all these negative stuff in a single mail,
>>> took me an hour to write it, but don't get me wrong, I really enjoy
>>> Infinispan, it's a great improvement. I'm really excited to have it
>>> plugged in AS7 (any plan on this?) and then migrate our platform to
>>> this new cluster framework. I expect a big performance improvement, on
>>> something already pretty fast, and much less memory usage, our
>>> Infinispan impl "feels" very fine grained and optimized in all points
>>> of view. Of course, the distribution mode is the cherry on top of the
>>> cake, hello true scalability.
>> 
>> I don't think it's negative stuff at all. On the contrary, I think it's really valuable feedback from another Infinispan user :)
> 
> I don't think it's negative also, but some could interpret like that,
> you know, too many complaints, whatever ;-)
> 
>> 
>> AS7 will of course have Infinispan but in due course.
>> 
> 
> So right now, there is no work being done to plug Infinispan in AS7?

I can't say 100% for sure, Paul Ferraro or Scott Marlow will be able to clarify this further, but AFAIK, this is not being done yet.  

> 
> -- Eduardo
> ..............................................
> http://emmartins.blogspot.com
> http://redhat.com/solutions/telco
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache