We have talked about dropping support for asymmetric clusters quite a few
times before. I still think it's worth making it look symmetric in the API,
however I'm starting to doubt it will bring a lot of benefits in
simplifying the implementation.
These are the reasons I think we'll need to keep different cache topologies
(and consistent hashes):
a) Having the same consistent hash (or should I say hashing function)
requires some pretty drastic restrictions: the same nodes, number of
segments, capacity factors, and numOwners. In particular enforcing the same
numOwners for all the caches would not allow a distributed cache and a
replicated cache to be part of the same CacheManager.
b) Even if all caches start state transfer at exactly the same time, not
all caches will finish state transfer at the same time. It makes sense to
clean up the temporary structures related to state transfer as soon as
possible.
Further comments inline...
On Fri, Feb 20, 2015 at 12:12 PM, Tristan Tarrant <ttarrant(a)redhat.com>
wrote:
Yes, I agree with the rationale.
We should have some kind of entry point, a hypothetical "ClusterManager"
which would handle common generic components (transport, thread pools,
...) and be a factory for other subcomponents (CacheManagers and other
clustered services). Dropping support for asymmetric clusters would mean
reduced complexity and memory usage.
This gets a big +1 from me.
Tristan
On 19/02/2015 17:46, Sanne Grinovero wrote:
> All,
> at the beginning of time, the expectation was that an application
> server (aka WildFly) would have a single CacheManager, and different
> applications would define their different Cache configuration on this
> app-server singleton.
>
> In that primitive world that sounded reasonable, as system
> administrators wouldn't want to manage firewalls and port assignments
> for a new Transport for each deployed application.
>
> Then the complexities came:
> - deployments are asymmetric compared to the application server
> - each deployment has its own ClassLoader
> - deployments start/stop independently from each other
>
> At that point a considerable investment was made to get lazily
> starting Caches, per-Cache sets of Externalizer(s) to isolate
> classloaders, ClassLoader-aware Cache decorators, up to the recently
> introduced Cache-dependency rules for stopping dependant Caches last.
I don't think we ever had a per-Cache set of Externalizers, that's what
ISPN-2133 is about...
AFAIK the classloader-aware decorators work only with storeAsBinary
enabled, and they actually enable users to share the same Cache (not
CacheManager) between deployments. I'm not sure how many users need that,
but we need to think hard about dropping support for the decorators (and
the related support for Thread.getContextClassLoader()).
> Not to mention we have now a complex per-Cache View handling, which
> results in performance complexities such as ISPN-4842.
I believe we will still need the per-Cache topologies (i.e. consistent
hashes), possibly with bulk join operations and memoization to help with
performance.
Note that ISPN-4842 talks about evolving from 500 caches to 3000 caches.
Should we require the application to use an extra CacheManager every time
it needs a new Cache?
> There are some more complexities coming:
> Hibernate OGM wishes to control the context of deserialization - this
> is actually an important optimisation to keep garbage production under
> control, but also applications might want to register custom RPC
> commands; this has been a long standing problem for Search (among
> others).
> Infinispan Query does have custom RPC commands, and this just happens
> to work because the Infinispan core module has an explicit dependency
> to query.. but that's a twisted dependency scheme, as the module would
> need to list each possible extension point: it's not something you can
> do for all projects using it.
Indeed, the WildFly module that applications depend on now needs to have
access to query classes, but I think that's just a matter of supplying the
proper ClassLoader to GlobalConfigurationBuilder. Fixing it shouldn't
require core changes.
For OGM, would having the deployment's ClassLoader in the
GlobalConfiguration remove the need of a per-Cache deserialization context?
>
> Interestingly enough, there is a very simple solution which wipes out
> all of the above complexity, and also resolves some pain points:
> today the app server supports the FORK protocol from JGroups, so we
> can get rid of the idea of a single CacheManager per appserver, and
> create one per classloader and *within* the classloader.
I guess Paul should chime in here, is it feasible to create the
CacheManager(s) required for a deployment only after the module has been
loaded and a proper classloader has been instantiated?
> By doing so, we can delete all code about per-Cache classloaders,
> remove the CacheView concept, and also allow the deployment (the
> application) which is needing caching services to register whatever it
> wants.
> It could register custom interceptors, commands, externalizers,
> CacheStore(s), etc.. without pain.
To be clear, we already removed the per-Cache classloaders some time ago.
All we have now is the per-CacheManager classloader, the thread-context
classloader, and the classloader decorator. And I don't feel I know enough
about how the latter two are used now to decide whether to remove them or
not...
>
> Finally, we could get rid of the concept that Caches start lazily. I'd
> change to a simplified lifecycle which expects the CacheManager to
> initialize, then allows Cache configurations to be defined, and then
> it all starts atomically.
> At that point, you'd not even be responsible anymore for complex
> dependency resolutions across caches.
How would that work? I.e. if I define an indexed cache (either in the
configuration or in a module), when would the indexed cache define the
cache it needs to store its index?
+1 to start all the defined caches when the CacheManager starts, but I
would still like to be able to start a temporary cache and then stop it for
Map/Reduce, instead of having to spin up an entire CacheManager.
>
> I'd hope this would allow OGM to get the features it needs, and also
> significantly simplify the definition of boot process for any user,
> not least it should simplify quite some code which is now being
> maintained.
>
> A nice follow-up task would be that WildFly would need to be able to
> "pick up" some configuration file from the deployment and inject a
> matching CacheManager, so this requires a bit of cooperation with the
> app server team, and an agreement on a conventional configuration
> name.
> This should be done by WildFly (and not the app), so that the user
> deployment can lookup the CacheManager by JNDI without needing to
> understand how to wire things up in the FORK channel.
>
> I also believe this is a winner from usability point of view, as many
> of the troubles I see in forums and customers are about "how do I
> start this up?". Remember our guides essentially teach you to either
> take the AS CacheManager, or to start your own. Neither of those are
> the optimal solution, and people get in trouble.
I think looking up a "forkable channel" in JNDI and calling new
DefaultCacheManager() with that should be easy enough, if it's properly
documented :)
One thing I've seen people ask about many times is sharing a
Cache/CacheManager between deployments. Granted, some of those questions
might have actually needed only a way to share the transport, but others
probably really needed to access the same data. Having a clear way to
instantiate the CacheManager and put it up in JNDI from a common dependency
would definitely be a plus.
Cheers
Dan