[infinispan-dev] RPCs for non-existant caches ought not throw exception
Paul Ferraro
paul.ferraro at redhat.com
Mon Sep 13 10:19:13 EDT 2010
On Mon, 2010-09-13 at 15:05 +0200, Galder Zamarreño wrote:
> I've had a brief look at this, need to spend a bit more time but here's an initial view on this,
>
> At the moment at least, InboundInvocationHandlerImpl doesn't take in
> account ComponentStatus to see if it's up. It only checks whether the
> component registry is null, but a ComponentStatus check might make
> more sense.
After the component registry null check, is the following:
if (!cr.getStatus().allowInvocations()) {
giveupTime = System.currentTimeMillis() + localConfig.getStateRetrievalTimeout();
while (cr.getStatus().startingUp() && System.currentTimeMillis() < giveupTime) Thread.sleep(100);
if (!cr.getStatus().allowInvocations()) {
log.warn("Cache named [{0}] exists but isn't in a state to handle invocations. Its state is {1}.", cacheName, cr.getStatus());
return RequestIgnoredResponse.INSTANCE;
}
}
So, there is, in fact, a ComponentStatus check. If the registry is not
RUNNING, then we spin for up to 30 seconds for the status to become
RUNNING. For a stopping or stopped cache, this does not seem to make
sense, since these states do not indicate that the cache is in the
process of starting.
> When I looked at this a while back, I'd have ideally like to be able
> to start a cache associated with the unknown cache request, however
> this is not feasible cos you can't know what configuration it should
> be started with.
>
> At first glance, a different valid status would be the way forward,
> but you have to think about the state transfer and distribution logic
> and that's the hard bit. If a cache is started in a non-coordinator,
> and the coordinator has not yet started that cache, how does state
> transfer or rehash control work? Both of them rely on some kind of
> logic running on coordinator. Now, who's the coordinator in that case?
> The coordinator is in theory the first node started, but what if the
> cache is not yet started in the coordinator? The coordinator now
> becomes a variant of the Cache rather than the CacheManager.
>
> I think the latter is the bigger problem to solve here.
Agreed.
> On Sep 10, 2010, at 7:16 PM, Paul Ferraro wrote:
>
> > OK - the plot thickens...
> > RequestIgnoredResponse is not actually appropriate because it's an
> > invalid response (i.e. extends InvalidResponse). Oops.
> > So, not only would we either need to return a valid response (perhaps
> > null, like the behavior prior to ISPN-447 ?), but an RPC for a stopped
> > (or stopping) cache should also be considered valid. For example, if I
> > have an app deployed on 2 nodes, and I undeploy the app from node2, this
> > would cause RPC-bound cache operations to fail on node1. Actually,
> > these RPCs would timeout, since the InboundInvocationHandler will wait
> > 30 seconds for them to start. That's no good.
> >
> > To address this would require some changes to the behavior of some of
> > the ComponentStatus values. For example, ComponentStatus.startingUp()
> > returns true for STOPPING and TERMINATED, and consequently
> > InboundInvocationHandler loops for 30 seconds hoping the cache will
> > start. That doesn't seem appropriate for the use case above. Would it
> > be possible to return a valid ignored response (e.g. null) for these
> > states?
> >
> > Thoughts?
> >
> > On Fri, 2010-09-10 at 11:54 -0400, Paul Ferraro wrote:
> >> In AS clustering, there are several use cases where a specific cache
> >> instance may not exist (or may not be started) for every member of the
> >> group. Currently, Infinispan treats this as an exception case, and any
> >> cache operation resulting in an RPC will fail. This is problematic for
> >> the following AS use cases:
> >>
> >> 1. For a given clustering service (e.g. web session, SFSBs, entity
> >> caching) there is a shared cache manager for all applications, while
> >> each application uses its own cache instance. If I have app1 running on
> >> node1 and node2, everything is fine. But if I deploy app2 on node1,
> >> it's membership will include node2 (because of the shared cache manager)
> >> even though there is no cache instance for app2 on node2. Consequently,
> >> the cache instances for app2 will be non-functional until app2 is
> >> deployed on node2.
> >> 2. In Hibernate's 2nd level cache, custom cache regions are created on
> >> demand. So, even with a single app running on 2 nodes, the first
> >> request to cache an entity in a custom cache region on node1 will fail,
> >> since the cache corresponding to the region will not exist on node2.
> >>
> >> Here's is relevant code in
> >> InboundInvocationHandlerImpl.handle(CacheRpcCommand):
> >>
> >> String cacheName = cmd.getCacheName();
> >> ComponentRegistry cr = gcr.getNamedComponentRegistry(cacheName);
> >> long giveupTime = System.currentTimeMillis() + 30000; // arbitraty (?) wait time for caches to start
> >> while (cr == null && System.currentTimeMillis() < giveupTime) {
> >> Thread.sleep(100);
> >> cr = gcr.getNamedComponentRegistry(cacheName);
> >> }
> >>
> >> if (cr == null) {
> >> if (log.isDebugEnabled()) log.debug("Cache named {0} does not exist on this cache manager!", cacheName);
> >> return new ExceptionResponse(new NamedCacheNotFoundException(cacheName));
> >> // return RequestIgnoredResponse.INSTANCE; // Suggested fix?
> >> }
> >>
> >> For the perspective of the AS, a request for a non-existent cache should
> >> be treated the same way as a request for a stopped cache (that logic
> >> returns RequestIgnoredResponse.INSTANCE).
> >> As Galder pointed out, handling this case via exception was an explicit
> >> workaround for this issue: https://jira.jboss.org/browse/ISPN-447
> >> In the comments for ISPN-447, Manik seemed to suggest that returning an
> >> exception is merely a workaround until this issue is fixed:
> >> https://jira.jboss.org/browse/ISPN-434
> >>
> >> As it stands, this is a blocker issue for AS infinispan integration.
> >>
> >> Thoughts?
> >>
> >> _______________________________________________
> >> infinispan-dev mailing list
> >> infinispan-dev at lists.jboss.org
> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
More information about the infinispan-dev
mailing list