FYI, just checked in 

http://fisheye.jboss.org/changelog/Infinispan/branches/4.2.x?cs=2361

and tests run clean.


On 14 Sep 2010, at 15:27, Manik Surtani wrote:


On 14 Sep 2010, at 05:04, Paul Ferraro wrote:

On Mon, 2010-09-13 at 18:12 +0100, Manik Surtani wrote:
So in essence a "correct" response would be:

1)  If the cache is stopping -> ACK with a ValidResponse

Do we have a notion of an ignored (but not invalid) response, i.e. don't
trigger a retry/rollback?

We can certainly change this for RequestIgnoredResponse by overriding isValid() to return true since it is, as you say, a valid response.  Would need to run through the test suite to make sure such a change doesn't break anything though.

2)  If the cache is starting, try and wait till we can accept the RPC

Yes, except that ComponentStatus.startingUp() currently returns true for
every status exception RUNNING.  IMO, it would make more sense to
restrict this to INSTANTIATED and INITIALIZING.

Again, startingUp() would need to be fixed accordingly - and tested.


3)  If the cache doesn't exist, ACK with a valid response as well?  Surely this will lead to inconsistencies, since the RPC originator will assume the RPC has completed when in fact nothing has happened?

From the AS's perspective, an RPC for a non-existent cache (e.g. yet to
be deployed app) should be handled no differently than an RPC for a
stopping/stopped cache (e.g. undeployed app).
I'm not suggesting we should be lie to the RPC originator, but rather
that it should be able to distinguish a normal valid response from an
ignored (but valid) response.

Agreed, but how does this difference manifest itself from a caller's perspective?


On 13 Sep 2010, at 15:19, Paul Ferraro wrote:

On Mon, 2010-09-13 at 15:05 +0200, Galder Zamarreņo wrote:
I've had a brief look at this, need to spend a bit more time but here's an initial view on this,

At the moment at least, InboundInvocationHandlerImpl doesn't take in
account ComponentStatus to see if it's up. It only checks whether the
component registry is null, but a ComponentStatus check might make
more sense.

After the component registry null check, is the following:

if (!cr.getStatus().allowInvocations()) {
giveupTime = System.currentTimeMillis() + localConfig.getStateRetrievalTimeout();
while (cr.getStatus().startingUp() && System.currentTimeMillis() < giveupTime) Thread.sleep(100);
if (!cr.getStatus().allowInvocations()) {
  log.warn("Cache named [{0}] exists but isn't in a state to handle invocations.  Its state is {1}.", cacheName, cr.getStatus());
  return RequestIgnoredResponse.INSTANCE;
}
}

So, there is, in fact, a ComponentStatus check.  If the registry is not
RUNNING, then we spin for up to 30 seconds for the status to become
RUNNING.  For a stopping or stopped cache, this does not seem to make
sense, since these states do not indicate that the cache is in the
process of starting.

When I looked at this a while back, I'd have ideally like to be able
to start a cache associated with the unknown cache request, however
this is not feasible cos you can't know what configuration it should
be started with.

At first glance, a different valid status would be the way forward,
but you have to think about the state transfer and distribution logic
and that's the hard bit. If a cache is started in a non-coordinator,
and the coordinator has not yet started that cache, how does state
transfer or rehash control work? Both of them rely on some kind of
logic running on coordinator. Now, who's the coordinator in that case?
The coordinator is in theory the first node started, but what if the
cache is not yet started in the coordinator? The coordinator now
becomes a variant of the Cache rather than the CacheManager.

I think the latter is the bigger problem to solve here.

Agreed.

On Sep 10, 2010, at 7:16 PM, Paul Ferraro wrote:

OK - the plot thickens...
RequestIgnoredResponse is not actually appropriate because it's an
invalid response (i.e. extends InvalidResponse).  Oops.
So, not only would we either need to return a valid response (perhaps
null, like the behavior prior to ISPN-447 ?), but an RPC for a stopped
(or stopping) cache should also be considered valid.  For example, if I
have an app deployed on 2 nodes, and I undeploy the app from node2, this
would cause RPC-bound cache operations to fail on node1.  Actually,
these RPCs would timeout, since the InboundInvocationHandler will wait
30 seconds for them to start.  That's no good.

To address this would require some changes to the behavior of some of
the ComponentStatus values.  For example, ComponentStatus.startingUp()
returns true for STOPPING and TERMINATED, and consequently
InboundInvocationHandler loops for 30 seconds hoping the cache will
start.  That doesn't seem appropriate for the use case above.  Would it
be possible to return a valid ignored response (e.g. null) for these
states?

Thoughts?

On Fri, 2010-09-10 at 11:54 -0400, Paul Ferraro wrote:
In AS clustering, there are several use cases where a specific cache
instance may not exist (or may not be started) for every member of the
group.  Currently, Infinispan treats this as an exception case, and any
cache operation resulting in an RPC will fail.  This is problematic for
the following AS use cases:

1. For a given clustering service (e.g. web session, SFSBs, entity
caching) there is a shared cache manager for all applications, while
each application uses its own cache instance.  If I have app1 running on
node1 and node2, everything is fine.  But if I deploy app2 on node1,
it's membership will include node2 (because of the shared cache manager)
even though there is no cache instance for app2 on node2.  Consequently,
the cache instances for app2 will be non-functional until app2 is
deployed on node2.
2. In Hibernate's 2nd level cache, custom cache regions are created on
demand.  So, even with a single app running on 2 nodes, the first
request to cache an entity in a custom cache region on node1 will fail,
since the cache corresponding to the region will not exist on node2.

Here's is relevant code in
InboundInvocationHandlerImpl.handle(CacheRpcCommand):

String cacheName = cmd.getCacheName();
ComponentRegistry cr = gcr.getNamedComponentRegistry(cacheName);
long giveupTime = System.currentTimeMillis() + 30000; // arbitraty (?) wait time for caches to start
while (cr == null && System.currentTimeMillis() < giveupTime) {
Thread.sleep(100);
cr = gcr.getNamedComponentRegistry(cacheName);
}

if (cr == null) {
if (log.isDebugEnabled()) log.debug("Cache named {0} does not exist on this cache manager!", cacheName);
return new ExceptionResponse(new NamedCacheNotFoundException(cacheName));
// return RequestIgnoredResponse.INSTANCE; // Suggested fix?
}

For the perspective of the AS, a request for a non-existent cache should
be treated the same way as a request for a stopped cache (that logic
returns RequestIgnoredResponse.INSTANCE).
As Galder pointed out, handling this case via exception was an explicit
workaround for this issue: https://jira.jboss.org/browse/ISPN-447
In the comments for ISPN-447, Manik seemed to suggest that returning an
exception is merely a workaround until this issue is fixed:
https://jira.jboss.org/browse/ISPN-434

As it stands, this is a blocker issue for AS infinispan integration.

Thoughts?

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreņo
Sr. Software Engineer
Infinispan, JBoss Cache


_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik@jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org





_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik@jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org





_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
Lead, Infinispan
Lead, JBoss Cache