Cache ops after view changes result in RehashInProgressException very easily
by Galder Zamarreño
Hi Dan,
Re: http://goo.gl/TGwrP
There's a few of this in the Hot Rod server+client testsuites. It's easy to replicate it locally. Seems like cache operations right after a cache has started are rather problematic.
In local execution of HotRodReplicationTest, I was able to replicate the issue when trying to test topology changes. Please find attached the log file, but here're the interesting bits:
1. A new view installation is being prepared with NodeA and NodeB:
2011-10-24 14:36:09,046 4221 TRACE [org.infinispan.cacheviews.CacheViewsManagerImpl] (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) ___hotRodTopologyCache: Preparing cache view CacheView{viewId=4, members=[NodeA-63227, NodeB-15806]}, committed view is CacheView{viewId=3, members=[NodeA-63227, NodeB-15806, NodeC-17654]}
…
2011-10-24 14:36:09,047 4222 DEBUG [org.infinispan.statetransfer.StateTransferLockImpl] (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Blocking new transactions
2011-10-24 14:36:09,047 4222 TRACE [org.infinispan.statetransfer.StateTransferLockImpl] (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Acquiring exclusive state transfer shared lock, shared holders: 0
2011-10-24 14:36:09,047 4222 TRACE [org.infinispan.statetransfer.StateTransferLockImpl] (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Acquired state transfer lock in exclusive mode
2. The cluster coordinator discovers a view change and requests NodeA and NodeB to remove NodeC from the topology view:
2011-10-24 14:36:09,048 4223 TRACE [org.infinispan.interceptors.InvocationContextInterceptor] (OOB-3,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Invoked with command RemoveCommand{key=NodeC-17654, value=null, flags=null} and InvocationContext [NonTxInvocationContext{flags=null}]
3. NodeB has not yet finished installing the cache view, so that remove times out:
2011-10-24 14:36:09,049 4224 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (OOB-3,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) ISPN000136: Execution error
org.infinispan.distribution.RehashInProgressException: Timed out waiting for the transaction lock
A way to solve this is to avoid relying on cluster view changes, but instead wait for the cache view to be installed, and then do the operations then. Is there any way to wait till then?
One way would be to have some CacheView installed callbacks or similar. This could be a good option cos I could have a CacheView listener for the hot rod topology cache whose callbacks I can check for isPre=false and then do the cache ops safely.
Otherwise, code like this the one I used for keeping the Hot Rod topology is gonna be racing against your cache view installation code.
You seem to have some pieces in place for this, i.e. CacheViewListener, but it seems only designed for internal core/ work.
Any other suggestions?
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
12 years, 6 months
Syncing to central
by Pete Muir
Guys
I'm just looking at having Infinispan sync'd to Maven central. One issue raised is that we have a number of releases which refer to SNAPSHOT dependencies (which is a really bad idea anyway). I can have them sync only 5.1.0 releases newer ALPHA1 which appear to be clear. But if we do this, we need to be much more careful going forward about not using SNAPSHOTs in releases.
WDYT? IMO We should definitely do this, it's good practice anyway, and much easier for users as a result.
Pete
12 years, 6 months
Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)
by Dan Berindei
Hi Galder
On Mon, Oct 24, 2011 at 1:46 PM, Galder Zamarreño <galder(a)redhat.com> wrote:
>
> On Oct 24, 2011, at 12:04 PM, Dan Berindei wrote:
>
>> ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
>> interesting question: if the preloading happens before joining, the
>> preloading code won't know anything about the consistent hash. It will
>> load everything from the cache store, including the keys that are
>> owned by other nodes.
>
> It's been defined to work that way:
> https://docs.jboss.org/author/display/ISPN/CacheLoaders
>
> Tbh, that will only happen in shared cache stores. In non-shared ones, you'll only have data that belongs to that node.
>
Not really... in distributed mode, every time the cache starts it will
have another position on the hash wheel.
That means even with a non-shared cache store, it's likely most of the
stored keys will no longer be local.
Actually I just noticed that you've fixed ISPN-1404, which looks like
it would solves my problem when the cache is created by a HotRod
server. I would like to extend it to work like this by default, e.g.
by using the transport's nodeName as the seed.
>> I think there is a check in place already so that the joiner won't
>> push stale data from its cache store to the other nodes, but we should
>> also discard the keys that don't map locally or we'll have stale data
>> (since we don't have a way to check if those keys are stale and
>> register to receive invalidations for those keys).
>
> +1, only for shared cache stores.
>
>>
>> What do you think, should I discard the non-local keys with the fix
>> for ISPN-1470 or should I let them be and warn the user about
>> potentially stale data?
>
> Discard only for shared cache stores.
>
> Cache configurations should be symmetrical, so if other nodes preload, they'll preload only data local to them with your change.
>
Discarding works fine from the correctness POV, but for performance
it's not that great: we may do a lot of work to preload keys and have
nothing to show for it at the end.
Enabling the fixed hash seed by default should make the performance
issue go away. I think it would also require virtual nodes enabled by
default and a way to ensure that the nodeNames are unique across the
cluster.
Cheers
Dan
>>
>> Cheers
>> Dan
>>
>>
>> On Mon, Oct 3, 2011 at 3:09 AM, Manik Surtani <manik(a)jboss.org> wrote:
>>>
>>> On 28 Sep 2011, at 10:56, Dan Berindei wrote:
>>>
>>> I'm not sure if the comment is valid though, since the old
>>> StateTransferManager had priority 55 and it also cleared the data
>>> container before applying the state from the coordinator. I'm not sure
>>> how preloading and state transfer are supposed to interact, maybe
>>> Manik can help clear this up?
>>>
>>> Hmm - this is interesting. I think preloading should happen first, since
>>> the cache store may contain old data.
>>> --
>>> Manik Surtani
>>> manik(a)jboss.org
>>> twitter.com/maniksurtani
>>> Lead, Infinispan
>>> http://www.infinispan.org
>>>
>>>
>>>
>
> --
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
>
>
12 years, 6 months
Testing ISPN 200
by Israel Lacerra
Guys,
I was thinking about test ISPN-200 in a real environment. I want to compare
the distributed queries with local queries in different scenarios. Do you
have any thought about this? Any suggestions?
My first idea is to create an application that puts and search "n" values
(of size "x") in a cache with "k" nodes...
What do you think about this?
thanks!
Israel
12 years, 6 months
JBoss Logging upgrade breaking build
by Galder Zamarreño
Hi,
This JBoss Logging upgrade is rather annoying for a couple of reasons:
1. IDE integration is broken: Before we had a JBoss Logging Processor as some dependency that allowed the logging processor to be found in the classpath. Now this is no longer in the classpath, so IntelliJ forces you to point to a jar. However, the new processing jar is split between two jars: ./jboss-logging-processor/1.0.0.CR3/jboss-logging-processor-1.0.0.CR3.jar and ./jboss-logging-generator/1.0.0.CR3/jboss-logging-generator-1.0.0.CR3.jar - And the generation won't work without both jars being pointed by the annotation processor. David, why did you split these jars?
2. It breaks our build, see https://infinispan.ci.cloudbees.com/job/Infinispan-master-JDK6-tcp/268/or... - What are these errors about? And why is it an info it breaks the compilation? :)
[INFO] diagnostic error: All message bundles and message logger messageMethods must have or inherit a message.
What is wrong with https://github.com/infinispan/infinispan/blob/master/query/src/main/java/... ?
Cheers,
p.s. We can we please test these upgrades throughfully before committing them? Thx :)
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
12 years, 6 months
5.1.0.BETA3 this Monday?
by Galder Zamarreño
Hi guys,
How are things looking for a potential BETA3 this coming Monday?
I'm fairly close to finishing https://issues.jboss.org/browse/ISPN-1408. I'm hoping to send the protocol version 1.1 for review this afternoon including a pull req. It's a relatively big pull due to coordination of several pieces. Obviously, if Pete's configuration stuff gets changed before that, I might need more time.
What about the rest?
Vladimir, your Fine Grained locking on AtomicMaps?
Manik, versioned API?
Pete, XML stuff?
Dan, anything in your plate?
And Mircea?
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
12 years, 6 months
ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)
by Galder Zamarreño
Hi,
Re: https://issues.jboss.org/browse/ISPN-1384
I've had a look to this and this race condition could, in theory, be resolved by making InboundInvocationHandlerImpl.handle() waiting for cache not only to be defined, but to be started. Otherwise there'll always be a potential race condition like the one showed in the log.
In this particular case, this is clustered get command being received from a clustered cache loader, which is arriving in the cache before this is started (and the cache loader has been created, hence the NPE).
Another question, is there any reason why CacheLoader is not a named cache component which can be initalised with a corresponding factory and to which other components can be injected (i.e. marshaller, cache...etc)?
In this particular case, this would also resolve the issue because ClusterCacheLoader.start() does nothing, so all the interceptor needs is a proper instance of ClusterCacheLoader available. The factory makes these available bejore inject.
Thoughts?
Cheers,
p.s. Dan, I am aware of https://issues.jboss.org/browse/ISPN-1324, maybe you're solving this indirectly with the work for that JIRA?
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
12 years, 6 months
Deprecated AdvancedCache.with() method?
by Galder Zamarreño
Hi,
I assume that once https://issues.jboss.org/browse/ISPN-1413 is in place, AdvancedCache.with() method will be deprecated right?
Remember that ISPN-1413 will result in a classloader being maintained per CacheManager.
The reason I ask this is cos http://goo.gl/4bfFb is broken and I'm planning to disable it (or remove it), unless we plan to support that with() call.
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
12 years, 6 months
JBoss Logged and conversion formats failures
by Sanne Grinovero
We're failing on log messages; I'm having a look but wondering if it
rings a bell to someone?
It complains about illegal formats, while the format is correct:
In this case the stop method is passing a long, and converting it via %d
Caused by: java.util.IllegalFormatConversionException: d != java.lang.String
at java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:3999)
at java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2709)
at java.util.Formatter$FormatSpecifier.print(Formatter.java:2661)
at java.util.Formatter.format(Formatter.java:2433)
at java.util.Formatter.format(Formatter.java:2367)
at java.lang.String.format(String.java:2769)
at org.jboss.logging.Log4jLogger.doLogf(Log4jLogger.java:48)
at org.jboss.logging.Logger.logf(Logger.java:2097)
at org.infinispan.util.logging.Log_$logger.tracef(Log_$logger.java:650)
at org.infinispan.transaction.TransactionTable.stop(TransactionTable.java:134)
In this other case, it's a debug message passing an int value, again
formatted with %d :
Caused by: java.util.IllegalFormatConversionException: d != java.lang.String
at java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:3999)
at java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2709)
at java.util.Formatter$FormatSpecifier.print(Formatter.java:2661)
at java.util.Formatter.format(Formatter.java:2433)
at java.util.Formatter.format(Formatter.java:2367)
at java.lang.String.format(String.java:2769)
at org.jboss.logging.Log4jLogger.doLogf(Log4jLogger.java:48)
at org.jboss.logging.Logger.logf(Logger.java:2097)
at org.infinispan.util.logging.Log_$logger.debugf(Log_$logger.java:875)
at org.infinispan.interceptors.InterceptorChain.printChainInfo(InterceptorChain.java:71)
Using org.jboss.logging:jboss-logging:jar:3.0.1 and version 1.0.0.CR2
of jboss-logging-processor
12 years, 6 months
startCaches and asymmetric clusters
by Sanne Grinovero
Since we now support asymmetric clusters, could we avoid scaring users
with the WARN log "You are not starting all your caches at the same
time. This can lead to problems as asymmetric clusters are not
supported, see ISPN-658"
If you're all ok with it I'd open an issue to drop the log.
What about the startCaches() API? We can either deprecate it, or leave
it there: one day we might improve on that and provide a more
efficient way to start multiple caches at once.
Sanne
12 years, 6 months