[JBoss JIRA] Work started: (ISPN-186) Smart L1 cache invalidation
by Pete Muir (JIRA)
[ https://issues.jboss.org/browse/ISPN-186?page=com.atlassian.jira.plugin.s... ]
Work on ISPN-186 started by Pete Muir.
> Smart L1 cache invalidation
> ---------------------------
>
> Key: ISPN-186
> URL: https://issues.jboss.org/browse/ISPN-186
> Project: Infinispan
> Issue Type: Feature Request
> Components: Distributed Cache
> Reporter: Manik Surtani
> Assignee: Pete Muir
> Labels: l1
> Fix For: 5.0.0.BETA1, 5.0.0.Final
>
>
> Need to build a mechanism in which L1 invalidation is NOT multicast, but instead is unicast _if necessary_ to specific nodes that may have cached a given entry. This can be detected by maintaining a list of nodes who have requested a key via a remote get, but this would need to be relayed by all data owners.
> Benefits would be performance by removing unnecessary invalidation where this is not needed, and by reducing noise in network stacks of most nodes.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 11 months
[JBoss JIRA] Created: (ISPN-493) Harden rehash leave process
by Vladimir Blagojevic (JIRA)
Harden rehash leave process
---------------------------
Key: ISPN-493
URL: https://jira.jboss.org/browse/ISPN-493
Project: Infinispan
Issue Type: Task
Affects Versions: 4.1.0.BETA2, 4.0.0.Final
Reporter: Vladimir Blagojevic
Assignee: Vladimir Blagojevic
Fix For: 5.0.0.BETA1, 5.0.0.Final
We need to make sure that leave rehash process properly handles massive and rapid node failure.
Massive failures:
JGroups detects multiple node failures and pushes up to Infinispan views that are more "volatile" than we currently assumed (only one member at the time can leave). For example, if we have view V1={A,B,C,D,E} and massive failure causes {C,D,E} to fail, JGroups failure detection and GMS are going to install a view V2={A,B} to surviving members. LeaveTask does not handle this scenario.
Rapid node failure:
We need to revisit how LeaveTasks are queued up and executed/canceled during rapid node failures. Do we always cancel currently running leave tasks? At what stage are we allowed to cancel it and at what stage of a leave tasks is it better to wait for a completion of a task.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 11 months
[JBoss JIRA] Created: (ISPN-902) Data consistency across rehashing
by Erik Salter (JIRA)
Data consistency across rehashing
---------------------------------
Key: ISPN-902
URL: https://issues.jboss.org/browse/ISPN-902
Project: Infinispan
Issue Type: Bug
Reporter: Erik Salter
Assignee: Manik Surtani
Priority: Critical
Attachments: cacheTest.zip
There are two scenarios we're seeing on rehashing, both of which are critical.
1. On a node leaving a running cluster, we're seeing an inordinate amount of timeout errors, such as the one below. The end result of this is that the cluster ends up losing data.
org.infinispan.util.concurrent.TimeoutException: Timed out waiting for valid responses!
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:417)
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:101)
at org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(DistributionManagerImpl.java:341)
at org.infinispan.interceptors.DistributionInterceptor.realRemoteGet(DistributionInterceptor.java:143)
at org.infinispan.interceptors.DistributionInterceptor.remoteGetAndStoreInL1(DistributionInterceptor.java:131)
06:07:44,097 WARN [GMS] cms-node-20192: merge leader did not get data from all partition coordinators [cms-node-20192, mydht1-18445], merge is cancelled at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:59)
2. Joining a node into a running cluster causes transactional failures on the other nodes. Most of the time, depending on the load, a node can take upwards of 8 minutes to join.
I've attached a unit test that can reproduce these issues.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 11 months
[JBoss JIRA] Created: (ISPN-931) NPE in /infinispan-gridfs-webdav demo
by Radoslav Husar (JIRA)
NPE in /infinispan-gridfs-webdav demo
-------------------------------------
Key: ISPN-931
URL: https://issues.jboss.org/browse/ISPN-931
Project: Infinispan
Issue Type: Bug
Components: Demos and Tutorials
Affects Versions: 4.2.1.CR1
Reporter: Radoslav Husar
Assignee: Manik Surtani
Priority: Minor
We ran into this when we were demoing gridFS webdav demo on the developer conference using Gnome's 'Connect to server' function and then accessing via Nautilus. Looks like error in webdav 3rd party lib.
14:57:07,790 ERROR [WebDavServletBean] Exception: java.lang.NullPointerException
at net.sf.webdav.methods.DoPropfind.parseProperties(DoPropfind.java:256)
at net.sf.webdav.methods.DoPropfind.recursiveParseProperties(DoPropfind.java:212)
at net.sf.webdav.methods.DoPropfind.recursiveParseProperties(DoPropfind.java:227)
at net.sf.webdav.methods.DoPropfind.execute(DoPropfind.java:165)
at net.sf.webdav.WebDavServletBean.service(WebDavServletBean.java:128)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:235)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:190)
at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:92)
at org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.process(SecurityContextEstablishmentValve.java:126)
at org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.invoke(SecurityContextEstablishmentValve.java:70)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:158)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:330)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:829)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:598)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:662)
Before happening, I noticed (maybe unrelated..)
14:42:31,269 INFO [STDOUT] LockedObject.removeLockedObjectOwner()
14:42:31,269 INFO [STDOUT] java.lang.ArrayIndexOutOfBoundsException: 1
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 11 months
[JBoss JIRA] Created: (ISPN-921) REST server war deployment problem for standalone REST server
by Michal Linhard (JIRA)
REST server war deployment problem for standalone REST server
-------------------------------------------------------------
Key: ISPN-921
URL: https://issues.jboss.org/browse/ISPN-921
Project: Infinispan
Issue Type: Bug
Components: Cache Server
Affects Versions: 4.2.1.CR1
Reporter: Michal Linhard
Assignee: Manik Surtani
When running standalone infinispan REST server (only the REST server creates the DefaultCacheManager)
the method org.infinispan.rest.StartupListener.getMcInjectedCacheManager()
fails and results in an ERROR log message
org.jboss.kernel.spi.registry.KernelRegistryEntryNotFoundException: Entry not found with name: DefaultCacheManager
at org.jboss.kernel.plugins.registry.AbstractKernelRegistry.getEntry(AbstractKernelRegistry.java:96)
the problem is that the method
org.jboss.kernel.spi.registry.KernelRegistry.getEntry
invoked via reflection
doesn't return null when an entry isn't found but rather throws KernelRegistryEntryNotFoundException
which is not handled by the code in getMcInjectedCacheManager()
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 11 months
[JBoss JIRA] Created: (ISPN-929) All example/demos should have JMX stats enabled
by Galder Zamarreño (JIRA)
All example/demos should have JMX stats enabled
-----------------------------------------------
Key: ISPN-929
URL: https://issues.jboss.org/browse/ISPN-929
Project: Infinispan
Issue Type: Task
Components: Demos and Tutorials, JMX, reporting and management
Reporter: Galder Zamarreño
Assignee: Galder Zamarreño
Fix For: 4.2.1.Final, 5.0.0.ALPHA3, 5.0.0.Final
All demos should have global and cache level JMX statistics enabled.
Firstly, it helps users view the stats without the need of extra tweaking.
Secondly, it makes life easier for me or QA for when we wanna tests these demos with the RHQ/JOPR.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 11 months
[JBoss JIRA] Created: (ISPN-927) Cache inserts inside a transaction are not propagated across the cluster in distributed mode
by Sudheer Krishna (JIRA)
Cache inserts inside a transaction are not propagated across the cluster in distributed mode
--------------------------------------------------------------------------------------------
Key: ISPN-927
URL: https://issues.jboss.org/browse/ISPN-927
Project: Infinispan
Issue Type: Bug
Components: Distributed Cache
Affects Versions: 4.2.0.Final
Environment: Tested in windows
Reporter: Sudheer Krishna
Assignee: Manik Surtani
Attachments: cache -framework.rar, testcase.rar, transaction-framework.rar
When I try to use transactions and in distributed mode (mode=dist) , changes made inside the transaction in one node is not reflected in the another node. when i change mode to replication everything works fine. So i assume this is a bug with dist mode when transactions is enabled.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 11 months
[JBoss JIRA] Created: (ISPN-920) TimeoutException: Replication timeout in DIST_ASYNC mode
by Dror Bereznitsky (JIRA)
TimeoutException: Replication timeout in DIST_ASYNC mode
--------------------------------------------------------
Key: ISPN-920
URL: https://issues.jboss.org/browse/ISPN-920
Project: Infinispan
Issue Type: Bug
Affects Versions: 4.2.0.Final
Reporter: Dror Bereznitsky
Assignee: Manik Surtani
Attachments: infinispan-test.xml, InfinispanClusterTest.java, jgroups-test.xml, unit-test-1.log, unit-test-2.log
I'm running a simple unit test for checking our Infinispan setup. The test involves starting two instances of Infinispan with a single cache in DIST_ASYNC mode.
The test creates the cache and start putting new values every X ms.
After starting the 2nd or 3rd test instance I'm always getting the following exception:
org.infinispan.util.concurrent.TimeoutException: Replication timeout for DRORB-LAP-TAN-26976
at org.infinispan.remoting.transport.AbstractTransport.parseResponseAndAddToResponseList(AbstractTransport.java:49)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:414)
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:101)
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:125)
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:230)
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:217)
at org.infinispan.remoting.rpc.RpcManagerImpl.broadcastRpcCommand(RpcManagerImpl.java:200)
at org.infinispan.distribution.JoinTask.performRehash(JoinTask.java:132)
at org.infinispan.distribution.RehashTask.call(RehashTask.java:53)
at org.infinispan.distribution.RehashTask.call(RehashTask.java:33)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Disabling the L1 cache resolves the problem.
Attached are the configuration files, test console output and the test source code.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 11 months