[JBoss JIRA] (ISPN-4717) Hot Rod 2.0 should add error codes for suspected nodes and stopping/stopped caches
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4717?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-4717:
-------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Hot Rod 2.0 should add error codes for suspected nodes and stopping/stopped caches
> ----------------------------------------------------------------------------------
>
> Key: ISPN-4717
> URL: https://issues.jboss.org/browse/ISPN-4717
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 6.0.2.Final
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Fix For: 7.0.0.CR1, 7.0.0.Final
>
>
> The way Hot Rod protocol deals with suspected exceptions is hacky. It inspects the error message to detect whether a SuspectException has been passed in. Instead, suspect exceptions should have a dedicated error code so that clients can handle appropriately.
> On top of that, another exception that should be handled more silently and failover is when a cache is stopping or is stopped. -Currently, this produces the following log messages without affecting functionality- Scrap that, it does get propagated to the client without being able to failover, so it's a bug:
> {code}2014-09-11 08:11:04,984 ERROR [HotRodDecoder] (HotRodServerWorker-6-1) ISPN005003: Exception reported
> java.lang.IllegalStateException: Default cache is in 'STOPPING' state and this is an invocation not belonging to an on-going transaction, so it does not accept new invocations. Either restart it or recreate the cache container.
> at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:94)
> at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71)
> at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:33)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333)
> at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1490)
> at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:968)
> at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:960)
> at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:485)
> at org.infinispan.server.core.AbstractProtocolDecoder.put(AbstractProtocolDecoder.scala:252)
> at org.infinispan.server.core.AbstractProtocolDecoder.org$infinispan$server$core$AbstractProtocolDecoder$$decodeValue(AbstractProtocolDecoder.scala:207)
> at org.infinispan.server.core.AbstractProtocolDecoder.decodeDispatch(AbstractProtocolDecoder.scala:73)
> at org.infinispan.server.core.AbstractProtocolDecoder.decode(AbstractProtocolDecoder.scala:61)
> at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:362)
> at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149)
> at org.infinispan.server.core.AbstractProtocolDecoder.channelRead(AbstractProtocolDecoder.scala:471)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:318)
> at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
> at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:125)
> at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:507)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:464)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350)
> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> at java.lang.Thread.run(Thread.java:744)
> 2014-09-11 08:11:04,990 ERROR [HotRodDecoder] (HotRodServerWorker-6-1) ISPN005009: Unexpected error before any request parameters read
> io.netty.handler.codec.DecoderException: org.infinispan.server.hotrod.HotRodException: java.lang.IllegalStateException: Default cache is in 'STOPPING' state and this is an invocation not belonging to an on-going transaction, so it does not accept new invocations. Either restart it or recreate the cache container.
> at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:417)
> at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149)
> at org.infinispan.server.core.AbstractProtocolDecoder.channelRead(AbstractProtocolDecoder.scala:471)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:318)
> at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
> at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:125)
> at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:507)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:464)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350)
> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: org.infinispan.server.hotrod.HotRodException: java.lang.IllegalStateException: Default cache is in 'STOPPING' state and this is an invocation not belonging to an on-going transaction, so it does not accept new invocations. Either restart it or recreate the cache container.
> at org.infinispan.server.hotrod.HotRodDecoder.createServerException(HotRodDecoder.scala:204)
> at org.infinispan.server.core.AbstractProtocolDecoder.decodeDispatch(AbstractProtocolDecoder.scala:77)
> at org.infinispan.server.core.AbstractProtocolDecoder.decode(AbstractProtocolDecoder.scala:61)
> at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:362)
> ... 12 more
> Caused by: java.lang.IllegalStateException: Default cache is in 'STOPPING' state and this is an invocation not belonging to an on-going transaction, so it does not accept new invocations. Either restart it or recreate the cache container.
> at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:94)
> at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71)
> at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:33)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333)
> at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1490)
> at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:968)
> at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:960)
> at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:485)
> at org.infinispan.server.core.AbstractProtocolDecoder.put(AbstractProtocolDecoder.scala:252)
> at org.infinispan.server.core.AbstractProtocolDecoder.org$infinispan$server$core$AbstractProtocolDecoder$$decodeValue(AbstractProtocolDecoder.scala:207)
> at org.infinispan.server.core.AbstractProtocolDecoder.decodeDispatch(AbstractProtocolDecoder.scala:73)
> ... 14 more
> 2014-09-11 08:11:04,991 WARN [Codec20] (ForkThread-1,DistTopologyChangeUnderLoadTest) ISPN004005: Error received from the server: io.netty.handler.codec.DecoderException: org.infinispan.server.hotrod.HotRodException: java.lang.IllegalStateException: Default cache is in 'STOPPING' state and this is an invocation not belonging to an on-going transaction, so it does not accept new invocations. Either restart it or recreate the cache container.
> {code}
> Cache stopping/stopped should have a different error code too so that clients can handle it properly.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years
[JBoss JIRA] (ISPN-4512) CacheManagerTest.testCacheManagerRestartReusingConfigurations random failures
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-4512?page=com.atlassian.jira.plugin.... ]
Work on ISPN-4512 stopped by William Burns.
-------------------------------------------
> CacheManagerTest.testCacheManagerRestartReusingConfigurations random failures
> -----------------------------------------------------------------------------
>
> Key: ISPN-4512
> URL: https://issues.jboss.org/browse/ISPN-4512
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 7.0.0.Alpha4
> Reporter: Dan Berindei
> Assignee: William Burns
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 7.0.0.CR1
>
> Attachments: CacheManagerTest_t_ISPN-4154_failing_elasticity_test_20140707.log.gz
>
>
> When a new cache manager is started with the same configuration, it uses the JGroupsTransport instance. In some rare cases, the JGroupsTransport keeps using the old marshaller, which doesn't work, and the cache fails to start:
> {noformat}
> 23:54:08,203 TRACE (testng-CacheManagerTest:___defaultcache) [JGroupsTransport] dests=[NodeB-24139], command=CacheTopologyControlCommand{cache=___defaultcache, type=JOIN, sender=NodeA-33664, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.impl.ReplicatedConsistentHashFactory@b8c8791, hashFunction=MurmurHash3, numSegments=60, numOwners=2, timeout=240000, totalOrder=false, distributed=false}, topologyId=0, currentCH=null, pendingCH=null, throwable=null, viewId=3}, mode=SYNCHRONOUS, timeout=240000
> 23:54:08,207 DEBUG (testng-CacheManagerTest:___defaultcache) [VersionAwareMarshaller] Object is not serializable
> java.io.NotSerializableException: org.infinispan.topology.CacheTopologyControlCommand
> at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:890)
> at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
> at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
> at org.infinispan.commons.marshall.jboss.AbstractJBossMarshaller.objectToObjectStream(AbstractJBossMarshaller.java:73)
> at org.infinispan.marshall.core.VersionAwareMarshaller.objectToBuffer(VersionAwareMarshaller.java:77)
> at org.infinispan.commons.marshall.AbstractMarshaller.objectToBuffer(AbstractMarshaller.java:41)
> at org.infinispan.commons.marshall.AbstractDelegatingMarshaller.objectToBuffer(AbstractDelegatingMarshaller.java:85)
> at org.infinispan.remoting.transport.jgroups.MarshallerAdapter.objectToBuffer(MarshallerAdapter.java:23)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.marshallCall(CommandAwareRpcDispatcher.java:335)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:352)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:165)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:526)
> at org.infinispan.topology.LocalTopologyManagerImpl.executeOnCoordinator(LocalTopologyManagerImpl.java:290)
> at org.infinispan.topology.LocalTopologyManagerImpl.join(LocalTopologyManagerImpl.java:100)
> at org.infinispan.statetransfer.StateTransferManagerImpl.start(StateTransferManagerImpl.java:104)
> {noformat}
> The only test that does this is CacheManagerTest.testCacheManagerRestartReusingConfigurations.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years
[JBoss JIRA] (ISPN-4584) Stricter validation of cache configurations for distributed indexes
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-4584?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant commented on ISPN-4584:
---------------------------------------
Gustavo, I like the config proposal, however i would not embed a cache declaration inside the <infinispan> element but just reference a named cache to be used as a template.
> Stricter validation of cache configurations for distributed indexes
> -------------------------------------------------------------------
>
> Key: ISPN-4584
> URL: https://issues.jboss.org/browse/ISPN-4584
> Project: Infinispan
> Issue Type: Enhancement
> Components: Lucene Directory
> Reporter: Sanne Grinovero
> Assignee: Gustavo Fernandes
> Priority: Minor
>
> See also ISPN-4577 : it should not be allowed to configure a distributed metadata cache while the chunks cache is using local mode.
> Ideally think of additional strict checks which we should apply.. suggestions?
> Mitigated by ISPN-4340
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years
[JBoss JIRA] (ISPN-4802) HotRodConcurrentStartTest.testConcurrentStartup random failures
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4802?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-4802:
------------------------------------
The test also seems to leak HotRod servers when the future times out, sometimes causing failures in other tests:
{noformat}
00:24:46,784 ERROR (testng-HotRodReplicatedEventsTest:) [UnitTestTestNGListener] Configuration method createBeforeClass(org.infinispan.server.hotrod.event.HotRodReplicatedEventsTest) threw an exception
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:476)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:454)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:439)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:195)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:338)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:370)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at java.lang.Thread.run(Thread.java:745)
{noformat}
It would also be nice if the threads creating the servers had the test name in them (e.g. using {{AbstractInfinispanTest.fork()}}), as it would make filtering the logs much easier.
> HotRodConcurrentStartTest.testConcurrentStartup random failures
> ---------------------------------------------------------------
>
> Key: ISPN-4802
> URL: https://issues.jboss.org/browse/ISPN-4802
> Project: Infinispan
> Issue Type: Bug
> Components: Server, Test Suite - Server
> Affects Versions: 7.0.0.Beta2
> Reporter: Dan Berindei
> Assignee: Galder Zamarreño
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 7.0.0.CR1
>
>
> Sometimes it takes a lot of time to start the cluster, and the 20s timeout is not enough:
> {noformat}
> 10:44:42,144 INFO (ForkJoinPool-1-worker-1:) [HotRodTestingUtil$] Start server in port 13081
> 10:44:42,171 INFO (ForkJoinPool-1-worker-3:) [HotRodTestingUtil$] Start server in port 13091
> 10:44:42,234 INFO (ForkJoinPool-1-worker-1:) [JGroupsTransport] ISPN000078: Starting JGroups channel ISPN
> 10:44:42,254 INFO (ForkJoinPool-1-worker-3:) [JGroupsTransport] ISPN000078: Starting JGroups channel ISPN
> 10:44:47,383 DEBUG (ForkJoinPool-1-worker-3:) [GMS] address=HotRodConcurrentStartTest-NodeB-30943, cluster=ISPN, physical address=127.0.0.1:9000
> 10:44:47,746 DEBUG (ForkJoinPool-1-worker-3:) [CacheImpl] Started cache __cluster_registry_cache__ on HotRodConcurrentStartTest-NodeB-30943
> 10:44:47,750 DEBUG (ForkJoinPool-1-worker-3:) [CacheImpl] Started cache ___defaultcache on HotRodConcurrentStartTest-NodeB-30943
> 10:44:48,078 DEBUG (ForkJoinPool-1-worker-1:) [GMS] address=HotRodConcurrentStartTest-NodeA-34821, cluster=ISPN, physical address=127.0.0.1:9001
> 10:44:48,187 DEBUG (ForkJoinPool-1-worker-3:) [CacheImpl] Started cache hotRodConcurrentStart on HotRodConcurrentStartTest-NodeB-30943
> 10:44:48,308 DEBUG (ForkJoinPool-1-worker-3:) [HotRodTestingUtil$$anon$1] Externally facing address is 127.0.0.1:13091
> 10:44:48,556 DEBUG (ForkJoinPool-1-worker-3:) [CacheImpl] Started cache ___hotRodTopologyCache on HotRodConcurrentStartTest-NodeB-30943
> 10:44:48,557 DEBUG (ForkJoinPool-1-worker-3:) [HotRodTestingUtil$$anon$1] Map HotRodConcurrentStartTest-NodeB-30943 cluster address with 127.0.0.1:13091 server endpoint in address cache
> 10:44:50,947 DEBUG (ForkJoinPool-1-worker-1:) [CacheImpl] Started cache __cluster_registry_cache__ on HotRodConcurrentStartTest-NodeA-34821
> 10:44:50,952 DEBUG (ForkJoinPool-1-worker-1:) [CacheImpl] Started cache ___defaultcache on HotRodConcurrentStartTest-NodeA-34821
> 10:44:51,925 DEBUG (ForkJoinPool-1-worker-1:) [CacheImpl] Started cache hotRodConcurrentStart on HotRodConcurrentStartTest-NodeA-34821
> 10:45:02,048 DEBUG (ForkJoinPool-1-worker-1:) [HotRodTestingUtil$$anon$1] Externally facing address is 127.0.0.1:13081
> 10:45:02,248 DEBUG (ForkJoinPool-1-worker-1:) [LocalTopologyManagerImpl] Node HotRodConcurrentStartTest-NodeA-34821 joining cache ___hotRodTopologyCache
> 10:45:02,356 ERROR (testng-HotRodConcurrentStartTest:) [UnitTestTestNGListener] Test testConcurrentStartup(org.infinispan.server.hotrod.HotRodConcurrentStartTest) failed.
> java.util.concurrent.TimeoutException: Futures timed out after [20 seconds]
> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:116)
> at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> at scala.concurrent.Await$.result(package.scala:116)
> at org.infinispan.server.hotrod.HotRodConcurrentStartTest.testConcurrentStartup(HotRodConcurrentStartTest.scala:64)
> {noformat}
> http://ci.infinispan.org/viewLog.html?buildId=12599&buildTypeId=bt8
> http://ci.infinispan.org/viewLog.html?buildId=12408&buildTypeId=Infinispa...
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years
[JBoss JIRA] (ISPN-4801) ConcurrentModificationException on the FileListCacheValue
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-4801?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-4801:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
Integrated. Thanks [~gustavonalle]!
> ConcurrentModificationException on the FileListCacheValue
> ---------------------------------------------------------
>
> Key: ISPN-4801
> URL: https://issues.jboss.org/browse/ISPN-4801
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying
> Affects Versions: 7.0.0.Beta2
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Priority: Critical
> Fix For: 7.0.0.CR1
>
>
> Since ISPN-4692 that made FileListCacheValue DeltaAware, the following is happening when running {{org.infinispan.lucene.profiling.PerformanceCompareStressTest}}:
> {code}
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
> at java.util.HashMap$KeyIterator.next(HashMap.java:956)
> at java.util.AbstractCollection.toArray(AbstractCollection.java:195)
> at org.infinispan.lucene.impl.FileListCacheValue.toArray(FileListCacheValue.java:109)
> at org.infinispan.lucene.impl.FileListOperations.listFilenames(FileListOperations.java:101)
> at org.infinispan.lucene.impl.DirectoryImplementor.list(DirectoryImplementor.java:56)
> at org.infinispan.lucene.impl.DirectoryLuceneV4.listAll(DirectoryLuceneV4.java:123)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:759)
> {code}
> The problem is that the deltas are not being applied in a thread safe manner
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years