[JBoss JIRA] (ISPN-3149) Unexpected reporting of no live owners
by Michal Linhard (JIRA)
Michal Linhard created ISPN-3149:
------------------------------------
Summary: Unexpected reporting of no live owners
Key: ISPN-3149
URL: https://issues.jboss.org/browse/ISPN-3149
Project: Infinispan
Issue Type: Bug
Components: State transfer
Affects Versions: 5.3.0.Beta2
Reporter: Michal Linhard
Assignee: Mircea Markus
Running a four node test with infinispan-server 5.3.0-SNAPSHOT
produces following errors:
{code}
08:46:22,962 TRACE [org.infinispan.statetransfer.StateTransferManagerImpl] (MSC service thread 1-3) Starting StateTransferManager of cache testCache on node node01/default
08:46:22,969 TRACE [org.infinispan.statetransfer.StateTransferManagerImpl] (MSC service thread 1-3) Installing new cache topology CacheTopology{id=0, currentCH=DefaultConsistentHash{numSegments=40, numOwners=2, members=[node01/default]}, pendingCH=null} on cache testCache
08:46:22,972 TRACE [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) Received new topology for cache testCache, isRebalance = false, isMember = true, topology = CacheTopology{id=0, currentCH=DefaultConsistentHash{numSegments=40, numOwners=2, members=[node01/default]}, pendingCH=null}
08:46:22,972 TRACE [org.infinispan.statetransfer.StateTransferLockImpl] (MSC service thread 1-3) Signalling topology 0 is installed
08:46:22,973 TRACE [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) On cache testCache we have: added segments: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 31, 30, 34, 35, 32, 33, 38, 39, 36, 37]
08:46:22,974 DEBUG [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) Adding inbound state transfer for segments [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 31, 30, 34, 35, 32, 33, 38, 39, 36, 37] of cache testCache
08:46:22,974 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) ISPN000208: No live owners found for segment 0 of cache testCache. Current owners are: [node01/default]. Faulty owners: []
08:46:22,975 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) ISPN000208: No live owners found for segment 1 of cache testCache. Current owners are: [node01/default]. Faulty owners: []
08:46:22,976 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) ISPN000208: No live owners found for segment 2 of cache testCache. Current owners are: [node01/default]. Faulty owners: []
08:46:22,976 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) ISPN000208: No live owners found for segment 3 of cache testCache. Current owners are: [node01/default]. Faulty owners: []
08:46:22,977 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) ISPN000208: No live owners found for segment 4 of cache testCache. Current owners are: [node01/default]. Faulty owners: []
08:46:22,977 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (MSC service thread 1-3) ISPN000208: No live owners found for segment 5 of cache testCache. Current owners are:
{code}
all logs:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/user/mlinhard@REDHAT.COM...
configs:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/user/mlinhard@REDHAT.COM...
infinispan-server build info:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/user/mlinhard@REDHAT.COM...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 10 months
[JBoss JIRA] (ISPN-3145) DataRehashedEventTest intermittent failures
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-3145?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-3145:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
Integrated in master. Thanks!
> DataRehashedEventTest intermittent failures
> -------------------------------------------
>
> Key: ISPN-3145
> URL: https://issues.jboss.org/browse/ISPN-3145
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite
> Affects Versions: 5.3.0.Beta2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 5.3.0.CR1
>
>
> DataRehashedEventTest sometimes fails because of ISPN-3035.
> But other times it fails because it doesn't wait enough for the rehash event listener to be called:
> {noformat}
> 2013-05-28 11:21:36,550 TRACE (asyncTransportThread-2,NodeA) [org.infinispan.statetransfer.DataRehashedEventTest] New event received: EventImpl{type=DATA_REHASHED, pre=true, cache=Cache '___defaultcache'@NodeA-19135, consistentHashAtStart=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-19135]}, consistentHashAtEnd=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-19135, NodeB-23459]}, newTopologyId=1}
> 2013-05-28 11:21:36,661 TRACE (testng-DataRehashedEventTest) [org.infinispan.test.TestingUtil] Node NodeA-19135 finished state transfer.
> 2013-05-28 11:21:36,661 TRACE (testng-DataRehashedEventTest) [org.infinispan.test.TestingUtil] Node NodeB-23459 finished state transfer.
> 2013-05-28 11:21:36,662 ERROR (testng-DataRehashedEventTest) [org.infinispan.test.fwk.UnitTestTestNGListener] Test testJoinAndLeave(org.infinispan.statetransfer.DataRehashedEventTest) failed.
> java.lang.AssertionError: expected [2] but found [1]
> at org.testng.Assert.fail(Assert.java:94)
> at org.testng.Assert.failNotEquals(Assert.java:494)
> at org.testng.Assert.assertEquals(Assert.java:123)
> at org.testng.Assert.assertEquals(Assert.java:370)
> at org.testng.Assert.assertEquals(Assert.java:380)
> at org.infinispan.statetransfer.DataRehashedEventTest.testJoinAndLeave(DataRehashedEventTest.java:73)
> ...
> 2013-05-28 11:21:36,670 TRACE (asyncTransportThread-4,NodeA) [org.infinispan.statetransfer.DataRehashedEventTest] New event received: EventImpl{type=DATA_REHASHED, pre=false, cache=Cache '___defaultcache'@NodeA-19135, consistentHashAtStart=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-19135]}, consistentHashAtEnd=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-19135, NodeB-23459]}, newTopologyId=2}
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 10 months
[JBoss JIRA] (ISPN-2786) ThreadLocal memory leak in Tomcat
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-2786?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-2786:
------------------------------------
Sorry, my previous comment would is more relevant to the pull request than the issue...
The ThreadLocal documentation also says "Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive *and the ThreadLocal instance is accessible.*" So I'm pretty sure the thread-local can't leak unless the AbstractJBossMarshaller instance is leaked as well.
Indeed, your fix fixes the test, but I don't think it completely fixes the Tomcat use case. Even if it did remove all thread-local values properly, if we leak the AbstractJBossMarshaller then the classloader is still accessible and we still have a memory leak.
> ThreadLocal memory leak in Tomcat
> ---------------------------------
>
> Key: ISPN-2786
> URL: https://issues.jboss.org/browse/ISPN-2786
> Project: Infinispan
> Issue Type: Bug
> Components: Marshalling, Transactions
> Affects Versions: 5.1.8.Final
> Reporter: Johann Burkard
> Assignee: Galder Zamarreño
> Labels: leak, local, memory, thread, threadlocal
> Fix For: 5.3.0.Final
>
>
> Just started an app using Infinispan 5.1.8.Final on Tomcat and got a few ThreadLocal problems during un-deployment:
> (Shortened)
> {code}
> key=org.jboss.marshalling.UTFUtils.BytesHolder
> value=org.jboss.marshalling.UTFUtils$BytesHolder@697a1686
> key=java.lang.ThreadLocal@36ed5ba6
> value=org.infinispan.context.SingleKeyNonTxInvocationContext{flags=null}
> key=org.infinispan.marshall.jboss.AbstractJBossMarshaller$1
> value=org.infinispan.marshall.jboss.AbstractJBossMarshaller$1@75f10df7
> value=org.infinispan.marshall.jboss.AbstractJBossMarshaller.PerThreadInstanceHolder
> {code}
> I do call {{DefaultCacheManager#shutdown()}} during un-deployment. :)
> Thanks
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 10 months
[JBoss JIRA] (ISPN-2802) Cache recovery fails due to missing responses
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-2802?page=com.atlassian.jira.plugin.... ]
Radim Vansa commented on ISPN-2802:
-----------------------------------
[~dan.berindei] I believe that I've checked the threadpool and it was not the case, but I don't remember for sure.
I'd probably suggest to close this issue for now and see if it appears in future resilience testing, I'll reopen it.
> Cache recovery fails due to missing responses
> ---------------------------------------------
>
> Key: ISPN-2802
> URL: https://issues.jboss.org/browse/ISPN-2802
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.0.CR3
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Fix For: 5.3.0.CR1, 5.3.0.Final
>
>
> When the cache recovery is started, the new coordinator sends CacheTopologyControlCommand.GET_STATUS to all nodes and waits for responses. However, I have a reproducible test-case where it always times out waiting for the responses.
> Here are the logs (TRACE is not doable here, but I added some byteman traces - see topology.btm in the archive): http://dl.dropbox.com/u/103079234/recovery.zip
> The problematic spot is on node3 at 05:37:57 receiving cluster view 34.
> All nodes (except the one which is killed, in this case node1) respond quickly to the GET_STATUS command (see BYTEMAN Receiving - Received pairs, these are bound to command execution in CommandAwareRpcDispatcher), but some responses are not received on node3 (look for Receiving rsp bound to GroupRequest).
> JGroups tracing could be useful here but it is not available (intensive logging often blocks on internal log4j locks and the node becomes unresponsive).
> As mentioned above, the case is reproducible, therefore if you can suggest any particular BYTEMAN hook, I can try it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 10 months
[JBoss JIRA] (ISPN-2786) ThreadLocal memory leak in Tomcat
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-2786?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-2786:
------------------------------------
[~galderz], the {{ThreadLocal.remove()}} documentation says that it only "Removes the current thread's value for this thread-local variable." So if the thread-local was accessed from multiple threads (e.g. because it's used in a webapp and each HTTP request is handled by a different thread), calling {{ThreadLocal.remove()}} will *not* remove all the instances of the thread local.
> ThreadLocal memory leak in Tomcat
> ---------------------------------
>
> Key: ISPN-2786
> URL: https://issues.jboss.org/browse/ISPN-2786
> Project: Infinispan
> Issue Type: Bug
> Components: Marshalling, Transactions
> Affects Versions: 5.1.8.Final
> Reporter: Johann Burkard
> Assignee: Galder Zamarreño
> Labels: leak, local, memory, thread, threadlocal
> Fix For: 5.3.0.Final
>
>
> Just started an app using Infinispan 5.1.8.Final on Tomcat and got a few ThreadLocal problems during un-deployment:
> (Shortened)
> {code}
> key=org.jboss.marshalling.UTFUtils.BytesHolder
> value=org.jboss.marshalling.UTFUtils$BytesHolder@697a1686
> key=java.lang.ThreadLocal@36ed5ba6
> value=org.infinispan.context.SingleKeyNonTxInvocationContext{flags=null}
> key=org.infinispan.marshall.jboss.AbstractJBossMarshaller$1
> value=org.infinispan.marshall.jboss.AbstractJBossMarshaller$1@75f10df7
> value=org.infinispan.marshall.jboss.AbstractJBossMarshaller.PerThreadInstanceHolder
> {code}
> I do call {{DefaultCacheManager#shutdown()}} during un-deployment. :)
> Thanks
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 10 months