[JBoss JIRA] (ISPN-5159) Make concurrent startup smooth
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-5159?page=com.atlassian.jira.plugin.... ]
Radim Vansa commented on ISPN-5159:
-----------------------------------
I can't remember precisely in which test this was causing trouble - probably in larger cluster tests, then. [~mcimbora], could you try to run a simple test without staggered start in LLNL, with e.g. 100 nodes?
> Make concurrent startup smooth
> ------------------------------
>
> Key: ISPN-5159
> URL: https://issues.jboss.org/browse/ISPN-5159
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Affects Versions: 7.1.0.Beta1
> Reporter: Radim Vansa
> Assignee: Dan Berindei
>
> When starting many instances in parallel, it often happens that the node does not detect its neighborhood very well and this results in many subclusters, merging views etc.
> Merging two available partitions has undefined results (AFAIK). While we can expect that there are no requests to the cluster from the application ^1^, Infinispan itself uses some caches to store internal information (HotRod routing, Protobuf etc...). It would be better if the available-available merge would provide hooks for rebuilding this info.
> ^1^) Being able to start the cluster with reads/writes disabled and enable them only when the cache has expected number of members would be convenient, too.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years
[JBoss JIRA] (ISPN-5159) Make concurrent startup smooth
by Matej Čimbora (JIRA)
[ https://issues.jboss.org/browse/ISPN-5159?page=com.atlassian.jira.plugin.... ]
Matej Čimbora commented on ISPN-5159:
-------------------------------------
[~dan.berindei] Test with 8 nodes starting in parallel repeatedly showed no merges on startup.
> Make concurrent startup smooth
> ------------------------------
>
> Key: ISPN-5159
> URL: https://issues.jboss.org/browse/ISPN-5159
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Affects Versions: 7.1.0.Beta1
> Reporter: Radim Vansa
> Assignee: Dan Berindei
>
> When starting many instances in parallel, it often happens that the node does not detect its neighborhood very well and this results in many subclusters, merging views etc.
> Merging two available partitions has undefined results (AFAIK). While we can expect that there are no requests to the cluster from the application ^1^, Infinispan itself uses some caches to store internal information (HotRod routing, Protobuf etc...). It would be better if the available-available merge would provide hooks for rebuilding this info.
> ^1^) Being able to start the cluster with reads/writes disabled and enable them only when the cache has expected number of members would be convenient, too.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years
[JBoss JIRA] (ISPN-5356) Transactional, optimistic-locked caches do not honour Flag.FAIL_SILENTLY
by Mitchell Archibald (JIRA)
Mitchell Archibald created ISPN-5356:
----------------------------------------
Summary: Transactional, optimistic-locked caches do not honour Flag.FAIL_SILENTLY
Key: ISPN-5356
URL: https://issues.jboss.org/browse/ISPN-5356
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 6.0.2.Final
Environment: Infinispan 6.0.2
Hibernate ORM 4.3.6
Apache Tomcat 8
JDK 1.8.0_25
Reporter: Mitchell Archibald
There is a related issue in the Hibernate JIRA, which can be found at:
https://hibernate.atlassian.net/browse/HHH-7898
However, I believe this to be an Infinispan issue, as per my analysis below.
The following Exception is thrown when two threads are simultaneously committing a JTA transaction in which an entry has been written to Hibernate's "replicated-query" cache using the same key.
{code}
10:55:02.238 ERROR saction.TransactionCoordinator - ISPN000255: Error while processing prepare
org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [0 milliseconds] on key [sql: select ...; parameters: ...; named parameters: {...}; transformer: org.hibernate.transform.CacheableResultTransformer@110f2] for requestor [GlobalTransaction:<null>:197882:local]! Lock held by [null]
at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:198)
at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLock(LockManagerImpl.java:171)
at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockKeyAndCheckOwnership(AbstractTxLockingInterceptor.java:169)
at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAndRegisterBackupLock(AbstractTxLockingInterceptor.java:98)
at org.infinispan.interceptors.locking.OptimisticLockingInterceptor$LockAcquisitionVisitor.lockAndRecord(OptimisticLockingInterceptor.java:211)
at org.infinispan.interceptors.locking.OptimisticLockingInterceptor$LockAcquisitionVisitor.visitSingleKeyCommand(OptimisticLockingInterceptor.java:206)
at org.infinispan.interceptors.locking.OptimisticLockingInterceptor$LockAcquisitionVisitor.visitPutKeyValueCommand(OptimisticLockingInterceptor.java:199)
at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:70)
at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.acquireLocksVisitingCommands(OptimisticLockingInterceptor.java:270)
at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.visitPrepareCommand(OptimisticLockingInterceptor.java:75)
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:124)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
at org.infinispan.interceptors.NotificationInterceptor.visitPrepareCommand(NotificationInterceptor.java:36)
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:124)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
at org.infinispan.interceptors.TxInterceptor.invokeNextInterceptorAndVerifyTransaction(TxInterceptor.java:114)
at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:101)
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:124)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:112)
at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:96)
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:124)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:110)
at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:73)
at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:96)
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:124)
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333)
at org.infinispan.transaction.TransactionCoordinator.prepare(TransactionCoordinator.java:119)
at org.infinispan.transaction.TransactionCoordinator.prepare(TransactionCoordinator.java:101)
at org.infinispan.transaction.synchronization.SynchronizationAdapter.beforeCompletion(SynchronizationAdapter.java:44)
at com.metiom.core.transaction.TransactionImpl.lambda$commit$0(TransactionImpl.java:115)
at com.metiom.core.transaction.TransactionImpl$$Lambda$2/330107372.accept(Unknown Source)
at com.metiom.core.transaction.TransactionImpl.forEachSynchronisation(TransactionImpl.java:264)
at com.metiom.core.transaction.TransactionImpl.commit(TransactionImpl.java:114)
at com.metiom.core.transaction.TransactionManagerImpl.commit(TransactionManagerImpl.java:142)
at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:1021)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:757)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:726)
at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:497)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:277)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
...
{code}
The configuration for Hibernate's "replicated-query" cache is as follows (from Hibernate's infinispan-configs.xml)
{code}
<!-- An alternative configuration for entity/collection caching that uses replication instead of invalidation -->
<namedCache name="replicated-entity">
<clustering mode="replication">
<stateTransfer fetchInMemoryState="false" timeout="20000"/>
<sync replTimeout="20000"/>
</clustering>
<locking isolationLevel="READ_COMMITTED" concurrencyLevel="1000"
lockAcquisitionTimeout="15000" useLockStriping="false"/>
<!-- Eviction configuration. WakeupInterval defines how often the eviction thread runs, in milliseconds.
0 means the eviction thread will never run. A separate executor is used for eviction in each cache. -->
<eviction maxEntries="10000" strategy="LRU"/>
<expiration maxIdle="100000" wakeUpInterval="5000"/>
<lazyDeserialization enabled="true"/>
<transaction transactionMode="TRANSACTIONAL" autoCommit="false"
lockingMode="OPTIMISTIC"/>
</namedCache>
{code}
Additionally, the cache is configured programmatically with:
* {{Flag.ZERO_LOCK_ACQUISITION_TIMEOUT}}; and
* {{Flag.FAIL_SILENTLY}}
The situation is that two threads are concurrently attempting to lock the same key on commit.
* During a JTA transaction, a Hibernate query is executed, and Hibernate places a result in the query cache.
* On commit, the {{SynchronizationAdaptor}} creates a {{PrepareCommand}}, which contains one or more "modifications", one of which is a {{PutKeyValueCommand}}. The {{SynchronizationAdapter}} invokes the {{PrepareCommand}}.
* When visited, the {{OptimisticLockingInterceptor}} loops through the modifications held by the {{PrepareCommand}} and visits each one.
* When the {{PutKeyValueCommand}} is visited, {{LockManagerImpl}} discovers that the key is locked, and throws {{TimeoutException}}.
* The Exception is caught by {{InvocationContextInterceptor}}, and attempts to find out whether the command it was invoking is affected by {{Flag.FAIL_SILENTLY}}. {{PrepareCommand}} does not implement {{FlagAffectedCommand}}, so this test fails and the Exception is propagated.
It is important to note here that the {{PutKeyValueCommand}} _does_ implement {{FlagAffectedCommand}}, and _does_ have the {{Flag.FAIL_SILENTLY}} flag set. So the problem seems to be that {{OptimisticLockingInterceptor}} is unaware that the modifications to the {{PrepareCommand}} could throw Exceptions which can be suppressed.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years
[JBoss JIRA] (ISPN-5355) BackupForNotSpecifiedTest.testDataGetsReplicated always fails
by Dan Berindei (JIRA)
Dan Berindei created ISPN-5355:
----------------------------------
Summary: BackupForNotSpecifiedTest.testDataGetsReplicated always fails
Key: ISPN-5355
URL: https://issues.jboss.org/browse/ISPN-5355
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Core
Affects Versions: 7.2.0.CR1
Reporter: Dan Berindei
Assignee: Tristan Tarrant
Priority: Blocker
Fix For: 7.2.0.CR1
The test seems to be failing since the fix for ISPN-5243 was integrated in master:
{noformat}
java.lang.AssertionError: expected:<v_backup_lon> but was:<null>
at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364)
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80)
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:88)
at org.infinispan.xsite.BackupForNotSpecifiedTest.testDataGetsReplicated(BackupForNotSpecifiedTest.java:67)
{noformat}
http://ci.infinispan.org/viewLog.html?buildId=24683&buildTypeId=bt8&tab=b...
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years
[JBoss JIRA] (ISPN-5270) Deadlock in InfinispanDirectoryProvider startup
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5270?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-5270:
------------------------------------
[~sannegrinovero] the way I see it, people who are in a hurry will use the the default configuration, which is to use different caches. Only people who want to "optimize" things will change the configuration, and I think it would be good to stop them from going on a way that could cause a lot of pain later.
> Deadlock in InfinispanDirectoryProvider startup
> -----------------------------------------------
>
> Key: ISPN-5270
> URL: https://issues.jboss.org/browse/ISPN-5270
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying
> Affects Versions: 7.2.0.Alpha1, 7.1.1.Final
> Reporter: Dan Berindei
> Assignee: Gustavo Fernandes
> Priority: Minor
> Attachments: surefire.stacks, surefire2.stacks
>
>
> The InfinispanDirectoryProvider tries to start the metadata, data, and locking caches when it starts up, with {{DefaultCacheManager.startCaches()}}.
> However, when one of these caches (e.g. the metadata cache) starts, the {{LifecycleManager.cacheStarting()}}, which can then try to start the InfinispanDirectoryProvider again:
> {noformat}
> "CacheStartThread,null,LuceneIndexesMetadata" prio=10 tid=0x00007f5f74484000 nid=0xe42 in Object.wait() [0x00007f5efff48000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000c2180000> (a org.infinispan.manager.DefaultCacheManager$1)
> at java.lang.Thread.join(Thread.java:1281)
> - locked <0x00000000c2180000> (a org.infinispan.manager.DefaultCacheManager$1)
> at java.lang.Thread.join(Thread.java:1355)
> at org.infinispan.manager.DefaultCacheManager.startCaches(DefaultCacheManager.java:465)
> at org.hibernate.search.infinispan.spi.InfinispanDirectoryProvider.start(InfinispanDirectoryProvider.java:84)
> at org.hibernate.search.indexes.spi.DirectoryBasedIndexManager.initialize(DirectoryBasedIndexManager.java:88)
> at org.hibernate.search.indexes.impl.IndexManagerHolder.createIndexManager(IndexManagerHolder.java:256)
> at org.hibernate.search.indexes.impl.IndexManagerHolder.createIndexManager(IndexManagerHolder.java:513)
> - locked <0x00000000ce6001d0> (a org.hibernate.search.indexes.impl.IndexManagerHolder)
> at org.hibernate.search.indexes.impl.IndexManagerHolder.createIndexManagers(IndexManagerHolder.java:482)
> at org.hibernate.search.indexes.impl.IndexManagerHolder.buildEntityIndexBinding(IndexManagerHolder.java:91)
> - locked <0x00000000ce6001d0> (a org.hibernate.search.indexes.impl.IndexManagerHolder)
> at org.hibernate.search.spi.SearchIntegratorBuilder.initDocumentBuilders(SearchIntegratorBuilder.java:366)
> at org.hibernate.search.spi.SearchIntegratorBuilder.buildNewSearchFactory(SearchIntegratorBuilder.java:204)
> at org.hibernate.search.spi.SearchIntegratorBuilder.buildSearchIntegrator(SearchIntegratorBuilder.java:122)
> at org.hibernate.search.spi.SearchFactoryBuilder.buildSearchFactory(SearchFactoryBuilder.java:35)
> at org.infinispan.query.impl.LifecycleManager.getSearchFactory(LifecycleManager.java:260)
> at org.infinispan.query.impl.LifecycleManager.cacheStarting(LifecycleManager.java:102)
> at org.infinispan.factories.ComponentRegistry.notifyCacheStarting(ComponentRegistry.java:230)
> at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:216)
> at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:814)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:591)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:546)
> at org.infinispan.manager.DefaultCacheManager.access$100(DefaultCacheManager.java:115)
> at org.infinispan.manager.DefaultCacheManager$1.run(DefaultCacheManager.java:452)
> {noformat}
> This can hang the test, the attached thread dumps show {{EmbeddedCompatTest}} and {{IndexCacheStopTest}}.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years
[JBoss JIRA] (ISPN-5314) EventSocketTimeoutTest.testSocketTimeoutWithEvent randomly failing
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-5314?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes updated ISPN-5314:
------------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> EventSocketTimeoutTest.testSocketTimeoutWithEvent randomly failing
> ------------------------------------------------------------------
>
> Key: ISPN-5314
> URL: https://issues.jboss.org/browse/ISPN-5314
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Protocols
> Affects Versions: 7.2.0.Beta1
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Fix For: 7.2.0.CR1
>
> Attachments: infinispan.log
>
>
> {code}
> org.infinispan.client.hotrod.exceptions.TransportException:: java.net.SocketTimeoutException
> at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.readByte(TcpTransport.java:184)
> at org.infinispan.client.hotrod.impl.protocol.Codec20.readMagic(Codec20.java:282)
> at org.infinispan.client.hotrod.impl.protocol.Codec20.readHeader(Codec20.java:94)
> at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:56)
> at org.infinispan.client.hotrod.impl.operations.AbstractKeyValueOperation.sendPutOperation(AbstractKeyValueOperation.java:50)
> at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:30)
> at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:19)
> at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:52)
> at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:237)
> at org.infinispan.client.hotrod.impl.RemoteCacheSupport.put(RemoteCacheSupport.java:79)
> at org.infinispan.client.hotrod.event.EventSocketTimeoutTest$1.call(EventSocketTimeoutTest.java:60)
> at org.infinispan.client.hotrod.test.HotRodClientTestingUtil.withClientListener(HotRodClientTestingUtil.java:145)
> at org.infinispan.client.hotrod.event.EventSocketTimeoutTest.testSocketTimeoutWithEvent(EventSocketTimeoutTest.java:47)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:348)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:38)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.SocketTimeoutException
> at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.readByte(TcpTransport.java:179)
> ... 32 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years