[JBoss JIRA] (ISPN-10272) RocskDbStore BatchWrite org.rocksdb.RocksDBException: unknown WriteBatch tag
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-10272?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-10272:
-----------------------------------
Sprint: ISPN+RHDG Sprint #29
> RocskDbStore BatchWrite org.rocksdb.RocksDBException: unknown WriteBatch tag
> -----------------------------------------------------------------------------
>
> Key: ISPN-10272
> URL: https://issues.jboss.org/browse/ISPN-10272
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 9.4.14.Final
> Environment: h6. OS
> Ubuntu 16.0.4 4.15.0-1045-azure x86_64
> h6. Java
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
> Reporter: Jestan Nirojan
> Assignee: Will Burns
> Priority: Major
> Fix For: 10.0.0.Beta4, 9.4.15.Final
>
>
> h6. Issue
> RocksDBStore is not closing WriteBatch native handles and under heavy write with multiple databases, following error is continuously throwing and database never recovers (following error is from infinispan 9.2.5-Final but i believe same issue exists in master , see https://github.com/infinispan/infinispan/blob/1bcfe5823e7957060080741f7d0...)
> {code:java}
> Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag
> at org.rocksdb.RocksDB.write0(Native Method)
> at org.rocksdb.RocksDB.write(RocksDB.java:602)
> at org.infinispan.persistence.rocksdb.RocksDBStore.writeBatch(RocksDBStore.java:393)
> at org.infinispan.persistence.rocksdb.RocksDBStore.writeBatch(RocksDBStore.java:374)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.lambda$writeBatchToAllNonTxStores$12
> {code}
> I was also getting other errors from this list, for example "bad WriteBatch Delete"https://github.com/facebook/rocksdb/blob/cae22c53fbad071be8aa3a8543415383...
> h6. Fix
> {code:java}
> private void writeBatch(WriteBatch batch) throws InterruptedException, RocksDBException {
> semaphore.acquire();
> try {
> if (stopped)
> throw new PersistenceException("RocksDB is stopped");
> db.write(dataWriteOptions(), batch);
> } finally {
> semaphore.release();
> batch.close();
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
6 years, 9 months
[JBoss JIRA] (ISPN-10269) ProtobufMetadataManagerImpl.registerProtofiles always times out
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-10269?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-10269:
-----------------------------------
Sprint: ISPN+RHDG Sprint #29
> ProtobufMetadataManagerImpl.registerProtofiles always times out
> ---------------------------------------------------------------
>
> Key: ISPN-10269
> URL: https://issues.jboss.org/browse/ISPN-10269
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Querying, Test Suite - Query
> Affects Versions: 10.0.0.Beta3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4
>
>
> {{ProtobufMetadataManagerImpl.registerProtofiles()}} uses {{protobufSchemaCache.putAll(files)}} under the covers, and because of a bug {{ProtobufMetadataManagerInterceptor.visitPutMapCommand()}} it tries to lock the {{.errors}} key both on the originator and on the remote nodes (during remote prepare). The {{LockControlCommand}} goes through {{TransactionSynchronizerInterceptor}} and tries to "lock" the transaction with {{enterSynchronizationAsync()}}, but it hangs because the {{PrepareCommand}} invocation can't continue and complete its {{releaseFuture}}.
> {noformat}
> 00:59:56,177 ERROR [org.infinispan.transaction.impl.TransactionCoordinator] (pool-8-thread-1) ISPN000097: Error while processing a prepare in a single-phase transaction: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:263)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:159)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionTable.afterCompletion(TransactionTable.java:870)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.synchronization.SynchronizationAdapter.afterCompletion(SynchronizationAdapter.java:33)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction.performConsumer(AbstractTransaction.java:223)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction$AssociatingSynchronization.afterCompletion(AbstractTransaction.java:306)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.afterCompletion(SynchronizationImple.java:96)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.afterCompletion(TwoPhaseCoordinator.java:545)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:101)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:162)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1288)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126)
> at org.jboss.jts.integration@5.9.0.Final//com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:89)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.LocalTransaction.commitAndDissociate(LocalTransaction.java:77)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.ContextTransactionManager.commit(ContextTransactionManager.java:71)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.tryCommit(CacheImpl.java:1981)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandWithInjectedTx(CacheImpl.java:1901)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1874)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:1458)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:2005)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:469)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.AbstractDelegatingCache.putAll(AbstractDelegatingCache.java:459)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.EncoderCache.putAll(EncoderCache.java:693)
> at org.infinispan.remote-query.server:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.query.remote.impl.ProtobufMetadataManagerImpl.registerProtofiles(ProtobufMetadataManagerImpl.java:166)
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ... 3 more
> {noformat}
> {{enterSynchronizationAsync()}} doesn't have any timeout. When the prepare times out on the originator and it sends a {{RollbackCommand}} to all the affected nodes, the {{RollbackCommand}} will hang as well. However, because {{ProtobufMetadataManagerImpl}} configures the protobuf metadata cache with {{invocationBatching().enable()}}, it uses {{Synchronization}}, and all the exceptions are hidden from the user.
> The failure can still causes failures later in {{RemoteQueryDescriptorIT}}:
> {noformat}
> 00:57:59.423 [ERROR] testDescriptorPropagation(org.infinispan.server.test.query.RemoteQueryDescriptorIT) Time elapsed: 38.848 s <<< FAILURE!
> java.lang.AssertionError
> at org.infinispan.server.test.query.RemoteQueryDescriptorIT.registerProtoOnServer1(RemoteQueryDescriptorIT.java:83)
> at org.infinispan.server.test.query.RemoteQueryDescriptorIT.testDescriptorPropagation(RemoteQueryDescriptorIT.java:59)
> {noformat}
> Interesting enough, it never fails when running all the tests in {{server/integration/testsuite}} on master, but it fails when running only {{RemoteQueryDescriptorIT}} with {{mvn verify -pl server/integration/testsuite -Dit.test=RemoteQueryDescriptorIT -Psuite.queries}}, and it also fails when running all the tests after I removed {{StringBasedStoreMultinodeIT.testSingleton}}. Maybe it's because multiple ITs share the same global state persistence location?
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
6 years, 9 months