[JBoss JIRA] (ISPN-4847) Improve indexing performance
by Radim Vansa (JIRA)
Radim Vansa created ISPN-4847:
---------------------------------
Summary: Improve indexing performance
Key: ISPN-4847
URL: https://issues.jboss.org/browse/ISPN-4847
Project: Infinispan
Issue Type: Enhancement
Components: Embedded Querying, Lucene Directory
Affects Versions: 7.0.0.CR1
Reporter: Radim Vansa
Priority: Critical
This JIRA is focused on optimizing performance of use case where the application uses short or none transactions - therefore, no batching can be applied - and then performs the query and expects that the result will be consistent (please, specify in documentation if any lag after cache.put() or tx.commit() is required and if the application can detect that the update has been applied).
Performance of indexing is currently insufficient, when compared to competitors. Competitors show very low overhead of indexed writes, while we can see that the throughput is thousands of times lower [1] (configuration in [2]) with distributed mode storing index in replicated cache, and about 4x slowdown when comparing replicated cache without indexing and indexing to NRT FS/RAM.
[1] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-perf-query-index...
[2] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-perf-query-index...
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4027) TransactionTable.start() initialize the TxCleanupService thread pool even when the cache is NON_TRANSACTIONAL
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4027?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-4027:
-------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 7.0.0.CR2
Resolution: Done
> TransactionTable.start() initialize the TxCleanupService thread pool even when the cache is NON_TRANSACTIONAL
> -------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-4027
> URL: https://issues.jboss.org/browse/ISPN-4027
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 6.0.1.Final
> Reporter: Guillermo GARCIA OCHOA
> Assignee: Takayoshi Kimura
> Labels: 630
> Fix For: 7.0.0.CR2
>
>
> In the {{TransactionTable.start()}} each cache creates a thread pool and a job is scheduled to clean up completed transactions.
> {code:java}
> private void start() {
> ...
> totalOrder = configuration.transaction().transactionProtocol().isTotalOrder();
> if (!totalOrder) {
> // Periodically run a task to cleanup the transaction table from completed transactions.
> ThreadFactory tf = new ThreadFactory() {
> @Override
> public Thread newThread(Runnable r) {
> String address = rpcManager != null ? rpcManager.getTransport().getAddress().toString() : "local";
> Thread th = new Thread(r, "TxCleanupService," + cacheName + "," + address);
> th.setDaemon(true);
> return th;
> }
> };
> executorService = Executors.newSingleThreadScheduledExecutor(tf);
> long interval = configuration.transaction().reaperWakeUpInterval();
> executorService.scheduleAtFixedRate(new Runnable() {
> @Override
> public void run() {
> cleanupCompletedTransactions();
> }
> }, interval, interval, TimeUnit.MILLISECONDS);
> }
> }
> {code}
> As you can see in the code, even is the cache is {{NON_TRANSACTIONAL}} the job is scheduled, consuming resources to do nothing (the {{completedTransactions}} map is always empty)
> Maybe I'm missing something, but our application profiling is showing us that these threads do nothing but they are consuming precious resources because we have more than 1000 {{NON_TRANSACTIONAL}} caches.
> (i) This can be considered when solving ISPN-3702 too.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4702) The ForkJoin thread pool should be started lazily
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4702?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-4702:
------------------------------------
The common pool is created the first time you create an {{EquivalentConcurrentHashMapV8}}, and it can't be stopped.
But AFAICT it doesn't create any worker threads until something is submitted to the pool, just like {{ThreadPoolExecutor}}.
> The ForkJoin thread pool should be started lazily
> -------------------------------------------------
>
> Key: ISPN-4702
> URL: https://issues.jboss.org/browse/ISPN-4702
> Project: Infinispan
> Issue Type: Enhancement
> Components: Distributed Execution and Map/Reduce
> Affects Versions: 7.0.0.Beta1
> Reporter: Sanne Grinovero
> Assignee: Dan Berindei
> Priority: Minor
>
> I'm not using these features, yet a significant amount of threads are being started.
> We should at least start this pool only on-demand, and ideally shut it down after a grace period if it's no longer being used.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4826) X-Site State transfer values not propagated correctly
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4826?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4826:
-----------------------------------------------
Pedro Ruivo <pruivo(a)redhat.com> changed the Status of [bug 1151523|https://bugzilla.redhat.com/show_bug.cgi?id=1151523] from NEW to POST
> X-Site State transfer values not propagated correctly
> ------------------------------------------------------
>
> Key: ISPN-4826
> URL: https://issues.jboss.org/browse/ISPN-4826
> Project: Infinispan
> Issue Type: Bug
> Components: Cross-Site Replication
> Affects Versions: 7.0.0.CR1
> Reporter: Matej Čimbora
> Assignee: Pedro Ruivo
> Fix For: 7.0.0.CR2
>
>
> Used configuration:
> a) SITE1: 2 nodes, cache testCacheSite1
> <backups>
> <backup site="SITE2"/>
> </backups>
> b) SITE2: 3 nodes, cache testCacheSite1_backup – backup cache for testCacheSite1
> <backup-for remote-cache="testCacheSite1" remote-site="SITE1"/>
> When using backup cache with name (testCacheSite1_backup) different from the name of the main cache in SITE1 (testCacheSite1), the data is not propagated to the backup cache completely. The issue seems to be fixed by using the same name for the backup cache (testCacheSite1).
> Scenario
> 1. Start site1 and write data into it (1000 entries)
> 2. Start site2 and invoke XsiteAdminOperations.pushState(“SITE2”)
> 3. Wait 2 minutes
> 4. Check whether the state was transferred to site2 (tested on dist & repl backup cache configs)
> a) distributed mode (numOwners=2) - expected 2000 entries in total, was 648 on site2 master & 0 on other nodes
> b) replicated mode – expected 3000 entries in total, was 1000 on site2 master & 0 on other nodes
>
> Trace log:
> 04:14:39,116 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (OOB-10,edg-perf13-23152) Silently ignoring that testCacheSite1 cache is not defined
> 04:14:39,375 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (OOB-10,edg-perf13-23152) Attempting to execute command: SingleRpcCommand{cacheName='testCacheSite1_backup', command=PutKeyValueCommand{key=key_0000000000000001, value=value_key_0000000000000001_SITE1_ORIGINAL@testCacheSite1, flags=[SKIP_REMOTE_LOOKUP, PUT_FOR_X_SITE_STATE_TRANSFER, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}} [sender=edg-perf14-31850]
> 04:14:39,376 TRACE [org.infinispan.statetransfer.StateTransferLockImpl] (OOB-10,edg-perf13-23152) Checking if transaction data was received for topology 4, current topology is 4
> 04:14:39,376 TRACE [org.infinispan.util.concurrent.BlockingTaskAwareExecutorServiceImpl] (OOB-10,edg-perf13-23152) Added a new task: 0 task(s) are waiting
> 04:14:39,376 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread--p3-t2) Calling perform() on SingleRpcCommand{cacheName='testCacheSite1_backup', command=PutKeyValueCommand{key=key_0000000000000001, value=value_key_0000000000000001_SITE1_ORIGINAL@testCacheSite1, flags=[SKIP_REMOTE_LOOKUP, PUT_FOR_X_SITE_STATE_TRANSFER, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}}
> 04:14:39,378 TRACE [org.infinispan.commands.remote.BaseRpcInvokingCommand] (remote-thread--p3-t2) Invoking command PutKeyValueCommand{key=key_0000000000000001, value=value_key_0000000000000001_SITE1_ORIGINAL@testCacheSite1, flags=[SKIP_REMOTE_LOOKUP, PUT_FOR_X_SITE_STATE_TRANSFER, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}, with originLocal flag set to false
> 04:14:39,378 TRACE [org.infinispan.interceptors.InvocationContextInterceptor] (remote-thread--p3-t2) Invoked with command PutKeyValueCommand{key=key_0000000000000001, value=value_key_0000000000000001_SITE1_ORIGINAL@testCacheSite1, flags=[SKIP_REMOTE_LOOKUP, PUT_FOR_X_SITE_STATE_TRANSFER, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true} and InvocationContext [org.infinispan.context.impl.NonTxInvocationContext@266883cb]
> 04:14:39,379 TRACE [org.infinispan.statetransfer.StateTransferInterceptor] (remote-thread--p3-t2) handleNonTxWriteCommand for command PutKeyValueCommand{key=key_0000000000000001, value=value_key_0000000000000001_SITE1_ORIGINAL@testCacheSite1, flags=[SKIP_REMOTE_LOOKUP, PUT_FOR_X_SITE_STATE_TRANSFER, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}
> 04:14:39,380 TRACE [org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor] (remote-thread--p3-t2) Are (edg-perf13-23152) we the lock owners for key 'key_0000000000000001'? false
> 04:14:39,380 TRACE [org.infinispan.interceptors.EntryWrappingInterceptor] (remote-thread--p3-t2) Wrapping entry 'key_0000000000000001'? true
> 04:14:39,380 TRACE [org.infinispan.container.EntryFactoryImpl] (remote-thread--p3-t2) Exists in context? null
> 04:14:39,382 TRACE [org.infinispan.container.EntryFactoryImpl] (remote-thread--p3-t2) Retrieved from container null (isL1Enabled=false, isLocal=true)
> 04:14:39,382 TRACE [org.infinispan.container.EntryFactoryImpl] (remote-thread--p3-t2) Creating new entry.
> 04:14:39,388 TRACE [org.infinispan.container.EntryFactoryImpl] (remote-thread--p3-t2) Wrap key_0000000000000001 for put. Entry=ReadCommittedEntry(197b92bc){key=key_0000000000000001, value=null, oldValue=null, isCreated=true, isChanged=false, isRemoved=false, isValid=true, skipRemoteGet=false, metadata=EmbeddedMetadata{version=null}}
> 04:14:39,390 TRACE [org.infinispan.interceptors.CallInterceptor] (remote-thread--p3-t2) Executing command: PutKeyValueCommand{key=key_0000000000000001, value=value_key_0000000000000001_SITE1_ORIGINAL@testCacheSite1, flags=[SKIP_REMOTE_LOOKUP, PUT_FOR_X_SITE_STATE_TRANSFER, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}.
> 04:14:39,391 TRACE [org.infinispan.interceptors.EntryWrappingInterceptor] (remote-thread--p3-t2) About to commit entry ReadCommittedEntry(197b92bc){key=key_0000000000000001, value=value_key_0000000000000001_SITE1_ORIGINAL@testCacheSite1, oldValue=null, isCreated=true, isChanged=true, isRemoved=false, isValid=true, skipRemoteGet=false, metadata=EmbeddedMetadata{version=null}}
> 04:14:39,392 TRACE [org.infinispan.statetransfer.CommitManager] (remote-thread--p3-t2) Trying to commit. Key=key_0000000000000001. Operation Flag=PUT_FOR_X_SITE_STATE_TRANSFER, L1 invalidation=false
> 04:14:39,392 TRACE [org.infinispan.statetransfer.CommitManager] (remote-thread--p3-t2) Not committing key=key_0000000000000001. It is a state transfer key but no track is enabled!
> 04:14:39,392 TRACE [org.infinispan.interceptors.EntryWrappingInterceptor] (remote-thread--p3-t2) The return value is null
> Suspicious lines:
> 04:14:39,116 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (OOB-10,edg-perf13-23152) Silently ignoring that testCacheSite1 cache is not defined
> 04:14:39,392 TRACE [org.infinispan.statetransfer.CommitManager] (remote-thread--p3-t2) Not committing key=key_0000000000000001. It is a state transfer key but no track is enabled!
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4842) Shared consistent hash
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4842?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-4842:
------------------------------------
Takayoshi, the JGroups channel is already shared. There are probably some per-cache threads that could be merged, like the TXCleanupService thread that you already issued a PR for, however most threads should be shared.
Sharing the consistent hash would require all those caches to be symmetric - i.e. if cache {{c}} is running on one node, it must also run on all the other nodes in the cluster. We are considering allowing only symmetric caches for 8.0 (adding support for multiple cache managers on FORKed JChannels instead), which should allow sharing the consistent hash.
However, 8.0 is quite far away, so I would focus on reducing cost of computing the consistent hash (e.g. [TopologyAware]SyncConsistentHash results may be cached) and the merging per-cache threads into global threads. Please create separate issues for other per-cache threads that you've found.
> Shared consistent hash
> ----------------------
>
> Key: ISPN-4842
> URL: https://issues.jboss.org/browse/ISPN-4842
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 7.0.0.CR1
> Reporter: Takayoshi Kimura
>
> A user is testing 500 nodes cluster with 500 dist caches defined, and plans to expand it to 3000 caches.
> Infinispan manages consistent hash per cache, uses a JGroups channel per cache and uses several threads per cache. It gives significant overhead with this large size cluster. When tested with this size, Infinispan easily exhausted all threads in the thread pools and deadlocks, and requires several thousands threads to handle massive JOIN requests - the coord receives 499 * 3000 JOIN requests.
> It would be great if we can share the consistent hash and resources between caches. For example, define a "master" dist cache and allow other caches to refer to the master cache for resource sharing.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4842) Reduce the overhead of clustered caches
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4842?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-4842:
-------------------------------
Summary: Reduce the overhead of clustered caches (was: Shared consistent hash)
> Reduce the overhead of clustered caches
> ---------------------------------------
>
> Key: ISPN-4842
> URL: https://issues.jboss.org/browse/ISPN-4842
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 7.0.0.CR1
> Reporter: Takayoshi Kimura
>
> A user is testing 500 nodes cluster with 500 dist caches defined, and plans to expand it to 3000 caches.
> Infinispan manages consistent hash per cache, uses a JGroups channel per cache and uses several threads per cache. It gives significant overhead with this large size cluster. When tested with this size, Infinispan easily exhausted all threads in the thread pools and deadlocks, and requires several thousands threads to handle massive JOIN requests - the coord receives 499 * 3000 JOIN requests.
> It would be great if we can share the consistent hash and resources between caches. For example, define a "master" dist cache and allow other caches to refer to the master cache for resource sharing.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4027) TransactionTable.start() initialize the TxCleanupService thread pool even when the cache is NON_TRANSACTIONAL
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-4027?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero commented on ISPN-4027:
---------------------------------------
Great catch. It's actually a regression as I remember fixing a similar issues years ago.
Also related: ISPN-4702
> TransactionTable.start() initialize the TxCleanupService thread pool even when the cache is NON_TRANSACTIONAL
> -------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-4027
> URL: https://issues.jboss.org/browse/ISPN-4027
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 6.0.1.Final
> Reporter: Guillermo GARCIA OCHOA
> Assignee: Takayoshi Kimura
> Labels: 630
>
> In the {{TransactionTable.start()}} each cache creates a thread pool and a job is scheduled to clean up completed transactions.
> {code:java}
> private void start() {
> ...
> totalOrder = configuration.transaction().transactionProtocol().isTotalOrder();
> if (!totalOrder) {
> // Periodically run a task to cleanup the transaction table from completed transactions.
> ThreadFactory tf = new ThreadFactory() {
> @Override
> public Thread newThread(Runnable r) {
> String address = rpcManager != null ? rpcManager.getTransport().getAddress().toString() : "local";
> Thread th = new Thread(r, "TxCleanupService," + cacheName + "," + address);
> th.setDaemon(true);
> return th;
> }
> };
> executorService = Executors.newSingleThreadScheduledExecutor(tf);
> long interval = configuration.transaction().reaperWakeUpInterval();
> executorService.scheduleAtFixedRate(new Runnable() {
> @Override
> public void run() {
> cleanupCompletedTransactions();
> }
> }, interval, interval, TimeUnit.MILLISECONDS);
> }
> }
> {code}
> As you can see in the code, even is the cache is {{NON_TRANSACTIONAL}} the job is scheduled, consuming resources to do nothing (the {{completedTransactions}} map is always empty)
> Maybe I'm missing something, but our application profiling is showing us that these threads do nothing but they are consuming precious resources because we have more than 1000 {{NON_TRANSACTIONAL}} caches.
> (i) This can be considered when solving ISPN-3702 too.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4027) TransactionTable.start() initialize the TxCleanupService thread pool even when the cache is NON_TRANSACTIONAL
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-4027?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero edited comment on ISPN-4027 at 10/15/14 6:34 AM:
-----------------------------------------------------------------
Great catch. It's actually a regression as I remember fixing a similar issue years ago.
Also related: ISPN-4702
was (Author: sannegrinovero):
Great catch. It's actually a regression as I remember fixing a similar issues years ago.
Also related: ISPN-4702
> TransactionTable.start() initialize the TxCleanupService thread pool even when the cache is NON_TRANSACTIONAL
> -------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-4027
> URL: https://issues.jboss.org/browse/ISPN-4027
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 6.0.1.Final
> Reporter: Guillermo GARCIA OCHOA
> Assignee: Takayoshi Kimura
> Labels: 630
>
> In the {{TransactionTable.start()}} each cache creates a thread pool and a job is scheduled to clean up completed transactions.
> {code:java}
> private void start() {
> ...
> totalOrder = configuration.transaction().transactionProtocol().isTotalOrder();
> if (!totalOrder) {
> // Periodically run a task to cleanup the transaction table from completed transactions.
> ThreadFactory tf = new ThreadFactory() {
> @Override
> public Thread newThread(Runnable r) {
> String address = rpcManager != null ? rpcManager.getTransport().getAddress().toString() : "local";
> Thread th = new Thread(r, "TxCleanupService," + cacheName + "," + address);
> th.setDaemon(true);
> return th;
> }
> };
> executorService = Executors.newSingleThreadScheduledExecutor(tf);
> long interval = configuration.transaction().reaperWakeUpInterval();
> executorService.scheduleAtFixedRate(new Runnable() {
> @Override
> public void run() {
> cleanupCompletedTransactions();
> }
> }, interval, interval, TimeUnit.MILLISECONDS);
> }
> }
> {code}
> As you can see in the code, even is the cache is {{NON_TRANSACTIONAL}} the job is scheduled, consuming resources to do nothing (the {{completedTransactions}} map is always empty)
> Maybe I'm missing something, but our application profiling is showing us that these threads do nothing but they are consuming precious resources because we have more than 1000 {{NON_TRANSACTIONAL}} caches.
> (i) This can be considered when solving ISPN-3702 too.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months