[JBoss JIRA] (ISPN-4991) Implement clustered cache statistics
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-4991?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-4991:
----------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 7.1.0.Beta1
7.1.0.Final
Resolution: Done
> Implement clustered cache statistics
> ------------------------------------
>
> Key: ISPN-4991
> URL: https://issues.jboss.org/browse/ISPN-4991
> Project: Infinispan
> Issue Type: Sub-task
> Components: JMX, reporting and management
> Reporter: Vladimir Blagojevic
> Assignee: Vladimir Blagojevic
> Fix For: 7.1.0.Beta1, 7.1.0.Final
>
>
> As of 7.0.0 release we implement cache statistics on a per node cache level. For Infinispan admin console we need to implement aggregate statistics for each cache across all nodes in the cluster. The implementing class should be a registered MBean and should implement similar cache statistics currently implemented by org.infinispan.interceptors.CacheMgmtInterceptor
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5103) Inefficient index updates cause high cost merges and increase overall latency
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-5103?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes commented on ISPN-5103:
-----------------------------------------
Addendum: the corner case described would work simply by ommiting the common index name:
{code}
@Indexed
public class Country { ... }
@Indexed
public class Currency { ... }
cm.getCache("currencies").put(1, new Currency(...))
cm.getCache("countries").put(1, new Country(...))
{code}
> Inefficient index updates cause high cost merges and increase overall latency
> -----------------------------------------------------------------------------
>
> Key: ISPN-5103
> URL: https://issues.jboss.org/browse/ISPN-5103
> Project: Infinispan
> Issue Type: Enhancement
> Components: Embedded Querying
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Gustavo Fernandes
>
> Currently every change to the index is done Lucene-wise combining two operations:
> * Delete by query, using a boolean query on the id plus the entity class
> * Add
>
> Under high load, specially during merges those numerous deletes provoke very long delays causing high latency.
> We should instead use a simple Lucene Update to add/change documents, since internally it translates to a Delete by term plus an Add operation, and delete by terms are extremely efficient in Lucene.
> Some local tests showed average latency of updating the index using this strategy to drop 4 times, both for the SYNC and ASYNC backends
> With relation to sharing the index between entities, which was the original motivation of the Delete by query plus add strategy, we have two scenarios:
> * Same cache with muliple entity types: that's a non-issue, since obviously there's no id colision in this case
> * Different caches with the same index: this scenario happens when different caches shares the same index, for ex:
> {code}
> @Indexed(indeName=common)
> public class Country { ... }
> @Indexed(indeName=common)
> public class Currency { ... }
> cm.getCache("currencies").put(1, new Currency(...))
> cm.getCache("countries").put(1, new Country(...))
> {code}
> This would require a delete by query in order to persist both a Country and a Currency with id=1.
> It would also require setting "default.exclusive_index_use", "false", with the associated cost of having to reopen the IndexWriter on every operation.
> Given the performance gain of doing a simple Update is considerable, we should make the corner case supported by extra configuration or alternatively, generate a unique @ProvidedId, including the entity class or the cache name that work for all cases described above.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5103) Inefficient index updates cause high cost merges and increase overall latency
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-5103?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes commented on ISPN-5103:
-----------------------------------------
> Also: if we were to ban such "mixed types" from the same index we could use a straight forward Lucene Update command rather than applying the two commands Delete + Add
Yes, that's what I described above :)
But we don't necessary need to ban mixed types, it works nicely when the source data lives all in the same cache.
The corner case (described above) is when entities with the same Id living in different caches share the same index. Is there any other scenario where delete by id is unsafe?
> Inefficient index updates cause high cost merges and increase overall latency
> -----------------------------------------------------------------------------
>
> Key: ISPN-5103
> URL: https://issues.jboss.org/browse/ISPN-5103
> Project: Infinispan
> Issue Type: Enhancement
> Components: Embedded Querying
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Gustavo Fernandes
>
> Currently every change to the index is done Lucene-wise combining two operations:
> * Delete by query, using a boolean query on the id plus the entity class
> * Add
>
> Under high load, specially during merges those numerous deletes provoke very long delays causing high latency.
> We should instead use a simple Lucene Update to add/change documents, since internally it translates to a Delete by term plus an Add operation, and delete by terms are extremely efficient in Lucene.
> Some local tests showed average latency of updating the index using this strategy to drop 4 times, both for the SYNC and ASYNC backends
> With relation to sharing the index between entities, which was the original motivation of the Delete by query plus add strategy, we have two scenarios:
> * Same cache with muliple entity types: that's a non-issue, since obviously there's no id colision in this case
> * Different caches with the same index: this scenario happens when different caches shares the same index, for ex:
> {code}
> @Indexed(indeName=common)
> public class Country { ... }
> @Indexed(indeName=common)
> public class Currency { ... }
> cm.getCache("currencies").put(1, new Currency(...))
> cm.getCache("countries").put(1, new Country(...))
> {code}
> This would require a delete by query in order to persist both a Country and a Currency with id=1.
> It would also require setting "default.exclusive_index_use", "false", with the associated cost of having to reopen the IndexWriter on every operation.
> Given the performance gain of doing a simple Update is considerable, we should make the corner case supported by extra configuration or alternatively, generate a unique @ProvidedId, including the entity class or the cache name that work for all cases described above.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5103) Inefficient index updates cause high cost merges and increase overall latency
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-5103?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero commented on ISPN-5103:
---------------------------------------
Also: if we where to ban such "mixed types" from the same index we could use a straight forward Lucene Update command rather than applying the two commands Delete + Add
> Inefficient index updates cause high cost merges and increase overall latency
> -----------------------------------------------------------------------------
>
> Key: ISPN-5103
> URL: https://issues.jboss.org/browse/ISPN-5103
> Project: Infinispan
> Issue Type: Enhancement
> Components: Embedded Querying
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Gustavo Fernandes
>
> Currently every change to the index is done Lucene-wise combining two operations:
> * Delete by query, using a boolean query on the id plus the entity class
> * Add
>
> Under high load, specially during merges those numerous deletes provoke very long delays causing high latency.
> We should instead use a simple Lucene Update to add/change documents, since internally it translates to a Delete by term plus an Add operation, and delete by terms are extremely efficient in Lucene.
> Some local tests showed average latency of updating the index using this strategy to drop 4 times, both for the SYNC and ASYNC backends
> With relation to sharing the index between entities, which was the original motivation of the Delete by query plus add strategy, we have two scenarios:
> * Same cache with muliple entity types: that's a non-issue, since obviously there's no id colision in this case
> * Different caches with the same index: this scenario happens when different caches shares the same index, for ex:
> {code}
> @Indexed(indeName=common)
> public class Country { ... }
> @Indexed(indeName=common)
> public class Currency { ... }
> cm.getCache("currencies").put(1, new Currency(...))
> cm.getCache("countries").put(1, new Country(...))
> {code}
> This would require a delete by query in order to persist both a Country and a Currency with id=1.
> It would also require setting "default.exclusive_index_use", "false", with the associated cost of having to reopen the IndexWriter on every operation.
> Given the performance gain of doing a simple Update is considerable, we should make the corner case supported by extra configuration or alternatively, generate a unique @ProvidedId, including the entity class or the cache name that work for all cases described above.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5103) Inefficient index updates cause high cost merges and increase overall latency
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-5103?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero edited comment on ISPN-5103 at 1/5/15 10:48 AM:
----------------------------------------------------------------
Also: if we were to ban such "mixed types" from the same index we could use a straight forward Lucene Update command rather than applying the two commands Delete + Add
was (Author: sannegrinovero):
Also: if we where to ban such "mixed types" from the same index we could use a straight forward Lucene Update command rather than applying the two commands Delete + Add
> Inefficient index updates cause high cost merges and increase overall latency
> -----------------------------------------------------------------------------
>
> Key: ISPN-5103
> URL: https://issues.jboss.org/browse/ISPN-5103
> Project: Infinispan
> Issue Type: Enhancement
> Components: Embedded Querying
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Gustavo Fernandes
>
> Currently every change to the index is done Lucene-wise combining two operations:
> * Delete by query, using a boolean query on the id plus the entity class
> * Add
>
> Under high load, specially during merges those numerous deletes provoke very long delays causing high latency.
> We should instead use a simple Lucene Update to add/change documents, since internally it translates to a Delete by term plus an Add operation, and delete by terms are extremely efficient in Lucene.
> Some local tests showed average latency of updating the index using this strategy to drop 4 times, both for the SYNC and ASYNC backends
> With relation to sharing the index between entities, which was the original motivation of the Delete by query plus add strategy, we have two scenarios:
> * Same cache with muliple entity types: that's a non-issue, since obviously there's no id colision in this case
> * Different caches with the same index: this scenario happens when different caches shares the same index, for ex:
> {code}
> @Indexed(indeName=common)
> public class Country { ... }
> @Indexed(indeName=common)
> public class Currency { ... }
> cm.getCache("currencies").put(1, new Currency(...))
> cm.getCache("countries").put(1, new Country(...))
> {code}
> This would require a delete by query in order to persist both a Country and a Currency with id=1.
> It would also require setting "default.exclusive_index_use", "false", with the associated cost of having to reopen the IndexWriter on every operation.
> Given the performance gain of doing a simple Update is considerable, we should make the corner case supported by extra configuration or alternatively, generate a unique @ProvidedId, including the entity class or the cache name that work for all cases described above.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5103) Inefficient index updates cause high cost merges and increase overall latency
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-5103?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero commented on ISPN-5103:
---------------------------------------
+1 to revive this discussion.
The reason that Hibernate Search can be more efficient (when using mapped Hibernate entities) is that it has a comprehensive knowledge of which entities are mapped to the current index, so for example it's possible to define if a delete by keyword is doable (it is safe to do when it's known that the index contains no mixed class hierarchies).
I'd be tempted to introduce some limitations to Infinispan Query, for example it would be very helpful to remove a lot of complexity (and inefficiencies like this one) to reject any mapping which would allow multiple entity types to be stored in the same index.
Another benefit would be in the internal engine code, as for example today it's not possible to compute correctly an override for the {{Similarity}} strategy is this is - for example - defined on a parent indexed class which was only added to the known indexed types after its children: {{Similarity}} can't be changed per-type and needs to apply to the whole index consistently from initial index usage.
> Inefficient index updates cause high cost merges and increase overall latency
> -----------------------------------------------------------------------------
>
> Key: ISPN-5103
> URL: https://issues.jboss.org/browse/ISPN-5103
> Project: Infinispan
> Issue Type: Enhancement
> Components: Embedded Querying
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Gustavo Fernandes
>
> Currently every change to the index is done Lucene-wise combining two operations:
> * Delete by query, using a boolean query on the id plus the entity class
> * Add
>
> Under high load, specially during merges those numerous deletes provoke very long delays causing high latency.
> We should instead use a simple Lucene Update to add/change documents, since internally it translates to a Delete by term plus an Add operation, and delete by terms are extremely efficient in Lucene.
> Some local tests showed average latency of updating the index using this strategy to drop 4 times, both for the SYNC and ASYNC backends
> With relation to sharing the index between entities, which was the original motivation of the Delete by query plus add strategy, we have two scenarios:
> * Same cache with muliple entity types: that's a non-issue, since obviously there's no id colision in this case
> * Different caches with the same index: this scenario happens when different caches shares the same index, for ex:
> {code}
> @Indexed(indeName=common)
> public class Country { ... }
> @Indexed(indeName=common)
> public class Currency { ... }
> cm.getCache("currencies").put(1, new Currency(...))
> cm.getCache("countries").put(1, new Country(...))
> {code}
> This would require a delete by query in order to persist both a Country and a Currency with id=1.
> It would also require setting "default.exclusive_index_use", "false", with the associated cost of having to reopen the IndexWriter on every operation.
> Given the performance gain of doing a simple Update is considerable, we should make the corner case supported by extra configuration or alternatively, generate a unique @ProvidedId, including the entity class or the cache name that work for all cases described above.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months