[JBoss JIRA] (ISPN-10286) Segmented Store can get stuck with bulk write
by Will Burns (Jira)
[ https://issues.jboss.org/browse/ISPN-10286?page=com.atlassian.jira.plugin... ]
Will Burns commented on ISPN-10286:
-----------------------------------
Also the old code wasn't properly waiting on the store bulkUpdate return, which if the store were actually doing async would cause writes to be fired after the command completed.
> Segmented Store can get stuck with bulk write
> ---------------------------------------------
>
> Key: ISPN-10286
> URL: https://issues.jboss.org/browse/ISPN-10286
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 10.0.0.Alpha3
> Reporter: Will Burns
> Priority: Major
> Fix For: 10.0.0.Beta4
>
>
> The code was refactored in ComposedSegmentedLoadWriteStore to be more non blocking friendly. Unfortunately due to how groupBy and flatMap interact it is possible for the bulkUpdate to never complete.
> FlatMap by default sets a parallelism level of 128. this means it will request 128 groups from groupBy, but unfortunately if there are more than 128 groups, it will never complete as groupBy must publish all groups before a single one can complete. Thus any time we use flatMap after a groupBy we must either set the parallelism level to Integer.MAX_VALUE or to an explicit value if we know how many groups at max there will be.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (ISPN-10286) Segmented Store can get stuck with bulk write
by Will Burns (Jira)
[ https://issues.jboss.org/browse/ISPN-10286?page=com.atlassian.jira.plugin... ]
Will Burns commented on ISPN-10286:
-----------------------------------
I believe also the batch size of the configuration plays a part. Unfortunately I can't easily replicate it with a simple unit test. I was able to very easily replicate it by using the https://github.com/infinispan/infinispan-benchmarks/tree/master/cachestores JMH test with rocks db configured to run with 10.0.
> Segmented Store can get stuck with bulk write
> ---------------------------------------------
>
> Key: ISPN-10286
> URL: https://issues.jboss.org/browse/ISPN-10286
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 10.0.0.Alpha3
> Reporter: Will Burns
> Priority: Major
> Fix For: 10.0.0.Beta4
>
>
> The code was refactored in ComposedSegmentedLoadWriteStore to be more non blocking friendly. Unfortunately due to how groupBy and flatMap interact it is possible for the bulkUpdate to never complete.
> FlatMap by default sets a parallelism level of 128. this means it will request 128 groups from groupBy, but unfortunately if there are more than 128 groups, it will never complete as groupBy must publish all groups before a single one can complete. Thus any time we use flatMap after a groupBy we must either set the parallelism level to Integer.MAX_VALUE or to an explicit value if we know how many groups at max there will be.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (ISPN-10286) Segmented Store can get stuck with bulk write
by Will Burns (Jira)
Will Burns created ISPN-10286:
---------------------------------
Summary: Segmented Store can get stuck with bulk write
Key: ISPN-10286
URL: https://issues.jboss.org/browse/ISPN-10286
Project: Infinispan
Issue Type: Bug
Components: Loaders and Stores
Affects Versions: 10.0.0.Alpha3
Reporter: Will Burns
Fix For: 10.0.0.Beta4
The code was refactored in ComposedSegmentedLoadWriteStore to be more non blocking friendly. Unfortunately due to how groupBy and flatMap interact it is possible for the bulkUpdate to never complete.
FlatMap by default sets a parallelism level of 128. this means it will request 128 groups from groupBy, but unfortunately if there are more than 128 groups, it will never complete as groupBy must publish all groups before a single one can complete. Thus any time we use flatMap after a groupBy we must either set the parallelism level to Integer.MAX_VALUE or to an explicit value if we know how many groups at max there will be.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (ISPN-9816) Handle non segmented container/store for publisher more efficiently
by Will Burns (Jira)
[ https://issues.jboss.org/browse/ISPN-9816?page=com.atlassian.jira.plugin.... ]
Will Burns commented on ISPN-9816:
----------------------------------
Actually running without the more recent changes the non segmented performance is pretty much the same.
{quote}
Benchmark (batchSize) (entryAmount) (keyObjectSize) (nodes) (useIdentityCache) (useStrings) (valueObjectSize) Mode Cnt Score Error Units
JMHBenchmarks.sizeParallel 4096 50000 10 6 true false 100 thrpt 20 3.525 ± 0.075 ops/s
JMHBenchmarks.sizeSequential 4096 50000 10 6 true false 100 thrpt 20 0.757 ± 0.029 ops/s
{quote}
> Handle non segmented container/store for publisher more efficiently
> -------------------------------------------------------------------
>
> Key: ISPN-9816
> URL: https://issues.jboss.org/browse/ISPN-9816
> Project: Infinispan
> Issue Type: Sub-task
> Components: Publisher
> Reporter: Will Burns
> Assignee: Will Burns
> Priority: Major
> Fix For: 10.0.0.Final
>
>
> The new Publisher is designed to take into account segmented data container and segmented stores. However if a store/data container is not segmented, the handling can cause performance issues, although it would still behave better memory and rehash based. The tradeoff is probably fine for data container, however stores performance drop would be massive. We need to process all segments in at least the non segmented store case to retain our old performance.
> To clarify this would require changes in the LocalPublisherManagerImpl class when invoking `CacheCollection.localPublisher(int)`, we would need to instead invoke `CacheCollection.localPublisher(IntSet)` so that we only have to iterate over the store once instead of IntSet.size times.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (ISPN-9816) Handle non segmented container/store for publisher more efficiently
by Will Burns (Jira)
[ https://issues.jboss.org/browse/ISPN-9816?page=com.atlassian.jira.plugin.... ]
Will Burns commented on ISPN-9816:
----------------------------------
For example with using https://github.com/infinispan/infinispan-benchmarks/tree/master/cachestores benchmark we can see that non segmented size method with RocksDB is substantially slower than segmented
{quote}
Non-Segmented
Benchmark (batchSize) (entryAmount) (keyObjectSize) (nodes) (useIdentityCache) (useStrings) (valueObjectSize) Mode Cnt Score Error Units
JMHBenchmarks.sizeParallel 4096 50000 10 6 true false 100 thrpt 20 3.479 ± 0.067 ops/s
JMHBenchmarks.sizeSequential 4096 50000 10 6 true false 100 thrpt 20 0.940 ± 0.038 ops/s
Segmented
Benchmark (batchSize) (entryAmount) (keyObjectSize) (nodes) (useIdentityCache) (useStrings) (valueObjectSize) Mode Cnt Score Error Units
JMHBenchmarks.sizeParallel 4096 50000 10 6 true false 100 thrpt 20 84.156 ± 1.816 ops/s
JMHBenchmarks.sizeSequential 4096 50000 10 6 true false 100 thrpt 20 40.973 ± 1.703 ops/s
{quote}
> Handle non segmented container/store for publisher more efficiently
> -------------------------------------------------------------------
>
> Key: ISPN-9816
> URL: https://issues.jboss.org/browse/ISPN-9816
> Project: Infinispan
> Issue Type: Sub-task
> Components: Publisher
> Reporter: Will Burns
> Assignee: Will Burns
> Priority: Major
> Fix For: 10.0.0.Final
>
>
> The new Publisher is designed to take into account segmented data container and segmented stores. However if a store/data container is not segmented, the handling can cause performance issues, although it would still behave better memory and rehash based. The tradeoff is probably fine for data container, however stores performance drop would be massive. We need to process all segments in at least the non segmented store case to retain our old performance.
> To clarify this would require changes in the LocalPublisherManagerImpl class when invoking `CacheCollection.localPublisher(int)`, we would need to instead invoke `CacheCollection.localPublisher(IntSet)` so that we only have to iterate over the store once instead of IntSet.size times.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (ISPN-9816) Handle non segmented container/store for publisher more efficiently
by Will Burns (Jira)
[ https://issues.jboss.org/browse/ISPN-9816?page=com.atlassian.jira.plugin.... ]
Will Burns updated ISPN-9816:
-----------------------------
Description:
The new Publisher is designed to take into account segmented data container and segmented stores. However if a store/data container is not segmented, the handling can cause performance issues, although it would still behave better memory and rehash based. The tradeoff is probably fine for data container, however stores performance drop would be massive. We need to process all segments in at least the non segmented store case to retain our old performance.
To clarify this would require changes in the LocalPublisherManagerImpl class when invoking `CacheCollection.localPublisher(int)`, we would need to instead invoke `CacheCollection.localPublisher(IntSet)` so that we only have to iterate over the store once instead of IntSet.size times.
was:The new Publisher is designed to take into account segmented data container and segmented stores. However if a store/data container is not segmented, the handling can cause performance issues, although it would still behave better memory and rehash based. The tradeoff is probably fine for data container, however stores performance drop would be massive. We need to process all segments in at least the non segmented store case to retain our old performance.
> Handle non segmented container/store for publisher more efficiently
> -------------------------------------------------------------------
>
> Key: ISPN-9816
> URL: https://issues.jboss.org/browse/ISPN-9816
> Project: Infinispan
> Issue Type: Sub-task
> Components: Publisher
> Reporter: Will Burns
> Assignee: Will Burns
> Priority: Major
> Fix For: 10.0.0.Final
>
>
> The new Publisher is designed to take into account segmented data container and segmented stores. However if a store/data container is not segmented, the handling can cause performance issues, although it would still behave better memory and rehash based. The tradeoff is probably fine for data container, however stores performance drop would be massive. We need to process all segments in at least the non segmented store case to retain our old performance.
> To clarify this would require changes in the LocalPublisherManagerImpl class when invoking `CacheCollection.localPublisher(int)`, we would need to instead invoke `CacheCollection.localPublisher(IntSet)` so that we only have to iterate over the store once instead of IntSet.size times.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (ISPN-10283) LocalPublisherManagerImpl performs blocking Cache operations
by Will Burns (Jira)
Will Burns created ISPN-10283:
---------------------------------
Summary: LocalPublisherManagerImpl performs blocking Cache operations
Key: ISPN-10283
URL: https://issues.jboss.org/browse/ISPN-10283
Project: Infinispan
Issue Type: Sub-task
Reporter: Will Burns
Assignee: Will Burns
The LocalPublisherManagerImpl invokes some methods that can block, such as a get that can go remote. We need to convert these to use the non blocking API instead and change the appropriate Flowable to compensate for that.
Examples are invocations of Cache#containsKey, Cache#get, Cache#getCacheEntry.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (ISPN-9722) Perform all CacheStore operations on a separate thread
by Ryan Emerson (Jira)
[ https://issues.jboss.org/browse/ISPN-9722?page=com.atlassian.jira.plugin.... ]
Ryan Emerson resolved ISPN-9722.
--------------------------------
Fix Version/s: (was: 10.0.0.Final)
Resolution: Done
> Perform all CacheStore operations on a separate thread
> ------------------------------------------------------
>
> Key: ISPN-9722
> URL: https://issues.jboss.org/browse/ISPN-9722
> Project: Infinispan
> Issue Type: Enhancement
> Components: Loaders and Stores
> Reporter: Will Burns
> Assignee: Will Burns
> Priority: Major
> Fix For: 10.0.0.Beta4
>
>
> Persistence is one of the few remaining systems that are not non blocking. This needs to be remedied. We will eventually need to add an SPI that does this, but for now we need to offload the persistence operations to a different thread pool.
> This should only require changes in the PersistenceManager to return non blocking methods (ie. return CompletionStage). We should then update references to use non blocking when possible.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (ISPN-9813) Convert DistributedStreams to use Distributed Publisher for single response
by Ryan Emerson (Jira)
[ https://issues.jboss.org/browse/ISPN-9813?page=com.atlassian.jira.plugin.... ]
Ryan Emerson resolved ISPN-9813.
--------------------------------
Fix Version/s: 10.0.0.Beta4
(was: 10.0.0.Final)
Resolution: Done
> Convert DistributedStreams to use Distributed Publisher for single response
> ---------------------------------------------------------------------------
>
> Key: ISPN-9813
> URL: https://issues.jboss.org/browse/ISPN-9813
> Project: Infinispan
> Issue Type: Sub-task
> Components: Publisher, Streams
> Reporter: Will Burns
> Assignee: Will Burns
> Priority: Major
> Fix For: 10.0.0.Beta4
>
>
> After ISPN-9811 is complete, we should be able to easily convert the existing DistributedCacheStream and friends to Publisher instead. This JIRA is to handle the terminal operations that return a single result (ie. Stream#count, Stream#collect, Stream#toArray, Stream#min, Stream#findAny etc.)
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months