From rory.odonnell at oracle.com Fri Jul 3 10:24:57 2015 From: rory.odonnell at oracle.com (Rory O'Donnell) Date: Fri, 03 Jul 2015 15:24:57 +0100 Subject: [infinispan-dev] Early Access builds for JDK 8u60 b21 and JDK 9 b70 are available on java.net Message-ID: <55969B39.60803@oracle.com> Hi Galder, Early Access build for JDK 8u60 b21 is available on java.net, summary of changes are listed here. As we enter the later phases of development for JDK 8u60, please log any show stoppers as soon as possible. Early Access build for JDK 9 b70 is available on java.net, summary of changes are listed here . The JDK 9 schedule of record is available on the JDK 9 Project page: http://openjdk.java.net/projects/jdk9 At https://wiki.openjdk.java.net/display/Adoption/JDK+9+Outreach you can find a (preliminary) list of other changes that might affect your project's code in JDK 9, and other things to consider when testing with JDK 9. I'd be curious to know if there is anything on that list you'd consider to have an effect on your project. Please keep in mind that as JEPs and others changes are integrated into (or out of) JDK 9, the list will change over time. Rgds,Rory -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150703/aa21bc5d/attachment.html From dan.berindei at gmail.com Mon Jul 6 14:50:21 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 6 Jul 2015 21:50:21 +0300 Subject: [infinispan-dev] Weekly IRC Meeting 2015-07-06 Message-ID: Hi all Here is the transcript of today's meeting: http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2015/infinispan.2015-07-06-14.02.log.html Cheers Dan From dan.berindei at gmail.com Tue Jul 7 03:17:02 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 7 Jul 2015 10:17:02 +0300 Subject: [infinispan-dev] Infinispan 8.0.0.Beta1 released Message-ID: Dear community Infinispan 8.0.0.Beta1 is now available! All the details are in the blog post: http://blog.infinispan.org/2015/07/infinispan-800beta1.html Cheers Dan From dan.berindei at gmail.com Thu Jul 9 05:10:59 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Thu, 9 Jul 2015 12:10:59 +0300 Subject: [infinispan-dev] Distributed Streams In-Reply-To: References: Message-ID: Hi Will After the discussion we started in Galder's PR's comments [1], I started thinking that we should really have a stream() method directly in the Cache/AdvancedCache interface. I feel entrySet(), keySet() or values() encourage users to use external iteration with a for-each loop or iterator(), and we should encourage users to use the Stream methods instead. I also really don't like the idea of users having to close their iterators explicitly, and lazy Iterators (or Spliterators) need to be properly closed or they will leak resources. My suggestion, then, is to make entrySet().iterator() and entrySet().spliterator() eager, so that they don't need to implement AutoCloseable. I would even go as far as to say that entrySet() should be eager itself, but maybe keySet().stream() would make a better API than adding a new keysStream() method. Now to your questions: 1) forEach() doesn't allow the consumer to modify the entries, so I think the most common use case would be doing something with a captured variable (or System.out). So I would make forEach execute the consumer on the originator, and maybe add a distributedForEach method that executes its consumer on each owner (accepting that the consumer may be executed twice for some keys, or never, if the originator crashes). distributedForEach probably makes more sense once you start injecting the Cache (or other components) in the remote Consumers. peek()'s intended use case is probably logging progress, so it will definitely need to interact with an external component. However, executing it to the originator would potentially change the execution of the stream dramatically, and adding logging shouldn't have that kind of impact. So peek() should be executed on the remote nodes, even if we don't have remote injection yet. 2) I would say implement sorting on the originator from the beginning, and limit() and skip() as well. It's true that users may me disappointed to see adding limit() doesn't improve the performance of their sorted() execution, but I would rather have a complete API available for applications who don't need to sort the entire cache. Cheers Dan [1] https://github.com/infinispan/infinispan/pull/3571#discussion-diff-34033399R22 On Wed, May 27, 2015 at 9:52 PM, William Burns wrote: > Hello everyone, > > I wanted to let you know I wrote up a design documenting the successor to > EntryRetriever, Distributed Streams [1] ! > > Any comments or feedback would be much appreciated. > > I especially would like targeted feedback regarding: > > 1. The operators forEach and peek may want to be ran locally. Should we > have an overridden method so users can pick which they want? Personally I > feel that peek is too specific to matter and forEach can always be done by > the caller locally if desired. > 2. The intermediate operators limit and skip do not seem worth implementing > unless we have sorting support. (Sorting support will come later). I am > thinking to not support these until sorting is added. > > Thanks, > > - Will > > [1] https://github.com/infinispan/infinispan/wiki/Distributed-Stream-Support > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From mudokonman at gmail.com Thu Jul 9 08:49:28 2015 From: mudokonman at gmail.com (William Burns) Date: Thu, 09 Jul 2015 12:49:28 +0000 Subject: [infinispan-dev] Distributed Streams In-Reply-To: References: Message-ID: On Thu, Jul 9, 2015 at 5:11 AM Dan Berindei wrote: > Hi Will > > After the discussion we started in Galder's PR's comments [1], I > started thinking that we should really have a stream() method directly > in the Cache/AdvancedCache interface. > > I feel entrySet(), keySet() or values() encourage users to use > external iteration with a for-each loop or iterator(), and we should > encourage users to use the Stream methods instead. I also really don't > like the idea of users having to close their iterators explicitly, and > lazy Iterators (or Spliterators) need to be properly closed or they > will leak resources. > The iterator and spliterator are automatically closed if it was fully iterated upon. I don't think pulling all entries from the cluster (and loader) in for entrySet, keySet or values is a good idea. Unless you are suggesting that we only pull local entries only? In which case we have reverted these changes back to ISPN 6 and older. The entrySet, keySet and values as of ISPN 7.0 are actually completely backing collections and methods are evaluated per invocation. This means any updates to them or the cache it was created from are seen by each other. > > My suggestion, then, is to make entrySet().iterator() and > entrySet().spliterator() eager, so that they don't need to implement > AutoCloseable. I would even go as far as to say that entrySet() should > be eager itself, but maybe keySet().stream() would make a better API > than adding a new keysStream() method. > Just so I understand you are more saying that we leave the entrySet, keySet and values the way they are so they are backing collections, however invocation of the iterator or spliterator would pull in all entries from the entire cache into memory at once? It seems throwing UnsupportedOperationException with a message stating to use stream().iterator() and closing the stream would be better imo (however that would preclude the usage of foreach). Note the foreach loop is only an issue when iterating over that collection and you break out of the loop early. try (Stream stream = entrySet.stream()) { Iterator> iterator = stream.iterator(); } Actually I think the issue here is that our CacheCollections don't currently implement CloseableIterable like the EntryIterable does. In that case you can do a simple foreach loop with a break in a try with resource. We could then document that close closes any iterators or spliterators that were created from this instance of the collection. It is a little awkward, but could work this way. try (CacheSet> closeableEntrySet = entrySet) { for (Map.Entry entry : closeableEntrySet) { } } > > > Now to your questions: > > 1) > forEach() doesn't allow the consumer to modify the entries, so I think > the most common use case would be doing something with a captured > variable (or System.out). This is actually something I didn't cover in the document. But upon thinking about this more I was thinking we probably want to allow for CDI Injection of the cache for the consumer action before firing. In this way the user can change values as they want. This would behave almost identically to map/reduce, however it is much more understandable imo. > So I would make forEach execute the consumer > on the originator, and maybe add a distributedForEach method that > executes its consumer on each owner (accepting that the consumer may > be executed twice for some keys, or never, if the originator crashes). > distributedForEach probably makes more sense once you start injecting > the Cache (or other components) in the remote Consumers. > This was my conundrum before, however I believe I found a happy medium. I figured if we implement it distributed gives more flexibility. The user can still choose to run it locally as they desire. For example you can call *.stream().iterator().forEachRemaining(consumer) if you wanted to do a forEach locally in a single thread. And if you wanted it parallelized you can do StreamSupport.stream(*.stream().spliterator(), true).forEach(consumer) This would all be documented on the forEach method. > > peek()'s intended use case is probably logging progress, so it will > definitely need to interact with an external component. However, > executing it to the originator would potentially change the execution > of the stream dramatically, and adding logging shouldn't have that > kind of impact. So peek() should be executed on the remote nodes, even > if we don't have remote injection yet. > This is how I ended up doing it was to have it done remotely. > > 2) > I would say implement sorting on the originator from the beginning, > and limit() and skip() as well. It's true that users may me > disappointed to see adding limit() doesn't improve the performance of > their sorted() execution, but I would rather have a complete API > available for applications who don't need to sort the entire cache. > This is how I did this as well :) Basically if we find that there is a sorted, distributed, limit or skip it performs all of the intermediate operations up that point then uses an iterator to bring the results back locally where it can be performed. Limit and distinct are also actually performed remotely first to reduce how many results are returned. I am not 100% sold on performing distinct remotely first as it could actually be significantly slower, but it should hopefully reduce some memory usage :P > > Cheers > Dan > > > [1] > https://github.com/infinispan/infinispan/pull/3571#discussion-diff-34033399R22 > > On Wed, May 27, 2015 at 9:52 PM, William Burns > wrote: > > Hello everyone, > > > > I wanted to let you know I wrote up a design documenting the successor to > > EntryRetriever, Distributed Streams [1] ! > > > > Any comments or feedback would be much appreciated. > > > > I especially would like targeted feedback regarding: > > > > 1. The operators forEach and peek may want to be ran locally. Should we > > have an overridden method so users can pick which they want? Personally > I > > feel that peek is too specific to matter and forEach can always be done > by > > the caller locally if desired. > > 2. The intermediate operators limit and skip do not seem worth > implementing > > unless we have sorting support. (Sorting support will come later). I am > > thinking to not support these until sorting is added. > > > > Thanks, > > > > - Will > > > > [1] > https://github.com/infinispan/infinispan/wiki/Distributed-Stream-Support > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150709/374155c2/attachment-0001.html From rvansa at redhat.com Thu Jul 9 11:33:31 2015 From: rvansa at redhat.com (Radim Vansa) Date: Thu, 09 Jul 2015 17:33:31 +0200 Subject: [infinispan-dev] Distributed Streams In-Reply-To: References: Message-ID: <559E944B.60802@redhat.com> On 07/09/2015 02:49 PM, William Burns wrote: > > > On Thu, Jul 9, 2015 at 5:11 AM Dan Berindei > wrote: > > Hi Will > > After the discussion we started in Galder's PR's comments [1], I > started thinking that we should really have a stream() method directly > in the Cache/AdvancedCache interface. > > I feel entrySet(), keySet() or values() encourage users to use > external iteration with a for-each loop or iterator(), and we should > encourage users to use the Stream methods instead. I also really don't > like the idea of users having to close their iterators explicitly, and > lazy Iterators (or Spliterators) need to be properly closed or they > will leak resources. > > > The iterator and spliterator are automatically closed if it was fully > iterated upon. > > I don't think pulling all entries from the cluster (and loader) in for > entrySet, keySet or values is a good idea. Unless you are suggesting > that we only pull local entries only? In which case we have reverted > these changes back to ISPN 6 and older. > > The entrySet, keySet and values as of ISPN 7.0 are actually completely > backing collections and methods are evaluated per invocation. This > means any updates to them or the cache it was created from are seen by > each other. > > > My suggestion, then, is to make entrySet().iterator() and > entrySet().spliterator() eager, so that they don't need to implement > AutoCloseable. I would even go as far as to say that entrySet() should > be eager itself, but maybe keySet().stream() would make a better API > than adding a new keysStream() method. > > > Just so I understand you are more saying that we leave the entrySet, > keySet and values the way they are so they are backing collections, > however invocation of the iterator or spliterator would pull in all > entries from the entire cache into memory at once? It seems throwing > UnsupportedOperationException with a message stating to use > stream().iterator() and closing the stream would be better imo > (however that would preclude the usage of foreach). Note the foreach > loop is only an issue when iterating over that collection and you > break out of the loop early. > > try (Stream stream = entrySet.stream()) { > Iterator> iterator = stream.iterator(); > } > > Actually I think the issue here is that our CacheCollections don't > currently implement CloseableIterable like the EntryIterable does. In > that case you can do a simple foreach loop with a break in a try with > resource. We could then document that close closes any iterators or > spliterators that were created from this instance of the collection. > > It is a little awkward, but could work this way. > > try (CacheSet> closeableEntrySet = entrySet) { > for (Map.Entry entry : closeableEntrySet) { > } > } What resources is the iterator actually holding? If that is just memory, could we do some autoclose magic with phantom references? > > > Now to your questions: > > 1) > forEach() doesn't allow the consumer to modify the entries, so I think > the most common use case would be doing something with a captured > variable (or System.out). > > > This is actually something I didn't cover in the document. But upon > thinking about this more I was thinking we probably want to allow for > CDI Injection of the cache for the consumer action before firing. In > this way the user can change values as they want. This would behave > almost identically to map/reduce, however it is much more > understandable imo. > > So I would make forEach execute the consumer > on the originator, and maybe add a distributedForEach method that > executes its consumer on each owner (accepting that the consumer may > be executed twice for some keys, or never, if the originator crashes). > distributedForEach probably makes more sense once you start injecting > the Cache (or other components) in the remote Consumers. > > > This was my conundrum before, however I believe I found a happy > medium. I figured if we implement it distributed gives more > flexibility. The user can still choose to run it locally as they desire. > > For example you can call > *.stream().iterator().forEachRemaining(consumer) if you wanted to do a > forEach locally in a single thread. And if you wanted it parallelized > you can do > StreamSupport.stream(*.stream().spliterator(), true).forEach(consumer) *.stream().distributedForEach(serializable consumer) looks much more obvious than ^. Despite that it would be documented. > > This would all be documented on the forEach method. > > > peek()'s intended use case is probably logging progress, so it will > definitely need to interact with an external component. However, > executing it to the originator would potentially change the execution > of the stream dramatically, and adding logging shouldn't have that > kind of impact. So peek() should be executed on the remote nodes, even > if we don't have remote injection yet. > > > This is how I ended up doing it was to have it done remotely. > > > 2) > I would say implement sorting on the originator from the beginning, > and limit() and skip() as well. It's true that users may me > disappointed to see adding limit() doesn't improve the performance of > their sorted() execution, but I would rather have a complete API > available for applications who don't need to sort the entire cache. > > > This is how I did this as well :) Basically if we find that there is > a sorted, distributed, limit or skip it performs all of the > intermediate operations up that point then uses an iterator to bring > the results back locally where it can be performed. Limit and > distinct are also actually performed remotely first to reduce how many > results are returned. I am not 100% sold on performing distinct > remotely first as it could actually be significantly slower, but it > should hopefully reduce some memory usage :P Probably not in the first implementation, but sort should be performed on each node remotely, and then the local node should do just n-way merge. That way, you can apply skip & limit on remote nodes and then again on the reduced merged set, dramatically reducing bandwith. Not sure about NoSQL use cases, but in classical DBs the top-N query is quite frequent operation, afaik. My 2c Radim > > Cheers > Dan > > > [1] > https://github.com/infinispan/infinispan/pull/3571#discussion-diff-34033399R22 > > On Wed, May 27, 2015 at 9:52 PM, William Burns > > wrote: > > Hello everyone, > > > > I wanted to let you know I wrote up a design documenting the > successor to > > EntryRetriever, Distributed Streams [1] ! > > > > Any comments or feedback would be much appreciated. > > > > I especially would like targeted feedback regarding: > > > > 1. The operators forEach and peek may want to be ran locally. > Should we > > have an overridden method so users can pick which they want? > Personally I > > feel that peek is too specific to matter and forEach can always > be done by > > the caller locally if desired. > > 2. The intermediate operators limit and skip do not seem worth > implementing > > unless we have sorting support. (Sorting support will come > later). I am > > thinking to not support these until sorting is added. > > > > Thanks, > > > > - Will > > > > [1] > https://github.com/infinispan/infinispan/wiki/Distributed-Stream-Support > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From rvansa at redhat.com Thu Jul 9 11:40:37 2015 From: rvansa at redhat.com (Radim Vansa) Date: Thu, 09 Jul 2015 17:40:37 +0200 Subject: [infinispan-dev] Distributed Streams In-Reply-To: <559E944B.60802@redhat.com> References: <559E944B.60802@redhat.com> Message-ID: <559E95F5.60902@redhat.com> On 07/09/2015 05:33 PM, Radim Vansa wrote: > On 07/09/2015 02:49 PM, William Burns wrote: >> >> On Thu, Jul 9, 2015 at 5:11 AM Dan Berindei > > wrote: >> >> Hi Will >> >> After the discussion we started in Galder's PR's comments [1], I >> started thinking that we should really have a stream() method directly >> in the Cache/AdvancedCache interface. >> >> I feel entrySet(), keySet() or values() encourage users to use >> external iteration with a for-each loop or iterator(), and we should >> encourage users to use the Stream methods instead. I also really don't >> like the idea of users having to close their iterators explicitly, and >> lazy Iterators (or Spliterators) need to be properly closed or they >> will leak resources. >> >> >> The iterator and spliterator are automatically closed if it was fully >> iterated upon. >> >> I don't think pulling all entries from the cluster (and loader) in for >> entrySet, keySet or values is a good idea. Unless you are suggesting >> that we only pull local entries only? In which case we have reverted >> these changes back to ISPN 6 and older. >> >> The entrySet, keySet and values as of ISPN 7.0 are actually completely >> backing collections and methods are evaluated per invocation. This >> means any updates to them or the cache it was created from are seen by >> each other. >> >> >> My suggestion, then, is to make entrySet().iterator() and >> entrySet().spliterator() eager, so that they don't need to implement >> AutoCloseable. I would even go as far as to say that entrySet() should >> be eager itself, but maybe keySet().stream() would make a better API >> than adding a new keysStream() method. >> >> >> Just so I understand you are more saying that we leave the entrySet, >> keySet and values the way they are so they are backing collections, >> however invocation of the iterator or spliterator would pull in all >> entries from the entire cache into memory at once? It seems throwing >> UnsupportedOperationException with a message stating to use >> stream().iterator() and closing the stream would be better imo >> (however that would preclude the usage of foreach). Note the foreach >> loop is only an issue when iterating over that collection and you >> break out of the loop early. >> >> try (Stream stream = entrySet.stream()) { >> Iterator> iterator = stream.iterator(); >> } >> >> Actually I think the issue here is that our CacheCollections don't >> currently implement CloseableIterable like the EntryIterable does. In >> that case you can do a simple foreach loop with a break in a try with >> resource. We could then document that close closes any iterators or >> spliterators that were created from this instance of the collection. >> >> It is a little awkward, but could work this way. >> >> try (CacheSet> closeableEntrySet = entrySet) { >> for (Map.Entry entry : closeableEntrySet) { >> } >> } > What resources is the iterator actually holding? If that is just memory, > could we do some autoclose magic with phantom references? > >> >> Now to your questions: >> >> 1) >> forEach() doesn't allow the consumer to modify the entries, so I think >> the most common use case would be doing something with a captured >> variable (or System.out). >> >> >> This is actually something I didn't cover in the document. But upon >> thinking about this more I was thinking we probably want to allow for >> CDI Injection of the cache for the consumer action before firing. In >> this way the user can change values as they want. This would behave >> almost identically to map/reduce, however it is much more >> understandable imo. >> >> So I would make forEach execute the consumer >> on the originator, and maybe add a distributedForEach method that >> executes its consumer on each owner (accepting that the consumer may >> be executed twice for some keys, or never, if the originator crashes). >> distributedForEach probably makes more sense once you start injecting >> the Cache (or other components) in the remote Consumers. >> >> >> This was my conundrum before, however I believe I found a happy >> medium. I figured if we implement it distributed gives more >> flexibility. The user can still choose to run it locally as they desire. >> >> For example you can call >> *.stream().iterator().forEachRemaining(consumer) if you wanted to do a >> forEach locally in a single thread. And if you wanted it parallelized >> you can do >> StreamSupport.stream(*.stream().spliterator(), true).forEach(consumer) > *.stream().distributedForEach(serializable consumer) looks much more > obvious than ^. Despite that it would be documented. > > >> This would all be documented on the forEach method. >> >> >> peek()'s intended use case is probably logging progress, so it will >> definitely need to interact with an external component. However, >> executing it to the originator would potentially change the execution >> of the stream dramatically, and adding logging shouldn't have that >> kind of impact. So peek() should be executed on the remote nodes, even >> if we don't have remote injection yet. >> >> >> This is how I ended up doing it was to have it done remotely. >> >> >> 2) >> I would say implement sorting on the originator from the beginning, >> and limit() and skip() as well. It's true that users may me >> disappointed to see adding limit() doesn't improve the performance of >> their sorted() execution, but I would rather have a complete API >> available for applications who don't need to sort the entire cache. >> >> >> This is how I did this as well :) Basically if we find that there is >> a sorted, distributed, limit or skip it performs all of the >> intermediate operations up that point then uses an iterator to bring >> the results back locally where it can be performed. Limit and >> distinct are also actually performed remotely first to reduce how many >> results are returned. I am not 100% sold on performing distinct >> remotely first as it could actually be significantly slower, but it >> should hopefully reduce some memory usage :P > Probably not in the first implementation, but sort should be performed > on each node remotely, and then the local node should do just n-way > merge. That way, you can apply skip & limit on remote nodes and then > again on the reduced merged set, dramatically reducing bandwith. Not > sure about NoSQL use cases, but in classical DBs the top-N query is > quite frequent operation, afaik. Err, only limit can be applied remotely... For remote application of skip, we would need multi-phase arbitration of the n-th element using some pivot points... just limit, just limit :) Radim > > My 2c > > Radim > >> Cheers >> Dan >> >> >> [1] >> https://github.com/infinispan/infinispan/pull/3571#discussion-diff-34033399R22 >> >> On Wed, May 27, 2015 at 9:52 PM, William Burns >> > wrote: >> > Hello everyone, >> > >> > I wanted to let you know I wrote up a design documenting the >> successor to >> > EntryRetriever, Distributed Streams [1] ! >> > >> > Any comments or feedback would be much appreciated. >> > >> > I especially would like targeted feedback regarding: >> > >> > 1. The operators forEach and peek may want to be ran locally. >> Should we >> > have an overridden method so users can pick which they want? >> Personally I >> > feel that peek is too specific to matter and forEach can always >> be done by >> > the caller locally if desired. >> > 2. The intermediate operators limit and skip do not seem worth >> implementing >> > unless we have sorting support. (Sorting support will come >> later). I am >> > thinking to not support these until sorting is added. >> > >> > Thanks, >> > >> > - Will >> > >> > [1] >> https://github.com/infinispan/infinispan/wiki/Distributed-Stream-Support >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Radim Vansa JBoss Performance Team From mudokonman at gmail.com Thu Jul 9 12:09:02 2015 From: mudokonman at gmail.com (William Burns) Date: Thu, 09 Jul 2015 16:09:02 +0000 Subject: [infinispan-dev] Distributed Streams In-Reply-To: <559E944B.60802@redhat.com> References: <559E944B.60802@redhat.com> Message-ID: On Thu, Jul 9, 2015 at 11:33 AM Radim Vansa wrote: > On 07/09/2015 02:49 PM, William Burns wrote: > > > > > > On Thu, Jul 9, 2015 at 5:11 AM Dan Berindei > > wrote: > > > > Hi Will > > > > After the discussion we started in Galder's PR's comments [1], I > > started thinking that we should really have a stream() method > directly > > in the Cache/AdvancedCache interface. > > > > I feel entrySet(), keySet() or values() encourage users to use > > external iteration with a for-each loop or iterator(), and we should > > encourage users to use the Stream methods instead. I also really > don't > > like the idea of users having to close their iterators explicitly, > and > > lazy Iterators (or Spliterators) need to be properly closed or they > > will leak resources. > > > > > > The iterator and spliterator are automatically closed if it was fully > > iterated upon. > > > > I don't think pulling all entries from the cluster (and loader) in for > > entrySet, keySet or values is a good idea. Unless you are suggesting > > that we only pull local entries only? In which case we have reverted > > these changes back to ISPN 6 and older. > > > > The entrySet, keySet and values as of ISPN 7.0 are actually completely > > backing collections and methods are evaluated per invocation. This > > means any updates to them or the cache it was created from are seen by > > each other. > > > > > > My suggestion, then, is to make entrySet().iterator() and > > entrySet().spliterator() eager, so that they don't need to implement > > AutoCloseable. I would even go as far as to say that entrySet() > should > > be eager itself, but maybe keySet().stream() would make a better API > > than adding a new keysStream() method. > > > > > > Just so I understand you are more saying that we leave the entrySet, > > keySet and values the way they are so they are backing collections, > > however invocation of the iterator or spliterator would pull in all > > entries from the entire cache into memory at once? It seems throwing > > UnsupportedOperationException with a message stating to use > > stream().iterator() and closing the stream would be better imo > > (however that would preclude the usage of foreach). Note the foreach > > loop is only an issue when iterating over that collection and you > > break out of the loop early. > > > > try (Stream stream = entrySet.stream()) { > > Iterator> iterator = stream.iterator(); > > } > > > > Actually I think the issue here is that our CacheCollections don't > > currently implement CloseableIterable like the EntryIterable does. In > > that case you can do a simple foreach loop with a break in a try with > > resource. We could then document that close closes any iterators or > > spliterators that were created from this instance of the collection. > > > > It is a little awkward, but could work this way. > > > > try (CacheSet> closeableEntrySet = entrySet) { > > for (Map.Entry entry : closeableEntrySet) { > > } > > } > > What resources is the iterator actually holding? If that is just memory, > could we do some autoclose magic with phantom references? > The iterator will hold onto a thread on various machines while it is processing return values. There is a finalizer, but we all know we can't trust that to run anytime soon. I have thought of a different implementation that would not have this limitatiion, but there is definitely not time for 8.0 to do that. Also I am not confident on the performance of it, but who knows :) > > > > > > > Now to your questions: > > > > 1) > > forEach() doesn't allow the consumer to modify the entries, so I > think > > the most common use case would be doing something with a captured > > variable (or System.out). > > > > > > This is actually something I didn't cover in the document. But upon > > thinking about this more I was thinking we probably want to allow for > > CDI Injection of the cache for the consumer action before firing. In > > this way the user can change values as they want. This would behave > > almost identically to map/reduce, however it is much more > > understandable imo. > > > > So I would make forEach execute the consumer > > on the originator, and maybe add a distributedForEach method that > > executes its consumer on each owner (accepting that the consumer may > > be executed twice for some keys, or never, if the originator > crashes). > > distributedForEach probably makes more sense once you start injecting > > the Cache (or other components) in the remote Consumers. > > > > > > This was my conundrum before, however I believe I found a happy > > medium. I figured if we implement it distributed gives more > > flexibility. The user can still choose to run it locally as they desire. > > > > For example you can call > > *.stream().iterator().forEachRemaining(consumer) if you wanted to do a > > forEach locally in a single thread. And if you wanted it parallelized > > you can do > > StreamSupport.stream(*.stream().spliterator(), true).forEach(consumer) > > *.stream().distributedForEach(serializable consumer) looks much more > obvious than ^. Despite that it would be documented. > The method looks very similar to the single threaded example to me. I personally think we should be drawing people to using the distributed forEach by default, not the other way around at least. The local ones will be orders of magnitude slower due to having to pull the entire contents to the local node. One issue with adding a new method that is designed to be a terminal or intermediate operation is that Stream uses builder type pattern and returns Stream. Currently I don't mess with that as the new methods are only available on the original CacheStream, however due to how Stream is typed you can't override it by default to return CacheStream, which will add quite a bit more bloat to start adding custom methods (which we may want to do). > > > > > > This would all be documented on the forEach method. > > > > > > peek()'s intended use case is probably logging progress, so it will > > definitely need to interact with an external component. However, > > executing it to the originator would potentially change the execution > > of the stream dramatically, and adding logging shouldn't have that > > kind of impact. So peek() should be executed on the remote nodes, > even > > if we don't have remote injection yet. > > > > > > This is how I ended up doing it was to have it done remotely. > > > > > > 2) > > I would say implement sorting on the originator from the beginning, > > and limit() and skip() as well. It's true that users may me > > disappointed to see adding limit() doesn't improve the performance of > > their sorted() execution, but I would rather have a complete API > > available for applications who don't need to sort the entire cache. > > > > > > This is how I did this as well :) Basically if we find that there is > > a sorted, distributed, limit or skip it performs all of the > > intermediate operations up that point then uses an iterator to bring > > the results back locally where it can be performed. Limit and > > distinct are also actually performed remotely first to reduce how many > > results are returned. I am not 100% sold on performing distinct > > remotely first as it could actually be significantly slower, but it > > should hopefully reduce some memory usage :P > > Probably not in the first implementation, but sort should be performed > on each node remotely, and then the local node should do just n-way > merge. That way, you can apply skip & limit on remote nodes and then > That is already planned to add for sorting with https://issues.jboss.org/browse/ISPN-4358. You already replied about skip, so I don't have to mention that one :) Limit is already applied on remote nodes, it just may bring in (limit * n) where n is the number of nodes and is locally limited to finally ensure proper limiting. It however doesn't support limit ran remotely after a sort remotely. I can only do this in the case of there being no map/flatMap operation between them otherwise I would lose the key we are sorting on. It could be a use case to cover though. > again on the reduced merged set, dramatically reducing bandwith. Not > sure about NoSQL use cases, but in classical DBs the top-N query is > quite frequent operation, afaik. > > My 2c > > Radim > > > > > Cheers > > Dan > > > > > > [1] > > > https://github.com/infinispan/infinispan/pull/3571#discussion-diff-34033399R22 > > > > On Wed, May 27, 2015 at 9:52 PM, William Burns > > > wrote: > > > Hello everyone, > > > > > > I wanted to let you know I wrote up a design documenting the > > successor to > > > EntryRetriever, Distributed Streams [1] ! > > > > > > Any comments or feedback would be much appreciated. > > > > > > I especially would like targeted feedback regarding: > > > > > > 1. The operators forEach and peek may want to be ran locally. > > Should we > > > have an overridden method so users can pick which they want? > > Personally I > > > feel that peek is too specific to matter and forEach can always > > be done by > > > the caller locally if desired. > > > 2. The intermediate operators limit and skip do not seem worth > > implementing > > > unless we have sorting support. (Sorting support will come > > later). I am > > > thinking to not support these until sorting is added. > > > > > > Thanks, > > > > > > - Will > > > > > > [1] > > > https://github.com/infinispan/infinispan/wiki/Distributed-Stream-Support > > > > > > _______________________________________________ > > > infinispan-dev mailing list > > > infinispan-dev at lists.jboss.org > > > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org infinispan-dev at lists.jboss.org> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150709/c8c5be99/attachment-0001.html From dan.berindei at gmail.com Thu Jul 9 12:27:56 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Thu, 9 Jul 2015 19:27:56 +0300 Subject: [infinispan-dev] Distributed Streams In-Reply-To: References: Message-ID: On Thu, Jul 9, 2015 at 3:49 PM, William Burns wrote: > > > On Thu, Jul 9, 2015 at 5:11 AM Dan Berindei wrote: >> >> Hi Will >> >> After the discussion we started in Galder's PR's comments [1], I >> started thinking that we should really have a stream() method directly >> in the Cache/AdvancedCache interface. >> >> I feel entrySet(), keySet() or values() encourage users to use >> external iteration with a for-each loop or iterator(), and we should >> encourage users to use the Stream methods instead. I also really don't >> like the idea of users having to close their iterators explicitly, and >> lazy Iterators (or Spliterators) need to be properly closed or they >> will leak resources. > > > The iterator and spliterator are automatically closed if it was fully > iterated upon. > > I don't think pulling all entries from the cluster (and loader) in for > entrySet, keySet or values is a good idea. Unless you are suggesting that > we only pull local entries only? In which case we have reverted these > changes back to ISPN 6 and older. > > The entrySet, keySet and values as of ISPN 7.0 are actually completely > backing collections and methods are evaluated per invocation. This means > any updates to them or the cache it was created from are seen by each other. > And the iterators are already AutoCloseable, I know. But with the streams API we can hide resource management from the user, so I was hoping we could avoid using AutoCloseable altogether. >> >> >> My suggestion, then, is to make entrySet().iterator() and >> entrySet().spliterator() eager, so that they don't need to implement >> AutoCloseable. I would even go as far as to say that entrySet() should >> be eager itself, but maybe keySet().stream() would make a better API >> than adding a new keysStream() method. > > > Just so I understand you are more saying that we leave the entrySet, keySet > and values the way they are so they are backing collections, however > invocation of the iterator or spliterator would pull in all entries from the > entire cache into memory at once? It seems throwing Yes > UnsupportedOperationException with a message stating to use > stream().iterator() and closing the stream would be better imo (however that > would preclude the usage of foreach). Note the foreach loop is only an > issue when iterating over that collection and you break out of the loop > early. > > try (Stream stream = entrySet.stream()) { > Iterator> iterator = stream.iterator(); > } > > Actually I think the issue here is that our CacheCollections don't currently > implement CloseableIterable like the EntryIterable does. In that case you > can do a simple foreach loop with a break in a try with resource. We could > then document that close closes any iterators or spliterators that were > created from this instance of the collection. > > It is a little awkward, but could work this way. > > try (CacheSet> closeableEntrySet = entrySet) { > for (Map.Entry entry : closeableEntrySet) { > } > } On the other hand, you wouldn't need any try-with-resources if you used the forEach method, because the resources are both acquired and released in the forEach call: entrySet.stream().forEach((k, v) -> { ... }) But if you make stream() or entrySet() return an AutoCloseable, then users will put a try-with-resources without it anyway, just in case. So I'd rather keep the iterator AutoCloseable than the stream/entry set. Making the EntryIterable implement AutoCloseable made sense in Java 7, because the only thing you could do with it was iterate on it, maybe with for-each. But in Java 8 Iterable also has a forEach method, and I wouldn't want users of forEach() to think about whether they need a try block or not. > >> >> >> >> Now to your questions: >> >> 1) >> forEach() doesn't allow the consumer to modify the entries, so I think >> the most common use case would be doing something with a captured >> variable (or System.out). > > > This is actually something I didn't cover in the document. But upon > thinking about this more I was thinking we probably want to allow for CDI > Injection of the cache for the consumer action before firing. In this way > the user can change values as they want. This would behave almost > identically to map/reduce, however it is much more understandable imo. > You mean like map/reduce when it is configured to store its results in an intermediate cache? I think you'd also need a CacheStream.reduceOnOwner() operation to put in the pipeline before forEach(), but yeah, it sounds like it should work. >> >> So I would make forEach execute the consumer >> on the originator, and maybe add a distributedForEach method that >> executes its consumer on each owner (accepting that the consumer may >> be executed twice for some keys, or never, if the originator crashes). >> distributedForEach probably makes more sense once you start injecting >> the Cache (or other components) in the remote Consumers. > > > This was my conundrum before, however I believe I found a happy medium. I > figured if we implement it distributed gives more flexibility. The user can > still choose to run it locally as they desire. > > For example you can call *.stream().iterator().forEachRemaining(consumer) if > you wanted to do a forEach locally in a single thread. And if you wanted it > parallelized you can do > StreamSupport.stream(*.stream().spliterator(), true).forEach(consumer) Except now you have to access the spliterator directly... Will this work, or will the user need try-with-resources? Do we want users to think about it? StreamSupport.stream(*.stream().spliterator(), true).limit(10).forEach(consumer) > > This would all be documented on the forEach method. > >> >> >> peek()'s intended use case is probably logging progress, so it will >> definitely need to interact with an external component. However, >> executing it to the originator would potentially change the execution >> of the stream dramatically, and adding logging shouldn't have that >> kind of impact. So peek() should be executed on the remote nodes, even >> if we don't have remote injection yet. > > > This is how I ended up doing it was to have it done remotely. > >> >> >> 2) >> I would say implement sorting on the originator from the beginning, >> and limit() and skip() as well. It's true that users may me >> disappointed to see adding limit() doesn't improve the performance of >> their sorted() execution, but I would rather have a complete API >> available for applications who don't need to sort the entire cache. > > > This is how I did this as well :) Basically if we find that there is a > sorted, distributed, limit or skip it performs all of the intermediate > operations up that point then uses an iterator to bring the results back > locally where it can be performed. Limit and distinct are also actually > performed remotely first to reduce how many results are returned. I am not > 100% sold on performing distinct remotely first as it could actually be > significantly slower, but it should hopefully reduce some memory usage :P > Shouldn't running distinct remotely actually require *more* memory, because now you have to keep track of unique results on each node? There are definitely scenarios where running it remotely will save a lot in network traffic, though. >> >> >> Cheers >> Dan >> >> >> [1] >> https://github.com/infinispan/infinispan/pull/3571#discussion-diff-34033399R22 >> >> On Wed, May 27, 2015 at 9:52 PM, William Burns >> wrote: >> > Hello everyone, >> > >> > I wanted to let you know I wrote up a design documenting the successor >> > to >> > EntryRetriever, Distributed Streams [1] ! >> > >> > Any comments or feedback would be much appreciated. >> > >> > I especially would like targeted feedback regarding: >> > >> > 1. The operators forEach and peek may want to be ran locally. Should we >> > have an overridden method so users can pick which they want? Personally >> > I >> > feel that peek is too specific to matter and forEach can always be done >> > by >> > the caller locally if desired. >> > 2. The intermediate operators limit and skip do not seem worth >> > implementing >> > unless we have sorting support. (Sorting support will come later). I am >> > thinking to not support these until sorting is added. >> > >> > Thanks, >> > >> > - Will >> > >> > [1] >> > https://github.com/infinispan/infinispan/wiki/Distributed-Stream-Support >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From mudokonman at gmail.com Thu Jul 9 12:53:19 2015 From: mudokonman at gmail.com (William Burns) Date: Thu, 09 Jul 2015 16:53:19 +0000 Subject: [infinispan-dev] Distributed Streams In-Reply-To: References: Message-ID: On Thu, Jul 9, 2015 at 12:28 PM Dan Berindei wrote: > On Thu, Jul 9, 2015 at 3:49 PM, William Burns > wrote: > > > > > > On Thu, Jul 9, 2015 at 5:11 AM Dan Berindei > wrote: > >> > >> Hi Will > >> > >> After the discussion we started in Galder's PR's comments [1], I > >> started thinking that we should really have a stream() method directly > >> in the Cache/AdvancedCache interface. > >> > >> I feel entrySet(), keySet() or values() encourage users to use > >> external iteration with a for-each loop or iterator(), and we should > >> encourage users to use the Stream methods instead. I also really don't > >> like the idea of users having to close their iterators explicitly, and > >> lazy Iterators (or Spliterators) need to be properly closed or they > >> will leak resources. > > > > > > The iterator and spliterator are automatically closed if it was fully > > iterated upon. > > > > I don't think pulling all entries from the cluster (and loader) in for > > entrySet, keySet or values is a good idea. Unless you are suggesting > that > > we only pull local entries only? In which case we have reverted these > > changes back to ISPN 6 and older. > > > > The entrySet, keySet and values as of ISPN 7.0 are actually completely > > backing collections and methods are evaluated per invocation. This means > > any updates to them or the cache it was created from are seen by each > other. > > > > And the iterators are already AutoCloseable, I know. But with the > streams API we can hide resource management from the user, so I was > hoping we could avoid using AutoCloseable altogether. > I would like that very much myself as well :) Currently the streams don't even return CloseableIterator/CloseableSpliterator because I rely on the user calling close on the Stream itself. > > >> > >> > >> My suggestion, then, is to make entrySet().iterator() and > >> entrySet().spliterator() eager, so that they don't need to implement > >> AutoCloseable. I would even go as far as to say that entrySet() should > >> be eager itself, but maybe keySet().stream() would make a better API > >> than adding a new keysStream() method. > > > > > > Just so I understand you are more saying that we leave the entrySet, > keySet > > and values the way they are so they are backing collections, however > > invocation of the iterator or spliterator would pull in all entries from > the > > entire cache into memory at once? It seems throwing > > Yes > I am not a big fan of having the entire cache contents in memory on one node. > > > UnsupportedOperationException with a message stating to use > > stream().iterator() and closing the stream would be better imo (however > that > > would preclude the usage of foreach). Note the foreach loop is only an > > issue when iterating over that collection and you break out of the loop > > early. > > > > try (Stream stream = entrySet.stream()) { > > Iterator> iterator = stream.iterator(); > > } > > > > Actually I think the issue here is that our CacheCollections don't > currently > > implement CloseableIterable like the EntryIterable does. In that case > you > > can do a simple foreach loop with a break in a try with resource. We > could > > then document that close closes any iterators or spliterators that were > > created from this instance of the collection. > > > > It is a little awkward, but could work this way. > > > > try (CacheSet> closeableEntrySet = entrySet) { > > for (Map.Entry entry : closeableEntrySet) { > > } > > } > > On the other hand, you wouldn't need any try-with-resources if you > used the forEach method, because the resources are both acquired and > released in the forEach call: > > entrySet.stream().forEach((k, v) -> { ... }) > Guessing you just meant entrySet.forEach ? > > But if you make stream() or entrySet() return an AutoCloseable, then > users will put a try-with-resources without it anyway, just in case. > So I'd rather keep the iterator AutoCloseable than the stream/entry > set. > > Making the EntryIterable implement AutoCloseable made sense in Java 7, > because the only thing you could do with it was iterate on it, maybe > with for-each. But in Java 8 Iterable also has a forEach method, and I > wouldn't want users of forEach() to think about whether they need a > try block or not. When in doubt close is my policy. There wouldn't be any noticable overhead by calling close. But I agree having it on the collections is not favorable :( > > > > >> > >> > >> > >> Now to your questions: > >> > >> 1) > >> forEach() doesn't allow the consumer to modify the entries, so I think > >> the most common use case would be doing something with a captured > >> variable (or System.out). > > > > > > This is actually something I didn't cover in the document. But upon > > thinking about this more I was thinking we probably want to allow for CDI > > Injection of the cache for the consumer action before firing. In this > way > > the user can change values as they want. This would behave almost > > identically to map/reduce, however it is much more understandable imo. > > > > You mean like map/reduce when it is configured to store its results in > an intermediate cache? I was referring to the injection part being the same. I am not sure what brings up the intermediate cache. > I think you'd also need a > CacheStream.reduceOnOwner() operation to put in the pipeline before > forEach(), but yeah, it sounds like it should work. > Not sure what this method would do. The forEach action would already be ran on the primary owner node, unless we had an intermediate operation that required it come local. > > >> > >> So I would make forEach execute the consumer > >> on the originator, and maybe add a distributedForEach method that > >> executes its consumer on each owner (accepting that the consumer may > >> be executed twice for some keys, or never, if the originator crashes). > >> distributedForEach probably makes more sense once you start injecting > >> the Cache (or other components) in the remote Consumers. > > > > > > This was my conundrum before, however I believe I found a happy medium. > I > > figured if we implement it distributed gives more flexibility. The user > can > > still choose to run it locally as they desire. > > > > For example you can call > *.stream().iterator().forEachRemaining(consumer) if > > you wanted to do a forEach locally in a single thread. And if you > wanted it > > parallelized you can do > > StreamSupport.stream(*.stream().spliterator(), true).forEach(consumer) > > Except now you have to access the spliterator directly... Will this > work, or will the user need try-with-resources? Do we want users to > think about it? > > StreamSupport.stream(*.stream().spliterator(), > true).limit(10).forEach(consumer) > The user can't call close on the spliterator from the stream, they would have to do it on the Stream. The AutoCloseable interfaces is only added to the iterator and spliterator on the collections returned via keySet, entrySet, and values. > > > > > This would all be documented on the forEach method. > > > >> > >> > >> peek()'s intended use case is probably logging progress, so it will > >> definitely need to interact with an external component. However, > >> executing it to the originator would potentially change the execution > >> of the stream dramatically, and adding logging shouldn't have that > >> kind of impact. So peek() should be executed on the remote nodes, even > >> if we don't have remote injection yet. > > > > > > This is how I ended up doing it was to have it done remotely. > > > >> > >> > >> 2) > >> I would say implement sorting on the originator from the beginning, > >> and limit() and skip() as well. It's true that users may me > >> disappointed to see adding limit() doesn't improve the performance of > >> their sorted() execution, but I would rather have a complete API > >> available for applications who don't need to sort the entire cache. > > > > > > This is how I did this as well :) Basically if we find that there is a > > sorted, distributed, limit or skip it performs all of the intermediate > > operations up that point then uses an iterator to bring the results back > > locally where it can be performed. Limit and distinct are also actually > > performed remotely first to reduce how many results are returned. I am > not > > 100% sold on performing distinct remotely first as it could actually be > > significantly slower, but it should hopefully reduce some memory usage :P > > > > Shouldn't running distinct remotely actually require *more* memory, > because now you have to keep track of unique results on each node? > There are definitely scenarios where running it remotely will save a > lot in network traffic, though. > It depends on what you mean by more. I am thinking more total memory on a given node. Running remote only reads data from each remote node from segments it primarily owns. If any elements are found to be the same it will reduce the overall total memory required on the originator node by the same amount of matches. If you do it only local you will have to have all data from all nodes including matches in memory on node. > > >> > >> > >> Cheers > >> Dan > >> > >> > >> [1] > >> > https://github.com/infinispan/infinispan/pull/3571#discussion-diff-34033399R22 > >> > >> On Wed, May 27, 2015 at 9:52 PM, William Burns > >> wrote: > >> > Hello everyone, > >> > > >> > I wanted to let you know I wrote up a design documenting the successor > >> > to > >> > EntryRetriever, Distributed Streams [1] ! > >> > > >> > Any comments or feedback would be much appreciated. > >> > > >> > I especially would like targeted feedback regarding: > >> > > >> > 1. The operators forEach and peek may want to be ran locally. Should > we > >> > have an overridden method so users can pick which they want? > Personally > >> > I > >> > feel that peek is too specific to matter and forEach can always be > done > >> > by > >> > the caller locally if desired. > >> > 2. The intermediate operators limit and skip do not seem worth > >> > implementing > >> > unless we have sorting support. (Sorting support will come later). I > am > >> > thinking to not support these until sorting is added. > >> > > >> > Thanks, > >> > > >> > - Will > >> > > >> > [1] > >> > > https://github.com/infinispan/infinispan/wiki/Distributed-Stream-Support > >> > > >> > _______________________________________________ > >> > infinispan-dev mailing list > >> > infinispan-dev at lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150709/c43dd76d/attachment-0001.html From galder at redhat.com Thu Jul 9 19:10:06 2015 From: galder at redhat.com (Galder Zamarreno) Date: Thu, 9 Jul 2015 19:10:06 -0400 (EDT) Subject: [infinispan-dev] Distributed Streams In-Reply-To: References: Message-ID: <1034624896.16910951.1436483406980.JavaMail.zimbra@redhat.com> ----- Original Message ----- > Hi Will > > After the discussion we started in Galder's PR's comments [1], I > started thinking that we should really have a stream() method directly > in the Cache/AdvancedCache interface. ^ I don't think that's a good idea. A stream of what? keys only? values and keys? In the end you'd end up with 2/3 stream methods.... > I feel entrySet(), keySet() or values() encourage users to use > external iteration with a for-each loop or iterator(), and we should > encourage users to use the Stream methods instead. I also really don't > like the idea of users having to close their iterators explicitly, and > lazy Iterators (or Spliterators) need to be properly closed or they > will leak resources. ^ I don't think there's a need to do any extra education here. If they need/want streams, they can get them via entrySet/keySet/values, without the need for extra API. I don't have an opinion on the need, or not need, to close iterators. > > My suggestion, then, is to make entrySet().iterator() and > entrySet().spliterator() eager, so that they don't need to implement > AutoCloseable. I would even go as far as to say that entrySet() should > be eager itself, but maybe keySet().stream() would make a better API > than adding a new keysStream() method. > > > Now to your questions: > > 1) > forEach() doesn't allow the consumer to modify the entries, so I think > the most common use case would be doing something with a captured > variable (or System.out). So I would make forEach execute the consumer > on the originator, and maybe add a distributedForEach method that > executes its consumer on each owner (accepting that the consumer may > be executed twice for some keys, or never, if the originator crashes). > distributedForEach probably makes more sense once you start injecting > the Cache (or other components) in the remote Consumers. > > peek()'s intended use case is probably logging progress, so it will > definitely need to interact with an external component. However, > executing it to the originator would potentially change the execution > of the stream dramatically, and adding logging shouldn't have that > kind of impact. So peek() should be executed on the remote nodes, even > if we don't have remote injection yet. > > 2) > I would say implement sorting on the originator from the beginning, > and limit() and skip() as well. It's true that users may me > disappointed to see adding limit() doesn't improve the performance of > their sorted() execution, but I would rather have a complete API > available for applications who don't need to sort the entire cache. > > Cheers > Dan > > > [1] > https://github.com/infinispan/infinispan/pull/3571#discussion-diff-34033399R22 > > On Wed, May 27, 2015 at 9:52 PM, William Burns wrote: > > Hello everyone, > > > > I wanted to let you know I wrote up a design documenting the successor to > > EntryRetriever, Distributed Streams [1] ! > > > > Any comments or feedback would be much appreciated. > > > > I especially would like targeted feedback regarding: > > > > 1. The operators forEach and peek may want to be ran locally. Should we > > have an overridden method so users can pick which they want? Personally I > > feel that peek is too specific to matter and forEach can always be done by > > the caller locally if desired. > > 2. The intermediate operators limit and skip do not seem worth implementing > > unless we have sorting support. (Sorting support will come later). I am > > thinking to not support these until sorting is added. > > > > Thanks, > > > > - Will > > > > [1] > > https://github.com/infinispan/infinispan/wiki/Distributed-Stream-Support > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From ttarrant at redhat.com Mon Jul 13 06:40:28 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 13 Jul 2015 11:40:28 +0100 Subject: [infinispan-dev] My work log Message-ID: <55A3959C.8060300@redhat.com> Hi all guys, I've just returned from two weeks of PTO. Before leaving I left a PR [1] which updates the JGroups subsystem to the one found in WildFly 9 as well as a bunch of other changes. Unfortunately I broke authorization in there, so I'm cleaning up my PR and hopefully should have it ready by later today. After that it's the infamous template PR which will be based on the one below. I have a ton of e-mails and PR comments to read and review so I'll probably re-emerge some time later this week :) Tristan [1] https://github.com/infinispan/infinispan/pull/3562 -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Mon Jul 13 06:53:10 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 13 Jul 2015 11:53:10 +0100 Subject: [infinispan-dev] My work log In-Reply-To: <55A3959C.8060300@redhat.com> References: <55A3959C.8060300@redhat.com> Message-ID: Welcome back! but.. we're in WildFly 10 season now ;) On 13 July 2015 at 11:40, Tristan Tarrant wrote: > Hi all guys, > > I've just returned from two weeks of PTO. > Before leaving I left a PR [1] which updates the JGroups subsystem to > the one found in WildFly 9 as well as a bunch of other changes. > Unfortunately I broke authorization in there, so I'm cleaning up my PR > and hopefully should have it ready by later today. After that it's the > infamous template PR which will be based on the one below. > > I have a ton of e-mails and PR comments to read and review so I'll > probably re-emerge some time later this week :) > > Tristan > > [1] https://github.com/infinispan/infinispan/pull/3562 > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Jul 13 07:33:32 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 13 Jul 2015 12:33:32 +0100 Subject: [infinispan-dev] My work log In-Reply-To: References: <55A3959C.8060300@redhat.com> Message-ID: <55A3A20C.3030807@redhat.com> Infinispan 8 will come out long before WildFly 10. The intention is to rebase the server to WildFly core 2.0 IFF there is an effort from some subsystems to reduce their dependency overhead (transactions especially). See a thread in wildfly-dev (to which I'll soon be replying). Tristan On 13/07/2015 11:53, Sanne Grinovero wrote: > Welcome back! > but.. we're in WildFly 10 season now ;) > > On 13 July 2015 at 11:40, Tristan Tarrant wrote: >> Hi all guys, >> >> I've just returned from two weeks of PTO. >> Before leaving I left a PR [1] which updates the JGroups subsystem to >> the one found in WildFly 9 as well as a bunch of other changes. >> Unfortunately I broke authorization in there, so I'm cleaning up my PR >> and hopefully should have it ready by later today. After that it's the >> infamous template PR which will be based on the one below. >> >> I have a ton of e-mails and PR comments to read and review so I'll >> probably re-emerge some time later this week :) >> >> Tristan >> >> [1] https://github.com/infinispan/infinispan/pull/3562 >> >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Mon Jul 13 07:51:27 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 13 Jul 2015 12:51:27 +0100 Subject: [infinispan-dev] My work log In-Reply-To: <55A3A20C.3030807@redhat.com> References: <55A3959C.8060300@redhat.com> <55A3A20C.3030807@redhat.com> Message-ID: I think we actually are talking about two different things. Infinispan server will of course want to be based on a stable server, and possibly WildFly core 2. But we also want to make sure - better sooner than later - that the Infinispan 8 modules can be consumed in embedded mode by applications running on WildFly 10 "full". I guess it's ok for some integration tests can be moved to WildFly 10 already? For example, embedded query modules, and the Hibernate Search / Infinispan Directory integrations. On 13 July 2015 at 12:33, Tristan Tarrant wrote: > Infinispan 8 will come out long before WildFly 10. > The intention is to rebase the server to WildFly core 2.0 IFF there is > an effort from some subsystems to reduce their dependency overhead > (transactions especially). See a thread in wildfly-dev (to which I'll > soon be replying). > > Tristan > > On 13/07/2015 11:53, Sanne Grinovero wrote: >> Welcome back! >> but.. we're in WildFly 10 season now ;) >> >> On 13 July 2015 at 11:40, Tristan Tarrant wrote: >>> Hi all guys, >>> >>> I've just returned from two weeks of PTO. >>> Before leaving I left a PR [1] which updates the JGroups subsystem to >>> the one found in WildFly 9 as well as a bunch of other changes. >>> Unfortunately I broke authorization in there, so I'm cleaning up my PR >>> and hopefully should have it ready by later today. After that it's the >>> infamous template PR which will be based on the one below. >>> >>> I have a ton of e-mails and PR comments to read and review so I'll >>> probably re-emerge some time later this week :) >>> >>> Tristan >>> >>> [1] https://github.com/infinispan/infinispan/pull/3562 >>> >>> -- >>> Tristan Tarrant >>> Infinispan Lead >>> JBoss, a division of Red Hat >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Jul 13 08:05:09 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 13 Jul 2015 13:05:09 +0100 Subject: [infinispan-dev] My work log In-Reply-To: References: <55A3959C.8060300@redhat.com> <55A3A20C.3030807@redhat.com> Message-ID: <55A3A975.9070208@redhat.com> I don't see the ambiguity in my original e-mail. However, you are right about making sure or modules are compatible with the latest WF. Bear in mind that I intend to eventually include our "server" subsystem modules in the module pack. Tristan On 13/07/2015 12:51, Sanne Grinovero wrote: > I think we actually are talking about two different things. > Infinispan server will of course want to be based on a stable server, > and possibly WildFly core 2. > > But we also want to make sure - better sooner than later - that the > Infinispan 8 modules can be consumed in embedded mode by applications > running on WildFly 10 "full". > I guess it's ok for some integration tests can be moved to WildFly 10 > already? For example, embedded query modules, and the Hibernate Search > / Infinispan Directory integrations. > > On 13 July 2015 at 12:33, Tristan Tarrant wrote: >> Infinispan 8 will come out long before WildFly 10. >> The intention is to rebase the server to WildFly core 2.0 IFF there is >> an effort from some subsystems to reduce their dependency overhead >> (transactions especially). See a thread in wildfly-dev (to which I'll >> soon be replying). >> >> Tristan >> >> On 13/07/2015 11:53, Sanne Grinovero wrote: >>> Welcome back! >>> but.. we're in WildFly 10 season now ;) >>> >>> On 13 July 2015 at 11:40, Tristan Tarrant wrote: >>>> Hi all guys, >>>> >>>> I've just returned from two weeks of PTO. >>>> Before leaving I left a PR [1] which updates the JGroups subsystem to >>>> the one found in WildFly 9 as well as a bunch of other changes. >>>> Unfortunately I broke authorization in there, so I'm cleaning up my PR >>>> and hopefully should have it ready by later today. After that it's the >>>> infamous template PR which will be based on the one below. >>>> >>>> I have a ton of e-mails and PR comments to read and review so I'll >>>> probably re-emerge some time later this week :) >>>> >>>> Tristan >>>> >>>> [1] https://github.com/infinispan/infinispan/pull/3562 >>>> >>>> -- >>>> Tristan Tarrant >>>> Infinispan Lead >>>> JBoss, a division of Red Hat >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >> >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Mon Jul 13 08:53:03 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 13 Jul 2015 13:53:03 +0100 Subject: [infinispan-dev] My work log In-Reply-To: <55A3A975.9070208@redhat.com> References: <55A3959C.8060300@redhat.com> <55A3A20C.3030807@redhat.com> <55A3A975.9070208@redhat.com> Message-ID: On 13 Jul 2015 13:07, "Tristan Tarrant" wrote: > > I don't see the ambiguity in my original e-mail. My fault, I was just thinking about the integration tests I've been involved with. I think those should move to WF10 asap, of course that's unrelated to your server patches. > However, you are right about making sure or modules are compatible with > the latest WF. Bear in mind that I intend to eventually include our > "server" subsystem modules in the module pack. Shouldn't that be a separate add-on? (And you don't mean a "feature pack" right?) > > Tristan > > On 13/07/2015 12:51, Sanne Grinovero wrote: > > I think we actually are talking about two different things. > > Infinispan server will of course want to be based on a stable server, > > and possibly WildFly core 2. > > > > But we also want to make sure - better sooner than later - that the > > Infinispan 8 modules can be consumed in embedded mode by applications > > running on WildFly 10 "full". > > I guess it's ok for some integration tests can be moved to WildFly 10 > > already? For example, embedded query modules, and the Hibernate Search > > / Infinispan Directory integrations. > > > > On 13 July 2015 at 12:33, Tristan Tarrant wrote: > >> Infinispan 8 will come out long before WildFly 10. > >> The intention is to rebase the server to WildFly core 2.0 IFF there is > >> an effort from some subsystems to reduce their dependency overhead > >> (transactions especially). See a thread in wildfly-dev (to which I'll > >> soon be replying). > >> > >> Tristan > >> > >> On 13/07/2015 11:53, Sanne Grinovero wrote: > >>> Welcome back! > >>> but.. we're in WildFly 10 season now ;) > >>> > >>> On 13 July 2015 at 11:40, Tristan Tarrant wrote: > >>>> Hi all guys, > >>>> > >>>> I've just returned from two weeks of PTO. > >>>> Before leaving I left a PR [1] which updates the JGroups subsystem to > >>>> the one found in WildFly 9 as well as a bunch of other changes. > >>>> Unfortunately I broke authorization in there, so I'm cleaning up my PR > >>>> and hopefully should have it ready by later today. After that it's the > >>>> infamous template PR which will be based on the one below. > >>>> > >>>> I have a ton of e-mails and PR comments to read and review so I'll > >>>> probably re-emerge some time later this week :) > >>>> > >>>> Tristan > >>>> > >>>> [1] https://github.com/infinispan/infinispan/pull/3562 > >>>> > >>>> -- > >>>> Tristan Tarrant > >>>> Infinispan Lead > >>>> JBoss, a division of Red Hat > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> infinispan-dev at lists.jboss.org > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev at lists.jboss.org > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> > >>> > >> > >> -- > >> Tristan Tarrant > >> Infinispan Lead > >> JBoss, a division of Red Hat > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150713/c784b389/attachment-0001.html From dan.berindei at gmail.com Mon Jul 13 10:45:52 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 13 Jul 2015 17:45:52 +0300 Subject: [infinispan-dev] Weekly IRC Meeting Minutes 2015-07-13 Message-ID: Hi all Here are the minutes of today's IRC meeting: http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2015/infinispan.2015-07-13-14.03.log.html Cheers Dan PS: The #topic command seems to be ignored if the user is not a chair. I've looked into the jbott docs, and it looks like there's a #chairs command that could be used to make everyone a chair at the beginning at the meeting. From mudokonman at gmail.com Mon Jul 13 13:13:06 2015 From: mudokonman at gmail.com (William Burns) Date: Mon, 13 Jul 2015 17:13:06 +0000 Subject: [infinispan-dev] Strict Expiration Message-ID: This is a necro of [1]. With Infinispan 8.0 we are adding in clustered expiration. That includes an expiration event raised that is clustered as well. Unfortunately expiration events currently occur multiple times (if numOwners > 1) at different times across nodes in a cluster. This makes coordinating a single cluster expiration event quite difficult. To work around this I am proposing that the expiration of an event is done solely by the owner of the given key that is now expired. This would fix the issue of having multiple events and the event can be raised while holding the lock for the given key so concurrent modifications would not be an issue. The problem arises when you have other nodes that have expiration set but expire at different times. Max idle is the biggest offender with this as a read on an owner only refreshes the owners timestamp, meaning other owners would not be updated and expire preemptively. To have expiration work properly in this case you would need coordination between the owners to see if anyone has a higher value. This requires blocking and would have to be done while accessing a key that is expired to be sure if expiration happened or not. The linked dev listing proposed instead to only expire an entry by the reaper thread and not on access. In this case a read will return a non null value until it is fully expired, increasing hit ratios possibly. Their are quire a bit of real benefits for this: 1. Cluster cache reads would be much simpler and wouldn't have to block to verify the object exists or not since this would only be done by the reaper thread (note this would have only happened if the entry was expired locally). An access would just return the value immediately. 2. Each node only expires entries it owns in the reaper thread reducing how many entries they must check or remove. This also provides a single point where events would be raised as we need. 3. A lot of code can now be removed and made simpler as it no longer has to check for expiration. The expiration check would only be done in 1 place, the expiration reaper thread. The main issue with this proposal is as the other listing mentions is if user code expects the value to be gone after expiration for correctness. I would say this use case is not as compelling for maxIdle, especially since we never supported it properly. And in the case of lifespan the user could very easily store the expiration time in the object that they can check after a get as pointed out in the other thread. [1] http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150713/b65a0e1c/attachment.html From sanne at infinispan.org Mon Jul 13 13:41:29 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 13 Jul 2015 18:41:29 +0100 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: Message-ID: +1 You had me convinced at the first line, although "A lot of code can now be removed and made simpler" makes it look extremely nice. On 13 Jul 2015 18:14, "William Burns" wrote: > This is a necro of [1]. > > With Infinispan 8.0 we are adding in clustered expiration. That includes > an expiration event raised that is clustered as well. Unfortunately > expiration events currently occur multiple times (if numOwners > 1) at > different times across nodes in a cluster. This makes coordinating a > single cluster expiration event quite difficult. > > To work around this I am proposing that the expiration of an event is done > solely by the owner of the given key that is now expired. This would fix > the issue of having multiple events and the event can be raised while > holding the lock for the given key so concurrent modifications would not be > an issue. > > The problem arises when you have other nodes that have expiration set but > expire at different times. Max idle is the biggest offender with this as a > read on an owner only refreshes the owners timestamp, meaning other owners > would not be updated and expire preemptively. To have expiration work > properly in this case you would need coordination between the owners to see > if anyone has a higher value. This requires blocking and would have to be > done while accessing a key that is expired to be sure if expiration > happened or not. > > The linked dev listing proposed instead to only expire an entry by the > reaper thread and not on access. In this case a read will return a non > null value until it is fully expired, increasing hit ratios possibly. > > Their are quire a bit of real benefits for this: > > 1. Cluster cache reads would be much simpler and wouldn't have to block to > verify the object exists or not since this would only be done by the reaper > thread (note this would have only happened if the entry was expired > locally). An access would just return the value immediately. > 2. Each node only expires entries it owns in the reaper thread reducing > how many entries they must check or remove. This also provides a single > point where events would be raised as we need. > 3. A lot of code can now be removed and made simpler as it no longer has > to check for expiration. The expiration check would only be done in 1 > place, the expiration reaper thread. > > The main issue with this proposal is as the other listing mentions is if > user code expects the value to be gone after expiration for correctness. I > would say this use case is not as compelling for maxIdle, especially since > we never supported it properly. And in the case of lifespan the user could > very easily store the expiration time in the object that they can check > after a get as pointed out in the other thread. > > [1] > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150713/de0ba282/attachment.html From ttarrant at redhat.com Mon Jul 13 14:25:37 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 13 Jul 2015 19:25:37 +0100 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: Message-ID: <55A402A1.4010004@redhat.com> After re-reading the whole original thread, I agree with the proposal with two caveats: - ensure that we don't break JCache compatibility - ensure that we document this properly Tristan On 13/07/2015 18:41, Sanne Grinovero wrote: > +1 > You had me convinced at the first line, although "A lot of code can now > be removed and made simpler" makes it look extremely nice. > > On 13 Jul 2015 18:14, "William Burns" > wrote: > > This is a necro of [1]. > > With Infinispan 8.0 we are adding in clustered expiration. That > includes an expiration event raised that is clustered as well. > Unfortunately expiration events currently occur multiple times (if > numOwners > 1) at different times across nodes in a cluster. This > makes coordinating a single cluster expiration event quite difficult. > > To work around this I am proposing that the expiration of an event > is done solely by the owner of the given key that is now expired. > This would fix the issue of having multiple events and the event can > be raised while holding the lock for the given key so concurrent > modifications would not be an issue. > > The problem arises when you have other nodes that have expiration > set but expire at different times. Max idle is the biggest offender > with this as a read on an owner only refreshes the owners timestamp, > meaning other owners would not be updated and expire preemptively. > To have expiration work properly in this case you would need > coordination between the owners to see if anyone has a higher > value. This requires blocking and would have to be done while > accessing a key that is expired to be sure if expiration happened or > not. > > The linked dev listing proposed instead to only expire an entry by > the reaper thread and not on access. In this case a read will > return a non null value until it is fully expired, increasing hit > ratios possibly. > > Their are quire a bit of real benefits for this: > > 1. Cluster cache reads would be much simpler and wouldn't have to > block to verify the object exists or not since this would only be > done by the reaper thread (note this would have only happened if the > entry was expired locally). An access would just return the value > immediately. > 2. Each node only expires entries it owns in the reaper thread > reducing how many entries they must check or remove. This also > provides a single point where events would be raised as we need. > 3. A lot of code can now be removed and made simpler as it no longer > has to check for expiration. The expiration check would only be > done in 1 place, the expiration reaper thread. > > The main issue with this proposal is as the other listing mentions > is if user code expects the value to be gone after expiration for > correctness. I would say this use case is not as compelling for > maxIdle, especially since we never supported it properly. And in > the case of lifespan the user could very easily store the expiration > time in the object that they can check after a get as pointed out in > the other thread. > > [1] > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Mon Jul 13 14:30:04 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 13 Jul 2015 19:30:04 +0100 Subject: [infinispan-dev] Distributed Streams In-Reply-To: <1034624896.16910951.1436483406980.JavaMail.zimbra@redhat.com> References: <1034624896.16910951.1436483406980.JavaMail.zimbra@redhat.com> Message-ID: <55A403AC.7090408@redhat.com> On 10/07/2015 00:10, Galder Zamarreno wrote: >> After the discussion we started in Galder's PR's comments [1], I >> started thinking that we should really have a stream() method directly >> in the Cache/AdvancedCache interface. > > ^ I don't think that's a good idea. A stream of what? keys only? values and keys? > > In the end you'd end up with 2/3 stream methods.... I agree with Galder here: streams only apply to Collections and not to Maps (like in the JDK). It has already proven difficult mimicking the Map interface in a distributed environment: let's not aggravate this. entrySet().stream(), keySet().stream() are unambiguous and respect the original interfaces. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From dan.berindei at gmail.com Tue Jul 14 02:11:53 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 14 Jul 2015 09:11:53 +0300 Subject: [infinispan-dev] Distributed Streams In-Reply-To: <55A403AC.7090408@redhat.com> References: <1034624896.16910951.1436483406980.JavaMail.zimbra@redhat.com> <55A403AC.7090408@redhat.com> Message-ID: On Mon, Jul 13, 2015 at 9:30 PM, Tristan Tarrant wrote: > > > On 10/07/2015 00:10, Galder Zamarreno wrote: >>> After the discussion we started in Galder's PR's comments [1], I >>> started thinking that we should really have a stream() method directly >>> in the Cache/AdvancedCache interface. >> >> ^ I don't think that's a good idea. A stream of what? keys only? values and keys? >> >> In the end you'd end up with 2/3 stream methods.... > > I agree with Galder here: streams only apply to Collections and not to > Maps (like in the JDK). It has already proven difficult mimicking the > Map interface in a distributed environment: let's not aggravate this. > > entrySet().stream(), keySet().stream() are unambiguous and respect the > original interfaces. I was thinking stream() would stream CacheEntries, like cacheEntrySet().stream() does in Will's PR. But I guess people who want to try the streaming API will still go for entrySet().stream() first, and they'll only try to find another method if it doesn't work or if they need access to the metadata. Dan From dan.berindei at gmail.com Tue Jul 14 04:41:01 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 14 Jul 2015 11:41:01 +0300 Subject: [infinispan-dev] Strict Expiration In-Reply-To: <55A402A1.4010004@redhat.com> References: <55A402A1.4010004@redhat.com> Message-ID: Processing expiration only on the reaper thread sounds nice, but I have one reservation: processing 1 million entries to see that 1 of them is expired is a lot of work, and in the general case we will not be able to ensure an expiration precision of less than 1 minute (maybe more, with a huge SingleFileStore attached). What happens to users who need better precision? In particular, I know some JCache tests were failing because HotRod was only supporting 1-second resolution instead of the 1-millisecond resolution they were expecting. I'm even less convinced about the need to guarantee that a clustered expiration listener will only be triggered once, and that the entry must be null everywhere after that listener was invoked. What's the use case? Note that this would make the reaper thread less efficient: with numOwners=2 (best case), half of the entries that the reaper touches cannot be expired, because the node isn't the primary node. And to make matters worse, the same reaper thread would have to perform a (synchronous?) RPC for each entry to ensure it expires everywhere. For maxIdle I'd like to know more information about how exactly the owners would coordinate to expire an entry. I'm pretty sure we cannot avoid ignoring some reads (expiring an entry immediately after it was read), and ensuring that we don't accidentally extend an entry's life (like the current code does, when we transfer an entry to a new owner) also sounds problematic. I'm not saying expiring entries on each node independently is perfect, far from it. But I wouldn't want us to provide new guarantees that could hurt performance without a really good use case. Cheers Dan On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant wrote: > After re-reading the whole original thread, I agree with the proposal > with two caveats: > > - ensure that we don't break JCache compatibility > - ensure that we document this properly > > Tristan > > On 13/07/2015 18:41, Sanne Grinovero wrote: >> +1 >> You had me convinced at the first line, although "A lot of code can now >> be removed and made simpler" makes it look extremely nice. >> >> On 13 Jul 2015 18:14, "William Burns" > > wrote: >> >> This is a necro of [1]. >> >> With Infinispan 8.0 we are adding in clustered expiration. That >> includes an expiration event raised that is clustered as well. >> Unfortunately expiration events currently occur multiple times (if >> numOwners > 1) at different times across nodes in a cluster. This >> makes coordinating a single cluster expiration event quite difficult. >> >> To work around this I am proposing that the expiration of an event >> is done solely by the owner of the given key that is now expired. >> This would fix the issue of having multiple events and the event can >> be raised while holding the lock for the given key so concurrent >> modifications would not be an issue. >> >> The problem arises when you have other nodes that have expiration >> set but expire at different times. Max idle is the biggest offender >> with this as a read on an owner only refreshes the owners timestamp, >> meaning other owners would not be updated and expire preemptively. >> To have expiration work properly in this case you would need >> coordination between the owners to see if anyone has a higher >> value. This requires blocking and would have to be done while >> accessing a key that is expired to be sure if expiration happened or >> not. >> >> The linked dev listing proposed instead to only expire an entry by >> the reaper thread and not on access. In this case a read will >> return a non null value until it is fully expired, increasing hit >> ratios possibly. >> >> Their are quire a bit of real benefits for this: >> >> 1. Cluster cache reads would be much simpler and wouldn't have to >> block to verify the object exists or not since this would only be >> done by the reaper thread (note this would have only happened if the >> entry was expired locally). An access would just return the value >> immediately. >> 2. Each node only expires entries it owns in the reaper thread >> reducing how many entries they must check or remove. This also >> provides a single point where events would be raised as we need. >> 3. A lot of code can now be removed and made simpler as it no longer >> has to check for expiration. The expiration check would only be >> done in 1 place, the expiration reaper thread. >> >> The main issue with this proposal is as the other listing mentions >> is if user code expects the value to be gone after expiration for >> correctness. I would say this use case is not as compelling for >> maxIdle, especially since we never supported it properly. And in >> the case of lifespan the user could very easily store the expiration >> time in the object that they can check after a get as pointed out in >> the other thread. >> >> [1] >> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Tue Jul 14 05:48:52 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 14 Jul 2015 10:48:52 +0100 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> Message-ID: <55A4DB04.9080205@redhat.com> Actually, soon after sending my e-mail yesterday, I've been having other thoughts on this. I think we should really distinguish between maxIdle (which IMO semantically only makes sense in a pure caching environment) and lifespan. While it is acceptable to have stale data for the former (and therefore justify a best-effort approach on expiration), I don't think we can apply that blanket statement to the latter, since a user might have legitimate reasons to depend on a fixed expiration time. While offloading the burden of actually checking for staleness on the user code might seem like the nice thing to do from our point of view, I'm not sure the user would think the same. Also, it doesn't seem correct for things like putIfAbsent and computeIfAbsent. +1 to only have the primary owner generate the expiration event. Tristan On 14/07/2015 09:41, Dan Berindei wrote: > Processing expiration only on the reaper thread sounds nice, but I > have one reservation: processing 1 million entries to see that 1 of > them is expired is a lot of work, and in the general case we will not > be able to ensure an expiration precision of less than 1 minute (maybe > more, with a huge SingleFileStore attached). > > What happens to users who need better precision? In particular, I know > some JCache tests were failing because HotRod was only supporting > 1-second resolution instead of the 1-millisecond resolution they were > expecting. > > > I'm even less convinced about the need to guarantee that a clustered > expiration listener will only be triggered once, and that the entry > must be null everywhere after that listener was invoked. What's the > use case? > > Note that this would make the reaper thread less efficient: with > numOwners=2 (best case), half of the entries that the reaper touches > cannot be expired, because the node isn't the primary node. And to > make matters worse, the same reaper thread would have to perform a > (synchronous?) RPC for each entry to ensure it expires everywhere. > > For maxIdle I'd like to know more information about how exactly the > owners would coordinate to expire an entry. I'm pretty sure we cannot > avoid ignoring some reads (expiring an entry immediately after it was > read), and ensuring that we don't accidentally extend an entry's life > (like the current code does, when we transfer an entry to a new owner) > also sounds problematic. > > I'm not saying expiring entries on each node independently is perfect, > far from it. But I wouldn't want us to provide new guarantees that > could hurt performance without a really good use case. > > Cheers > Dan > > > On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant wrote: >> After re-reading the whole original thread, I agree with the proposal >> with two caveats: >> >> - ensure that we don't break JCache compatibility >> - ensure that we document this properly >> >> Tristan >> >> On 13/07/2015 18:41, Sanne Grinovero wrote: >>> +1 >>> You had me convinced at the first line, although "A lot of code can now >>> be removed and made simpler" makes it look extremely nice. >>> >>> On 13 Jul 2015 18:14, "William Burns" >> > wrote: >>> >>> This is a necro of [1]. >>> >>> With Infinispan 8.0 we are adding in clustered expiration. That >>> includes an expiration event raised that is clustered as well. >>> Unfortunately expiration events currently occur multiple times (if >>> numOwners > 1) at different times across nodes in a cluster. This >>> makes coordinating a single cluster expiration event quite difficult. >>> >>> To work around this I am proposing that the expiration of an event >>> is done solely by the owner of the given key that is now expired. >>> This would fix the issue of having multiple events and the event can >>> be raised while holding the lock for the given key so concurrent >>> modifications would not be an issue. >>> >>> The problem arises when you have other nodes that have expiration >>> set but expire at different times. Max idle is the biggest offender >>> with this as a read on an owner only refreshes the owners timestamp, >>> meaning other owners would not be updated and expire preemptively. >>> To have expiration work properly in this case you would need >>> coordination between the owners to see if anyone has a higher >>> value. This requires blocking and would have to be done while >>> accessing a key that is expired to be sure if expiration happened or >>> not. >>> >>> The linked dev listing proposed instead to only expire an entry by >>> the reaper thread and not on access. In this case a read will >>> return a non null value until it is fully expired, increasing hit >>> ratios possibly. >>> >>> Their are quire a bit of real benefits for this: >>> >>> 1. Cluster cache reads would be much simpler and wouldn't have to >>> block to verify the object exists or not since this would only be >>> done by the reaper thread (note this would have only happened if the >>> entry was expired locally). An access would just return the value >>> immediately. >>> 2. Each node only expires entries it owns in the reaper thread >>> reducing how many entries they must check or remove. This also >>> provides a single point where events would be raised as we need. >>> 3. A lot of code can now be removed and made simpler as it no longer >>> has to check for expiration. The expiration check would only be >>> done in 1 place, the expiration reaper thread. >>> >>> The main issue with this proposal is as the other listing mentions >>> is if user code expects the value to be gone after expiration for >>> correctness. I would say this use case is not as compelling for >>> maxIdle, especially since we never supported it properly. And in >>> the case of lifespan the user could very easily store the expiration >>> time in the object that they can check after a get as pointed out in >>> the other thread. >>> >>> [1] >>> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From smarlow at redhat.com Tue Jul 14 08:34:01 2015 From: smarlow at redhat.com (Scott Marlow) Date: Tue, 14 Jul 2015 08:34:01 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... Message-ID: <55A501B9.7060608@redhat.com> Hi, I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. If that happens, how does that impact Hibernate ORM 5.0 which currently integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any changes to integrate with Infinispan 8.0? Thanks, Scott From mudokonman at gmail.com Tue Jul 14 09:37:32 2015 From: mudokonman at gmail.com (William Burns) Date: Tue, 14 Jul 2015 13:37:32 +0000 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> Message-ID: On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei wrote: > Processing expiration only on the reaper thread sounds nice, but I > have one reservation: processing 1 million entries to see that 1 of > them is expired is a lot of work, and in the general case we will not > be able to ensure an expiration precision of less than 1 minute (maybe > more, with a huge SingleFileStore attached). > This isn't much different then before. The only difference is that if a user touched a value after it expired it wouldn't show up (which is unlikely with maxIdle especially). > > What happens to users who need better precision? In particular, I know > some JCache tests were failing because HotRod was only supporting > 1-second resolution instead of the 1-millisecond resolution they were > expecting. > JCache is an interesting piece. The thing about JCache is that the spec is only defined for local caches. However I wouldn't want to muddy up the waters in regards to it behaving differently for local/remote. In the JCache scenario we could add an interceptor to prevent it returning such values (we do something similar already for events). JCache behavior vs ISPN behavior seems a bit easier to differentiate. But like you are getting at, either way is not very appealing. > > > I'm even less convinced about the need to guarantee that a clustered > expiration listener will only be triggered once, and that the entry > must be null everywhere after that listener was invoked. What's the > use case? > Maybe Tristan would know more to answer. To be honest this work seems fruitless unless we know what our end users want here. Spending time on something for it to thrown out is never fun :( And the more I thought about this the more I question the validity of maxIdle even. It seems like a very poor way to prevent memory exhaustion, which eviction does in a much better way and has much more flexible algorithms. Does anyone know what maxIdle would be used for that wouldn't be covered by eviction? The only thing I can think of is cleaning up the cache store as well. > > Note that this would make the reaper thread less efficient: with > numOwners=2 (best case), half of the entries that the reaper touches > cannot be expired, because the node isn't the primary node. And to > make matters worse, the same reaper thread would have to perform a > (synchronous?) RPC for each entry to ensure it expires everywhere. > I have debated about this, it could something like a sync removeAll which has a special marker to tell it is due to expiration (which would raise listeners there), while also sending a cluster expiration event to other non owners. > > For maxIdle I'd like to know more information about how exactly the > owners would coordinate to expire an entry. I'm pretty sure we cannot > avoid ignoring some reads (expiring an entry immediately after it was > read), and ensuring that we don't accidentally extend an entry's life > (like the current code does, when we transfer an entry to a new owner) > also sounds problematic. > For lifespan it is simple, the primary owner just expires it when it expires there. There is no coordination needed in this case it just sends the expired remove to owners etc. Max idle is more complicated as we all know. The primary owner would send a request for the last used time for a given key or set of keys. Then the owner would take those times and check for a new access it isn't aware of. If there isn't then it would send a remove command for the key(s). If there is a new access the owner would instead send the last used time to all of the owners. The expiration obviously would have a window that if a read occurred after sending a response that could be ignored. This could be resolved by using some sort of 2PC and blocking reads during that period but I would say it isn't worth it. The issue with transferring to a new node refreshing the last update/lifespan seems like just a bug we need to fix irrespective of this issue IMO. > > I'm not saying expiring entries on each node independently is perfect, > far from it. But I wouldn't want us to provide new guarantees that > could hurt performance without a really good use case. > I would guess that user perceived performance should be a little faster with this. But this also depends on an alternative that we decided on :) Also the expiration thread pool is set to min priority atm so it may delay removal of said objects but hopefully (if the jvm supports) it wouldn't overrun a CPU while processing unless it has availability. > > Cheers > Dan > > > On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant > wrote: > > After re-reading the whole original thread, I agree with the proposal > > with two caveats: > > > > - ensure that we don't break JCache compatibility > > - ensure that we document this properly > > > > Tristan > > > > On 13/07/2015 18:41, Sanne Grinovero wrote: > >> +1 > >> You had me convinced at the first line, although "A lot of code can now > >> be removed and made simpler" makes it look extremely nice. > >> > >> On 13 Jul 2015 18:14, "William Burns" >> > wrote: > >> > >> This is a necro of [1]. > >> > >> With Infinispan 8.0 we are adding in clustered expiration. That > >> includes an expiration event raised that is clustered as well. > >> Unfortunately expiration events currently occur multiple times (if > >> numOwners > 1) at different times across nodes in a cluster. This > >> makes coordinating a single cluster expiration event quite > difficult. > >> > >> To work around this I am proposing that the expiration of an event > >> is done solely by the owner of the given key that is now expired. > >> This would fix the issue of having multiple events and the event can > >> be raised while holding the lock for the given key so concurrent > >> modifications would not be an issue. > >> > >> The problem arises when you have other nodes that have expiration > >> set but expire at different times. Max idle is the biggest offender > >> with this as a read on an owner only refreshes the owners timestamp, > >> meaning other owners would not be updated and expire preemptively. > >> To have expiration work properly in this case you would need > >> coordination between the owners to see if anyone has a higher > >> value. This requires blocking and would have to be done while > >> accessing a key that is expired to be sure if expiration happened or > >> not. > >> > >> The linked dev listing proposed instead to only expire an entry by > >> the reaper thread and not on access. In this case a read will > >> return a non null value until it is fully expired, increasing hit > >> ratios possibly. > >> > >> Their are quire a bit of real benefits for this: > >> > >> 1. Cluster cache reads would be much simpler and wouldn't have to > >> block to verify the object exists or not since this would only be > >> done by the reaper thread (note this would have only happened if the > >> entry was expired locally). An access would just return the value > >> immediately. > >> 2. Each node only expires entries it owns in the reaper thread > >> reducing how many entries they must check or remove. This also > >> provides a single point where events would be raised as we need. > >> 3. A lot of code can now be removed and made simpler as it no longer > >> has to check for expiration. The expiration check would only be > >> done in 1 place, the expiration reaper thread. > >> > >> The main issue with this proposal is as the other listing mentions > >> is if user code expects the value to be gone after expiration for > >> correctness. I would say this use case is not as compelling for > >> maxIdle, especially since we never supported it properly. And in > >> the case of lifespan the user could very easily store the expiration > >> time in the object that they can check after a get as pointed out in > >> the other thread. > >> > >> [1] > >> > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org infinispan-dev at lists.jboss.org> > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > > > > -- > > Tristan Tarrant > > Infinispan Lead > > JBoss, a division of Red Hat > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150714/41db7c48/attachment-0001.html From mudokonman at gmail.com Tue Jul 14 10:19:06 2015 From: mudokonman at gmail.com (William Burns) Date: Tue, 14 Jul 2015 14:19:06 +0000 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> Message-ID: On Tue, Jul 14, 2015 at 9:37 AM William Burns wrote: > On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei > wrote: > >> Processing expiration only on the reaper thread sounds nice, but I >> have one reservation: processing 1 million entries to see that 1 of >> them is expired is a lot of work, and in the general case we will not >> be able to ensure an expiration precision of less than 1 minute (maybe >> more, with a huge SingleFileStore attached). >> > > This isn't much different then before. The only difference is that if a > user touched a value after it expired it wouldn't show up (which is > unlikely with maxIdle especially). > > >> >> What happens to users who need better precision? In particular, I know >> some JCache tests were failing because HotRod was only supporting >> 1-second resolution instead of the 1-millisecond resolution they were >> expecting. >> > > JCache is an interesting piece. The thing about JCache is that the spec > is only defined for local caches. However I wouldn't want to muddy up the > waters in regards to it behaving differently for local/remote. In the > JCache scenario we could add an interceptor to prevent it returning such > values (we do something similar already for events). JCache behavior vs > ISPN behavior seems a bit easier to differentiate. But like you are > getting at, either way is not very appealing. > > >> >> >> I'm even less convinced about the need to guarantee that a clustered >> expiration listener will only be triggered once, and that the entry >> must be null everywhere after that listener was invoked. What's the >> use case? >> > > Maybe Tristan would know more to answer. To be honest this work seems > fruitless unless we know what our end users want here. Spending time on > something for it to thrown out is never fun :( > > And the more I thought about this the more I question the validity of > maxIdle even. It seems like a very poor way to prevent memory exhaustion, > which eviction does in a much better way and has much more flexible > algorithms. Does anyone know what maxIdle would be used for that wouldn't > be covered by eviction? The only thing I can think of is cleaning up the > cache store as well. > Actually I guess for session/authentication related information this would be important. However maxIdle isn't really as usable in that case since most likely you would have a sticky session to go back to that node which means you would never refresh the last used date on the copies (current implementation). Without cluster expiration you could lose that session information on a failover very easily. > > >> >> Note that this would make the reaper thread less efficient: with >> numOwners=2 (best case), half of the entries that the reaper touches >> cannot be expired, because the node isn't the primary node. And to >> make matters worse, the same reaper thread would have to perform a >> (synchronous?) RPC for each entry to ensure it expires everywhere. >> > > I have debated about this, it could something like a sync removeAll which > has a special marker to tell it is due to expiration (which would raise > listeners there), while also sending a cluster expiration event to other > non owners. > > >> >> For maxIdle I'd like to know more information about how exactly the >> owners would coordinate to expire an entry. I'm pretty sure we cannot >> avoid ignoring some reads (expiring an entry immediately after it was >> read), and ensuring that we don't accidentally extend an entry's life >> (like the current code does, when we transfer an entry to a new owner) >> also sounds problematic. >> > > For lifespan it is simple, the primary owner just expires it when it > expires there. There is no coordination needed in this case it just sends > the expired remove to owners etc. > > Max idle is more complicated as we all know. The primary owner would send > a request for the last used time for a given key or set of keys. Then the > owner would take those times and check for a new access it isn't aware of. > If there isn't then it would send a remove command for the key(s). If > there is a new access the owner would instead send the last used time to > all of the owners. The expiration obviously would have a window that if a > read occurred after sending a response that could be ignored. This could > be resolved by using some sort of 2PC and blocking reads during that period > but I would say it isn't worth it. > > The issue with transferring to a new node refreshing the last > update/lifespan seems like just a bug we need to fix irrespective of this > issue IMO. > > >> >> I'm not saying expiring entries on each node independently is perfect, >> far from it. But I wouldn't want us to provide new guarantees that >> could hurt performance without a really good use case. >> > > I would guess that user perceived performance should be a little faster > with this. But this also depends on an alternative that we decided on :) > > Also the expiration thread pool is set to min priority atm so it may delay > removal of said objects but hopefully (if the jvm supports) it wouldn't > overrun a CPU while processing unless it has availability. > > >> >> Cheers >> Dan >> >> >> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant >> wrote: >> > After re-reading the whole original thread, I agree with the proposal >> > with two caveats: >> > >> > - ensure that we don't break JCache compatibility >> > - ensure that we document this properly >> > >> > Tristan >> > >> > On 13/07/2015 18:41, Sanne Grinovero wrote: >> >> +1 >> >> You had me convinced at the first line, although "A lot of code can now >> >> be removed and made simpler" makes it look extremely nice. >> >> >> >> On 13 Jul 2015 18:14, "William Burns" > >> > wrote: >> >> >> >> This is a necro of [1]. >> >> >> >> With Infinispan 8.0 we are adding in clustered expiration. That >> >> includes an expiration event raised that is clustered as well. >> >> Unfortunately expiration events currently occur multiple times (if >> >> numOwners > 1) at different times across nodes in a cluster. This >> >> makes coordinating a single cluster expiration event quite >> difficult. >> >> >> >> To work around this I am proposing that the expiration of an event >> >> is done solely by the owner of the given key that is now expired. >> >> This would fix the issue of having multiple events and the event >> can >> >> be raised while holding the lock for the given key so concurrent >> >> modifications would not be an issue. >> >> >> >> The problem arises when you have other nodes that have expiration >> >> set but expire at different times. Max idle is the biggest >> offender >> >> with this as a read on an owner only refreshes the owners >> timestamp, >> >> meaning other owners would not be updated and expire preemptively. >> >> To have expiration work properly in this case you would need >> >> coordination between the owners to see if anyone has a higher >> >> value. This requires blocking and would have to be done while >> >> accessing a key that is expired to be sure if expiration happened >> or >> >> not. >> >> >> >> The linked dev listing proposed instead to only expire an entry by >> >> the reaper thread and not on access. In this case a read will >> >> return a non null value until it is fully expired, increasing hit >> >> ratios possibly. >> >> >> >> Their are quire a bit of real benefits for this: >> >> >> >> 1. Cluster cache reads would be much simpler and wouldn't have to >> >> block to verify the object exists or not since this would only be >> >> done by the reaper thread (note this would have only happened if >> the >> >> entry was expired locally). An access would just return the value >> >> immediately. >> >> 2. Each node only expires entries it owns in the reaper thread >> >> reducing how many entries they must check or remove. This also >> >> provides a single point where events would be raised as we need. >> >> 3. A lot of code can now be removed and made simpler as it no >> longer >> >> has to check for expiration. The expiration check would only be >> >> done in 1 place, the expiration reaper thread. >> >> >> >> The main issue with this proposal is as the other listing mentions >> >> is if user code expects the value to be gone after expiration for >> >> correctness. I would say this use case is not as compelling for >> >> maxIdle, especially since we never supported it properly. And in >> >> the case of lifespan the user could very easily store the >> expiration >> >> time in the object that they can check after a get as pointed out >> in >> >> the other thread. >> >> >> >> [1] >> >> >> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> infinispan-dev at lists.jboss.org > infinispan-dev at lists.jboss.org> >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> infinispan-dev at lists.jboss.org >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> > >> > -- >> > Tristan Tarrant >> > Infinispan Lead >> > JBoss, a division of Red Hat >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150714/d3b09300/attachment.html From rvansa at redhat.com Tue Jul 14 11:08:46 2015 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 14 Jul 2015 17:08:46 +0200 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> Message-ID: <55A525FE.80500@redhat.com> On 07/14/2015 04:19 PM, William Burns wrote: > > > On Tue, Jul 14, 2015 at 9:37 AM William Burns > wrote: > > On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei > > wrote: > > Processing expiration only on the reaper thread sounds nice, but I > have one reservation: processing 1 million entries to see that > 1 of > them is expired is a lot of work, and in the general case we > will not > be able to ensure an expiration precision of less than 1 > minute (maybe > more, with a huge SingleFileStore attached). > > > This isn't much different then before. The only difference is > that if a user touched a value after it expired it wouldn't show > up (which is unlikely with maxIdle especially). > > > What happens to users who need better precision? In > particular, I know > some JCache tests were failing because HotRod was only supporting > 1-second resolution instead of the 1-millisecond resolution > they were > expecting. > > > JCache is an interesting piece. The thing about JCache is that > the spec is only defined for local caches. However I wouldn't > want to muddy up the waters in regards to it behaving differently > for local/remote. In the JCache scenario we could add an > interceptor to prevent it returning such values (we do something > similar already for events). JCache behavior vs ISPN behavior > seems a bit easier to differentiate. But like you are getting at, > either way is not very appealing. > > > > I'm even less convinced about the need to guarantee that a > clustered > expiration listener will only be triggered once, and that the > entry > must be null everywhere after that listener was invoked. > What's the > use case? > > > Maybe Tristan would know more to answer. To be honest this work > seems fruitless unless we know what our end users want here. > Spending time on something for it to thrown out is never fun :( > > And the more I thought about this the more I question the validity > of maxIdle even. It seems like a very poor way to prevent memory > exhaustion, which eviction does in a much better way and has much > more flexible algorithms. Does anyone know what maxIdle would be > used for that wouldn't be covered by eviction? The only thing I > can think of is cleaning up the cache store as well. > > > Actually I guess for session/authentication related information this > would be important. However maxIdle isn't really as usable in that > case since most likely you would have a sticky session to go back to > that node which means you would never refresh the last used date on > the copies (current implementation). Without cluster expiration you > could lose that session information on a failover very easily. I would say that maxIdle can be used as for memory management as kind of WeakHashMap - e.g. in 2LC the maxIdle is used to store some record for a short while (regular transaction lifespan ~ seconds to minutes), and regularly the record is removed. However, to make sure that we don't leak records in this cache (if something goes wrong and the remove does not occur), it is removed. I can guess how long the transaction takes place, but not how many parallel transactions there are. With eviction algorithms (where I am not sure about the exact guarantees) I can set the cache to not hold more than N entries, but I can't know for sure that my record does not suddenly get evicted after shorter period, possibly causing some inconsistency. So this is similar to WeakHashMap by removing the key "when it can't be used anymore" because I know that the transaction will finish before the deadline. I don't care about the exact size, I don't want to tune that, I just don't want to leak. From my POV the non-strict maxIdle and strict expiration would be a nice compromise. Radim > > Note that this would make the reaper thread less efficient: with > numOwners=2 (best case), half of the entries that the reaper > touches > cannot be expired, because the node isn't the primary node. And to > make matters worse, the same reaper thread would have to perform a > (synchronous?) RPC for each entry to ensure it expires everywhere. > > > I have debated about this, it could something like a sync > removeAll which has a special marker to tell it is due to > expiration (which would raise listeners there), while also sending > a cluster expiration event to other non owners. > > > For maxIdle I'd like to know more information about how > exactly the > owners would coordinate to expire an entry. I'm pretty sure we > cannot > avoid ignoring some reads (expiring an entry immediately after > it was > read), and ensuring that we don't accidentally extend an > entry's life > (like the current code does, when we transfer an entry to a > new owner) > also sounds problematic. > > > For lifespan it is simple, the primary owner just expires it when > it expires there. There is no coordination needed in this case it > just sends the expired remove to owners etc. > > Max idle is more complicated as we all know. The primary owner > would send a request for the last used time for a given key or set > of keys. Then the owner would take those times and check for a > new access it isn't aware of. If there isn't then it would send a > remove command for the key(s). If there is a new access the owner > would instead send the last used time to all of the owners. The > expiration obviously would have a window that if a read occurred > after sending a response that could be ignored. This could be > resolved by using some sort of 2PC and blocking reads during that > period but I would say it isn't worth it. > > The issue with transferring to a new node refreshing the last > update/lifespan seems like just a bug we need to fix irrespective > of this issue IMO. > > > I'm not saying expiring entries on each node independently is > perfect, > far from it. But I wouldn't want us to provide new guarantees that > could hurt performance without a really good use case. > > > I would guess that user perceived performance should be a little > faster with this. But this also depends on an alternative that we > decided on :) > > Also the expiration thread pool is set to min priority atm so it > may delay removal of said objects but hopefully (if the jvm > supports) it wouldn't overrun a CPU while processing unless it has > availability. > > > Cheers > Dan > > > On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant > > wrote: > > After re-reading the whole original thread, I agree with the > proposal > > with two caveats: > > > > - ensure that we don't break JCache compatibility > > - ensure that we document this properly > > > > Tristan > > > > On 13/07/2015 18:41, Sanne Grinovero wrote: > >> +1 > >> You had me convinced at the first line, although "A lot of > code can now > >> be removed and made simpler" makes it look extremely nice. > >> > >> On 13 Jul 2015 18:14, "William Burns" > >> >> wrote: > >> > >> This is a necro of [1]. > >> > >> With Infinispan 8.0 we are adding in clustered > expiration. That > >> includes an expiration event raised that is clustered > as well. > >> Unfortunately expiration events currently occur > multiple times (if > >> numOwners > 1) at different times across nodes in a > cluster. This > >> makes coordinating a single cluster expiration event > quite difficult. > >> > >> To work around this I am proposing that the expiration > of an event > >> is done solely by the owner of the given key that is > now expired. > >> This would fix the issue of having multiple events and > the event can > >> be raised while holding the lock for the given key so > concurrent > >> modifications would not be an issue. > >> > >> The problem arises when you have other nodes that have > expiration > >> set but expire at different times. Max idle is the > biggest offender > >> with this as a read on an owner only refreshes the > owners timestamp, > >> meaning other owners would not be updated and expire > preemptively. > >> To have expiration work properly in this case you would > need > >> coordination between the owners to see if anyone has a > higher > >> value. This requires blocking and would have to be > done while > >> accessing a key that is expired to be sure if > expiration happened or > >> not. > >> > >> The linked dev listing proposed instead to only expire > an entry by > >> the reaper thread and not on access. In this case a > read will > >> return a non null value until it is fully expired, > increasing hit > >> ratios possibly. > >> > >> Their are quire a bit of real benefits for this: > >> > >> 1. Cluster cache reads would be much simpler and > wouldn't have to > >> block to verify the object exists or not since this > would only be > >> done by the reaper thread (note this would have only > happened if the > >> entry was expired locally). An access would just > return the value > >> immediately. > >> 2. Each node only expires entries it owns in the reaper > thread > >> reducing how many entries they must check or remove. > This also > >> provides a single point where events would be raised as > we need. > >> 3. A lot of code can now be removed and made simpler as > it no longer > >> has to check for expiration. The expiration check > would only be > >> done in 1 place, the expiration reaper thread. > >> > >> The main issue with this proposal is as the other > listing mentions > >> is if user code expects the value to be gone after > expiration for > >> correctness. I would say this use case is not as > compelling for > >> maxIdle, especially since we never supported it > properly. And in > >> the case of lifespan the user could very easily store > the expiration > >> time in the object that they can check after a get as > pointed out in > >> the other thread. > >> > >> [1] > >> > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > > > > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > > > > -- > > Tristan Tarrant > > Infinispan Lead > > JBoss, a division of Red Hat > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From rvansa at redhat.com Tue Jul 14 11:16:37 2015 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 14 Jul 2015 17:16:37 +0200 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55A501B9.7060608@redhat.com> References: <55A501B9.7060608@redhat.com> Message-ID: <55A527D5.8060606@redhat.com> IIRC currently hibernate-infinispan module uses just the basic cache API, which won't change. However, with certain consistency fixes ([1] [2] and maybe more) the Interceptor API is used (as the basic invalidation mode cannot make 2LC consistent on it own), which is about to change in Infinispan 8.0. I will take care of updating hibernate-infinispan module to 8.0 when it will be out. Since that would require only changes internal to that module, I hope this upgrade can be scoped to a micro-release. Radim [1] https://hibernate.atlassian.net/browse/HHH-9868 [2] https://hibernate.atlassian.net/browse/HHH-9881 On 07/14/2015 02:34 PM, Scott Marlow wrote: > Hi, > > I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. > If that happens, how does that impact Hibernate ORM 5.0 which currently > integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any > changes to integrate with Infinispan 8.0? > > Thanks, > Scott > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From dereed at redhat.com Tue Jul 14 11:45:16 2015 From: dereed at redhat.com (Dennis Reed) Date: Tue, 14 Jul 2015 11:45:16 -0400 Subject: [infinispan-dev] Strict Expiration In-Reply-To: <55A525FE.80500@redhat.com> References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> Message-ID: <55A52E8C.3060906@redhat.com> On 07/14/2015 11:08 AM, Radim Vansa wrote: > On 07/14/2015 04:19 PM, William Burns wrote: >> >> >> On Tue, Jul 14, 2015 at 9:37 AM William Burns > > wrote: >> >> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei >> > wrote: >> >> Processing expiration only on the reaper thread sounds nice, but I >> have one reservation: processing 1 million entries to see that >> 1 of >> them is expired is a lot of work, and in the general case we >> will not >> be able to ensure an expiration precision of less than 1 >> minute (maybe >> more, with a huge SingleFileStore attached). >> >> >> This isn't much different then before. The only difference is >> that if a user touched a value after it expired it wouldn't show >> up (which is unlikely with maxIdle especially). >> >> >> What happens to users who need better precision? In >> particular, I know >> some JCache tests were failing because HotRod was only supporting >> 1-second resolution instead of the 1-millisecond resolution >> they were >> expecting. >> >> >> JCache is an interesting piece. The thing about JCache is that >> the spec is only defined for local caches. However I wouldn't >> want to muddy up the waters in regards to it behaving differently >> for local/remote. In the JCache scenario we could add an >> interceptor to prevent it returning such values (we do something >> similar already for events). JCache behavior vs ISPN behavior >> seems a bit easier to differentiate. But like you are getting at, >> either way is not very appealing. >> >> >> >> I'm even less convinced about the need to guarantee that a >> clustered >> expiration listener will only be triggered once, and that the >> entry >> must be null everywhere after that listener was invoked. >> What's the >> use case? >> >> >> Maybe Tristan would know more to answer. To be honest this work >> seems fruitless unless we know what our end users want here. >> Spending time on something for it to thrown out is never fun :( >> >> And the more I thought about this the more I question the validity >> of maxIdle even. It seems like a very poor way to prevent memory >> exhaustion, which eviction does in a much better way and has much >> more flexible algorithms. Does anyone know what maxIdle would be >> used for that wouldn't be covered by eviction? The only thing I >> can think of is cleaning up the cache store as well. >> >> >> Actually I guess for session/authentication related information this >> would be important. However maxIdle isn't really as usable in that >> case since most likely you would have a sticky session to go back to >> that node which means you would never refresh the last used date on >> the copies (current implementation). Without cluster expiration you >> could lose that session information on a failover very easily. > > I would say that maxIdle can be used as for memory management as kind of > WeakHashMap - e.g. in 2LC the maxIdle is used to store some record for a > short while (regular transaction lifespan ~ seconds to minutes), and > regularly the record is removed. However, to make sure that we don't > leak records in this cache (if something goes wrong and the remove does > not occur), it is removed. Note that just relying on maxIdle doesn't guarantee you won't leak records in this use case (specifically with the way the current hibernate-infinispan 2LC implementation uses it). Hibernate-infinispan adds entries to its own Map stored in Infinispan, and expects maxIdle to remove the map if it skips a remove. But in a current case, we found that due to frequent accesses to that same map the entries never idle out and it ends up in OOME). -Dennis > I can guess how long the transaction takes place, but not how many > parallel transactions there are. With eviction algorithms (where I am > not sure about the exact guarantees) I can set the cache to not hold > more than N entries, but I can't know for sure that my record does not > suddenly get evicted after shorter period, possibly causing some > inconsistency. > So this is similar to WeakHashMap by removing the key "when it can't be > used anymore" because I know that the transaction will finish before the > deadline. I don't care about the exact size, I don't want to tune that, > I just don't want to leak. > > From my POV the non-strict maxIdle and strict expiration would be a > nice compromise. > > Radim > >> >> Note that this would make the reaper thread less efficient: with >> numOwners=2 (best case), half of the entries that the reaper >> touches >> cannot be expired, because the node isn't the primary node. And to >> make matters worse, the same reaper thread would have to perform a >> (synchronous?) RPC for each entry to ensure it expires everywhere. >> >> >> I have debated about this, it could something like a sync >> removeAll which has a special marker to tell it is due to >> expiration (which would raise listeners there), while also sending >> a cluster expiration event to other non owners. >> >> >> For maxIdle I'd like to know more information about how >> exactly the >> owners would coordinate to expire an entry. I'm pretty sure we >> cannot >> avoid ignoring some reads (expiring an entry immediately after >> it was >> read), and ensuring that we don't accidentally extend an >> entry's life >> (like the current code does, when we transfer an entry to a >> new owner) >> also sounds problematic. >> >> >> For lifespan it is simple, the primary owner just expires it when >> it expires there. There is no coordination needed in this case it >> just sends the expired remove to owners etc. >> >> Max idle is more complicated as we all know. The primary owner >> would send a request for the last used time for a given key or set >> of keys. Then the owner would take those times and check for a >> new access it isn't aware of. If there isn't then it would send a >> remove command for the key(s). If there is a new access the owner >> would instead send the last used time to all of the owners. The >> expiration obviously would have a window that if a read occurred >> after sending a response that could be ignored. This could be >> resolved by using some sort of 2PC and blocking reads during that >> period but I would say it isn't worth it. >> >> The issue with transferring to a new node refreshing the last >> update/lifespan seems like just a bug we need to fix irrespective >> of this issue IMO. >> >> >> I'm not saying expiring entries on each node independently is >> perfect, >> far from it. But I wouldn't want us to provide new guarantees that >> could hurt performance without a really good use case. >> >> >> I would guess that user perceived performance should be a little >> faster with this. But this also depends on an alternative that we >> decided on :) >> >> Also the expiration thread pool is set to min priority atm so it >> may delay removal of said objects but hopefully (if the jvm >> supports) it wouldn't overrun a CPU while processing unless it has >> availability. >> >> >> Cheers >> Dan >> >> >> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant >> > wrote: >> > After re-reading the whole original thread, I agree with the >> proposal >> > with two caveats: >> > >> > - ensure that we don't break JCache compatibility >> > - ensure that we document this properly >> > >> > Tristan >> > >> > On 13/07/2015 18:41, Sanne Grinovero wrote: >> >> +1 >> >> You had me convinced at the first line, although "A lot of >> code can now >> >> be removed and made simpler" makes it look extremely nice. >> >> >> >> On 13 Jul 2015 18:14, "William Burns" > >> >> > >> wrote: >> >> >> >> This is a necro of [1]. >> >> >> >> With Infinispan 8.0 we are adding in clustered >> expiration. That >> >> includes an expiration event raised that is clustered >> as well. >> >> Unfortunately expiration events currently occur >> multiple times (if >> >> numOwners > 1) at different times across nodes in a >> cluster. This >> >> makes coordinating a single cluster expiration event >> quite difficult. >> >> >> >> To work around this I am proposing that the expiration >> of an event >> >> is done solely by the owner of the given key that is >> now expired. >> >> This would fix the issue of having multiple events and >> the event can >> >> be raised while holding the lock for the given key so >> concurrent >> >> modifications would not be an issue. >> >> >> >> The problem arises when you have other nodes that have >> expiration >> >> set but expire at different times. Max idle is the >> biggest offender >> >> with this as a read on an owner only refreshes the >> owners timestamp, >> >> meaning other owners would not be updated and expire >> preemptively. >> >> To have expiration work properly in this case you would >> need >> >> coordination between the owners to see if anyone has a >> higher >> >> value. This requires blocking and would have to be >> done while >> >> accessing a key that is expired to be sure if >> expiration happened or >> >> not. >> >> >> >> The linked dev listing proposed instead to only expire >> an entry by >> >> the reaper thread and not on access. In this case a >> read will >> >> return a non null value until it is fully expired, >> increasing hit >> >> ratios possibly. >> >> >> >> Their are quire a bit of real benefits for this: >> >> >> >> 1. Cluster cache reads would be much simpler and >> wouldn't have to >> >> block to verify the object exists or not since this >> would only be >> >> done by the reaper thread (note this would have only >> happened if the >> >> entry was expired locally). An access would just >> return the value >> >> immediately. >> >> 2. Each node only expires entries it owns in the reaper >> thread >> >> reducing how many entries they must check or remove. >> This also >> >> provides a single point where events would be raised as >> we need. >> >> 3. A lot of code can now be removed and made simpler as >> it no longer >> >> has to check for expiration. The expiration check >> would only be >> >> done in 1 place, the expiration reaper thread. >> >> >> >> The main issue with this proposal is as the other >> listing mentions >> >> is if user code expects the value to be gone after >> expiration for >> >> correctness. I would say this use case is not as >> compelling for >> >> maxIdle, especially since we never supported it >> properly. And in >> >> the case of lifespan the user could very easily store >> the expiration >> >> time in the object that they can check after a get as >> pointed out in >> >> the other thread. >> >> >> >> [1] >> >> >> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> infinispan-dev at lists.jboss.org >> >> > > >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> infinispan-dev at lists.jboss.org >> >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> > >> > -- >> > Tristan Tarrant >> > Infinispan Lead >> > JBoss, a division of Red Hat >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From ttarrant at redhat.com Tue Jul 14 11:58:36 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 14 Jul 2015 16:58:36 +0100 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> Message-ID: <55A531AC.7020607@redhat.com> On 14/07/2015 14:37, William Burns wrote: > I'm even less convinced about the need to guarantee that a clustered > expiration listener will only be triggered once, and that the entry > must be null everywhere after that listener was invoked. What's the > use case? > > > Maybe Tristan would know more to answer. To be honest this work seems > fruitless unless we know what our end users want here. Spending time on > something for it to thrown out is never fun :( I see two use-cases related to taking action on expiration: 1. do something with the data before it definitely is removed. This could be something like archiving to secondary storage. This particular case would also require still having the value to pass to the event handler. I don't think we need special guarantees about "only once" but it would be a nice-to-have. Also null-after-the-fact wouldn't be a requirement. 2. pre-emptive memoization, i.e. a way to implement refreshOnExpiration akin to computeIfAbsent. "Only once" would definitely be preferable here and. "Null-after-the-fact" is also irrelevant since the code would probably end up replacing the existing value. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From rvansa at redhat.com Tue Jul 14 13:51:26 2015 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 14 Jul 2015 19:51:26 +0200 Subject: [infinispan-dev] Strict Expiration In-Reply-To: <55A52E8C.3060906@redhat.com> References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> <55A52E8C.3060906@redhat.com> Message-ID: <55A54C1E.3080703@redhat.com> Yes, I know about [1]. I've worked that around by storing timestamp in the entry as well and when a new record is added, the 'expired' invalidations are purged. But I can't purge that if I don't access it - Infinispan needs to handle that internally. Radim [1] https://hibernate.atlassian.net/browse/HHH-6219 On 07/14/2015 05:45 PM, Dennis Reed wrote: > On 07/14/2015 11:08 AM, Radim Vansa wrote: >> On 07/14/2015 04:19 PM, William Burns wrote: >>> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns >> > wrote: >>> >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei >>> > wrote: >>> >>> Processing expiration only on the reaper thread sounds nice, but I >>> have one reservation: processing 1 million entries to see that >>> 1 of >>> them is expired is a lot of work, and in the general case we >>> will not >>> be able to ensure an expiration precision of less than 1 >>> minute (maybe >>> more, with a huge SingleFileStore attached). >>> >>> >>> This isn't much different then before. The only difference is >>> that if a user touched a value after it expired it wouldn't show >>> up (which is unlikely with maxIdle especially). >>> >>> >>> What happens to users who need better precision? In >>> particular, I know >>> some JCache tests were failing because HotRod was only supporting >>> 1-second resolution instead of the 1-millisecond resolution >>> they were >>> expecting. >>> >>> >>> JCache is an interesting piece. The thing about JCache is that >>> the spec is only defined for local caches. However I wouldn't >>> want to muddy up the waters in regards to it behaving differently >>> for local/remote. In the JCache scenario we could add an >>> interceptor to prevent it returning such values (we do something >>> similar already for events). JCache behavior vs ISPN behavior >>> seems a bit easier to differentiate. But like you are getting at, >>> either way is not very appealing. >>> >>> >>> >>> I'm even less convinced about the need to guarantee that a >>> clustered >>> expiration listener will only be triggered once, and that the >>> entry >>> must be null everywhere after that listener was invoked. >>> What's the >>> use case? >>> >>> >>> Maybe Tristan would know more to answer. To be honest this work >>> seems fruitless unless we know what our end users want here. >>> Spending time on something for it to thrown out is never fun :( >>> >>> And the more I thought about this the more I question the validity >>> of maxIdle even. It seems like a very poor way to prevent memory >>> exhaustion, which eviction does in a much better way and has much >>> more flexible algorithms. Does anyone know what maxIdle would be >>> used for that wouldn't be covered by eviction? The only thing I >>> can think of is cleaning up the cache store as well. >>> >>> >>> Actually I guess for session/authentication related information this >>> would be important. However maxIdle isn't really as usable in that >>> case since most likely you would have a sticky session to go back to >>> that node which means you would never refresh the last used date on >>> the copies (current implementation). Without cluster expiration you >>> could lose that session information on a failover very easily. >> I would say that maxIdle can be used as for memory management as kind of >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some record for a >> short while (regular transaction lifespan ~ seconds to minutes), and >> regularly the record is removed. However, to make sure that we don't >> leak records in this cache (if something goes wrong and the remove does >> not occur), it is removed. > Note that just relying on maxIdle doesn't guarantee you won't leak > records in this use case (specifically with the way the current > hibernate-infinispan 2LC implementation uses it). > > Hibernate-infinispan adds entries to its own Map stored in Infinispan, > and expects maxIdle to remove the map if it skips a remove. But in a > current case, we found that due to frequent accesses to that same map > the entries never idle out and it ends up in OOME). > > -Dennis > >> I can guess how long the transaction takes place, but not how many >> parallel transactions there are. With eviction algorithms (where I am >> not sure about the exact guarantees) I can set the cache to not hold >> more than N entries, but I can't know for sure that my record does not >> suddenly get evicted after shorter period, possibly causing some >> inconsistency. >> So this is similar to WeakHashMap by removing the key "when it can't be >> used anymore" because I know that the transaction will finish before the >> deadline. I don't care about the exact size, I don't want to tune that, >> I just don't want to leak. >> >> From my POV the non-strict maxIdle and strict expiration would be a >> nice compromise. >> >> Radim >> >>> Note that this would make the reaper thread less efficient: with >>> numOwners=2 (best case), half of the entries that the reaper >>> touches >>> cannot be expired, because the node isn't the primary node. And to >>> make matters worse, the same reaper thread would have to perform a >>> (synchronous?) RPC for each entry to ensure it expires everywhere. >>> >>> >>> I have debated about this, it could something like a sync >>> removeAll which has a special marker to tell it is due to >>> expiration (which would raise listeners there), while also sending >>> a cluster expiration event to other non owners. >>> >>> >>> For maxIdle I'd like to know more information about how >>> exactly the >>> owners would coordinate to expire an entry. I'm pretty sure we >>> cannot >>> avoid ignoring some reads (expiring an entry immediately after >>> it was >>> read), and ensuring that we don't accidentally extend an >>> entry's life >>> (like the current code does, when we transfer an entry to a >>> new owner) >>> also sounds problematic. >>> >>> >>> For lifespan it is simple, the primary owner just expires it when >>> it expires there. There is no coordination needed in this case it >>> just sends the expired remove to owners etc. >>> >>> Max idle is more complicated as we all know. The primary owner >>> would send a request for the last used time for a given key or set >>> of keys. Then the owner would take those times and check for a >>> new access it isn't aware of. If there isn't then it would send a >>> remove command for the key(s). If there is a new access the owner >>> would instead send the last used time to all of the owners. The >>> expiration obviously would have a window that if a read occurred >>> after sending a response that could be ignored. This could be >>> resolved by using some sort of 2PC and blocking reads during that >>> period but I would say it isn't worth it. >>> >>> The issue with transferring to a new node refreshing the last >>> update/lifespan seems like just a bug we need to fix irrespective >>> of this issue IMO. >>> >>> >>> I'm not saying expiring entries on each node independently is >>> perfect, >>> far from it. But I wouldn't want us to provide new guarantees that >>> could hurt performance without a really good use case. >>> >>> >>> I would guess that user perceived performance should be a little >>> faster with this. But this also depends on an alternative that we >>> decided on :) >>> >>> Also the expiration thread pool is set to min priority atm so it >>> may delay removal of said objects but hopefully (if the jvm >>> supports) it wouldn't overrun a CPU while processing unless it has >>> availability. >>> >>> >>> Cheers >>> Dan >>> >>> >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant >>> > wrote: >>> > After re-reading the whole original thread, I agree with the >>> proposal >>> > with two caveats: >>> > >>> > - ensure that we don't break JCache compatibility >>> > - ensure that we document this properly >>> > >>> > Tristan >>> > >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: >>> >> +1 >>> >> You had me convinced at the first line, although "A lot of >>> code can now >>> >> be removed and made simpler" makes it look extremely nice. >>> >> >>> >> On 13 Jul 2015 18:14, "William Burns" >> >>> >> >> >> wrote: >>> >> >>> >> This is a necro of [1]. >>> >> >>> >> With Infinispan 8.0 we are adding in clustered >>> expiration. That >>> >> includes an expiration event raised that is clustered >>> as well. >>> >> Unfortunately expiration events currently occur >>> multiple times (if >>> >> numOwners > 1) at different times across nodes in a >>> cluster. This >>> >> makes coordinating a single cluster expiration event >>> quite difficult. >>> >> >>> >> To work around this I am proposing that the expiration >>> of an event >>> >> is done solely by the owner of the given key that is >>> now expired. >>> >> This would fix the issue of having multiple events and >>> the event can >>> >> be raised while holding the lock for the given key so >>> concurrent >>> >> modifications would not be an issue. >>> >> >>> >> The problem arises when you have other nodes that have >>> expiration >>> >> set but expire at different times. Max idle is the >>> biggest offender >>> >> with this as a read on an owner only refreshes the >>> owners timestamp, >>> >> meaning other owners would not be updated and expire >>> preemptively. >>> >> To have expiration work properly in this case you would >>> need >>> >> coordination between the owners to see if anyone has a >>> higher >>> >> value. This requires blocking and would have to be >>> done while >>> >> accessing a key that is expired to be sure if >>> expiration happened or >>> >> not. >>> >> >>> >> The linked dev listing proposed instead to only expire >>> an entry by >>> >> the reaper thread and not on access. In this case a >>> read will >>> >> return a non null value until it is fully expired, >>> increasing hit >>> >> ratios possibly. >>> >> >>> >> Their are quire a bit of real benefits for this: >>> >> >>> >> 1. Cluster cache reads would be much simpler and >>> wouldn't have to >>> >> block to verify the object exists or not since this >>> would only be >>> >> done by the reaper thread (note this would have only >>> happened if the >>> >> entry was expired locally). An access would just >>> return the value >>> >> immediately. >>> >> 2. Each node only expires entries it owns in the reaper >>> thread >>> >> reducing how many entries they must check or remove. >>> This also >>> >> provides a single point where events would be raised as >>> we need. >>> >> 3. A lot of code can now be removed and made simpler as >>> it no longer >>> >> has to check for expiration. The expiration check >>> would only be >>> >> done in 1 place, the expiration reaper thread. >>> >> >>> >> The main issue with this proposal is as the other >>> listing mentions >>> >> is if user code expects the value to be gone after >>> expiration for >>> >> correctness. I would say this use case is not as >>> compelling for >>> >> maxIdle, especially since we never supported it >>> properly. And in >>> >> the case of lifespan the user could very easily store >>> the expiration >>> >> time in the object that they can check after a get as >>> pointed out in >>> >> the other thread. >>> >> >>> >> [1] >>> >> >>> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html >>> >> >>> >> _______________________________________________ >>> >> infinispan-dev mailing list >>> >> infinispan-dev at lists.jboss.org >>> >>> >> > >>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >>> >> >>> >> >>> >> _______________________________________________ >>> >> infinispan-dev mailing list >>> >> infinispan-dev at lists.jboss.org >>> >>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >>> > >>> > -- >>> > Tristan Tarrant >>> > Infinispan Lead >>> > JBoss, a division of Red Hat >>> > _______________________________________________ >>> > infinispan-dev mailing list >>> > infinispan-dev at lists.jboss.org >>> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From sanne at infinispan.org Tue Jul 14 17:27:25 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 14 Jul 2015 22:27:25 +0100 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55A527D5.8060606@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> Message-ID: Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 in a micro release of Hibernate.. that's against our conventions. The better plan would be to work towards a Hibernate 5.1 version for this, or make sure to talk with him in advance if that doesn't work out. On 14 July 2015 at 16:16, Radim Vansa wrote: > IIRC currently hibernate-infinispan module uses just the basic cache > API, which won't change. However, with certain consistency fixes ([1] > [2] and maybe more) the Interceptor API is used (as the basic > invalidation mode cannot make 2LC consistent on it own), which is about > to change in Infinispan 8.0. > I will take care of updating hibernate-infinispan module to 8.0 when it > will be out. Since that would require only changes internal to that > module, I hope this upgrade can be scoped to a micro-release. > > Radim > > [1] https://hibernate.atlassian.net/browse/HHH-9868 > [2] https://hibernate.atlassian.net/browse/HHH-9881 > > On 07/14/2015 02:34 PM, Scott Marlow wrote: >> Hi, >> >> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >> If that happens, how does that impact Hibernate ORM 5.0 which currently >> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >> changes to integrate with Infinispan 8.0? >> >> Thanks, >> Scott >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From smarlow at redhat.com Wed Jul 15 09:25:46 2015 From: smarlow at redhat.com (Scott Marlow) Date: Wed, 15 Jul 2015 09:25:46 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> Message-ID: <55A65F5A.7090904@redhat.com> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: " hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: error: cannot find symbol rpcManager.broadcastRpcCommand( cmd, isSync ); ^ symbol: method broadcastRpcCommand(EvictAllCommand,boolean) location: variable rpcManager of type RpcManager " Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? More inline below... On 07/14/2015 05:27 PM, Sanne Grinovero wrote: > Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 > in a micro release of Hibernate.. that's against our conventions. > The better plan would be to work towards a Hibernate 5.1 version for > this, or make sure to talk with him in advance if that doesn't work > out. The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, with that target in mind, we should figure out if we can have a Hibernate 5.x and Infinispan 8.x that work together. I don't particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, as long as someone is making sure Infinispan 8.x and ORM 5.0, work well together. IMO, we should either change the Infinispan 8 API to still work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. > > On 14 July 2015 at 16:16, Radim Vansa wrote: >> IIRC currently hibernate-infinispan module uses just the basic cache >> API, which won't change. However, with certain consistency fixes ([1] >> [2] and maybe more) the Interceptor API is used (as the basic >> invalidation mode cannot make 2LC consistent on it own), which is about >> to change in Infinispan 8.0. >> I will take care of updating hibernate-infinispan module to 8.0 when it >> will be out. Since that would require only changes internal to that >> module, I hope this upgrade can be scoped to a micro-release. >> >> Radim >> >> [1] https://hibernate.atlassian.net/browse/HHH-9868 >> [2] https://hibernate.atlassian.net/browse/HHH-9881 >> >> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>> Hi, >>> >>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>> changes to integrate with Infinispan 8.0? >>> >>> Thanks, >>> Scott >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Radim Vansa >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From rvansa at redhat.com Thu Jul 16 04:34:44 2015 From: rvansa at redhat.com (Radim Vansa) Date: Thu, 16 Jul 2015 10:34:44 +0200 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55A65F5A.7090904@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> Message-ID: <55A76CA4.3000608@redhat.com> On 07/15/2015 03:25 PM, Scott Marlow wrote: > Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: > > " > hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: > error: cannot find symbol > rpcManager.broadcastRpcCommand( cmd, isSync ); > ^ > symbol: method broadcastRpcCommand(EvictAllCommand,boolean) > location: variable rpcManager of type RpcManager > " > > Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? This should be fixed in ORM 5.x, but should I do a PR that uses Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get out before ORM 5.0.0.Final? > > More inline below... > > On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >> in a micro release of Hibernate.. that's against our conventions. >> The better plan would be to work towards a Hibernate 5.1 version for >> this, or make sure to talk with him in advance if that doesn't work >> out. > The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, > with that target in mind, we should figure out if we can have a > Hibernate 5.x and Infinispan 8.x that work together. I don't > particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, > as long as someone is making sure Infinispan 8.x and ORM 5.0, work well > together. IMO, we should either change the Infinispan 8 API to still > work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. See above: I would 'make sure' that they work together, but it's IMO not possible until Infinispan 8.0 is released. Radim > >> On 14 July 2015 at 16:16, Radim Vansa wrote: >>> IIRC currently hibernate-infinispan module uses just the basic cache >>> API, which won't change. However, with certain consistency fixes ([1] >>> [2] and maybe more) the Interceptor API is used (as the basic >>> invalidation mode cannot make 2LC consistent on it own), which is about >>> to change in Infinispan 8.0. >>> I will take care of updating hibernate-infinispan module to 8.0 when it >>> will be out. Since that would require only changes internal to that >>> module, I hope this upgrade can be scoped to a micro-release. >>> >>> Radim >>> >>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>> >>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>> Hi, >>>> >>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>> changes to integrate with Infinispan 8.0? >>>> >>>> Thanks, >>>> Scott >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> -- >>> Radim Vansa >>> JBoss Performance Team >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From sanne at infinispan.org Thu Jul 16 09:32:11 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Thu, 16 Jul 2015 14:32:11 +0100 Subject: [infinispan-dev] Shared vs Non-Shared CacheStores Message-ID: I would like to propose a clear cut separation between our shared and non-shared CacheStores, in all terms such as: - Configuration options - Integration contracts (Split the CacheStore SPI) - Implementations - Terminology, to avoid any further confusion around valid configurations and sensible architectures We have loads of examples of users who get in trouble by configuring one incorrectly, but also there are plenty of efficiency improvements we could take advantage of by clearly splitting the integration points and the implementations in two categories. Not least, it's a very common and dangerous pitfall to assume that Infinispan is able to restore a consistent state after having stopped a DIST cluster which passivated into non-shared CacheStore instances, or even REPL clusters when they don't shutdown all at the same exact time (and "exact same time" is a strange concept at least..). We need to clarify the different options, tradeoffs and their consequences.. to users and ourselves, as a clearly defined use case will avoid bugs and simplify implementations. # The purpose of each I think that people should use a non-shared (local?) CacheStore for the sole purpose of expanding to storage capacity of each single node.. be it because you don't have enough memory at all, or be it because you prefer some extra safety margin because either your estimates are complex, or maybe because we live in a real world were the hashing function might not be perfect in practice. I hope we all agree that Infinispan should be able to take such situations with at worst a graceful performance degradatation, rather than complain sending OOMs to the admin and setting the service on strike. A Shared CacheStore is useful for very different purposes; primarily to implement a Cache on some other service - for example your (single, shared) RDBMs, a slow (or expensive) webservice your organization has to call frequently, etc.. Or it's useful even as a write-through cache on a similar service, maybe internal but not able to handle the high variation of load spikes which Infinsipan can handle better. Finally, a great use case is to have a consistent backup of all your data-grid content, possibly in some "reference" form such as JPA mapped entities. # Benefits of a Non-Shared A non-shared CacheStore implementor should be able to take advantage of *its purpose*, among the big ones I see: - Exclusive usage -> locking of a specific entry can be handled at datacontainer level, can simplify quite some internal code. - Reliability -> since a clustered node needs to wipe its state at reboot (after a crash), it's much simpler to code any such CacheStore to avoid any form of disk synch or persistance guarantees. - Encoding format -> this can be controlled entirely by Infinispan, and no need to take factors like rolling upgrade compatible encodings in mind. JBoss Marshalling would be good enough, or some implementations might not need to serialize at all. Our non-shared CacheStore implentation(s) could take advantage of lower level more complex code optimisations and interfaces, as users would rarely want to customize one of these, while the use case of mapping data to a shared service needs a more user friendly SPI so to keep it simple to plug in custom stores: custom data formats, custom connectors, get some help in implementing concurrency correctly. Proper Transaction integration for the CacheStore has been on our wishlist for some time too, I suspect that accepting that we have been mixing up two different things under a same name so far, would make it simpler to implement further improvements such as transactions: the way to do such a thing is very different in each of these use cases, so it would help at least to implement it on a subset first, or maybe only if it turns out there's no need for such things in the context of the local-only-dedicated "swapfile". # Mixed types should be killed I'm aware that some of our current implementations _could_ work both as shared or non-shared, for example the JDBC or JPACacheStore or the Remote Cachestore.. but in most cases it doesn't make much sense. Why would you ever want to use the JPACacheStore if not to share data with a _shared_ database? We should take such options away, and by doing so focus on the use cases which actually matter and simplify the implementations and improve the configuration validations. If ever a compelling storage technology is identified which we'd like to offer as an option for both shared or non-shared, I would still recommend to make two different implementations, as there certainly are different requirements and assumptions when coding such a thing. Not least, I would very like to see a default local CacheStore: picking one for local "emergency swapping" should be a no-brainer for users; we could setup one by default and not bother newcomers with complex choices. If we simplify the requirement of such a thing, it should be easy to write one on standard Java NIO2 APIs and get rid of the complexities of maintaining the native integration with things like LevelDB, not least the inefficiency of Java to make such native calls. Then as a second step, we should attack the other use case: backups; from a *purpose driven perspective* I'd then see us revive the Cassandra integration; obviously as a shared-only option. Cheers, Sanne From smarlow at redhat.com Thu Jul 16 10:00:23 2015 From: smarlow at redhat.com (Scott Marlow) Date: Thu, 16 Jul 2015 10:00:23 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55A76CA4.3000608@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> Message-ID: <55A7B8F7.4010800@redhat.com> On 07/16/2015 04:34 AM, Radim Vansa wrote: > > On 07/15/2015 03:25 PM, Scott Marlow wrote: >> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >> >> " >> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >> error: cannot find symbol >> rpcManager.broadcastRpcCommand( cmd, isSync ); >> ^ >> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >> location: variable rpcManager of type RpcManager >> " >> >> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? > > This should be fixed in ORM 5.x, but should I do a PR that uses > Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get > out before ORM 5.0.0.Final? Could Infinispan 8.0 support the older RpcManager.broadcastRpcCommand(EvictAllCommand,boolean)? Adding Steve, so he can respond about the idea of bringing Infinispan 8.0.0.Beta1 into ORM 5.0. > >> >> More inline below... >> >> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>> in a micro release of Hibernate.. that's against our conventions. >>> The better plan would be to work towards a Hibernate 5.1 version for >>> this, or make sure to talk with him in advance if that doesn't work >>> out. >> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >> with that target in mind, we should figure out if we can have a >> Hibernate 5.x and Infinispan 8.x that work together. I don't >> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >> together. IMO, we should either change the Infinispan 8 API to still >> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. > > See above: I would 'make sure' that they work together, but it's IMO not > possible until Infinispan 8.0 is released. Just don't break Hibernate ORM and this won't be an issue ;) > > Radim > >> >>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>> API, which won't change. However, with certain consistency fixes ([1] >>>> [2] and maybe more) the Interceptor API is used (as the basic >>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>> to change in Infinispan 8.0. >>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>> will be out. Since that would require only changes internal to that >>>> module, I hope this upgrade can be scoped to a micro-release. >>>> >>>> Radim >>>> >>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>> >>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>> Hi, >>>>> >>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>> changes to integrate with Infinispan 8.0? >>>>> >>>>> Thanks, >>>>> Scott >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> -- >>>> Radim Vansa >>>> JBoss Performance Team >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From steve at hibernate.org Thu Jul 16 10:20:10 2015 From: steve at hibernate.org (Steve Ebersole) Date: Thu, 16 Jul 2015 14:20:10 +0000 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55A7B8F7.4010800@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55A7B8F7.4010800@redhat.com> Message-ID: Scott, I don't think I am subscribed to the Infinispan list so I doubt my reply goes through... It's already been stated, but for the record, my take is that it is not proper to have pre-production releases (Beta, etc) be a dependency in a Final release. If we have to upgrade Infinispan to 8 in Hibernate to fix this, that would have to be done in 5.1 or later at this point. On Thu, Jul 16, 2015, 9:00 AM Scott Marlow wrote: > > > On 07/16/2015 04:34 AM, Radim Vansa wrote: > > > > On 07/15/2015 03:25 PM, Scott Marlow wrote: > >> Looks like Hibernate 5.0 cannot be compiled against Infinispan > 8.0.0.Beta1: > >> > >> " > >> > hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: > >> error: cannot find symbol > >> rpcManager.broadcastRpcCommand( cmd, isSync > ); > >> ^ > >> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) > >> location: variable rpcManager of type RpcManager > >> " > >> > >> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? > > > > This should be fixed in ORM 5.x, but should I do a PR that uses > > Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get > > out before ORM 5.0.0.Final? > > Could Infinispan 8.0 support the older > RpcManager.broadcastRpcCommand(EvictAllCommand,boolean)? > > Adding Steve, so he can respond about the idea of bringing Infinispan > 8.0.0.Beta1 into ORM 5.0. > > > > >> > >> More inline below... > >> > >> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: > >>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 > >>> in a micro release of Hibernate.. that's against our conventions. > >>> The better plan would be to work towards a Hibernate 5.1 version for > >>> this, or make sure to talk with him in advance if that doesn't work > >>> out. > >> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, > >> with that target in mind, we should figure out if we can have a > >> Hibernate 5.x and Infinispan 8.x that work together. I don't > >> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, > >> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well > >> together. IMO, we should either change the Infinispan 8 API to still > >> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. > > > > See above: I would 'make sure' that they work together, but it's IMO not > > possible until Infinispan 8.0 is released. > > Just don't break Hibernate ORM and this won't be an issue ;) > > > > > Radim > > > >> > >>> On 14 July 2015 at 16:16, Radim Vansa wrote: > >>>> IIRC currently hibernate-infinispan module uses just the basic cache > >>>> API, which won't change. However, with certain consistency fixes ([1] > >>>> [2] and maybe more) the Interceptor API is used (as the basic > >>>> invalidation mode cannot make 2LC consistent on it own), which is > about > >>>> to change in Infinispan 8.0. > >>>> I will take care of updating hibernate-infinispan module to 8.0 when > it > >>>> will be out. Since that would require only changes internal to that > >>>> module, I hope this upgrade can be scoped to a micro-release. > >>>> > >>>> Radim > >>>> > >>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 > >>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 > >>>> > >>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: > >>>>> Hi, > >>>>> > >>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. > >>>>> If that happens, how does that impact Hibernate ORM 5.0 which > currently > >>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need > any > >>>>> changes to integrate with Infinispan 8.0? > >>>>> > >>>>> Thanks, > >>>>> Scott > >>>>> _______________________________________________ > >>>>> infinispan-dev mailing list > >>>>> infinispan-dev at lists.jboss.org > >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>> > >>>> -- > >>>> Radim Vansa > >>>> JBoss Performance Team > >>>> > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> infinispan-dev at lists.jboss.org > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev at lists.jboss.org > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150716/eab959ff/attachment-0001.html From smarlow at redhat.com Thu Jul 16 10:25:08 2015 From: smarlow at redhat.com (Scott Marlow) Date: Thu, 16 Jul 2015 10:25:08 -0400 Subject: [infinispan-dev] Fwd: Re: Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: Message-ID: <55A7BEC4.4070001@redhat.com> Thanks Steve, I am forwarding your answer. -------- Forwarded Message -------- Subject: Re: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... Date: Thu, 16 Jul 2015 14:20:10 +0000 From: Steve Ebersole To: Scott Marlow , infinispan -Dev List Scott, I don't think I am subscribed to the Infinispan list so I doubt my reply goes through... It's already been stated, but for the record, my take is that it is not proper to have pre-production releases (Beta, etc) be a dependency in a Final release. If we have to upgrade Infinispan to 8 in Hibernate to fix this, that would have to be done in 5.1 or later at this point. On Thu, Jul 16, 2015, 9:00 AM Scott Marlow > wrote: On 07/16/2015 04:34 AM, Radim Vansa wrote: > > On 07/15/2015 03:25 PM, Scott Marlow wrote: >> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >> >> " >> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >> error: cannot find symbol >> rpcManager.broadcastRpcCommand( cmd, isSync ); >> ^ >> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >> location: variable rpcManager of type RpcManager >> " >> >> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? > > This should be fixed in ORM 5.x, but should I do a PR that uses > Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get > out before ORM 5.0.0.Final? Could Infinispan 8.0 support the older RpcManager.broadcastRpcCommand(EvictAllCommand,boolean)? Adding Steve, so he can respond about the idea of bringing Infinispan 8.0.0.Beta1 into ORM 5.0. > >> >> More inline below... >> >> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>> in a micro release of Hibernate.. that's against our conventions. >>> The better plan would be to work towards a Hibernate 5.1 version for >>> this, or make sure to talk with him in advance if that doesn't work >>> out. >> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >> with that target in mind, we should figure out if we can have a >> Hibernate 5.x and Infinispan 8.x that work together. I don't >> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >> together. IMO, we should either change the Infinispan 8 API to still >> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. > > See above: I would 'make sure' that they work together, but it's IMO not > possible until Infinispan 8.0 is released. Just don't break Hibernate ORM and this won't be an issue ;) > > Radim > >> >>> On 14 July 2015 at 16:16, Radim Vansa > wrote: >>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>> API, which won't change. However, with certain consistency fixes ([1] >>>> [2] and maybe more) the Interceptor API is used (as the basic >>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>> to change in Infinispan 8.0. >>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>> will be out. Since that would require only changes internal to that >>>> module, I hope this upgrade can be scoped to a micro-release. >>>> >>>> Radim >>>> >>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>> >>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>> Hi, >>>>> >>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>> changes to integrate with Infinispan 8.0? >>>>> >>>>> Thanks, >>>>> Scott >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> -- >>>> Radim Vansa > >>>> JBoss Performance Team >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From wfink at redhat.com Fri Jul 17 07:48:44 2015 From: wfink at redhat.com (Wolf-Dieter Fink) Date: Fri, 17 Jul 2015 13:48:44 +0200 Subject: [infinispan-dev] Shared vs Non-Shared CacheStores In-Reply-To: References: Message-ID: <55A8EB9C.8020104@redhat.com> +1000 On 16/07/15 15:32, Sanne Grinovero wrote: > I would like to propose a clear cut separation between our shared and > non-shared CacheStores, > in all terms such as: > - Configuration options > - Integration contracts (Split the CacheStore SPI) > - Implementations > - Terminology, to avoid any further confusion around valid > configurations and sensible architectures > > We have loads of examples of users who get in trouble by configuring > one incorrectly, but also there are plenty of efficiency improvements > we could take advantage of by clearly splitting the integration points > and the implementations in two categories. > > Not least, it's a very common and dangerous pitfall to assume that > Infinispan is able to restore a consistent state after having stopped > a DIST cluster which passivated into non-shared CacheStore instances, > or even REPL clusters when they don't shutdown all at the same exact > time (and "exact same time" is a strange concept at least..). We need > to clarify the different options, tradeoffs and their consequences.. > to users and ourselves, as a clearly defined use case will avoid bugs > and simplify implementations. > > # The purpose of each > I think that people should use a non-shared (local?) CacheStore for > the sole purpose of expanding to storage capacity of each single > node.. be it because you don't have enough memory at all, or be it > because you prefer some extra safety margin because either your > estimates are complex, or maybe because we live in a real world were > the hashing function might not be perfect in practice. I hope we all > agree that Infinispan should be able to take such situations with at > worst a graceful performance degradatation, rather than complain > sending OOMs to the admin and setting the service on strike. > > A Shared CacheStore is useful for very different purposes; primarily > to implement a Cache on some other service - for example your (single, > shared) RDBMs, a slow (or expensive) webservice your organization has > to call frequently, etc.. Or it's useful even as a write-through cache > on a similar service, maybe internal but not able to handle the high > variation of load spikes which Infinsipan can handle better. > Finally, a great use case is to have a consistent backup of all your > data-grid content, possibly in some "reference" form such as JPA > mapped entities. > > # Benefits of a Non-Shared > A non-shared CacheStore implementor should be able to take advantage > of *its purpose*, among the big ones I see: > - Exclusive usage -> locking of a specific entry can be handled at > datacontainer level, can simplify quite some internal code. > - Reliability -> since a clustered node needs to wipe its state at > reboot (after a crash), it's much simpler to code any such CacheStore > to avoid any form of disk synch or persistance guarantees. > - Encoding format -> this can be controlled entirely by Infinispan, > and no need to take factors like rolling upgrade compatible encodings > in mind. JBoss Marshalling would be good enough, or some > implementations might not need to serialize at all. > > Our non-shared CacheStore implentation(s) could take advantage of > lower level more complex code optimisations and interfaces, as users > would rarely want to customize one of these, while the use case of > mapping data to a shared service needs a more user friendly SPI so to > keep it simple to plug in custom stores: custom data formats, custom > connectors, get some help in implementing concurrency correctly. > Proper Transaction integration for the CacheStore has been on our > wishlist for some time too, I suspect that accepting that we have been > mixing up two different things under a same name so far, would make it > simpler to implement further improvements such as transactions: the > way to do such a thing is very different in each of these use cases, > so it would help at least to implement it on a subset first, or maybe > only if it turns out there's no need for such things in the context of > the local-only-dedicated "swapfile". > > # Mixed types should be killed > I'm aware that some of our current implementations _could_ work both as > shared or non-shared, for example the JDBC or JPACacheStore or the > Remote Cachestore.. but in most cases it doesn't make much sense. Why > would you ever want to use the JPACacheStore if not to share data with > a _shared_ database? > > We should take such options away, and by doing so focus on the use > cases which actually matter and simplify the implementations and > improve the configuration validations. > > If ever a compelling storage technology is identified which we'd like to > offer as an option for both shared or non-shared, I would still > recommend to make two different implementations, as there certainly are > different requirements and assumptions when coding such a thing. > > Not least, I would very like to see a default local CacheStore: > picking one for local "emergency swapping" should be a no-brainer for > users; we could setup one by default and not bother newcomers with > complex choices. > > If we simplify the requirement of such a thing, it should be easy to > write one on standard Java NIO2 APIs and get rid of the complexities of > maintaining the native integration with things like LevelDB, not least > the inefficiency of Java to make such native calls. > > Then as a second step, we should attack the other use case: backups; > from a *purpose driven perspective* I'd then see us revive the Cassandra > integration; obviously as a shared-only option. > > Cheers, > Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rory.odonnell at oracle.com Fri Jul 17 08:51:51 2015 From: rory.odonnell at oracle.com (Rory O'Donnell) Date: Fri, 17 Jul 2015 13:51:51 +0100 Subject: [infinispan-dev] Early Access builds for JDK 8u60 b24 and JDK 9 b72 are available on java.net Message-ID: <55A8FA67.40409@oracle.com> Hi Galder, Early Access build for JDK 8u60 b24 is available on java.net, summary of changes are listed here. As we enter the later phases of development for JDK 8u60, please log any show stoppers as soon as possible. Early Access build for JDK 9 b72 is available on java.net, summary of changes are listed here . Note: b72 includes 8081708 - Change default GC for server configurations to G1 As described in JEP 248 , switching to a low-pause collector in JDK 9 such as G1 should provide a better overall experience, for most users, than a throughput-oriented collector such as the Parallel GC, which is currently the default. A call for feedback and data points regarding this change went out on the hotspot-dev mailing list, where such input should be provided. Rgds,Rory -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150717/ef55ae9d/attachment.html From ancosen1985 at yahoo.com Sun Jul 19 05:45:47 2015 From: ancosen1985 at yahoo.com (Andrea Cosentino) Date: Sun, 19 Jul 2015 09:45:47 +0000 (UTC) Subject: [infinispan-dev] Infinispan-Cachestore Message-ID: <991586958.453914.1437299147025.JavaMail.yahoo@mail.yahoo.com> Hi, Yesterday I've opened this ticket on github https://github.com/infinispan/infinispan-cachestore-cassandra/issues/3 ?I'm interested in the different cachestore projects (mongodb, leveldb and so on) and I'd like to see it revived by contributing to those projects. Andrea --? Andrea Cosentino? ----------------------------------? Apache Camel Committer? Email: ancosen1985 at yahoo.com? Twitter: @oscerd2? Github: oscerd? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150719/ac1a3ccc/attachment.html From slaskawi at redhat.com Mon Jul 20 02:22:40 2015 From: slaskawi at redhat.com (=?UTF-8?Q?Sebastian_=c5=81askawiec?=) Date: Mon, 20 Jul 2015 08:22:40 +0200 Subject: [infinispan-dev] Infinispan-Cachestore In-Reply-To: <991586958.453914.1437299147025.JavaMail.yahoo@mail.yahoo.com> References: <991586958.453914.1437299147025.JavaMail.yahoo@mail.yahoo.com> Message-ID: <55AC93B0.2000306@redhat.com> Hey! +1000, that's great idea! I think the first step is to update them and create CI jobs for them (at least one for Pull Requests and one for SNAPSHOTs). After that we could modify existing cache stores (at least Cassandra and Mongo) and use Infinispan core dependency and not the parent. This would allow us to use version ranges. Does it make sense to put our Cache Stores into Infinispan release cycle? This way the Cache Store version would be the same as Infinispan version. Thanks Sebastian On 07/19/2015 11:45 AM, Andrea Cosentino wrote: > Hi, > > Yesterday I've opened this ticket on github > > https://github.com/infinispan/infinispan-cachestore-cassandra/issues/3 > I'm interested in the different cachestore projects (mongodb, leveldb > and so on) and I'd like to see it revived by contributing to those > projects. > > Andrea > > -- > Andrea Cosentino > ---------------------------------- > Apache Camel Committer > Email: ancosen1985 at yahoo.com > Twitter: @oscerd2 > Github: oscerd > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150720/20911cdd/attachment-0001.html From dan.berindei at gmail.com Mon Jul 20 03:52:07 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 20 Jul 2015 10:52:07 +0300 Subject: [infinispan-dev] Infinispan-Cachestore In-Reply-To: <55AC93B0.2000306@redhat.com> References: <991586958.453914.1437299147025.JavaMail.yahoo@mail.yahoo.com> <55AC93B0.2000306@redhat.com> Message-ID: Of course, any contribution is welcome! I don't think we should commit to update all of them with each Infinispan release, though - as you've seen, we don't always have the resources to update and test them. Cheers Dan On Mon, Jul 20, 2015 at 9:22 AM, Sebastian ?askawiec wrote: > Hey! > > +1000, that's great idea! > > I think the first step is to update them and create CI jobs for them (at > least one for Pull Requests and one for SNAPSHOTs). > > After that we could modify existing cache stores (at least Cassandra and > Mongo) and use Infinispan core dependency and not the parent. This would > allow us to use version ranges. > > Does it make sense to put our Cache Stores into Infinispan release cycle? > This way the Cache Store version would be the same as Infinispan version. > > Thanks > Sebastian > > On 07/19/2015 11:45 AM, Andrea Cosentino wrote: > > Hi, > > Yesterday I've opened this ticket on github > > https://github.com/infinispan/infinispan-cachestore-cassandra/issues/3 > > I'm interested in the different cachestore projects (mongodb, leveldb and so > on) and I'd like to see it revived by contributing to those projects. > > Andrea > > -- > Andrea Cosentino > ---------------------------------- > Apache Camel Committer > Email: ancosen1985 at yahoo.com > Twitter: @oscerd2 > Github: oscerd > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Jul 20 05:00:36 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 20 Jul 2015 11:00:36 +0200 Subject: [infinispan-dev] Infinispan-Cachestore In-Reply-To: <991586958.453914.1437299147025.JavaMail.yahoo@mail.yahoo.com> References: <991586958.453914.1437299147025.JavaMail.yahoo@mail.yahoo.com> Message-ID: <55ACB8B4.8080307@redhat.com> LevelDB is part of the main tree and maintained. If I had to choose among the others, I would vote for the Cassandra cachestore to be the first to be aligned to the latest Infinispan SPI. We now have an infinispan-cachestore-archetype that should help with setting up an initial project with information on getting started. Tristan On 19/07/2015 11:45, Andrea Cosentino wrote: > Hi, > > Yesterday I've opened this ticket on github > > https://github.com/infinispan/infinispan-cachestore-cassandra/issues/3 > I'm interested in the different cachestore projects (mongodb, leveldb > and so on) and I'd like to see it revived by contributing to those projects. > > Andrea > > -- > Andrea Cosentino > ---------------------------------- > Apache Camel Committer > Email: ancosen1985 at yahoo.com > Twitter: @oscerd2 > Github: oscerd > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Mon Jul 20 05:07:14 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 20 Jul 2015 11:07:14 +0200 Subject: [infinispan-dev] Shared vs Non-Shared CacheStores In-Reply-To: References: Message-ID: <55ACBA42.5070507@redhat.com> Sanne, well written. Before actually implementing any of the optimizations/changes you mention, I think the lowest-hanging fruit we should grab now is just to add checks to all of our cachestores to actually throw an exception when they are being enabled in unsupported configurations. I've created [1] to get us started Tristan [1] https://issues.jboss.org/browse/ISPN-5617 On 16/07/2015 15:32, Sanne Grinovero wrote: > I would like to propose a clear cut separation between our shared and > non-shared CacheStores, > in all terms such as: > - Configuration options > - Integration contracts (Split the CacheStore SPI) > - Implementations > - Terminology, to avoid any further confusion around valid > configurations and sensible architectures > > We have loads of examples of users who get in trouble by configuring > one incorrectly, but also there are plenty of efficiency improvements > we could take advantage of by clearly splitting the integration points > and the implementations in two categories. > > Not least, it's a very common and dangerous pitfall to assume that > Infinispan is able to restore a consistent state after having stopped > a DIST cluster which passivated into non-shared CacheStore instances, > or even REPL clusters when they don't shutdown all at the same exact > time (and "exact same time" is a strange concept at least..). We need > to clarify the different options, tradeoffs and their consequences.. > to users and ourselves, as a clearly defined use case will avoid bugs > and simplify implementations. > > # The purpose of each > I think that people should use a non-shared (local?) CacheStore for > the sole purpose of expanding to storage capacity of each single > node.. be it because you don't have enough memory at all, or be it > because you prefer some extra safety margin because either your > estimates are complex, or maybe because we live in a real world were > the hashing function might not be perfect in practice. I hope we all > agree that Infinispan should be able to take such situations with at > worst a graceful performance degradatation, rather than complain > sending OOMs to the admin and setting the service on strike. > > A Shared CacheStore is useful for very different purposes; primarily > to implement a Cache on some other service - for example your (single, > shared) RDBMs, a slow (or expensive) webservice your organization has > to call frequently, etc.. Or it's useful even as a write-through cache > on a similar service, maybe internal but not able to handle the high > variation of load spikes which Infinsipan can handle better. > Finally, a great use case is to have a consistent backup of all your > data-grid content, possibly in some "reference" form such as JPA > mapped entities. > > # Benefits of a Non-Shared > A non-shared CacheStore implementor should be able to take advantage > of *its purpose*, among the big ones I see: > - Exclusive usage -> locking of a specific entry can be handled at > datacontainer level, can simplify quite some internal code. > - Reliability -> since a clustered node needs to wipe its state at > reboot (after a crash), it's much simpler to code any such CacheStore > to avoid any form of disk synch or persistance guarantees. > - Encoding format -> this can be controlled entirely by Infinispan, > and no need to take factors like rolling upgrade compatible encodings > in mind. JBoss Marshalling would be good enough, or some > implementations might not need to serialize at all. > > Our non-shared CacheStore implentation(s) could take advantage of > lower level more complex code optimisations and interfaces, as users > would rarely want to customize one of these, while the use case of > mapping data to a shared service needs a more user friendly SPI so to > keep it simple to plug in custom stores: custom data formats, custom > connectors, get some help in implementing concurrency correctly. > Proper Transaction integration for the CacheStore has been on our > wishlist for some time too, I suspect that accepting that we have been > mixing up two different things under a same name so far, would make it > simpler to implement further improvements such as transactions: the > way to do such a thing is very different in each of these use cases, > so it would help at least to implement it on a subset first, or maybe > only if it turns out there's no need for such things in the context of > the local-only-dedicated "swapfile". > > # Mixed types should be killed > I'm aware that some of our current implementations _could_ work both as > shared or non-shared, for example the JDBC or JPACacheStore or the > Remote Cachestore.. but in most cases it doesn't make much sense. Why > would you ever want to use the JPACacheStore if not to share data with > a _shared_ database? > > We should take such options away, and by doing so focus on the use > cases which actually matter and simplify the implementations and > improve the configuration validations. > > If ever a compelling storage technology is identified which we'd like to > offer as an option for both shared or non-shared, I would still > recommend to make two different implementations, as there certainly are > different requirements and assumptions when coding such a thing. > > Not least, I would very like to see a default local CacheStore: > picking one for local "emergency swapping" should be a no-brainer for > users; we could setup one by default and not bother newcomers with > complex choices. > > If we simplify the requirement of such a thing, it should be easy to > write one on standard Java NIO2 APIs and get rid of the complexities of > maintaining the native integration with things like LevelDB, not least > the inefficiency of Java to make such native calls. > > Then as a second step, we should attack the other use case: backups; > from a *purpose driven perspective* I'd then see us revive the Cassandra > integration; obviously as a shared-only option. > > Cheers, > Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ancosen1985 at yahoo.com Mon Jul 20 05:08:53 2015 From: ancosen1985 at yahoo.com (Andrea Cosentino) Date: Mon, 20 Jul 2015 09:08:53 +0000 (UTC) Subject: [infinispan-dev] Infinispan-Cachestore In-Reply-To: <55ACB8B4.8080307@redhat.com> References: <55ACB8B4.8080307@redhat.com> Message-ID: <1786965614.877463.1437383333091.JavaMail.yahoo@mail.yahoo.com> Perfect.??I'll take a look to the archetype and start from it. Best. --? Andrea Cosentino? ----------------------------------? Apache Camel Committer? Email: ancosen1985 at yahoo.com? Twitter: @oscerd2? Github: oscerd? On Monday, July 20, 2015 11:01 AM, Tristan Tarrant wrote: LevelDB is part of the main tree and maintained. If I had to choose among the others, I would vote for the Cassandra cachestore to be the first to be aligned to the latest Infinispan SPI. We now have an infinispan-cachestore-archetype that should help with setting up an initial project with information on getting started. Tristan On 19/07/2015 11:45, Andrea Cosentino wrote: > Hi, > > Yesterday I've opened this ticket on github > > https://github.com/infinispan/infinispan-cachestore-cassandra/issues/3 > I'm interested in the different cachestore projects (mongodb, leveldb > and so on) and I'd like to see it revived by contributing to those projects. > > Andrea > > -- > Andrea Cosentino > ---------------------------------- > Apache Camel Committer > Email: ancosen1985 at yahoo.com > Twitter: @oscerd2 > Github: oscerd > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat _______________________________________________ infinispan-dev mailing list infinispan-dev at lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150720/5e7ff939/attachment.html From sanne at infinispan.org Mon Jul 20 05:41:31 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 20 Jul 2015 10:41:31 +0100 Subject: [infinispan-dev] Shared vs Non-Shared CacheStores In-Reply-To: <55ACBA42.5070507@redhat.com> References: <55ACBA42.5070507@redhat.com> Message-ID: +1 for incremental changes.. I'd see the first step as defining two different interfaces; essentially we need to choose two good names. Then we could have both interfaces still implement the same identical methods, but go through each implementation and decide to "mark" it as shared-only or never-shared. That would make it simpler to make concrete change proposals on each of them and start taking some advantage from the split. I think you'll need the two different interfaces to implement the validations you mentioned. For Infinispan 8's goals, I'd be happy enough to keep the "shared-only" interface quite similar to the current one, but mark the never-shared one as a private or experimental SPI to allow ourselves some more flexibility in performance oriented changes. Thanks, Sanne On 20 July 2015 at 10:07, Tristan Tarrant wrote: > Sanne, well written. > Before actually implementing any of the optimizations/changes you > mention, I think the lowest-hanging fruit we should grab now is just to > add checks to all of our cachestores to actually throw an exception when > they are being enabled in unsupported configurations. > > I've created [1] to get us started > > Tristan > > [1] https://issues.jboss.org/browse/ISPN-5617 > > On 16/07/2015 15:32, Sanne Grinovero wrote: >> I would like to propose a clear cut separation between our shared and >> non-shared CacheStores, >> in all terms such as: >> - Configuration options >> - Integration contracts (Split the CacheStore SPI) >> - Implementations >> - Terminology, to avoid any further confusion around valid >> configurations and sensible architectures >> >> We have loads of examples of users who get in trouble by configuring >> one incorrectly, but also there are plenty of efficiency improvements >> we could take advantage of by clearly splitting the integration points >> and the implementations in two categories. >> >> Not least, it's a very common and dangerous pitfall to assume that >> Infinispan is able to restore a consistent state after having stopped >> a DIST cluster which passivated into non-shared CacheStore instances, >> or even REPL clusters when they don't shutdown all at the same exact >> time (and "exact same time" is a strange concept at least..). We need >> to clarify the different options, tradeoffs and their consequences.. >> to users and ourselves, as a clearly defined use case will avoid bugs >> and simplify implementations. >> >> # The purpose of each >> I think that people should use a non-shared (local?) CacheStore for >> the sole purpose of expanding to storage capacity of each single >> node.. be it because you don't have enough memory at all, or be it >> because you prefer some extra safety margin because either your >> estimates are complex, or maybe because we live in a real world were >> the hashing function might not be perfect in practice. I hope we all >> agree that Infinispan should be able to take such situations with at >> worst a graceful performance degradatation, rather than complain >> sending OOMs to the admin and setting the service on strike. >> >> A Shared CacheStore is useful for very different purposes; primarily >> to implement a Cache on some other service - for example your (single, >> shared) RDBMs, a slow (or expensive) webservice your organization has >> to call frequently, etc.. Or it's useful even as a write-through cache >> on a similar service, maybe internal but not able to handle the high >> variation of load spikes which Infinsipan can handle better. >> Finally, a great use case is to have a consistent backup of all your >> data-grid content, possibly in some "reference" form such as JPA >> mapped entities. >> >> # Benefits of a Non-Shared >> A non-shared CacheStore implementor should be able to take advantage >> of *its purpose*, among the big ones I see: >> - Exclusive usage -> locking of a specific entry can be handled at >> datacontainer level, can simplify quite some internal code. >> - Reliability -> since a clustered node needs to wipe its state at >> reboot (after a crash), it's much simpler to code any such CacheStore >> to avoid any form of disk synch or persistance guarantees. >> - Encoding format -> this can be controlled entirely by Infinispan, >> and no need to take factors like rolling upgrade compatible encodings >> in mind. JBoss Marshalling would be good enough, or some >> implementations might not need to serialize at all. >> >> Our non-shared CacheStore implentation(s) could take advantage of >> lower level more complex code optimisations and interfaces, as users >> would rarely want to customize one of these, while the use case of >> mapping data to a shared service needs a more user friendly SPI so to >> keep it simple to plug in custom stores: custom data formats, custom >> connectors, get some help in implementing concurrency correctly. >> Proper Transaction integration for the CacheStore has been on our >> wishlist for some time too, I suspect that accepting that we have been >> mixing up two different things under a same name so far, would make it >> simpler to implement further improvements such as transactions: the >> way to do such a thing is very different in each of these use cases, >> so it would help at least to implement it on a subset first, or maybe >> only if it turns out there's no need for such things in the context of >> the local-only-dedicated "swapfile". >> >> # Mixed types should be killed >> I'm aware that some of our current implementations _could_ work both as >> shared or non-shared, for example the JDBC or JPACacheStore or the >> Remote Cachestore.. but in most cases it doesn't make much sense. Why >> would you ever want to use the JPACacheStore if not to share data with >> a _shared_ database? >> >> We should take such options away, and by doing so focus on the use >> cases which actually matter and simplify the implementations and >> improve the configuration validations. >> >> If ever a compelling storage technology is identified which we'd like to >> offer as an option for both shared or non-shared, I would still >> recommend to make two different implementations, as there certainly are >> different requirements and assumptions when coding such a thing. >> >> Not least, I would very like to see a default local CacheStore: >> picking one for local "emergency swapping" should be a no-brainer for >> users; we could setup one by default and not bother newcomers with >> complex choices. >> >> If we simplify the requirement of such a thing, it should be easy to >> write one on standard Java NIO2 APIs and get rid of the complexities of >> maintaining the native integration with things like LevelDB, not least >> the inefficiency of Java to make such native calls. >> >> Then as a second step, we should attack the other use case: backups; >> from a *purpose driven perspective* I'd then see us revive the Cassandra >> integration; obviously as a shared-only option. >> >> Cheers, >> Sanne >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From sanne at infinispan.org Mon Jul 20 05:43:13 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 20 Jul 2015 10:43:13 +0100 Subject: [infinispan-dev] Infinispan-Cachestore In-Reply-To: <55ACB8B4.8080307@redhat.com> References: <991586958.453914.1437299147025.JavaMail.yahoo@mail.yahoo.com> <55ACB8B4.8080307@redhat.com> Message-ID: On 20 July 2015 at 10:00, Tristan Tarrant wrote: > LevelDB is part of the main tree and maintained. > If I had to choose among the others, I would vote for the Cassandra > cachestore to be the first to be aligned to the latest Infinispan SPI. +1 also for the many reasons I listed in the other thread. > We now have an infinispan-cachestore-archetype that should help with > setting up an initial project with information on getting started. > > Tristan > > On 19/07/2015 11:45, Andrea Cosentino wrote: >> Hi, >> >> Yesterday I've opened this ticket on github >> >> https://github.com/infinispan/infinispan-cachestore-cassandra/issues/3 >> I'm interested in the different cachestore projects (mongodb, leveldb >> and so on) and I'd like to see it revived by contributing to those projects. >> >> Andrea >> >> -- >> Andrea Cosentino >> ---------------------------------- >> Apache Camel Committer >> Email: ancosen1985 at yahoo.com >> Twitter: @oscerd2 >> Github: oscerd >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Mon Jul 20 06:02:26 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 20 Jul 2015 13:02:26 +0300 Subject: [infinispan-dev] Shared vs Non-Shared CacheStores In-Reply-To: References: <55ACBA42.5070507@redhat.com> Message-ID: Sanne, I think changing the cache store API is actually the most painful part, so we should only do it if we gain a concrete advantage from doing it. From a compatibility point of view, implementing a new interface vs implementing the same interface with completely different methods is just as bad. On Mon, Jul 20, 2015 at 12:41 PM, Sanne Grinovero wrote: > +1 for incremental changes.. > > I'd see the first step as defining two different interfaces; > essentially we need to choose two good names. > > Then we could have both interfaces still implement the same identical > methods, but go through each implementation and decide to "mark" it as > shared-only or never-shared. > > That would make it simpler to make concrete change proposals on each > of them and start taking some advantage from the split. I think you'll > need the two different interfaces to implement the validations you > mentioned. > > For Infinispan 8's goals, I'd be happy enough to keep the > "shared-only" interface quite similar to the current one, but mark the > never-shared one as a private or experimental SPI to allow ourselves > some more flexibility in performance oriented changes. > > Thanks, > Sanne > > On 20 July 2015 at 10:07, Tristan Tarrant wrote: >> Sanne, well written. >> Before actually implementing any of the optimizations/changes you >> mention, I think the lowest-hanging fruit we should grab now is just to >> add checks to all of our cachestores to actually throw an exception when >> they are being enabled in unsupported configurations. >> >> I've created [1] to get us started >> >> Tristan >> >> [1] https://issues.jboss.org/browse/ISPN-5617 >> >> On 16/07/2015 15:32, Sanne Grinovero wrote: >>> I would like to propose a clear cut separation between our shared and >>> non-shared CacheStores, >>> in all terms such as: >>> - Configuration options >>> - Integration contracts (Split the CacheStore SPI) >>> - Implementations >>> - Terminology, to avoid any further confusion around valid >>> configurations and sensible architectures >>> >>> We have loads of examples of users who get in trouble by configuring >>> one incorrectly, but also there are plenty of efficiency improvements >>> we could take advantage of by clearly splitting the integration points >>> and the implementations in two categories. >>> >>> Not least, it's a very common and dangerous pitfall to assume that >>> Infinispan is able to restore a consistent state after having stopped >>> a DIST cluster which passivated into non-shared CacheStore instances, >>> or even REPL clusters when they don't shutdown all at the same exact >>> time (and "exact same time" is a strange concept at least..). We need >>> to clarify the different options, tradeoffs and their consequences.. >>> to users and ourselves, as a clearly defined use case will avoid bugs >>> and simplify implementations. >>> >>> # The purpose of each >>> I think that people should use a non-shared (local?) CacheStore for >>> the sole purpose of expanding to storage capacity of each single >>> node.. be it because you don't have enough memory at all, or be it >>> because you prefer some extra safety margin because either your >>> estimates are complex, or maybe because we live in a real world were >>> the hashing function might not be perfect in practice. I hope we all >>> agree that Infinispan should be able to take such situations with at >>> worst a graceful performance degradatation, rather than complain >>> sending OOMs to the admin and setting the service on strike. >>> >>> A Shared CacheStore is useful for very different purposes; primarily >>> to implement a Cache on some other service - for example your (single, >>> shared) RDBMs, a slow (or expensive) webservice your organization has >>> to call frequently, etc.. Or it's useful even as a write-through cache >>> on a similar service, maybe internal but not able to handle the high >>> variation of load spikes which Infinsipan can handle better. >>> Finally, a great use case is to have a consistent backup of all your >>> data-grid content, possibly in some "reference" form such as JPA >>> mapped entities. >>> >>> # Benefits of a Non-Shared >>> A non-shared CacheStore implementor should be able to take advantage >>> of *its purpose*, among the big ones I see: >>> - Exclusive usage -> locking of a specific entry can be handled at >>> datacontainer level, can simplify quite some internal code. >>> - Reliability -> since a clustered node needs to wipe its state at >>> reboot (after a crash), it's much simpler to code any such CacheStore >>> to avoid any form of disk synch or persistance guarantees. >>> - Encoding format -> this can be controlled entirely by Infinispan, >>> and no need to take factors like rolling upgrade compatible encodings >>> in mind. JBoss Marshalling would be good enough, or some >>> implementations might not need to serialize at all. >>> >>> Our non-shared CacheStore implentation(s) could take advantage of >>> lower level more complex code optimisations and interfaces, as users >>> would rarely want to customize one of these, while the use case of >>> mapping data to a shared service needs a more user friendly SPI so to >>> keep it simple to plug in custom stores: custom data formats, custom >>> connectors, get some help in implementing concurrency correctly. >>> Proper Transaction integration for the CacheStore has been on our >>> wishlist for some time too, I suspect that accepting that we have been >>> mixing up two different things under a same name so far, would make it >>> simpler to implement further improvements such as transactions: the >>> way to do such a thing is very different in each of these use cases, >>> so it would help at least to implement it on a subset first, or maybe >>> only if it turns out there's no need for such things in the context of >>> the local-only-dedicated "swapfile". >>> >>> # Mixed types should be killed >>> I'm aware that some of our current implementations _could_ work both as >>> shared or non-shared, for example the JDBC or JPACacheStore or the >>> Remote Cachestore.. but in most cases it doesn't make much sense. Why >>> would you ever want to use the JPACacheStore if not to share data with >>> a _shared_ database? >>> >>> We should take such options away, and by doing so focus on the use >>> cases which actually matter and simplify the implementations and >>> improve the configuration validations. >>> >>> If ever a compelling storage technology is identified which we'd like to >>> offer as an option for both shared or non-shared, I would still >>> recommend to make two different implementations, as there certainly are >>> different requirements and assumptions when coding such a thing. >>> >>> Not least, I would very like to see a default local CacheStore: >>> picking one for local "emergency swapping" should be a no-brainer for >>> users; we could setup one by default and not bother newcomers with >>> complex choices. >>> >>> If we simplify the requirement of such a thing, it should be easy to >>> write one on standard Java NIO2 APIs and get rid of the complexities of >>> maintaining the native integration with things like LevelDB, not least >>> the inefficiency of Java to make such native calls. >>> >>> Then as a second step, we should attack the other use case: backups; >>> from a *purpose driven perspective* I'd then see us revive the Cassandra >>> integration; obviously as a shared-only option. >>> >>> Cheers, >>> Sanne >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Jul 20 12:08:38 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 20 Jul 2015 18:08:38 +0200 Subject: [infinispan-dev] Weekly IRC meeting log 2015-07-20 Message-ID: <55AD1D06.6080409@redhat.com> Hi all, here are the logs from the weekly IRC meeting: http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2015/infinispan.2015-07-20-14.02.log.html Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Mon Jul 20 12:44:59 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 20 Jul 2015 18:44:59 +0200 Subject: [infinispan-dev] Development process and handling of PRs Message-ID: <55AD258B.20702@redhat.com> Hi all, there is something about our current development model which I feel is holding us back a little. This is caused by a number of issues: - Handling Pull Requests: we are really slow at doing this. When issuing a PR, a developer expects at least one review to happen within the next half-day at most. Instead, requests sit in the queue for days (weeks) before they even get considered. I don't expect everybody to just drop what they are doing and review immediately, but at least be a bit more reactive. - It seems like we're always aiming for the perfect PR. Obviously a PR should have zero failures, but we should be a bit more iterative about the way we make changes. This is probably also a consequence of the above: why should I break up my PR into small chunks, if it takes so long to review each one and the cumulative delay is detrimental to my progress. I like what Pedro has done for his locking changes. - We're afraid of changes, but that's what a development phase is for, especially for a new major release. We should be a bit more aggressive with trying things out. A PR can be merged even if there are some concerns (obviously not from a fundamental design POV), and it can be refined in later steps. This is what I would like to see in Beta2: - The functional API (I can take care of rebasing the PR) - The management console - The query grouping/aggregation stuff - anything else we can merge soon I would like to release Wednesday at the latest, so please do your best to help in achieving this goal. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From dan.berindei at gmail.com Tue Jul 21 07:31:25 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 21 Jul 2015 14:31:25 +0300 Subject: [infinispan-dev] Development process and handling of PRs In-Reply-To: <55AD258B.20702@redhat.com> References: <55AD258B.20702@redhat.com> Message-ID: On Mon, Jul 20, 2015 at 7:44 PM, Tristan Tarrant wrote: > Hi all, > > there is something about our current development model which I feel is > holding us back a little. This is caused by a number of issues: > > - Handling Pull Requests: we are really slow at doing this. When issuing > a PR, a developer expects at least one review to happen within the next > half-day at most. Instead, requests sit in the queue for days (weeks) > before they even get considered. I don't expect everybody to just drop > what they are doing and review immediately, but at least be a bit more > reactive. Lately we've had a lot of PRs that take more than 4 hours just to read and try to remember what was going on in the code that's being modified. If I issue a PR modifying 50+ files (which I freely admit I've done in the past) I definitely don't expect the first review pass to be done within a day. > - It seems like we're always aiming for the perfect PR. Obviously a PR > should have zero failures, but we should be a bit more iterative about > the way we make changes. This is probably also a consequence of the > above: why should I break up my PR into small chunks, if it takes so > long to review each one and the cumulative delay is detrimental to my > progress. I like what Pedro has done for his locking changes. First of all, I'm not sure our zero failures policy works. I know more about core, so I try to make sure PRs I review don't introduce random failures in core, but I'm not sure anyone ever investigated the random failures in AtomicObjectFactoryTest. Breaking your work into small chunks is clearly more work, and a small PR will sometimes take just as long to be integrated as a big one. But I really think the extra feedback you get by having smaller PRs is worth it. And if you think your PR is spending too much time in the queue, it's much easier to ping someone on IRC to review a 50-lines change than a 5000-lines one. > - We're afraid of changes, but that's what a development phase is for, > especially for a new major release. We should be a bit more aggressive > with trying things out. A PR can be merged even if there are some > concerns (obviously not from a fundamental design POV), and it can be > refined in later steps. I'm ok with postponing some decisions, as long as we're not forgetting about all the discussions and moving on to the next big thing when the PR is closed. > > This is what I would like to see in Beta2: > - The functional API (I can take care of rebasing the PR) > - The management console > - The query grouping/aggregation stuff > - anything else we can merge soon > > I would like to release Wednesday at the latest, so please do your best > to help in achieving this goal. > > Tristan > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Tue Jul 21 08:05:46 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 21 Jul 2015 14:05:46 +0200 Subject: [infinispan-dev] Development process and handling of PRs In-Reply-To: References: <55AD258B.20702@redhat.com> Message-ID: <55AE359A.1060606@redhat.com> On 21/07/2015 13:31, Dan Berindei wrote: > Lately we've had a lot of PRs that take more than 4 hours just to read > and try to remember what was going on in the code that's being > modified. If I issue a PR modifying 50+ files (which I freely admit > I've done in the past) I definitely don't expect the first review pass > to be done within a day. Of course not. But it also doesn't mean that days can go by even for 1-liners. >> - It seems like we're always aiming for the perfect PR. Obviously a PR >> should have zero failures, but we should be a bit more iterative about > > First of all, I'm not sure our zero failures policy works. I know more > about core, so I try to make sure PRs I review don't introduce random > failures in core, but I'm not sure anyone ever investigated the random > failures in AtomicObjectFactoryTest. There are high-priority modules (core) and low-priority ones (atomic). If a test is randomly failing in a low-priority module, it should be disabled and a Jira created to track its resolution at a later date. > Breaking your work into small chunks is clearly more work, and a small > PR will sometimes take just as long to be integrated as a big one. But > I really think the extra feedback you get by having smaller PRs is > worth it. And if you think your PR is spending too much time in the > queue, it's much easier to ping someone on IRC to review a 50-lines > change than a 5000-lines one. +1 >> - We're afraid of changes, but that's what a development phase is for, >> especially for a new major release. We should be a bit more aggressive >> with trying things out. A PR can be merged even if there are some >> concerns (obviously not from a fundamental design POV), and it can be >> refined in later steps. > > I'm ok with postponing some decisions, as long as we're not forgetting > about all the discussions and moving on to the next big thing when the All the postponed decisions need the creation of a Jira. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From mudokonman at gmail.com Tue Jul 21 10:25:42 2015 From: mudokonman at gmail.com (William Burns) Date: Tue, 21 Jul 2015 14:25:42 +0000 Subject: [infinispan-dev] Strict Expiration In-Reply-To: <55A54C1E.3080703@redhat.com> References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> <55A52E8C.3060906@redhat.com> <55A54C1E.3080703@redhat.com> Message-ID: So I wanted to sum up what it looks like the plan is for this in regards to cluster expiration for ISPN 8. First off to not make it ambiguous, maxIdle being used with a clustered cache will provide undefined and unsupported behavior. This can and will expire entries on a single node without notifying other cluster members (essentially it will operate as it does today unchanged). This leaves me to talk solely about lifespan cluster expiration. Lifespan Expiration events are fired by the primary owner of an expired key - when accessing an expired entry. - by the reaper thread. If the expiration is detected by a node other than the primary owner, an expiration command is sent to it and null is returned immediately not waiting for a response. Expiration event listeners follow the usual rules for sync/async: in the case of a sync listener, the handler is invoked while holding the lock, whereas an async listener will not hold locks. It is desirable for expiration events to contain both the key and value. However currently cache stores do not provide the value when they expire values. Thus we can only guarantee the value is present when an in memory expiration event occurs. We could plan on adding this later. Also as you may have guessed this doesn't touch strict expiration, which I think we have come to the conclusion should only work with maxIdle and as such this is not explored with this iteration. Let me know if you guys think this approach is okay. Cheers, - Will On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa wrote: > Yes, I know about [1]. I've worked that around by storing timestamp in > the entry as well and when a new record is added, the 'expired' > invalidations are purged. But I can't purge that if I don't access it - > Infinispan needs to handle that internally. > > Radim > > [1] https://hibernate.atlassian.net/browse/HHH-6219 > > On 07/14/2015 05:45 PM, Dennis Reed wrote: > > On 07/14/2015 11:08 AM, Radim Vansa wrote: > >> On 07/14/2015 04:19 PM, William Burns wrote: > >>> > >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns >>> > wrote: > >>> > >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei > >>> > wrote: > >>> > >>> Processing expiration only on the reaper thread sounds nice, > but I > >>> have one reservation: processing 1 million entries to see > that > >>> 1 of > >>> them is expired is a lot of work, and in the general case we > >>> will not > >>> be able to ensure an expiration precision of less than 1 > >>> minute (maybe > >>> more, with a huge SingleFileStore attached). > >>> > >>> > >>> This isn't much different then before. The only difference is > >>> that if a user touched a value after it expired it wouldn't show > >>> up (which is unlikely with maxIdle especially). > >>> > >>> > >>> What happens to users who need better precision? In > >>> particular, I know > >>> some JCache tests were failing because HotRod was only > supporting > >>> 1-second resolution instead of the 1-millisecond resolution > >>> they were > >>> expecting. > >>> > >>> > >>> JCache is an interesting piece. The thing about JCache is that > >>> the spec is only defined for local caches. However I wouldn't > >>> want to muddy up the waters in regards to it behaving differently > >>> for local/remote. In the JCache scenario we could add an > >>> interceptor to prevent it returning such values (we do something > >>> similar already for events). JCache behavior vs ISPN behavior > >>> seems a bit easier to differentiate. But like you are getting > at, > >>> either way is not very appealing. > >>> > >>> > >>> > >>> I'm even less convinced about the need to guarantee that a > >>> clustered > >>> expiration listener will only be triggered once, and that the > >>> entry > >>> must be null everywhere after that listener was invoked. > >>> What's the > >>> use case? > >>> > >>> > >>> Maybe Tristan would know more to answer. To be honest this work > >>> seems fruitless unless we know what our end users want here. > >>> Spending time on something for it to thrown out is never fun :( > >>> > >>> And the more I thought about this the more I question the > validity > >>> of maxIdle even. It seems like a very poor way to prevent memory > >>> exhaustion, which eviction does in a much better way and has much > >>> more flexible algorithms. Does anyone know what maxIdle would be > >>> used for that wouldn't be covered by eviction? The only thing I > >>> can think of is cleaning up the cache store as well. > >>> > >>> > >>> Actually I guess for session/authentication related information this > >>> would be important. However maxIdle isn't really as usable in that > >>> case since most likely you would have a sticky session to go back to > >>> that node which means you would never refresh the last used date on > >>> the copies (current implementation). Without cluster expiration you > >>> could lose that session information on a failover very easily. > >> I would say that maxIdle can be used as for memory management as kind of > >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some record for a > >> short while (regular transaction lifespan ~ seconds to minutes), and > >> regularly the record is removed. However, to make sure that we don't > >> leak records in this cache (if something goes wrong and the remove does > >> not occur), it is removed. > > Note that just relying on maxIdle doesn't guarantee you won't leak > > records in this use case (specifically with the way the current > > hibernate-infinispan 2LC implementation uses it). > > > > Hibernate-infinispan adds entries to its own Map stored in Infinispan, > > and expects maxIdle to remove the map if it skips a remove. But in a > > current case, we found that due to frequent accesses to that same map > > the entries never idle out and it ends up in OOME). > > > > -Dennis > > > >> I can guess how long the transaction takes place, but not how many > >> parallel transactions there are. With eviction algorithms (where I am > >> not sure about the exact guarantees) I can set the cache to not hold > >> more than N entries, but I can't know for sure that my record does not > >> suddenly get evicted after shorter period, possibly causing some > >> inconsistency. > >> So this is similar to WeakHashMap by removing the key "when it can't be > >> used anymore" because I know that the transaction will finish before the > >> deadline. I don't care about the exact size, I don't want to tune that, > >> I just don't want to leak. > >> > >> From my POV the non-strict maxIdle and strict expiration would be a > >> nice compromise. > >> > >> Radim > >> > >>> Note that this would make the reaper thread less efficient: > with > >>> numOwners=2 (best case), half of the entries that the reaper > >>> touches > >>> cannot be expired, because the node isn't the primary node. > And to > >>> make matters worse, the same reaper thread would have to > perform a > >>> (synchronous?) RPC for each entry to ensure it expires > everywhere. > >>> > >>> > >>> I have debated about this, it could something like a sync > >>> removeAll which has a special marker to tell it is due to > >>> expiration (which would raise listeners there), while also > sending > >>> a cluster expiration event to other non owners. > >>> > >>> > >>> For maxIdle I'd like to know more information about how > >>> exactly the > >>> owners would coordinate to expire an entry. I'm pretty sure > we > >>> cannot > >>> avoid ignoring some reads (expiring an entry immediately > after > >>> it was > >>> read), and ensuring that we don't accidentally extend an > >>> entry's life > >>> (like the current code does, when we transfer an entry to a > >>> new owner) > >>> also sounds problematic. > >>> > >>> > >>> For lifespan it is simple, the primary owner just expires it when > >>> it expires there. There is no coordination needed in this case > it > >>> just sends the expired remove to owners etc. > >>> > >>> Max idle is more complicated as we all know. The primary owner > >>> would send a request for the last used time for a given key or > set > >>> of keys. Then the owner would take those times and check for a > >>> new access it isn't aware of. If there isn't then it would send > a > >>> remove command for the key(s). If there is a new access the > owner > >>> would instead send the last used time to all of the owners. The > >>> expiration obviously would have a window that if a read occurred > >>> after sending a response that could be ignored. This could be > >>> resolved by using some sort of 2PC and blocking reads during that > >>> period but I would say it isn't worth it. > >>> > >>> The issue with transferring to a new node refreshing the last > >>> update/lifespan seems like just a bug we need to fix irrespective > >>> of this issue IMO. > >>> > >>> > >>> I'm not saying expiring entries on each node independently is > >>> perfect, > >>> far from it. But I wouldn't want us to provide new > guarantees that > >>> could hurt performance without a really good use case. > >>> > >>> > >>> I would guess that user perceived performance should be a little > >>> faster with this. But this also depends on an alternative that > we > >>> decided on :) > >>> > >>> Also the expiration thread pool is set to min priority atm so it > >>> may delay removal of said objects but hopefully (if the jvm > >>> supports) it wouldn't overrun a CPU while processing unless it > has > >>> availability. > >>> > >>> > >>> Cheers > >>> Dan > >>> > >>> > >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant > >>> > wrote: > >>> > After re-reading the whole original thread, I agree with > the > >>> proposal > >>> > with two caveats: > >>> > > >>> > - ensure that we don't break JCache compatibility > >>> > - ensure that we document this properly > >>> > > >>> > Tristan > >>> > > >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: > >>> >> +1 > >>> >> You had me convinced at the first line, although "A lot of > >>> code can now > >>> >> be removed and made simpler" makes it look extremely nice. > >>> >> > >>> >> On 13 Jul 2015 18:14, "William Burns" < > mudokonman at gmail.com > >>> > >>> >> >>> >> wrote: > >>> >> > >>> >> This is a necro of [1]. > >>> >> > >>> >> With Infinispan 8.0 we are adding in clustered > >>> expiration. That > >>> >> includes an expiration event raised that is clustered > >>> as well. > >>> >> Unfortunately expiration events currently occur > >>> multiple times (if > >>> >> numOwners > 1) at different times across nodes in a > >>> cluster. This > >>> >> makes coordinating a single cluster expiration event > >>> quite difficult. > >>> >> > >>> >> To work around this I am proposing that the expiration > >>> of an event > >>> >> is done solely by the owner of the given key that is > >>> now expired. > >>> >> This would fix the issue of having multiple events and > >>> the event can > >>> >> be raised while holding the lock for the given key so > >>> concurrent > >>> >> modifications would not be an issue. > >>> >> > >>> >> The problem arises when you have other nodes that have > >>> expiration > >>> >> set but expire at different times. Max idle is the > >>> biggest offender > >>> >> with this as a read on an owner only refreshes the > >>> owners timestamp, > >>> >> meaning other owners would not be updated and expire > >>> preemptively. > >>> >> To have expiration work properly in this case you > would > >>> need > >>> >> coordination between the owners to see if anyone has a > >>> higher > >>> >> value. This requires blocking and would have to be > >>> done while > >>> >> accessing a key that is expired to be sure if > >>> expiration happened or > >>> >> not. > >>> >> > >>> >> The linked dev listing proposed instead to only expire > >>> an entry by > >>> >> the reaper thread and not on access. In this case a > >>> read will > >>> >> return a non null value until it is fully expired, > >>> increasing hit > >>> >> ratios possibly. > >>> >> > >>> >> Their are quire a bit of real benefits for this: > >>> >> > >>> >> 1. Cluster cache reads would be much simpler and > >>> wouldn't have to > >>> >> block to verify the object exists or not since this > >>> would only be > >>> >> done by the reaper thread (note this would have only > >>> happened if the > >>> >> entry was expired locally). An access would just > >>> return the value > >>> >> immediately. > >>> >> 2. Each node only expires entries it owns in the > reaper > >>> thread > >>> >> reducing how many entries they must check or remove. > >>> This also > >>> >> provides a single point where events would be raised > as > >>> we need. > >>> >> 3. A lot of code can now be removed and made simpler > as > >>> it no longer > >>> >> has to check for expiration. The expiration check > >>> would only be > >>> >> done in 1 place, the expiration reaper thread. > >>> >> > >>> >> The main issue with this proposal is as the other > >>> listing mentions > >>> >> is if user code expects the value to be gone after > >>> expiration for > >>> >> correctness. I would say this use case is not as > >>> compelling for > >>> >> maxIdle, especially since we never supported it > >>> properly. And in > >>> >> the case of lifespan the user could very easily store > >>> the expiration > >>> >> time in the object that they can check after a get as > >>> pointed out in > >>> >> the other thread. > >>> >> > >>> >> [1] > >>> >> > >>> > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > >>> >> > >>> >> _______________________________________________ > >>> >> infinispan-dev mailing list > >>> >> infinispan-dev at lists.jboss.org > >>> > >>> >>> > > >>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> >> > >>> >> > >>> >> > >>> >> _______________________________________________ > >>> >> infinispan-dev mailing list > >>> >> infinispan-dev at lists.jboss.org > >>> > >>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> >> > >>> > > >>> > -- > >>> > Tristan Tarrant > >>> > Infinispan Lead > >>> > JBoss, a division of Red Hat > >>> > _______________________________________________ > >>> > infinispan-dev mailing list > >>> > infinispan-dev at lists.jboss.org > >>> > >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev at lists.jboss.org > >>> > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> > >>> > >>> > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev at lists.jboss.org > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150721/c82306b7/attachment-0001.html From dan.berindei at gmail.com Wed Jul 22 10:53:19 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 22 Jul 2015 17:53:19 +0300 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> <55A52E8C.3060906@redhat.com> <55A54C1E.3080703@redhat.com> Message-ID: Is it possible/feasible to skip the notification from the backups to the primary (and back) when there is no clustered expiration listener? Dan On Tue, Jul 21, 2015 at 5:25 PM, William Burns wrote: > So I wanted to sum up what it looks like the plan is for this in regards to > cluster expiration for ISPN 8. > > First off to not make it ambiguous, maxIdle being used with a clustered > cache will provide undefined and unsupported behavior. This can and will > expire entries on a single node without notifying other cluster members > (essentially it will operate as it does today unchanged). > > This leaves me to talk solely about lifespan cluster expiration. > > Lifespan Expiration events are fired by the primary owner of an expired key > > - when accessing an expired entry. > > - by the reaper thread. > > If the expiration is detected by a node other than the primary owner, an > expiration command is sent to it and null is returned immediately not > waiting for a response. > > Expiration event listeners follow the usual rules for sync/async: in the > case of a sync listener, the handler is invoked while holding the lock, > whereas an async listener will not hold locks. > > It is desirable for expiration events to contain both the key and value. > However currently cache stores do not provide the value when they expire > values. Thus we can only guarantee the value is present when an in memory > expiration event occurs. We could plan on adding this later. > > Also as you may have guessed this doesn't touch strict expiration, which I > think we have come to the conclusion should only work with maxIdle and as > such this is not explored with this iteration. > > Let me know if you guys think this approach is okay. > > Cheers, > > - Will > > On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa wrote: >> >> Yes, I know about [1]. I've worked that around by storing timestamp in >> the entry as well and when a new record is added, the 'expired' >> invalidations are purged. But I can't purge that if I don't access it - >> Infinispan needs to handle that internally. >> >> Radim >> >> [1] https://hibernate.atlassian.net/browse/HHH-6219 >> >> On 07/14/2015 05:45 PM, Dennis Reed wrote: >> > On 07/14/2015 11:08 AM, Radim Vansa wrote: >> >> On 07/14/2015 04:19 PM, William Burns wrote: >> >>> >> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns > >>> > wrote: >> >>> >> >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei >> >>> > wrote: >> >>> >> >>> Processing expiration only on the reaper thread sounds nice, >> >>> but I >> >>> have one reservation: processing 1 million entries to see >> >>> that >> >>> 1 of >> >>> them is expired is a lot of work, and in the general case we >> >>> will not >> >>> be able to ensure an expiration precision of less than 1 >> >>> minute (maybe >> >>> more, with a huge SingleFileStore attached). >> >>> >> >>> >> >>> This isn't much different then before. The only difference is >> >>> that if a user touched a value after it expired it wouldn't show >> >>> up (which is unlikely with maxIdle especially). >> >>> >> >>> >> >>> What happens to users who need better precision? In >> >>> particular, I know >> >>> some JCache tests were failing because HotRod was only >> >>> supporting >> >>> 1-second resolution instead of the 1-millisecond resolution >> >>> they were >> >>> expecting. >> >>> >> >>> >> >>> JCache is an interesting piece. The thing about JCache is that >> >>> the spec is only defined for local caches. However I wouldn't >> >>> want to muddy up the waters in regards to it behaving >> >>> differently >> >>> for local/remote. In the JCache scenario we could add an >> >>> interceptor to prevent it returning such values (we do something >> >>> similar already for events). JCache behavior vs ISPN behavior >> >>> seems a bit easier to differentiate. But like you are getting >> >>> at, >> >>> either way is not very appealing. >> >>> >> >>> >> >>> >> >>> I'm even less convinced about the need to guarantee that a >> >>> clustered >> >>> expiration listener will only be triggered once, and that >> >>> the >> >>> entry >> >>> must be null everywhere after that listener was invoked. >> >>> What's the >> >>> use case? >> >>> >> >>> >> >>> Maybe Tristan would know more to answer. To be honest this work >> >>> seems fruitless unless we know what our end users want here. >> >>> Spending time on something for it to thrown out is never fun :( >> >>> >> >>> And the more I thought about this the more I question the >> >>> validity >> >>> of maxIdle even. It seems like a very poor way to prevent >> >>> memory >> >>> exhaustion, which eviction does in a much better way and has >> >>> much >> >>> more flexible algorithms. Does anyone know what maxIdle would >> >>> be >> >>> used for that wouldn't be covered by eviction? The only thing I >> >>> can think of is cleaning up the cache store as well. >> >>> >> >>> >> >>> Actually I guess for session/authentication related information this >> >>> would be important. However maxIdle isn't really as usable in that >> >>> case since most likely you would have a sticky session to go back to >> >>> that node which means you would never refresh the last used date on >> >>> the copies (current implementation). Without cluster expiration you >> >>> could lose that session information on a failover very easily. >> >> I would say that maxIdle can be used as for memory management as kind >> >> of >> >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some record for >> >> a >> >> short while (regular transaction lifespan ~ seconds to minutes), and >> >> regularly the record is removed. However, to make sure that we don't >> >> leak records in this cache (if something goes wrong and the remove does >> >> not occur), it is removed. >> > Note that just relying on maxIdle doesn't guarantee you won't leak >> > records in this use case (specifically with the way the current >> > hibernate-infinispan 2LC implementation uses it). >> > >> > Hibernate-infinispan adds entries to its own Map stored in Infinispan, >> > and expects maxIdle to remove the map if it skips a remove. But in a >> > current case, we found that due to frequent accesses to that same map >> > the entries never idle out and it ends up in OOME). >> > >> > -Dennis >> > >> >> I can guess how long the transaction takes place, but not how many >> >> parallel transactions there are. With eviction algorithms (where I am >> >> not sure about the exact guarantees) I can set the cache to not hold >> >> more than N entries, but I can't know for sure that my record does not >> >> suddenly get evicted after shorter period, possibly causing some >> >> inconsistency. >> >> So this is similar to WeakHashMap by removing the key "when it can't be >> >> used anymore" because I know that the transaction will finish before >> >> the >> >> deadline. I don't care about the exact size, I don't want to tune that, >> >> I just don't want to leak. >> >> >> >> From my POV the non-strict maxIdle and strict expiration would be a >> >> nice compromise. >> >> >> >> Radim >> >> >> >>> Note that this would make the reaper thread less efficient: >> >>> with >> >>> numOwners=2 (best case), half of the entries that the reaper >> >>> touches >> >>> cannot be expired, because the node isn't the primary node. >> >>> And to >> >>> make matters worse, the same reaper thread would have to >> >>> perform a >> >>> (synchronous?) RPC for each entry to ensure it expires >> >>> everywhere. >> >>> >> >>> >> >>> I have debated about this, it could something like a sync >> >>> removeAll which has a special marker to tell it is due to >> >>> expiration (which would raise listeners there), while also >> >>> sending >> >>> a cluster expiration event to other non owners. >> >>> >> >>> >> >>> For maxIdle I'd like to know more information about how >> >>> exactly the >> >>> owners would coordinate to expire an entry. I'm pretty sure >> >>> we >> >>> cannot >> >>> avoid ignoring some reads (expiring an entry immediately >> >>> after >> >>> it was >> >>> read), and ensuring that we don't accidentally extend an >> >>> entry's life >> >>> (like the current code does, when we transfer an entry to a >> >>> new owner) >> >>> also sounds problematic. >> >>> >> >>> >> >>> For lifespan it is simple, the primary owner just expires it >> >>> when >> >>> it expires there. There is no coordination needed in this case >> >>> it >> >>> just sends the expired remove to owners etc. >> >>> >> >>> Max idle is more complicated as we all know. The primary owner >> >>> would send a request for the last used time for a given key or >> >>> set >> >>> of keys. Then the owner would take those times and check for a >> >>> new access it isn't aware of. If there isn't then it would send >> >>> a >> >>> remove command for the key(s). If there is a new access the >> >>> owner >> >>> would instead send the last used time to all of the owners. The >> >>> expiration obviously would have a window that if a read occurred >> >>> after sending a response that could be ignored. This could be >> >>> resolved by using some sort of 2PC and blocking reads during >> >>> that >> >>> period but I would say it isn't worth it. >> >>> >> >>> The issue with transferring to a new node refreshing the last >> >>> update/lifespan seems like just a bug we need to fix >> >>> irrespective >> >>> of this issue IMO. >> >>> >> >>> >> >>> I'm not saying expiring entries on each node independently >> >>> is >> >>> perfect, >> >>> far from it. But I wouldn't want us to provide new >> >>> guarantees that >> >>> could hurt performance without a really good use case. >> >>> >> >>> >> >>> I would guess that user perceived performance should be a little >> >>> faster with this. But this also depends on an alternative that >> >>> we >> >>> decided on :) >> >>> >> >>> Also the expiration thread pool is set to min priority atm so it >> >>> may delay removal of said objects but hopefully (if the jvm >> >>> supports) it wouldn't overrun a CPU while processing unless it >> >>> has >> >>> availability. >> >>> >> >>> >> >>> Cheers >> >>> Dan >> >>> >> >>> >> >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant >> >>> > wrote: >> >>> > After re-reading the whole original thread, I agree with >> >>> the >> >>> proposal >> >>> > with two caveats: >> >>> > >> >>> > - ensure that we don't break JCache compatibility >> >>> > - ensure that we document this properly >> >>> > >> >>> > Tristan >> >>> > >> >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: >> >>> >> +1 >> >>> >> You had me convinced at the first line, although "A lot >> >>> of >> >>> code can now >> >>> >> be removed and made simpler" makes it look extremely >> >>> nice. >> >>> >> >> >>> >> On 13 Jul 2015 18:14, "William Burns" >> >>> > >>> >> >>> >> > >> >>> >> wrote: >> >>> >> >> >>> >> This is a necro of [1]. >> >>> >> >> >>> >> With Infinispan 8.0 we are adding in clustered >> >>> expiration. That >> >>> >> includes an expiration event raised that is clustered >> >>> as well. >> >>> >> Unfortunately expiration events currently occur >> >>> multiple times (if >> >>> >> numOwners > 1) at different times across nodes in a >> >>> cluster. This >> >>> >> makes coordinating a single cluster expiration event >> >>> quite difficult. >> >>> >> >> >>> >> To work around this I am proposing that the >> >>> expiration >> >>> of an event >> >>> >> is done solely by the owner of the given key that is >> >>> now expired. >> >>> >> This would fix the issue of having multiple events >> >>> and >> >>> the event can >> >>> >> be raised while holding the lock for the given key so >> >>> concurrent >> >>> >> modifications would not be an issue. >> >>> >> >> >>> >> The problem arises when you have other nodes that >> >>> have >> >>> expiration >> >>> >> set but expire at different times. Max idle is the >> >>> biggest offender >> >>> >> with this as a read on an owner only refreshes the >> >>> owners timestamp, >> >>> >> meaning other owners would not be updated and expire >> >>> preemptively. >> >>> >> To have expiration work properly in this case you >> >>> would >> >>> need >> >>> >> coordination between the owners to see if anyone has >> >>> a >> >>> higher >> >>> >> value. This requires blocking and would have to be >> >>> done while >> >>> >> accessing a key that is expired to be sure if >> >>> expiration happened or >> >>> >> not. >> >>> >> >> >>> >> The linked dev listing proposed instead to only >> >>> expire >> >>> an entry by >> >>> >> the reaper thread and not on access. In this case a >> >>> read will >> >>> >> return a non null value until it is fully expired, >> >>> increasing hit >> >>> >> ratios possibly. >> >>> >> >> >>> >> Their are quire a bit of real benefits for this: >> >>> >> >> >>> >> 1. Cluster cache reads would be much simpler and >> >>> wouldn't have to >> >>> >> block to verify the object exists or not since this >> >>> would only be >> >>> >> done by the reaper thread (note this would have only >> >>> happened if the >> >>> >> entry was expired locally). An access would just >> >>> return the value >> >>> >> immediately. >> >>> >> 2. Each node only expires entries it owns in the >> >>> reaper >> >>> thread >> >>> >> reducing how many entries they must check or remove. >> >>> This also >> >>> >> provides a single point where events would be raised >> >>> as >> >>> we need. >> >>> >> 3. A lot of code can now be removed and made simpler >> >>> as >> >>> it no longer >> >>> >> has to check for expiration. The expiration check >> >>> would only be >> >>> >> done in 1 place, the expiration reaper thread. >> >>> >> >> >>> >> The main issue with this proposal is as the other >> >>> listing mentions >> >>> >> is if user code expects the value to be gone after >> >>> expiration for >> >>> >> correctness. I would say this use case is not as >> >>> compelling for >> >>> >> maxIdle, especially since we never supported it >> >>> properly. And in >> >>> >> the case of lifespan the user could very easily store >> >>> the expiration >> >>> >> time in the object that they can check after a get as >> >>> pointed out in >> >>> >> the other thread. >> >>> >> >> >>> >> [1] >> >>> >> >> >>> >> >>> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html >> >>> >> >> >>> >> _______________________________________________ >> >>> >> infinispan-dev mailing list >> >>> >> infinispan-dev at lists.jboss.org >> >>> >> >>> > >>> > >> >>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>> >> >> >>> >> >> >>> >> >> >>> >> _______________________________________________ >> >>> >> infinispan-dev mailing list >> >>> >> infinispan-dev at lists.jboss.org >> >>> >> >>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>> >> >> >>> > >> >>> > -- >> >>> > Tristan Tarrant >> >>> > Infinispan Lead >> >>> > JBoss, a division of Red Hat >> >>> > _______________________________________________ >> >>> > infinispan-dev mailing list >> >>> > infinispan-dev at lists.jboss.org >> >>> >> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>> _______________________________________________ >> >>> infinispan-dev mailing list >> >>> infinispan-dev at lists.jboss.org >> >>> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>> >> >>> >> >>> >> >>> _______________________________________________ >> >>> infinispan-dev mailing list >> >>> infinispan-dev at lists.jboss.org >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Radim Vansa >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From mudokonman at gmail.com Wed Jul 22 11:06:47 2015 From: mudokonman at gmail.com (William Burns) Date: Wed, 22 Jul 2015 15:06:47 +0000 Subject: [infinispan-dev] Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> <55A52E8C.3060906@redhat.com> <55A54C1E.3080703@redhat.com> Message-ID: On Wed, Jul 22, 2015 at 10:53 AM Dan Berindei wrote: > Is it possible/feasible to skip the notification from the backups to > the primary (and back) when there is no clustered expiration listener? > Unfortunately there is no way to distinguish whether or a listener is create, modify, remove or expiration. So this would only work if there are no clustered listeners. This however should be feasible. This shouldn't be hard to add. The only thing I would have to figure out is what happens in the case of a rehash and the node that removed the value is now the primary owner and some nodes have the old value and someone registers an expiration listener. I am thinking I should only raise the event if the primary owner still has the value. > > Dan > > > On Tue, Jul 21, 2015 at 5:25 PM, William Burns > wrote: > > So I wanted to sum up what it looks like the plan is for this in regards > to > > cluster expiration for ISPN 8. > > > > First off to not make it ambiguous, maxIdle being used with a clustered > > cache will provide undefined and unsupported behavior. This can and will > > expire entries on a single node without notifying other cluster members > > (essentially it will operate as it does today unchanged). > > > > This leaves me to talk solely about lifespan cluster expiration. > > > > Lifespan Expiration events are fired by the primary owner of an expired > key > > > > - when accessing an expired entry. > > > > - by the reaper thread. > > > > If the expiration is detected by a node other than the primary owner, an > > expiration command is sent to it and null is returned immediately not > > waiting for a response. > > > > Expiration event listeners follow the usual rules for sync/async: in the > > case of a sync listener, the handler is invoked while holding the lock, > > whereas an async listener will not hold locks. > > > > It is desirable for expiration events to contain both the key and value. > > However currently cache stores do not provide the value when they expire > > values. Thus we can only guarantee the value is present when an in > memory > > expiration event occurs. We could plan on adding this later. > > > > Also as you may have guessed this doesn't touch strict expiration, which > I > > think we have come to the conclusion should only work with maxIdle and as > > such this is not explored with this iteration. > > > > Let me know if you guys think this approach is okay. > > > > Cheers, > > > > - Will > > > > On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa wrote: > >> > >> Yes, I know about [1]. I've worked that around by storing timestamp in > >> the entry as well and when a new record is added, the 'expired' > >> invalidations are purged. But I can't purge that if I don't access it - > >> Infinispan needs to handle that internally. > >> > >> Radim > >> > >> [1] https://hibernate.atlassian.net/browse/HHH-6219 > >> > >> On 07/14/2015 05:45 PM, Dennis Reed wrote: > >> > On 07/14/2015 11:08 AM, Radim Vansa wrote: > >> >> On 07/14/2015 04:19 PM, William Burns wrote: > >> >>> > >> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns >> >>> > wrote: > >> >>> > >> >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei > >> >>> > > wrote: > >> >>> > >> >>> Processing expiration only on the reaper thread sounds > nice, > >> >>> but I > >> >>> have one reservation: processing 1 million entries to see > >> >>> that > >> >>> 1 of > >> >>> them is expired is a lot of work, and in the general case > we > >> >>> will not > >> >>> be able to ensure an expiration precision of less than 1 > >> >>> minute (maybe > >> >>> more, with a huge SingleFileStore attached). > >> >>> > >> >>> > >> >>> This isn't much different then before. The only difference is > >> >>> that if a user touched a value after it expired it wouldn't > show > >> >>> up (which is unlikely with maxIdle especially). > >> >>> > >> >>> > >> >>> What happens to users who need better precision? In > >> >>> particular, I know > >> >>> some JCache tests were failing because HotRod was only > >> >>> supporting > >> >>> 1-second resolution instead of the 1-millisecond > resolution > >> >>> they were > >> >>> expecting. > >> >>> > >> >>> > >> >>> JCache is an interesting piece. The thing about JCache is > that > >> >>> the spec is only defined for local caches. However I wouldn't > >> >>> want to muddy up the waters in regards to it behaving > >> >>> differently > >> >>> for local/remote. In the JCache scenario we could add an > >> >>> interceptor to prevent it returning such values (we do > something > >> >>> similar already for events). JCache behavior vs ISPN behavior > >> >>> seems a bit easier to differentiate. But like you are getting > >> >>> at, > >> >>> either way is not very appealing. > >> >>> > >> >>> > >> >>> > >> >>> I'm even less convinced about the need to guarantee that a > >> >>> clustered > >> >>> expiration listener will only be triggered once, and that > >> >>> the > >> >>> entry > >> >>> must be null everywhere after that listener was invoked. > >> >>> What's the > >> >>> use case? > >> >>> > >> >>> > >> >>> Maybe Tristan would know more to answer. To be honest this > work > >> >>> seems fruitless unless we know what our end users want here. > >> >>> Spending time on something for it to thrown out is never fun > :( > >> >>> > >> >>> And the more I thought about this the more I question the > >> >>> validity > >> >>> of maxIdle even. It seems like a very poor way to prevent > >> >>> memory > >> >>> exhaustion, which eviction does in a much better way and has > >> >>> much > >> >>> more flexible algorithms. Does anyone know what maxIdle would > >> >>> be > >> >>> used for that wouldn't be covered by eviction? The only > thing I > >> >>> can think of is cleaning up the cache store as well. > >> >>> > >> >>> > >> >>> Actually I guess for session/authentication related information this > >> >>> would be important. However maxIdle isn't really as usable in that > >> >>> case since most likely you would have a sticky session to go back to > >> >>> that node which means you would never refresh the last used date on > >> >>> the copies (current implementation). Without cluster expiration you > >> >>> could lose that session information on a failover very easily. > >> >> I would say that maxIdle can be used as for memory management as kind > >> >> of > >> >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some record > for > >> >> a > >> >> short while (regular transaction lifespan ~ seconds to minutes), and > >> >> regularly the record is removed. However, to make sure that we don't > >> >> leak records in this cache (if something goes wrong and the remove > does > >> >> not occur), it is removed. > >> > Note that just relying on maxIdle doesn't guarantee you won't leak > >> > records in this use case (specifically with the way the current > >> > hibernate-infinispan 2LC implementation uses it). > >> > > >> > Hibernate-infinispan adds entries to its own Map stored in Infinispan, > >> > and expects maxIdle to remove the map if it skips a remove. But in a > >> > current case, we found that due to frequent accesses to that same map > >> > the entries never idle out and it ends up in OOME). > >> > > >> > -Dennis > >> > > >> >> I can guess how long the transaction takes place, but not how many > >> >> parallel transactions there are. With eviction algorithms (where I am > >> >> not sure about the exact guarantees) I can set the cache to not hold > >> >> more than N entries, but I can't know for sure that my record does > not > >> >> suddenly get evicted after shorter period, possibly causing some > >> >> inconsistency. > >> >> So this is similar to WeakHashMap by removing the key "when it can't > be > >> >> used anymore" because I know that the transaction will finish before > >> >> the > >> >> deadline. I don't care about the exact size, I don't want to tune > that, > >> >> I just don't want to leak. > >> >> > >> >> From my POV the non-strict maxIdle and strict expiration would be > a > >> >> nice compromise. > >> >> > >> >> Radim > >> >> > >> >>> Note that this would make the reaper thread less > efficient: > >> >>> with > >> >>> numOwners=2 (best case), half of the entries that the > reaper > >> >>> touches > >> >>> cannot be expired, because the node isn't the primary > node. > >> >>> And to > >> >>> make matters worse, the same reaper thread would have to > >> >>> perform a > >> >>> (synchronous?) RPC for each entry to ensure it expires > >> >>> everywhere. > >> >>> > >> >>> > >> >>> I have debated about this, it could something like a sync > >> >>> removeAll which has a special marker to tell it is due to > >> >>> expiration (which would raise listeners there), while also > >> >>> sending > >> >>> a cluster expiration event to other non owners. > >> >>> > >> >>> > >> >>> For maxIdle I'd like to know more information about how > >> >>> exactly the > >> >>> owners would coordinate to expire an entry. I'm pretty > sure > >> >>> we > >> >>> cannot > >> >>> avoid ignoring some reads (expiring an entry immediately > >> >>> after > >> >>> it was > >> >>> read), and ensuring that we don't accidentally extend an > >> >>> entry's life > >> >>> (like the current code does, when we transfer an entry to > a > >> >>> new owner) > >> >>> also sounds problematic. > >> >>> > >> >>> > >> >>> For lifespan it is simple, the primary owner just expires it > >> >>> when > >> >>> it expires there. There is no coordination needed in this > case > >> >>> it > >> >>> just sends the expired remove to owners etc. > >> >>> > >> >>> Max idle is more complicated as we all know. The primary > owner > >> >>> would send a request for the last used time for a given key or > >> >>> set > >> >>> of keys. Then the owner would take those times and check for > a > >> >>> new access it isn't aware of. If there isn't then it would > send > >> >>> a > >> >>> remove command for the key(s). If there is a new access the > >> >>> owner > >> >>> would instead send the last used time to all of the owners. > The > >> >>> expiration obviously would have a window that if a read > occurred > >> >>> after sending a response that could be ignored. This could be > >> >>> resolved by using some sort of 2PC and blocking reads during > >> >>> that > >> >>> period but I would say it isn't worth it. > >> >>> > >> >>> The issue with transferring to a new node refreshing the last > >> >>> update/lifespan seems like just a bug we need to fix > >> >>> irrespective > >> >>> of this issue IMO. > >> >>> > >> >>> > >> >>> I'm not saying expiring entries on each node independently > >> >>> is > >> >>> perfect, > >> >>> far from it. But I wouldn't want us to provide new > >> >>> guarantees that > >> >>> could hurt performance without a really good use case. > >> >>> > >> >>> > >> >>> I would guess that user perceived performance should be a > little > >> >>> faster with this. But this also depends on an alternative > that > >> >>> we > >> >>> decided on :) > >> >>> > >> >>> Also the expiration thread pool is set to min priority atm so > it > >> >>> may delay removal of said objects but hopefully (if the jvm > >> >>> supports) it wouldn't overrun a CPU while processing unless it > >> >>> has > >> >>> availability. > >> >>> > >> >>> > >> >>> Cheers > >> >>> Dan > >> >>> > >> >>> > >> >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant > >> >>> > wrote: > >> >>> > After re-reading the whole original thread, I agree with > >> >>> the > >> >>> proposal > >> >>> > with two caveats: > >> >>> > > >> >>> > - ensure that we don't break JCache compatibility > >> >>> > - ensure that we document this properly > >> >>> > > >> >>> > Tristan > >> >>> > > >> >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: > >> >>> >> +1 > >> >>> >> You had me convinced at the first line, although "A lot > >> >>> of > >> >>> code can now > >> >>> >> be removed and made simpler" makes it look extremely > >> >>> nice. > >> >>> >> > >> >>> >> On 13 Jul 2015 18:14, "William Burns" > >> >>> >> >>> > >> >>> >> >> > >> >>> >> wrote: > >> >>> >> > >> >>> >> This is a necro of [1]. > >> >>> >> > >> >>> >> With Infinispan 8.0 we are adding in clustered > >> >>> expiration. That > >> >>> >> includes an expiration event raised that is > clustered > >> >>> as well. > >> >>> >> Unfortunately expiration events currently occur > >> >>> multiple times (if > >> >>> >> numOwners > 1) at different times across nodes in a > >> >>> cluster. This > >> >>> >> makes coordinating a single cluster expiration > event > >> >>> quite difficult. > >> >>> >> > >> >>> >> To work around this I am proposing that the > >> >>> expiration > >> >>> of an event > >> >>> >> is done solely by the owner of the given key that > is > >> >>> now expired. > >> >>> >> This would fix the issue of having multiple events > >> >>> and > >> >>> the event can > >> >>> >> be raised while holding the lock for the given key > so > >> >>> concurrent > >> >>> >> modifications would not be an issue. > >> >>> >> > >> >>> >> The problem arises when you have other nodes that > >> >>> have > >> >>> expiration > >> >>> >> set but expire at different times. Max idle is the > >> >>> biggest offender > >> >>> >> with this as a read on an owner only refreshes the > >> >>> owners timestamp, > >> >>> >> meaning other owners would not be updated and > expire > >> >>> preemptively. > >> >>> >> To have expiration work properly in this case you > >> >>> would > >> >>> need > >> >>> >> coordination between the owners to see if anyone > has > >> >>> a > >> >>> higher > >> >>> >> value. This requires blocking and would have to be > >> >>> done while > >> >>> >> accessing a key that is expired to be sure if > >> >>> expiration happened or > >> >>> >> not. > >> >>> >> > >> >>> >> The linked dev listing proposed instead to only > >> >>> expire > >> >>> an entry by > >> >>> >> the reaper thread and not on access. In this case > a > >> >>> read will > >> >>> >> return a non null value until it is fully expired, > >> >>> increasing hit > >> >>> >> ratios possibly. > >> >>> >> > >> >>> >> Their are quire a bit of real benefits for this: > >> >>> >> > >> >>> >> 1. Cluster cache reads would be much simpler and > >> >>> wouldn't have to > >> >>> >> block to verify the object exists or not since this > >> >>> would only be > >> >>> >> done by the reaper thread (note this would have > only > >> >>> happened if the > >> >>> >> entry was expired locally). An access would just > >> >>> return the value > >> >>> >> immediately. > >> >>> >> 2. Each node only expires entries it owns in the > >> >>> reaper > >> >>> thread > >> >>> >> reducing how many entries they must check or > remove. > >> >>> This also > >> >>> >> provides a single point where events would be > raised > >> >>> as > >> >>> we need. > >> >>> >> 3. A lot of code can now be removed and made > simpler > >> >>> as > >> >>> it no longer > >> >>> >> has to check for expiration. The expiration check > >> >>> would only be > >> >>> >> done in 1 place, the expiration reaper thread. > >> >>> >> > >> >>> >> The main issue with this proposal is as the other > >> >>> listing mentions > >> >>> >> is if user code expects the value to be gone after > >> >>> expiration for > >> >>> >> correctness. I would say this use case is not as > >> >>> compelling for > >> >>> >> maxIdle, especially since we never supported it > >> >>> properly. And in > >> >>> >> the case of lifespan the user could very easily > store > >> >>> the expiration > >> >>> >> time in the object that they can check after a get > as > >> >>> pointed out in > >> >>> >> the other thread. > >> >>> >> > >> >>> >> [1] > >> >>> >> > >> >>> > >> >>> > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > >> >>> >> > >> >>> >> _______________________________________________ > >> >>> >> infinispan-dev mailing list > >> >>> >> infinispan-dev at lists.jboss.org > >> >>> > >> >>> >> >>> > > >> >>> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> _______________________________________________ > >> >>> >> infinispan-dev mailing list > >> >>> >> infinispan-dev at lists.jboss.org > >> >>> > >> >>> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> >> > >> >>> > > >> >>> > -- > >> >>> > Tristan Tarrant > >> >>> > Infinispan Lead > >> >>> > JBoss, a division of Red Hat > >> >>> > _______________________________________________ > >> >>> > infinispan-dev mailing list > >> >>> > infinispan-dev at lists.jboss.org > >> >>> > >> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> _______________________________________________ > >> >>> infinispan-dev mailing list > >> >>> infinispan-dev at lists.jboss.org > >> >>> > >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> > >> >>> > >> >>> > >> >>> _______________________________________________ > >> >>> infinispan-dev mailing list > >> >>> infinispan-dev at lists.jboss.org > >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >> > >> > _______________________________________________ > >> > infinispan-dev mailing list > >> > infinispan-dev at lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> > >> -- > >> Radim Vansa > >> JBoss Performance Team > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150722/6a8ecb47/attachment-0001.html From mudokonman at gmail.com Thu Jul 23 08:37:25 2015 From: mudokonman at gmail.com (William Burns) Date: Thu, 23 Jul 2015 12:37:25 +0000 Subject: [infinispan-dev] Fwd: Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> <55A52E8C.3060906@redhat.com> <55A54C1E.3080703@redhat.com> Message-ID: I actually found another hiccup with cache stores. It seems currently we only allow for a callback when an entry is expired from a cache store when using the reaper thread [1]. However we don't allow for such a callback on a read which finds an expired entry and wants to remove it [2]. Interestingly our cache stores in general don't even expire entries on load with the few exceptions below: 1. SingleCacheStore returns true for an expired entry on contains 2. SingleCacheStore removes expired entries on load 3. RemoteStore does not need to worry about expiration since it is handled by another remote server. Of all of the other stores I have looked at they return false properly for expired entries and only purge elements from within reaper thread. I propose we change SingleCacheStore to behave as the other cache stores. This doesn't require any API changes. We would then rely on store expiring elements only during reaper thread or if the element expires in memory. We should also guarantee that when a cache store is used that the reaper thread is enabled (throw exception if not enabled and store is present at init). Should I worry about when only a RemoteStore is used (this seems a bit fragile)? To be honest we would need to revamp the CacheLoader/Writer API at a later point to allow for values to be optionally provided for expiration anyways, so I would say to do that in addition to allowing loader/stores to expire on access. [1] https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/AdvancedCacheWriter.java#L29 [2] https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/CacheLoader.java#L34 ---------- Forwarded message --------- From: William Burns Date: Wed, Jul 22, 2015 at 11:06 AM Subject: Re: [infinispan-dev] Strict Expiration To: infinispan -Dev List On Wed, Jul 22, 2015 at 10:53 AM Dan Berindei wrote: > Is it possible/feasible to skip the notification from the backups to > the primary (and back) when there is no clustered expiration listener? > Unfortunately there is no way to distinguish whether or a listener is create, modify, remove or expiration. So this would only work if there are no clustered listeners. This however should be feasible. This shouldn't be hard to add. The only thing I would have to figure out is what happens in the case of a rehash and the node that removed the value is now the primary owner and some nodes have the old value and someone registers an expiration listener. I am thinking I should only raise the event if the primary owner still has the value. > > Dan > > > On Tue, Jul 21, 2015 at 5:25 PM, William Burns > wrote: > > So I wanted to sum up what it looks like the plan is for this in regards > to > > cluster expiration for ISPN 8. > > > > First off to not make it ambiguous, maxIdle being used with a clustered > > cache will provide undefined and unsupported behavior. This can and will > > expire entries on a single node without notifying other cluster members > > (essentially it will operate as it does today unchanged). > > > > This leaves me to talk solely about lifespan cluster expiration. > > > > Lifespan Expiration events are fired by the primary owner of an expired > key > > > > - when accessing an expired entry. > > > > - by the reaper thread. > > > > If the expiration is detected by a node other than the primary owner, an > > expiration command is sent to it and null is returned immediately not > > waiting for a response. > > > > Expiration event listeners follow the usual rules for sync/async: in the > > case of a sync listener, the handler is invoked while holding the lock, > > whereas an async listener will not hold locks. > > > > It is desirable for expiration events to contain both the key and value. > > However currently cache stores do not provide the value when they expire > > values. Thus we can only guarantee the value is present when an in > memory > > expiration event occurs. We could plan on adding this later. > > > > Also as you may have guessed this doesn't touch strict expiration, which > I > > think we have come to the conclusion should only work with maxIdle and as > > such this is not explored with this iteration. > > > > Let me know if you guys think this approach is okay. > > > > Cheers, > > > > - Will > > > > On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa wrote: > >> > >> Yes, I know about [1]. I've worked that around by storing timestamp in > >> the entry as well and when a new record is added, the 'expired' > >> invalidations are purged. But I can't purge that if I don't access it - > >> Infinispan needs to handle that internally. > >> > >> Radim > >> > >> [1] https://hibernate.atlassian.net/browse/HHH-6219 > >> > >> On 07/14/2015 05:45 PM, Dennis Reed wrote: > >> > On 07/14/2015 11:08 AM, Radim Vansa wrote: > >> >> On 07/14/2015 04:19 PM, William Burns wrote: > >> >>> > >> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns >> >>> > wrote: > >> >>> > >> >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei > >> >>> > > wrote: > >> >>> > >> >>> Processing expiration only on the reaper thread sounds > nice, > >> >>> but I > >> >>> have one reservation: processing 1 million entries to see > >> >>> that > >> >>> 1 of > >> >>> them is expired is a lot of work, and in the general case > we > >> >>> will not > >> >>> be able to ensure an expiration precision of less than 1 > >> >>> minute (maybe > >> >>> more, with a huge SingleFileStore attached). > >> >>> > >> >>> > >> >>> This isn't much different then before. The only difference is > >> >>> that if a user touched a value after it expired it wouldn't > show > >> >>> up (which is unlikely with maxIdle especially). > >> >>> > >> >>> > >> >>> What happens to users who need better precision? In > >> >>> particular, I know > >> >>> some JCache tests were failing because HotRod was only > >> >>> supporting > >> >>> 1-second resolution instead of the 1-millisecond > resolution > >> >>> they were > >> >>> expecting. > >> >>> > >> >>> > >> >>> JCache is an interesting piece. The thing about JCache is > that > >> >>> the spec is only defined for local caches. However I wouldn't > >> >>> want to muddy up the waters in regards to it behaving > >> >>> differently > >> >>> for local/remote. In the JCache scenario we could add an > >> >>> interceptor to prevent it returning such values (we do > something > >> >>> similar already for events). JCache behavior vs ISPN behavior > >> >>> seems a bit easier to differentiate. But like you are getting > >> >>> at, > >> >>> either way is not very appealing. > >> >>> > >> >>> > >> >>> > >> >>> I'm even less convinced about the need to guarantee that a > >> >>> clustered > >> >>> expiration listener will only be triggered once, and that > >> >>> the > >> >>> entry > >> >>> must be null everywhere after that listener was invoked. > >> >>> What's the > >> >>> use case? > >> >>> > >> >>> > >> >>> Maybe Tristan would know more to answer. To be honest this > work > >> >>> seems fruitless unless we know what our end users want here. > >> >>> Spending time on something for it to thrown out is never fun > :( > >> >>> > >> >>> And the more I thought about this the more I question the > >> >>> validity > >> >>> of maxIdle even. It seems like a very poor way to prevent > >> >>> memory > >> >>> exhaustion, which eviction does in a much better way and has > >> >>> much > >> >>> more flexible algorithms. Does anyone know what maxIdle would > >> >>> be > >> >>> used for that wouldn't be covered by eviction? The only > thing I > >> >>> can think of is cleaning up the cache store as well. > >> >>> > >> >>> > >> >>> Actually I guess for session/authentication related information this > >> >>> would be important. However maxIdle isn't really as usable in that > >> >>> case since most likely you would have a sticky session to go back to > >> >>> that node which means you would never refresh the last used date on > >> >>> the copies (current implementation). Without cluster expiration you > >> >>> could lose that session information on a failover very easily. > >> >> I would say that maxIdle can be used as for memory management as kind > >> >> of > >> >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some record > for > >> >> a > >> >> short while (regular transaction lifespan ~ seconds to minutes), and > >> >> regularly the record is removed. However, to make sure that we don't > >> >> leak records in this cache (if something goes wrong and the remove > does > >> >> not occur), it is removed. > >> > Note that just relying on maxIdle doesn't guarantee you won't leak > >> > records in this use case (specifically with the way the current > >> > hibernate-infinispan 2LC implementation uses it). > >> > > >> > Hibernate-infinispan adds entries to its own Map stored in Infinispan, > >> > and expects maxIdle to remove the map if it skips a remove. But in a > >> > current case, we found that due to frequent accesses to that same map > >> > the entries never idle out and it ends up in OOME). > >> > > >> > -Dennis > >> > > >> >> I can guess how long the transaction takes place, but not how many > >> >> parallel transactions there are. With eviction algorithms (where I am > >> >> not sure about the exact guarantees) I can set the cache to not hold > >> >> more than N entries, but I can't know for sure that my record does > not > >> >> suddenly get evicted after shorter period, possibly causing some > >> >> inconsistency. > >> >> So this is similar to WeakHashMap by removing the key "when it can't > be > >> >> used anymore" because I know that the transaction will finish before > >> >> the > >> >> deadline. I don't care about the exact size, I don't want to tune > that, > >> >> I just don't want to leak. > >> >> > >> >> From my POV the non-strict maxIdle and strict expiration would be > a > >> >> nice compromise. > >> >> > >> >> Radim > >> >> > >> >>> Note that this would make the reaper thread less > efficient: > >> >>> with > >> >>> numOwners=2 (best case), half of the entries that the > reaper > >> >>> touches > >> >>> cannot be expired, because the node isn't the primary > node. > >> >>> And to > >> >>> make matters worse, the same reaper thread would have to > >> >>> perform a > >> >>> (synchronous?) RPC for each entry to ensure it expires > >> >>> everywhere. > >> >>> > >> >>> > >> >>> I have debated about this, it could something like a sync > >> >>> removeAll which has a special marker to tell it is due to > >> >>> expiration (which would raise listeners there), while also > >> >>> sending > >> >>> a cluster expiration event to other non owners. > >> >>> > >> >>> > >> >>> For maxIdle I'd like to know more information about how > >> >>> exactly the > >> >>> owners would coordinate to expire an entry. I'm pretty > sure > >> >>> we > >> >>> cannot > >> >>> avoid ignoring some reads (expiring an entry immediately > >> >>> after > >> >>> it was > >> >>> read), and ensuring that we don't accidentally extend an > >> >>> entry's life > >> >>> (like the current code does, when we transfer an entry to > a > >> >>> new owner) > >> >>> also sounds problematic. > >> >>> > >> >>> > >> >>> For lifespan it is simple, the primary owner just expires it > >> >>> when > >> >>> it expires there. There is no coordination needed in this > case > >> >>> it > >> >>> just sends the expired remove to owners etc. > >> >>> > >> >>> Max idle is more complicated as we all know. The primary > owner > >> >>> would send a request for the last used time for a given key or > >> >>> set > >> >>> of keys. Then the owner would take those times and check for > a > >> >>> new access it isn't aware of. If there isn't then it would > send > >> >>> a > >> >>> remove command for the key(s). If there is a new access the > >> >>> owner > >> >>> would instead send the last used time to all of the owners. > The > >> >>> expiration obviously would have a window that if a read > occurred > >> >>> after sending a response that could be ignored. This could be > >> >>> resolved by using some sort of 2PC and blocking reads during > >> >>> that > >> >>> period but I would say it isn't worth it. > >> >>> > >> >>> The issue with transferring to a new node refreshing the last > >> >>> update/lifespan seems like just a bug we need to fix > >> >>> irrespective > >> >>> of this issue IMO. > >> >>> > >> >>> > >> >>> I'm not saying expiring entries on each node independently > >> >>> is > >> >>> perfect, > >> >>> far from it. But I wouldn't want us to provide new > >> >>> guarantees that > >> >>> could hurt performance without a really good use case. > >> >>> > >> >>> > >> >>> I would guess that user perceived performance should be a > little > >> >>> faster with this. But this also depends on an alternative > that > >> >>> we > >> >>> decided on :) > >> >>> > >> >>> Also the expiration thread pool is set to min priority atm so > it > >> >>> may delay removal of said objects but hopefully (if the jvm > >> >>> supports) it wouldn't overrun a CPU while processing unless it > >> >>> has > >> >>> availability. > >> >>> > >> >>> > >> >>> Cheers > >> >>> Dan > >> >>> > >> >>> > >> >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant > >> >>> > wrote: > >> >>> > After re-reading the whole original thread, I agree with > >> >>> the > >> >>> proposal > >> >>> > with two caveats: > >> >>> > > >> >>> > - ensure that we don't break JCache compatibility > >> >>> > - ensure that we document this properly > >> >>> > > >> >>> > Tristan > >> >>> > > >> >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: > >> >>> >> +1 > >> >>> >> You had me convinced at the first line, although "A lot > >> >>> of > >> >>> code can now > >> >>> >> be removed and made simpler" makes it look extremely > >> >>> nice. > >> >>> >> > >> >>> >> On 13 Jul 2015 18:14, "William Burns" > >> >>> >> >>> > >> >>> >> >> > >> >>> >> wrote: > >> >>> >> > >> >>> >> This is a necro of [1]. > >> >>> >> > >> >>> >> With Infinispan 8.0 we are adding in clustered > >> >>> expiration. That > >> >>> >> includes an expiration event raised that is > clustered > >> >>> as well. > >> >>> >> Unfortunately expiration events currently occur > >> >>> multiple times (if > >> >>> >> numOwners > 1) at different times across nodes in a > >> >>> cluster. This > >> >>> >> makes coordinating a single cluster expiration > event > >> >>> quite difficult. > >> >>> >> > >> >>> >> To work around this I am proposing that the > >> >>> expiration > >> >>> of an event > >> >>> >> is done solely by the owner of the given key that > is > >> >>> now expired. > >> >>> >> This would fix the issue of having multiple events > >> >>> and > >> >>> the event can > >> >>> >> be raised while holding the lock for the given key > so > >> >>> concurrent > >> >>> >> modifications would not be an issue. > >> >>> >> > >> >>> >> The problem arises when you have other nodes that > >> >>> have > >> >>> expiration > >> >>> >> set but expire at different times. Max idle is the > >> >>> biggest offender > >> >>> >> with this as a read on an owner only refreshes the > >> >>> owners timestamp, > >> >>> >> meaning other owners would not be updated and > expire > >> >>> preemptively. > >> >>> >> To have expiration work properly in this case you > >> >>> would > >> >>> need > >> >>> >> coordination between the owners to see if anyone > has > >> >>> a > >> >>> higher > >> >>> >> value. This requires blocking and would have to be > >> >>> done while > >> >>> >> accessing a key that is expired to be sure if > >> >>> expiration happened or > >> >>> >> not. > >> >>> >> > >> >>> >> The linked dev listing proposed instead to only > >> >>> expire > >> >>> an entry by > >> >>> >> the reaper thread and not on access. In this case > a > >> >>> read will > >> >>> >> return a non null value until it is fully expired, > >> >>> increasing hit > >> >>> >> ratios possibly. > >> >>> >> > >> >>> >> Their are quire a bit of real benefits for this: > >> >>> >> > >> >>> >> 1. Cluster cache reads would be much simpler and > >> >>> wouldn't have to > >> >>> >> block to verify the object exists or not since this > >> >>> would only be > >> >>> >> done by the reaper thread (note this would have > only > >> >>> happened if the > >> >>> >> entry was expired locally). An access would just > >> >>> return the value > >> >>> >> immediately. > >> >>> >> 2. Each node only expires entries it owns in the > >> >>> reaper > >> >>> thread > >> >>> >> reducing how many entries they must check or > remove. > >> >>> This also > >> >>> >> provides a single point where events would be > raised > >> >>> as > >> >>> we need. > >> >>> >> 3. A lot of code can now be removed and made > simpler > >> >>> as > >> >>> it no longer > >> >>> >> has to check for expiration. The expiration check > >> >>> would only be > >> >>> >> done in 1 place, the expiration reaper thread. > >> >>> >> > >> >>> >> The main issue with this proposal is as the other > >> >>> listing mentions > >> >>> >> is if user code expects the value to be gone after > >> >>> expiration for > >> >>> >> correctness. I would say this use case is not as > >> >>> compelling for > >> >>> >> maxIdle, especially since we never supported it > >> >>> properly. And in > >> >>> >> the case of lifespan the user could very easily > store > >> >>> the expiration > >> >>> >> time in the object that they can check after a get > as > >> >>> pointed out in > >> >>> >> the other thread. > >> >>> >> > >> >>> >> [1] > >> >>> >> > >> >>> > >> >>> > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > >> >>> >> > >> >>> >> _______________________________________________ > >> >>> >> infinispan-dev mailing list > >> >>> >> infinispan-dev at lists.jboss.org > >> >>> > >> >>> >> >>> > > >> >>> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> _______________________________________________ > >> >>> >> infinispan-dev mailing list > >> >>> >> infinispan-dev at lists.jboss.org > >> >>> > >> >>> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> >> > >> >>> > > >> >>> > -- > >> >>> > Tristan Tarrant > >> >>> > Infinispan Lead > >> >>> > JBoss, a division of Red Hat > >> >>> > _______________________________________________ > >> >>> > infinispan-dev mailing list > >> >>> > infinispan-dev at lists.jboss.org > >> >>> > >> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> _______________________________________________ > >> >>> infinispan-dev mailing list > >> >>> infinispan-dev at lists.jboss.org > >> >>> > >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> > >> >>> > >> >>> > >> >>> _______________________________________________ > >> >>> infinispan-dev mailing list > >> >>> infinispan-dev at lists.jboss.org > >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >> > >> > _______________________________________________ > >> > infinispan-dev mailing list > >> > infinispan-dev at lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> > >> -- > >> Radim Vansa > >> JBoss Performance Team > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150723/0f68b5f7/attachment-0001.html From dan.berindei at gmail.com Thu Jul 23 09:59:58 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Thu, 23 Jul 2015 16:59:58 +0300 Subject: [infinispan-dev] Fwd: Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> <55A52E8C.3060906@redhat.com> <55A54C1E.3080703@redhat.com> Message-ID: On Thu, Jul 23, 2015 at 3:37 PM, William Burns wrote: > I actually found another hiccup with cache stores. It seems currently we > only allow for a callback when an entry is expired from a cache store when > using the reaper thread [1]. However we don't allow for such a callback on > a read which finds an expired entry and wants to remove it [2]. > > Interestingly our cache stores in general don't even expire entries on load > with the few exceptions below: > > 1. SingleCacheStore returns true for an expired entry on contains > 2. SingleCacheStore removes expired entries on load > 3. RemoteStore does not need to worry about expiration since it is handled > by another remote server. > > Of all of the other stores I have looked at they return false properly for > expired entries and only purge elements from within reaper thread. > > I propose we change SingleCacheStore to behave as the other cache stores. > This doesn't require any API changes. We would then rely on store expiring > elements only during reaper thread or if the element expires in memory. We > should also guarantee that when a cache store is used that the reaper thread > is enabled (throw exception if not enabled and store is present at init). > Should I worry about when only a RemoteStore is used (this seems a bit > fragile)? +1, I wouldn't add a special case for RemoteStore. > > To be honest we would need to revamp the CacheLoader/Writer API at a later > point to allow for values to be optionally provided for expiration anyways, > so I would say to do that in addition to allowing loader/stores to expire on > access. Sounds good. > > [1] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/AdvancedCacheWriter.java#L29 > [2] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/CacheLoader.java#L34 > > > ---------- Forwarded message --------- > From: William Burns > Date: Wed, Jul 22, 2015 at 11:06 AM > Subject: Re: [infinispan-dev] Strict Expiration > To: infinispan -Dev List > > > On Wed, Jul 22, 2015 at 10:53 AM Dan Berindei > wrote: >> >> Is it possible/feasible to skip the notification from the backups to >> the primary (and back) when there is no clustered expiration listener? > > > Unfortunately there is no way to distinguish whether or a listener is > create, modify, remove or expiration. So this would only work if there are > no clustered listeners. > > This however should be feasible. This shouldn't be hard to add. It should be good enough. > > The only thing I would have to figure out is what happens in the case of a > rehash and the node that removed the value is now the primary owner and some > nodes have the old value and someone registers an expiration listener. I am > thinking I should only raise the event if the primary owner still has the > value. +1 > >> >> >> Dan >> >> >> On Tue, Jul 21, 2015 at 5:25 PM, William Burns >> wrote: >> > So I wanted to sum up what it looks like the plan is for this in regards >> > to >> > cluster expiration for ISPN 8. >> > >> > First off to not make it ambiguous, maxIdle being used with a clustered >> > cache will provide undefined and unsupported behavior. This can and >> > will >> > expire entries on a single node without notifying other cluster members >> > (essentially it will operate as it does today unchanged). >> > >> > This leaves me to talk solely about lifespan cluster expiration. >> > >> > Lifespan Expiration events are fired by the primary owner of an expired >> > key >> > >> > - when accessing an expired entry. >> > >> > - by the reaper thread. >> > >> > If the expiration is detected by a node other than the primary owner, an >> > expiration command is sent to it and null is returned immediately not >> > waiting for a response. >> > >> > Expiration event listeners follow the usual rules for sync/async: in the >> > case of a sync listener, the handler is invoked while holding the lock, >> > whereas an async listener will not hold locks. >> > >> > It is desirable for expiration events to contain both the key and value. >> > However currently cache stores do not provide the value when they expire >> > values. Thus we can only guarantee the value is present when an in >> > memory >> > expiration event occurs. We could plan on adding this later. >> > >> > Also as you may have guessed this doesn't touch strict expiration, which >> > I >> > think we have come to the conclusion should only work with maxIdle and >> > as >> > such this is not explored with this iteration. >> > >> > Let me know if you guys think this approach is okay. >> > >> > Cheers, >> > >> > - Will >> > >> > On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa wrote: >> >> >> >> Yes, I know about [1]. I've worked that around by storing timestamp in >> >> the entry as well and when a new record is added, the 'expired' >> >> invalidations are purged. But I can't purge that if I don't access it - >> >> Infinispan needs to handle that internally. >> >> >> >> Radim >> >> >> >> [1] https://hibernate.atlassian.net/browse/HHH-6219 >> >> >> >> On 07/14/2015 05:45 PM, Dennis Reed wrote: >> >> > On 07/14/2015 11:08 AM, Radim Vansa wrote: >> >> >> On 07/14/2015 04:19 PM, William Burns wrote: >> >> >>> >> >> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns > >> >>> > wrote: >> >> >>> >> >> >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei >> >> >>> > >> >> >>> wrote: >> >> >>> >> >> >>> Processing expiration only on the reaper thread sounds >> >> >>> nice, >> >> >>> but I >> >> >>> have one reservation: processing 1 million entries to see >> >> >>> that >> >> >>> 1 of >> >> >>> them is expired is a lot of work, and in the general case >> >> >>> we >> >> >>> will not >> >> >>> be able to ensure an expiration precision of less than 1 >> >> >>> minute (maybe >> >> >>> more, with a huge SingleFileStore attached). >> >> >>> >> >> >>> >> >> >>> This isn't much different then before. The only difference >> >> >>> is >> >> >>> that if a user touched a value after it expired it wouldn't >> >> >>> show >> >> >>> up (which is unlikely with maxIdle especially). >> >> >>> >> >> >>> >> >> >>> What happens to users who need better precision? In >> >> >>> particular, I know >> >> >>> some JCache tests were failing because HotRod was only >> >> >>> supporting >> >> >>> 1-second resolution instead of the 1-millisecond >> >> >>> resolution >> >> >>> they were >> >> >>> expecting. >> >> >>> >> >> >>> >> >> >>> JCache is an interesting piece. The thing about JCache is >> >> >>> that >> >> >>> the spec is only defined for local caches. However I >> >> >>> wouldn't >> >> >>> want to muddy up the waters in regards to it behaving >> >> >>> differently >> >> >>> for local/remote. In the JCache scenario we could add an >> >> >>> interceptor to prevent it returning such values (we do >> >> >>> something >> >> >>> similar already for events). JCache behavior vs ISPN >> >> >>> behavior >> >> >>> seems a bit easier to differentiate. But like you are >> >> >>> getting >> >> >>> at, >> >> >>> either way is not very appealing. >> >> >>> >> >> >>> >> >> >>> >> >> >>> I'm even less convinced about the need to guarantee that >> >> >>> a >> >> >>> clustered >> >> >>> expiration listener will only be triggered once, and that >> >> >>> the >> >> >>> entry >> >> >>> must be null everywhere after that listener was invoked. >> >> >>> What's the >> >> >>> use case? >> >> >>> >> >> >>> >> >> >>> Maybe Tristan would know more to answer. To be honest this >> >> >>> work >> >> >>> seems fruitless unless we know what our end users want here. >> >> >>> Spending time on something for it to thrown out is never fun >> >> >>> :( >> >> >>> >> >> >>> And the more I thought about this the more I question the >> >> >>> validity >> >> >>> of maxIdle even. It seems like a very poor way to prevent >> >> >>> memory >> >> >>> exhaustion, which eviction does in a much better way and has >> >> >>> much >> >> >>> more flexible algorithms. Does anyone know what maxIdle >> >> >>> would >> >> >>> be >> >> >>> used for that wouldn't be covered by eviction? The only >> >> >>> thing I >> >> >>> can think of is cleaning up the cache store as well. >> >> >>> >> >> >>> >> >> >>> Actually I guess for session/authentication related information >> >> >>> this >> >> >>> would be important. However maxIdle isn't really as usable in that >> >> >>> case since most likely you would have a sticky session to go back >> >> >>> to >> >> >>> that node which means you would never refresh the last used date on >> >> >>> the copies (current implementation). Without cluster expiration >> >> >>> you >> >> >>> could lose that session information on a failover very easily. >> >> >> I would say that maxIdle can be used as for memory management as >> >> >> kind >> >> >> of >> >> >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some record >> >> >> for >> >> >> a >> >> >> short while (regular transaction lifespan ~ seconds to minutes), and >> >> >> regularly the record is removed. However, to make sure that we don't >> >> >> leak records in this cache (if something goes wrong and the remove >> >> >> does >> >> >> not occur), it is removed. >> >> > Note that just relying on maxIdle doesn't guarantee you won't leak >> >> > records in this use case (specifically with the way the current >> >> > hibernate-infinispan 2LC implementation uses it). >> >> > >> >> > Hibernate-infinispan adds entries to its own Map stored in >> >> > Infinispan, >> >> > and expects maxIdle to remove the map if it skips a remove. But in a >> >> > current case, we found that due to frequent accesses to that same map >> >> > the entries never idle out and it ends up in OOME). >> >> > >> >> > -Dennis >> >> > >> >> >> I can guess how long the transaction takes place, but not how many >> >> >> parallel transactions there are. With eviction algorithms (where I >> >> >> am >> >> >> not sure about the exact guarantees) I can set the cache to not hold >> >> >> more than N entries, but I can't know for sure that my record does >> >> >> not >> >> >> suddenly get evicted after shorter period, possibly causing some >> >> >> inconsistency. >> >> >> So this is similar to WeakHashMap by removing the key "when it can't >> >> >> be >> >> >> used anymore" because I know that the transaction will finish before >> >> >> the >> >> >> deadline. I don't care about the exact size, I don't want to tune >> >> >> that, >> >> >> I just don't want to leak. >> >> >> >> >> >> From my POV the non-strict maxIdle and strict expiration would be >> >> >> a >> >> >> nice compromise. >> >> >> >> >> >> Radim >> >> >> >> >> >>> Note that this would make the reaper thread less >> >> >>> efficient: >> >> >>> with >> >> >>> numOwners=2 (best case), half of the entries that the >> >> >>> reaper >> >> >>> touches >> >> >>> cannot be expired, because the node isn't the primary >> >> >>> node. >> >> >>> And to >> >> >>> make matters worse, the same reaper thread would have to >> >> >>> perform a >> >> >>> (synchronous?) RPC for each entry to ensure it expires >> >> >>> everywhere. >> >> >>> >> >> >>> >> >> >>> I have debated about this, it could something like a sync >> >> >>> removeAll which has a special marker to tell it is due to >> >> >>> expiration (which would raise listeners there), while also >> >> >>> sending >> >> >>> a cluster expiration event to other non owners. >> >> >>> >> >> >>> >> >> >>> For maxIdle I'd like to know more information about how >> >> >>> exactly the >> >> >>> owners would coordinate to expire an entry. I'm pretty >> >> >>> sure >> >> >>> we >> >> >>> cannot >> >> >>> avoid ignoring some reads (expiring an entry immediately >> >> >>> after >> >> >>> it was >> >> >>> read), and ensuring that we don't accidentally extend an >> >> >>> entry's life >> >> >>> (like the current code does, when we transfer an entry to >> >> >>> a >> >> >>> new owner) >> >> >>> also sounds problematic. >> >> >>> >> >> >>> >> >> >>> For lifespan it is simple, the primary owner just expires it >> >> >>> when >> >> >>> it expires there. There is no coordination needed in this >> >> >>> case >> >> >>> it >> >> >>> just sends the expired remove to owners etc. >> >> >>> >> >> >>> Max idle is more complicated as we all know. The primary >> >> >>> owner >> >> >>> would send a request for the last used time for a given key >> >> >>> or >> >> >>> set >> >> >>> of keys. Then the owner would take those times and check for >> >> >>> a >> >> >>> new access it isn't aware of. If there isn't then it would >> >> >>> send >> >> >>> a >> >> >>> remove command for the key(s). If there is a new access the >> >> >>> owner >> >> >>> would instead send the last used time to all of the owners. >> >> >>> The >> >> >>> expiration obviously would have a window that if a read >> >> >>> occurred >> >> >>> after sending a response that could be ignored. This could >> >> >>> be >> >> >>> resolved by using some sort of 2PC and blocking reads during >> >> >>> that >> >> >>> period but I would say it isn't worth it. >> >> >>> >> >> >>> The issue with transferring to a new node refreshing the last >> >> >>> update/lifespan seems like just a bug we need to fix >> >> >>> irrespective >> >> >>> of this issue IMO. >> >> >>> >> >> >>> >> >> >>> I'm not saying expiring entries on each node >> >> >>> independently >> >> >>> is >> >> >>> perfect, >> >> >>> far from it. But I wouldn't want us to provide new >> >> >>> guarantees that >> >> >>> could hurt performance without a really good use case. >> >> >>> >> >> >>> >> >> >>> I would guess that user perceived performance should be a >> >> >>> little >> >> >>> faster with this. But this also depends on an alternative >> >> >>> that >> >> >>> we >> >> >>> decided on :) >> >> >>> >> >> >>> Also the expiration thread pool is set to min priority atm so >> >> >>> it >> >> >>> may delay removal of said objects but hopefully (if the jvm >> >> >>> supports) it wouldn't overrun a CPU while processing unless >> >> >>> it >> >> >>> has >> >> >>> availability. >> >> >>> >> >> >>> >> >> >>> Cheers >> >> >>> Dan >> >> >>> >> >> >>> >> >> >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant >> >> >>> > wrote: >> >> >>> > After re-reading the whole original thread, I agree >> >> >>> with >> >> >>> the >> >> >>> proposal >> >> >>> > with two caveats: >> >> >>> > >> >> >>> > - ensure that we don't break JCache compatibility >> >> >>> > - ensure that we document this properly >> >> >>> > >> >> >>> > Tristan >> >> >>> > >> >> >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: >> >> >>> >> +1 >> >> >>> >> You had me convinced at the first line, although "A >> >> >>> lot >> >> >>> of >> >> >>> code can now >> >> >>> >> be removed and made simpler" makes it look extremely >> >> >>> nice. >> >> >>> >> >> >> >>> >> On 13 Jul 2015 18:14, "William Burns" >> >> >>> > >> >>> >> >> >>> >> > >> >> >> >>> >> wrote: >> >> >>> >> >> >> >>> >> This is a necro of [1]. >> >> >>> >> >> >> >>> >> With Infinispan 8.0 we are adding in clustered >> >> >>> expiration. That >> >> >>> >> includes an expiration event raised that is >> >> >>> clustered >> >> >>> as well. >> >> >>> >> Unfortunately expiration events currently occur >> >> >>> multiple times (if >> >> >>> >> numOwners > 1) at different times across nodes in >> >> >>> a >> >> >>> cluster. This >> >> >>> >> makes coordinating a single cluster expiration >> >> >>> event >> >> >>> quite difficult. >> >> >>> >> >> >> >>> >> To work around this I am proposing that the >> >> >>> expiration >> >> >>> of an event >> >> >>> >> is done solely by the owner of the given key that >> >> >>> is >> >> >>> now expired. >> >> >>> >> This would fix the issue of having multiple events >> >> >>> and >> >> >>> the event can >> >> >>> >> be raised while holding the lock for the given key >> >> >>> so >> >> >>> concurrent >> >> >>> >> modifications would not be an issue. >> >> >>> >> >> >> >>> >> The problem arises when you have other nodes that >> >> >>> have >> >> >>> expiration >> >> >>> >> set but expire at different times. Max idle is >> >> >>> the >> >> >>> biggest offender >> >> >>> >> with this as a read on an owner only refreshes the >> >> >>> owners timestamp, >> >> >>> >> meaning other owners would not be updated and >> >> >>> expire >> >> >>> preemptively. >> >> >>> >> To have expiration work properly in this case you >> >> >>> would >> >> >>> need >> >> >>> >> coordination between the owners to see if anyone >> >> >>> has >> >> >>> a >> >> >>> higher >> >> >>> >> value. This requires blocking and would have to >> >> >>> be >> >> >>> done while >> >> >>> >> accessing a key that is expired to be sure if >> >> >>> expiration happened or >> >> >>> >> not. >> >> >>> >> >> >> >>> >> The linked dev listing proposed instead to only >> >> >>> expire >> >> >>> an entry by >> >> >>> >> the reaper thread and not on access. In this case >> >> >>> a >> >> >>> read will >> >> >>> >> return a non null value until it is fully expired, >> >> >>> increasing hit >> >> >>> >> ratios possibly. >> >> >>> >> >> >> >>> >> Their are quire a bit of real benefits for this: >> >> >>> >> >> >> >>> >> 1. Cluster cache reads would be much simpler and >> >> >>> wouldn't have to >> >> >>> >> block to verify the object exists or not since >> >> >>> this >> >> >>> would only be >> >> >>> >> done by the reaper thread (note this would have >> >> >>> only >> >> >>> happened if the >> >> >>> >> entry was expired locally). An access would just >> >> >>> return the value >> >> >>> >> immediately. >> >> >>> >> 2. Each node only expires entries it owns in the >> >> >>> reaper >> >> >>> thread >> >> >>> >> reducing how many entries they must check or >> >> >>> remove. >> >> >>> This also >> >> >>> >> provides a single point where events would be >> >> >>> raised >> >> >>> as >> >> >>> we need. >> >> >>> >> 3. A lot of code can now be removed and made >> >> >>> simpler >> >> >>> as >> >> >>> it no longer >> >> >>> >> has to check for expiration. The expiration check >> >> >>> would only be >> >> >>> >> done in 1 place, the expiration reaper thread. >> >> >>> >> >> >> >>> >> The main issue with this proposal is as the other >> >> >>> listing mentions >> >> >>> >> is if user code expects the value to be gone after >> >> >>> expiration for >> >> >>> >> correctness. I would say this use case is not as >> >> >>> compelling for >> >> >>> >> maxIdle, especially since we never supported it >> >> >>> properly. And in >> >> >>> >> the case of lifespan the user could very easily >> >> >>> store >> >> >>> the expiration >> >> >>> >> time in the object that they can check after a get >> >> >>> as >> >> >>> pointed out in >> >> >>> >> the other thread. >> >> >>> >> >> >> >>> >> [1] >> >> >>> >> >> >> >>> >> >> >>> >> >> >>> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html >> >> >>> >> >> >> >>> >> _______________________________________________ >> >> >>> >> infinispan-dev mailing list >> >> >>> >> infinispan-dev at lists.jboss.org >> >> >>> >> >> >>> > >> >>> > >> >> >>> >> >> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> _______________________________________________ >> >> >>> >> infinispan-dev mailing list >> >> >>> >> infinispan-dev at lists.jboss.org >> >> >>> >> >> >>> >> >> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >>> >> >> >> >>> > >> >> >>> > -- >> >> >>> > Tristan Tarrant >> >> >>> > Infinispan Lead >> >> >>> > JBoss, a division of Red Hat >> >> >>> > _______________________________________________ >> >> >>> > infinispan-dev mailing list >> >> >>> > infinispan-dev at lists.jboss.org >> >> >>> >> >> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >>> _______________________________________________ >> >> >>> infinispan-dev mailing list >> >> >>> infinispan-dev at lists.jboss.org >> >> >>> >> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >>> >> >> >>> >> >> >>> >> >> >>> _______________________________________________ >> >> >>> infinispan-dev mailing list >> >> >>> infinispan-dev at lists.jboss.org >> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> > _______________________________________________ >> >> > infinispan-dev mailing list >> >> > infinispan-dev at lists.jboss.org >> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> >> -- >> >> Radim Vansa >> >> JBoss Performance Team >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> infinispan-dev at lists.jboss.org >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Thu Jul 23 12:54:37 2015 From: rvansa at redhat.com (Radim Vansa) Date: Thu, 23 Jul 2015 18:54:37 +0200 Subject: [infinispan-dev] Fwd: Strict Expiration In-Reply-To: References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> <55A52E8C.3060906@redhat.com> <55A54C1E.3080703@redhat.com> Message-ID: <55B11C4D.3030507@redhat.com> When you're into the stores & expiration: any plans for handling [1]? Radim [1] https://issues.jboss.org/browse/ISPN-3202 On 07/23/2015 02:37 PM, William Burns wrote: > I actually found another hiccup with cache stores. It seems currently > we only allow for a callback when an entry is expired from a cache > store when using the reaper thread [1]. However we don't allow for > such a callback on a read which finds an expired entry and wants to > remove it [2]. > > Interestingly our cache stores in general don't even expire entries on > load with the few exceptions below: > > 1. SingleCacheStore returns true for an expired entry on contains > 2. SingleCacheStore removes expired entries on load > 3. RemoteStore does not need to worry about expiration since it is > handled by another remote server. > > Of all of the other stores I have looked at they return false properly > for expired entries and only purge elements from within reaper thread. > > I propose we change SingleCacheStore to behave as the other cache > stores. This doesn't require any API changes. We would then rely on > store expiring elements only during reaper thread or if the element > expires in memory. We should also guarantee that when a cache store is > used that the reaper thread is enabled (throw exception if not enabled > and store is present at init). Should I worry about when only a > RemoteStore is used (this seems a bit fragile)? > > To be honest we would need to revamp the CacheLoader/Writer API at a > later point to allow for values to be optionally provided for > expiration anyways, so I would say to do that in addition to allowing > loader/stores to expire on access. > > [1] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/AdvancedCacheWriter.java#L29 > > [2] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/CacheLoader.java#L34 > > ---------- Forwarded message --------- > From: William Burns > > Date: Wed, Jul 22, 2015 at 11:06 AM > Subject: Re: [infinispan-dev] Strict Expiration > To: infinispan -Dev List > > > > On Wed, Jul 22, 2015 at 10:53 AM Dan Berindei > wrote: > > Is it possible/feasible to skip the notification from the backups to > the primary (and back) when there is no clustered expiration listener? > > > Unfortunately there is no way to distinguish whether or a listener is > create, modify, remove or expiration. So this would only work if > there are no clustered listeners. > > This however should be feasible. This shouldn't be hard to add. > > The only thing I would have to figure out is what happens in the case > of a rehash and the node that removed the value is now the primary > owner and some nodes have the old value and someone registers an > expiration listener. I am thinking I should only raise the event if > the primary owner still has the value. > > > Dan > > > On Tue, Jul 21, 2015 at 5:25 PM, William Burns > > wrote: > > So I wanted to sum up what it looks like the plan is for this in > regards to > > cluster expiration for ISPN 8. > > > > First off to not make it ambiguous, maxIdle being used with a > clustered > > cache will provide undefined and unsupported behavior. This can > and will > > expire entries on a single node without notifying other cluster > members > > (essentially it will operate as it does today unchanged). > > > > This leaves me to talk solely about lifespan cluster expiration. > > > > Lifespan Expiration events are fired by the primary owner of an > expired key > > > > - when accessing an expired entry. > > > > - by the reaper thread. > > > > If the expiration is detected by a node other than the primary > owner, an > > expiration command is sent to it and null is returned > immediately not > > waiting for a response. > > > > Expiration event listeners follow the usual rules for > sync/async: in the > > case of a sync listener, the handler is invoked while holding > the lock, > > whereas an async listener will not hold locks. > > > > It is desirable for expiration events to contain both the key > and value. > > However currently cache stores do not provide the value when > they expire > > values. Thus we can only guarantee the value is present when an > in memory > > expiration event occurs. We could plan on adding this later. > > > > Also as you may have guessed this doesn't touch strict > expiration, which I > > think we have come to the conclusion should only work with > maxIdle and as > > such this is not explored with this iteration. > > > > Let me know if you guys think this approach is okay. > > > > Cheers, > > > > - Will > > > > On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa > wrote: > >> > >> Yes, I know about [1]. I've worked that around by storing > timestamp in > >> the entry as well and when a new record is added, the 'expired' > >> invalidations are purged. But I can't purge that if I don't > access it - > >> Infinispan needs to handle that internally. > >> > >> Radim > >> > >> [1] https://hibernate.atlassian.net/browse/HHH-6219 > >> > >> On 07/14/2015 05:45 PM, Dennis Reed wrote: > >> > On 07/14/2015 11:08 AM, Radim Vansa wrote: > >> >> On 07/14/2015 04:19 PM, William Burns wrote: > >> >>> > >> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns > > >> >>> >> wrote: > >> >>> > >> >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei > >> >>> >> wrote: > >> >>> > >> >>> Processing expiration only on the reaper thread > sounds nice, > >> >>> but I > >> >>> have one reservation: processing 1 million > entries to see > >> >>> that > >> >>> 1 of > >> >>> them is expired is a lot of work, and in the > general case we > >> >>> will not > >> >>> be able to ensure an expiration precision of less > than 1 > >> >>> minute (maybe > >> >>> more, with a huge SingleFileStore attached). > >> >>> > >> >>> > >> >>> This isn't much different then before. The only > difference is > >> >>> that if a user touched a value after it expired it > wouldn't show > >> >>> up (which is unlikely with maxIdle especially). > >> >>> > >> >>> > >> >>> What happens to users who need better precision? In > >> >>> particular, I know > >> >>> some JCache tests were failing because HotRod was > only > >> >>> supporting > >> >>> 1-second resolution instead of the 1-millisecond > resolution > >> >>> they were > >> >>> expecting. > >> >>> > >> >>> > >> >>> JCache is an interesting piece. The thing about > JCache is that > >> >>> the spec is only defined for local caches. However I > wouldn't > >> >>> want to muddy up the waters in regards to it behaving > >> >>> differently > >> >>> for local/remote. In the JCache scenario we could add an > >> >>> interceptor to prevent it returning such values (we > do something > >> >>> similar already for events). JCache behavior vs ISPN > behavior > >> >>> seems a bit easier to differentiate. But like you > are getting > >> >>> at, > >> >>> either way is not very appealing. > >> >>> > >> >>> > >> >>> > >> >>> I'm even less convinced about the need to > guarantee that a > >> >>> clustered > >> >>> expiration listener will only be triggered once, > and that > >> >>> the > >> >>> entry > >> >>> must be null everywhere after that listener was > invoked. > >> >>> What's the > >> >>> use case? > >> >>> > >> >>> > >> >>> Maybe Tristan would know more to answer. To be > honest this work > >> >>> seems fruitless unless we know what our end users > want here. > >> >>> Spending time on something for it to thrown out is > never fun :( > >> >>> > >> >>> And the more I thought about this the more I question the > >> >>> validity > >> >>> of maxIdle even. It seems like a very poor way to prevent > >> >>> memory > >> >>> exhaustion, which eviction does in a much better way > and has > >> >>> much > >> >>> more flexible algorithms. Does anyone know what > maxIdle would > >> >>> be > >> >>> used for that wouldn't be covered by eviction? The > only thing I > >> >>> can think of is cleaning up the cache store as well. > >> >>> > >> >>> > >> >>> Actually I guess for session/authentication related > information this > >> >>> would be important. However maxIdle isn't really as usable > in that > >> >>> case since most likely you would have a sticky session to > go back to > >> >>> that node which means you would never refresh the last used > date on > >> >>> the copies (current implementation). Without cluster > expiration you > >> >>> could lose that session information on a failover very easily. > >> >> I would say that maxIdle can be used as for memory > management as kind > >> >> of > >> >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some > record for > >> >> a > >> >> short while (regular transaction lifespan ~ seconds to > minutes), and > >> >> regularly the record is removed. However, to make sure that > we don't > >> >> leak records in this cache (if something goes wrong and the > remove does > >> >> not occur), it is removed. > >> > Note that just relying on maxIdle doesn't guarantee you won't > leak > >> > records in this use case (specifically with the way the current > >> > hibernate-infinispan 2LC implementation uses it). > >> > > >> > Hibernate-infinispan adds entries to its own Map stored in > Infinispan, > >> > and expects maxIdle to remove the map if it skips a remove. > But in a > >> > current case, we found that due to frequent accesses to that > same map > >> > the entries never idle out and it ends up in OOME). > >> > > >> > -Dennis > >> > > >> >> I can guess how long the transaction takes place, but not > how many > >> >> parallel transactions there are. With eviction algorithms > (where I am > >> >> not sure about the exact guarantees) I can set the cache to > not hold > >> >> more than N entries, but I can't know for sure that my > record does not > >> >> suddenly get evicted after shorter period, possibly causing some > >> >> inconsistency. > >> >> So this is similar to WeakHashMap by removing the key "when > it can't be > >> >> used anymore" because I know that the transaction will > finish before > >> >> the > >> >> deadline. I don't care about the exact size, I don't want to > tune that, > >> >> I just don't want to leak. > >> >> > >> >> From my POV the non-strict maxIdle and strict expiration > would be a > >> >> nice compromise. > >> >> > >> >> Radim > >> >> > >> >>> Note that this would make the reaper thread less > efficient: > >> >>> with > >> >>> numOwners=2 (best case), half of the entries that > the reaper > >> >>> touches > >> >>> cannot be expired, because the node isn't the > primary node. > >> >>> And to > >> >>> make matters worse, the same reaper thread would > have to > >> >>> perform a > >> >>> (synchronous?) RPC for each entry to ensure it > expires > >> >>> everywhere. > >> >>> > >> >>> > >> >>> I have debated about this, it could something like a sync > >> >>> removeAll which has a special marker to tell it is due to > >> >>> expiration (which would raise listeners there), while > also > >> >>> sending > >> >>> a cluster expiration event to other non owners. > >> >>> > >> >>> > >> >>> For maxIdle I'd like to know more information > about how > >> >>> exactly the > >> >>> owners would coordinate to expire an entry. I'm > pretty sure > >> >>> we > >> >>> cannot > >> >>> avoid ignoring some reads (expiring an entry > immediately > >> >>> after > >> >>> it was > >> >>> read), and ensuring that we don't accidentally > extend an > >> >>> entry's life > >> >>> (like the current code does, when we transfer an > entry to a > >> >>> new owner) > >> >>> also sounds problematic. > >> >>> > >> >>> > >> >>> For lifespan it is simple, the primary owner just > expires it > >> >>> when > >> >>> it expires there. There is no coordination needed in > this case > >> >>> it > >> >>> just sends the expired remove to owners etc. > >> >>> > >> >>> Max idle is more complicated as we all know. The > primary owner > >> >>> would send a request for the last used time for a > given key or > >> >>> set > >> >>> of keys. Then the owner would take those times and > check for a > >> >>> new access it isn't aware of. If there isn't then it > would send > >> >>> a > >> >>> remove command for the key(s). If there is a new > access the > >> >>> owner > >> >>> would instead send the last used time to all of the > owners. The > >> >>> expiration obviously would have a window that if a > read occurred > >> >>> after sending a response that could be ignored. This > could be > >> >>> resolved by using some sort of 2PC and blocking reads > during > >> >>> that > >> >>> period but I would say it isn't worth it. > >> >>> > >> >>> The issue with transferring to a new node refreshing > the last > >> >>> update/lifespan seems like just a bug we need to fix > >> >>> irrespective > >> >>> of this issue IMO. > >> >>> > >> >>> > >> >>> I'm not saying expiring entries on each node > independently > >> >>> is > >> >>> perfect, > >> >>> far from it. But I wouldn't want us to provide new > >> >>> guarantees that > >> >>> could hurt performance without a really good use > case. > >> >>> > >> >>> > >> >>> I would guess that user perceived performance should > be a little > >> >>> faster with this. But this also depends on an > alternative that > >> >>> we > >> >>> decided on :) > >> >>> > >> >>> Also the expiration thread pool is set to min > priority atm so it > >> >>> may delay removal of said objects but hopefully (if > the jvm > >> >>> supports) it wouldn't overrun a CPU while processing > unless it > >> >>> has > >> >>> availability. > >> >>> > >> >>> > >> >>> Cheers > >> >>> Dan > >> >>> > >> >>> > >> >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant > >> >>> > >> wrote: > >> >>> > After re-reading the whole original thread, I > agree with > >> >>> the > >> >>> proposal > >> >>> > with two caveats: > >> >>> > > >> >>> > - ensure that we don't break JCache compatibility > >> >>> > - ensure that we document this properly > >> >>> > > >> >>> > Tristan > >> >>> > > >> >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: > >> >>> >> +1 > >> >>> >> You had me convinced at the first line, > although "A lot > >> >>> of > >> >>> code can now > >> >>> >> be removed and made simpler" makes it look > extremely > >> >>> nice. > >> >>> >> > >> >>> >> On 13 Jul 2015 18:14, "William Burns" > >> >>> > >> >>> > > >> >>> >> > >> > >> >>> >>> wrote: > >> >>> >> > >> >>> >> This is a necro of [1]. > >> >>> >> > >> >>> >> With Infinispan 8.0 we are adding in clustered > >> >>> expiration. That > >> >>> >> includes an expiration event raised that is > clustered > >> >>> as well. > >> >>> >> Unfortunately expiration events currently occur > >> >>> multiple times (if > >> >>> >> numOwners > 1) at different times across > nodes in a > >> >>> cluster. This > >> >>> >> makes coordinating a single cluster > expiration event > >> >>> quite difficult. > >> >>> >> > >> >>> >> To work around this I am proposing that the > >> >>> expiration > >> >>> of an event > >> >>> >> is done solely by the owner of the given key > that is > >> >>> now expired. > >> >>> >> This would fix the issue of having multiple > events > >> >>> and > >> >>> the event can > >> >>> >> be raised while holding the lock for the > given key so > >> >>> concurrent > >> >>> >> modifications would not be an issue. > >> >>> >> > >> >>> >> The problem arises when you have other nodes that > >> >>> have > >> >>> expiration > >> >>> >> set but expire at different times. Max idle > is the > >> >>> biggest offender > >> >>> >> with this as a read on an owner only > refreshes the > >> >>> owners timestamp, > >> >>> >> meaning other owners would not be updated and > expire > >> >>> preemptively. > >> >>> >> To have expiration work properly in this case you > >> >>> would > >> >>> need > >> >>> >> coordination between the owners to see if > anyone has > >> >>> a > >> >>> higher > >> >>> >> value. This requires blocking and would have > to be > >> >>> done while > >> >>> >> accessing a key that is expired to be sure if > >> >>> expiration happened or > >> >>> >> not. > >> >>> >> > >> >>> >> The linked dev listing proposed instead to only > >> >>> expire > >> >>> an entry by > >> >>> >> the reaper thread and not on access. In this > case a > >> >>> read will > >> >>> >> return a non null value until it is fully > expired, > >> >>> increasing hit > >> >>> >> ratios possibly. > >> >>> >> > >> >>> >> Their are quire a bit of real benefits for this: > >> >>> >> > >> >>> >> 1. Cluster cache reads would be much simpler and > >> >>> wouldn't have to > >> >>> >> block to verify the object exists or not > since this > >> >>> would only be > >> >>> >> done by the reaper thread (note this would > have only > >> >>> happened if the > >> >>> >> entry was expired locally). An access would just > >> >>> return the value > >> >>> >> immediately. > >> >>> >> 2. Each node only expires entries it owns in the > >> >>> reaper > >> >>> thread > >> >>> >> reducing how many entries they must check or > remove. > >> >>> This also > >> >>> >> provides a single point where events would be > raised > >> >>> as > >> >>> we need. > >> >>> >> 3. A lot of code can now be removed and made > simpler > >> >>> as > >> >>> it no longer > >> >>> >> has to check for expiration. The expiration > check > >> >>> would only be > >> >>> >> done in 1 place, the expiration reaper thread. > >> >>> >> > >> >>> >> The main issue with this proposal is as the other > >> >>> listing mentions > >> >>> >> is if user code expects the value to be gone > after > >> >>> expiration for > >> >>> >> correctness. I would say this use case is not as > >> >>> compelling for > >> >>> >> maxIdle, especially since we never supported it > >> >>> properly. And in > >> >>> >> the case of lifespan the user could very > easily store > >> >>> the expiration > >> >>> >> time in the object that they can check after > a get as > >> >>> pointed out in > >> >>> >> the other thread. > >> >>> >> > >> >>> >> [1] > >> >>> >> > >> >>> > >> >>> > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > >> >>> >> > >> >>> >> _______________________________________________ > >> >>> >> infinispan-dev mailing list > >> >>> >> infinispan-dev at lists.jboss.org > > >> >>> > > >> >>> > >> >>> >> > >> >>> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> _______________________________________________ > >> >>> >> infinispan-dev mailing list > >> >>> >> infinispan-dev at lists.jboss.org > > >> >>> > > >> >>> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> >> > >> >>> > > >> >>> > -- > >> >>> > Tristan Tarrant > >> >>> > Infinispan Lead > >> >>> > JBoss, a division of Red Hat > >> >>> > _______________________________________________ > >> >>> > infinispan-dev mailing list > >> >>> > infinispan-dev at lists.jboss.org > > >> >>> > > >> >>> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> _______________________________________________ > >> >>> infinispan-dev mailing list > >> >>> infinispan-dev at lists.jboss.org > > >> >>> > > >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >>> > >> >>> > >> >>> > >> >>> _______________________________________________ > >> >>> infinispan-dev mailing list > >> >>> infinispan-dev at lists.jboss.org > > >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> >> > >> > _______________________________________________ > >> > infinispan-dev mailing list > >> > infinispan-dev at lists.jboss.org > > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> > >> -- > >> Radim Vansa > > >> JBoss Performance Team > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From mudokonman at gmail.com Thu Jul 23 13:03:15 2015 From: mudokonman at gmail.com (William Burns) Date: Thu, 23 Jul 2015 17:03:15 +0000 Subject: [infinispan-dev] Fwd: Strict Expiration In-Reply-To: <55B11C4D.3030507@redhat.com> References: <55A402A1.4010004@redhat.com> <55A525FE.80500@redhat.com> <55A52E8C.3060906@redhat.com> <55A54C1E.3080703@redhat.com> <55B11C4D.3030507@redhat.com> Message-ID: On Thu, Jul 23, 2015 at 12:54 PM Radim Vansa wrote: > When you're into the stores & expiration: any plans for handling [1]? > > Radim > > [1] https://issues.jboss.org/browse/ISPN-3202 I am not planning on it. This is yet another thing I wasn't aware of that makes me dislike maxIdle. > > On 07/23/2015 02:37 PM, William Burns wrote: > > I actually found another hiccup with cache stores. It seems currently > > we only allow for a callback when an entry is expired from a cache > > store when using the reaper thread [1]. However we don't allow for > > such a callback on a read which finds an expired entry and wants to > > remove it [2]. > > > > Interestingly our cache stores in general don't even expire entries on > > load with the few exceptions below: > > > > 1. SingleCacheStore returns true for an expired entry on contains > > 2. SingleCacheStore removes expired entries on load > > 3. RemoteStore does not need to worry about expiration since it is > > handled by another remote server. > > > > Of all of the other stores I have looked at they return false properly > > for expired entries and only purge elements from within reaper thread. > > > > I propose we change SingleCacheStore to behave as the other cache > > stores. This doesn't require any API changes. We would then rely on > > store expiring elements only during reaper thread or if the element > > expires in memory. We should also guarantee that when a cache store is > > used that the reaper thread is enabled (throw exception if not enabled > > and store is present at init). Should I worry about when only a > > RemoteStore is used (this seems a bit fragile)? > > > > To be honest we would need to revamp the CacheLoader/Writer API at a > > later point to allow for values to be optionally provided for > > expiration anyways, so I would say to do that in addition to allowing > > loader/stores to expire on access. > > > > [1] > > > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/AdvancedCacheWriter.java#L29 > > > > [2] > > > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/CacheLoader.java#L34 > > > > ---------- Forwarded message --------- > > From: William Burns > > > Date: Wed, Jul 22, 2015 at 11:06 AM > > Subject: Re: [infinispan-dev] Strict Expiration > > To: infinispan -Dev List > > > > > > > > On Wed, Jul 22, 2015 at 10:53 AM Dan Berindei > > wrote: > > > > Is it possible/feasible to skip the notification from the backups to > > the primary (and back) when there is no clustered expiration > listener? > > > > > > Unfortunately there is no way to distinguish whether or a listener is > > create, modify, remove or expiration. So this would only work if > > there are no clustered listeners. > > > > This however should be feasible. This shouldn't be hard to add. > > > > The only thing I would have to figure out is what happens in the case > > of a rehash and the node that removed the value is now the primary > > owner and some nodes have the old value and someone registers an > > expiration listener. I am thinking I should only raise the event if > > the primary owner still has the value. > > > > > > Dan > > > > > > On Tue, Jul 21, 2015 at 5:25 PM, William Burns > > > wrote: > > > So I wanted to sum up what it looks like the plan is for this in > > regards to > > > cluster expiration for ISPN 8. > > > > > > First off to not make it ambiguous, maxIdle being used with a > > clustered > > > cache will provide undefined and unsupported behavior. This can > > and will > > > expire entries on a single node without notifying other cluster > > members > > > (essentially it will operate as it does today unchanged). > > > > > > This leaves me to talk solely about lifespan cluster expiration. > > > > > > Lifespan Expiration events are fired by the primary owner of an > > expired key > > > > > > - when accessing an expired entry. > > > > > > - by the reaper thread. > > > > > > If the expiration is detected by a node other than the primary > > owner, an > > > expiration command is sent to it and null is returned > > immediately not > > > waiting for a response. > > > > > > Expiration event listeners follow the usual rules for > > sync/async: in the > > > case of a sync listener, the handler is invoked while holding > > the lock, > > > whereas an async listener will not hold locks. > > > > > > It is desirable for expiration events to contain both the key > > and value. > > > However currently cache stores do not provide the value when > > they expire > > > values. Thus we can only guarantee the value is present when an > > in memory > > > expiration event occurs. We could plan on adding this later. > > > > > > Also as you may have guessed this doesn't touch strict > > expiration, which I > > > think we have come to the conclusion should only work with > > maxIdle and as > > > such this is not explored with this iteration. > > > > > > Let me know if you guys think this approach is okay. > > > > > > Cheers, > > > > > > - Will > > > > > > On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa > > wrote: > > >> > > >> Yes, I know about [1]. I've worked that around by storing > > timestamp in > > >> the entry as well and when a new record is added, the 'expired' > > >> invalidations are purged. But I can't purge that if I don't > > access it - > > >> Infinispan needs to handle that internally. > > >> > > >> Radim > > >> > > >> [1] https://hibernate.atlassian.net/browse/HHH-6219 > > >> > > >> On 07/14/2015 05:45 PM, Dennis Reed wrote: > > >> > On 07/14/2015 11:08 AM, Radim Vansa wrote: > > >> >> On 07/14/2015 04:19 PM, William Burns wrote: > > >> >>> > > >> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns > > > > >> >>> > >> wrote: > > >> >>> > > >> >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei > > >> >>> > > >> wrote: > > >> >>> > > >> >>> Processing expiration only on the reaper thread > > sounds nice, > > >> >>> but I > > >> >>> have one reservation: processing 1 million > > entries to see > > >> >>> that > > >> >>> 1 of > > >> >>> them is expired is a lot of work, and in the > > general case we > > >> >>> will not > > >> >>> be able to ensure an expiration precision of less > > than 1 > > >> >>> minute (maybe > > >> >>> more, with a huge SingleFileStore attached). > > >> >>> > > >> >>> > > >> >>> This isn't much different then before. The only > > difference is > > >> >>> that if a user touched a value after it expired it > > wouldn't show > > >> >>> up (which is unlikely with maxIdle especially). > > >> >>> > > >> >>> > > >> >>> What happens to users who need better precision? In > > >> >>> particular, I know > > >> >>> some JCache tests were failing because HotRod was > > only > > >> >>> supporting > > >> >>> 1-second resolution instead of the 1-millisecond > > resolution > > >> >>> they were > > >> >>> expecting. > > >> >>> > > >> >>> > > >> >>> JCache is an interesting piece. The thing about > > JCache is that > > >> >>> the spec is only defined for local caches. However I > > wouldn't > > >> >>> want to muddy up the waters in regards to it behaving > > >> >>> differently > > >> >>> for local/remote. In the JCache scenario we could add an > > >> >>> interceptor to prevent it returning such values (we > > do something > > >> >>> similar already for events). JCache behavior vs ISPN > > behavior > > >> >>> seems a bit easier to differentiate. But like you > > are getting > > >> >>> at, > > >> >>> either way is not very appealing. > > >> >>> > > >> >>> > > >> >>> > > >> >>> I'm even less convinced about the need to > > guarantee that a > > >> >>> clustered > > >> >>> expiration listener will only be triggered once, > > and that > > >> >>> the > > >> >>> entry > > >> >>> must be null everywhere after that listener was > > invoked. > > >> >>> What's the > > >> >>> use case? > > >> >>> > > >> >>> > > >> >>> Maybe Tristan would know more to answer. To be > > honest this work > > >> >>> seems fruitless unless we know what our end users > > want here. > > >> >>> Spending time on something for it to thrown out is > > never fun :( > > >> >>> > > >> >>> And the more I thought about this the more I question > the > > >> >>> validity > > >> >>> of maxIdle even. It seems like a very poor way to > prevent > > >> >>> memory > > >> >>> exhaustion, which eviction does in a much better way > > and has > > >> >>> much > > >> >>> more flexible algorithms. Does anyone know what > > maxIdle would > > >> >>> be > > >> >>> used for that wouldn't be covered by eviction? The > > only thing I > > >> >>> can think of is cleaning up the cache store as well. > > >> >>> > > >> >>> > > >> >>> Actually I guess for session/authentication related > > information this > > >> >>> would be important. However maxIdle isn't really as usable > > in that > > >> >>> case since most likely you would have a sticky session to > > go back to > > >> >>> that node which means you would never refresh the last used > > date on > > >> >>> the copies (current implementation). Without cluster > > expiration you > > >> >>> could lose that session information on a failover very easily. > > >> >> I would say that maxIdle can be used as for memory > > management as kind > > >> >> of > > >> >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some > > record for > > >> >> a > > >> >> short while (regular transaction lifespan ~ seconds to > > minutes), and > > >> >> regularly the record is removed. However, to make sure that > > we don't > > >> >> leak records in this cache (if something goes wrong and the > > remove does > > >> >> not occur), it is removed. > > >> > Note that just relying on maxIdle doesn't guarantee you won't > > leak > > >> > records in this use case (specifically with the way the current > > >> > hibernate-infinispan 2LC implementation uses it). > > >> > > > >> > Hibernate-infinispan adds entries to its own Map stored in > > Infinispan, > > >> > and expects maxIdle to remove the map if it skips a remove. > > But in a > > >> > current case, we found that due to frequent accesses to that > > same map > > >> > the entries never idle out and it ends up in OOME). > > >> > > > >> > -Dennis > > >> > > > >> >> I can guess how long the transaction takes place, but not > > how many > > >> >> parallel transactions there are. With eviction algorithms > > (where I am > > >> >> not sure about the exact guarantees) I can set the cache to > > not hold > > >> >> more than N entries, but I can't know for sure that my > > record does not > > >> >> suddenly get evicted after shorter period, possibly causing > some > > >> >> inconsistency. > > >> >> So this is similar to WeakHashMap by removing the key "when > > it can't be > > >> >> used anymore" because I know that the transaction will > > finish before > > >> >> the > > >> >> deadline. I don't care about the exact size, I don't want to > > tune that, > > >> >> I just don't want to leak. > > >> >> > > >> >> From my POV the non-strict maxIdle and strict expiration > > would be a > > >> >> nice compromise. > > >> >> > > >> >> Radim > > >> >> > > >> >>> Note that this would make the reaper thread less > > efficient: > > >> >>> with > > >> >>> numOwners=2 (best case), half of the entries that > > the reaper > > >> >>> touches > > >> >>> cannot be expired, because the node isn't the > > primary node. > > >> >>> And to > > >> >>> make matters worse, the same reaper thread would > > have to > > >> >>> perform a > > >> >>> (synchronous?) RPC for each entry to ensure it > > expires > > >> >>> everywhere. > > >> >>> > > >> >>> > > >> >>> I have debated about this, it could something like a > sync > > >> >>> removeAll which has a special marker to tell it is due > to > > >> >>> expiration (which would raise listeners there), while > > also > > >> >>> sending > > >> >>> a cluster expiration event to other non owners. > > >> >>> > > >> >>> > > >> >>> For maxIdle I'd like to know more information > > about how > > >> >>> exactly the > > >> >>> owners would coordinate to expire an entry. I'm > > pretty sure > > >> >>> we > > >> >>> cannot > > >> >>> avoid ignoring some reads (expiring an entry > > immediately > > >> >>> after > > >> >>> it was > > >> >>> read), and ensuring that we don't accidentally > > extend an > > >> >>> entry's life > > >> >>> (like the current code does, when we transfer an > > entry to a > > >> >>> new owner) > > >> >>> also sounds problematic. > > >> >>> > > >> >>> > > >> >>> For lifespan it is simple, the primary owner just > > expires it > > >> >>> when > > >> >>> it expires there. There is no coordination needed in > > this case > > >> >>> it > > >> >>> just sends the expired remove to owners etc. > > >> >>> > > >> >>> Max idle is more complicated as we all know. The > > primary owner > > >> >>> would send a request for the last used time for a > > given key or > > >> >>> set > > >> >>> of keys. Then the owner would take those times and > > check for a > > >> >>> new access it isn't aware of. If there isn't then it > > would send > > >> >>> a > > >> >>> remove command for the key(s). If there is a new > > access the > > >> >>> owner > > >> >>> would instead send the last used time to all of the > > owners. The > > >> >>> expiration obviously would have a window that if a > > read occurred > > >> >>> after sending a response that could be ignored. This > > could be > > >> >>> resolved by using some sort of 2PC and blocking reads > > during > > >> >>> that > > >> >>> period but I would say it isn't worth it. > > >> >>> > > >> >>> The issue with transferring to a new node refreshing > > the last > > >> >>> update/lifespan seems like just a bug we need to fix > > >> >>> irrespective > > >> >>> of this issue IMO. > > >> >>> > > >> >>> > > >> >>> I'm not saying expiring entries on each node > > independently > > >> >>> is > > >> >>> perfect, > > >> >>> far from it. But I wouldn't want us to provide new > > >> >>> guarantees that > > >> >>> could hurt performance without a really good use > > case. > > >> >>> > > >> >>> > > >> >>> I would guess that user perceived performance should > > be a little > > >> >>> faster with this. But this also depends on an > > alternative that > > >> >>> we > > >> >>> decided on :) > > >> >>> > > >> >>> Also the expiration thread pool is set to min > > priority atm so it > > >> >>> may delay removal of said objects but hopefully (if > > the jvm > > >> >>> supports) it wouldn't overrun a CPU while processing > > unless it > > >> >>> has > > >> >>> availability. > > >> >>> > > >> >>> > > >> >>> Cheers > > >> >>> Dan > > >> >>> > > >> >>> > > >> >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant > > >> >>> > > >> wrote: > > >> >>> > After re-reading the whole original thread, I > > agree with > > >> >>> the > > >> >>> proposal > > >> >>> > with two caveats: > > >> >>> > > > >> >>> > - ensure that we don't break JCache compatibility > > >> >>> > - ensure that we document this properly > > >> >>> > > > >> >>> > Tristan > > >> >>> > > > >> >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: > > >> >>> >> +1 > > >> >>> >> You had me convinced at the first line, > > although "A lot > > >> >>> of > > >> >>> code can now > > >> >>> >> be removed and made simpler" makes it look > > extremely > > >> >>> nice. > > >> >>> >> > > >> >>> >> On 13 Jul 2015 18:14, "William Burns" > > >> >>> > > >> >>> > > > > >> >>> >> > > > >> > > >> >>> > >>> wrote: > > >> >>> >> > > >> >>> >> This is a necro of [1]. > > >> >>> >> > > >> >>> >> With Infinispan 8.0 we are adding in clustered > > >> >>> expiration. That > > >> >>> >> includes an expiration event raised that is > > clustered > > >> >>> as well. > > >> >>> >> Unfortunately expiration events currently occur > > >> >>> multiple times (if > > >> >>> >> numOwners > 1) at different times across > > nodes in a > > >> >>> cluster. This > > >> >>> >> makes coordinating a single cluster > > expiration event > > >> >>> quite difficult. > > >> >>> >> > > >> >>> >> To work around this I am proposing that the > > >> >>> expiration > > >> >>> of an event > > >> >>> >> is done solely by the owner of the given key > > that is > > >> >>> now expired. > > >> >>> >> This would fix the issue of having multiple > > events > > >> >>> and > > >> >>> the event can > > >> >>> >> be raised while holding the lock for the > > given key so > > >> >>> concurrent > > >> >>> >> modifications would not be an issue. > > >> >>> >> > > >> >>> >> The problem arises when you have other nodes > that > > >> >>> have > > >> >>> expiration > > >> >>> >> set but expire at different times. Max idle > > is the > > >> >>> biggest offender > > >> >>> >> with this as a read on an owner only > > refreshes the > > >> >>> owners timestamp, > > >> >>> >> meaning other owners would not be updated and > > expire > > >> >>> preemptively. > > >> >>> >> To have expiration work properly in this case > you > > >> >>> would > > >> >>> need > > >> >>> >> coordination between the owners to see if > > anyone has > > >> >>> a > > >> >>> higher > > >> >>> >> value. This requires blocking and would have > > to be > > >> >>> done while > > >> >>> >> accessing a key that is expired to be sure if > > >> >>> expiration happened or > > >> >>> >> not. > > >> >>> >> > > >> >>> >> The linked dev listing proposed instead to only > > >> >>> expire > > >> >>> an entry by > > >> >>> >> the reaper thread and not on access. In this > > case a > > >> >>> read will > > >> >>> >> return a non null value until it is fully > > expired, > > >> >>> increasing hit > > >> >>> >> ratios possibly. > > >> >>> >> > > >> >>> >> Their are quire a bit of real benefits for this: > > >> >>> >> > > >> >>> >> 1. Cluster cache reads would be much simpler and > > >> >>> wouldn't have to > > >> >>> >> block to verify the object exists or not > > since this > > >> >>> would only be > > >> >>> >> done by the reaper thread (note this would > > have only > > >> >>> happened if the > > >> >>> >> entry was expired locally). An access would > just > > >> >>> return the value > > >> >>> >> immediately. > > >> >>> >> 2. Each node only expires entries it owns in the > > >> >>> reaper > > >> >>> thread > > >> >>> >> reducing how many entries they must check or > > remove. > > >> >>> This also > > >> >>> >> provides a single point where events would be > > raised > > >> >>> as > > >> >>> we need. > > >> >>> >> 3. A lot of code can now be removed and made > > simpler > > >> >>> as > > >> >>> it no longer > > >> >>> >> has to check for expiration. The expiration > > check > > >> >>> would only be > > >> >>> >> done in 1 place, the expiration reaper thread. > > >> >>> >> > > >> >>> >> The main issue with this proposal is as the > other > > >> >>> listing mentions > > >> >>> >> is if user code expects the value to be gone > > after > > >> >>> expiration for > > >> >>> >> correctness. I would say this use case is not > as > > >> >>> compelling for > > >> >>> >> maxIdle, especially since we never supported it > > >> >>> properly. And in > > >> >>> >> the case of lifespan the user could very > > easily store > > >> >>> the expiration > > >> >>> >> time in the object that they can check after > > a get as > > >> >>> pointed out in > > >> >>> >> the other thread. > > >> >>> >> > > >> >>> >> [1] > > >> >>> >> > > >> >>> > > >> >>> > > > http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html > > >> >>> >> > > >> >>> >> _______________________________________________ > > >> >>> >> infinispan-dev mailing list > > >> >>> >> infinispan-dev at lists.jboss.org > > > > >> >>> > > > > >> >>> > > > >> >>> > >> > > >> >>> >> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > >> >>> >> > > >> >>> >> > > >> >>> >> > > >> >>> >> _______________________________________________ > > >> >>> >> infinispan-dev mailing list > > >> >>> >> infinispan-dev at lists.jboss.org > > > > >> >>> > > > > >> >>> >> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > >> >>> >> > > >> >>> > > > >> >>> > -- > > >> >>> > Tristan Tarrant > > >> >>> > Infinispan Lead > > >> >>> > JBoss, a division of Red Hat > > >> >>> > _______________________________________________ > > >> >>> > infinispan-dev mailing list > > >> >>> > infinispan-dev at lists.jboss.org > > > > >> >>> > > > > >> >>> > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > >> >>> _______________________________________________ > > >> >>> infinispan-dev mailing list > > >> >>> infinispan-dev at lists.jboss.org > > > > >> >>> > > > > >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > >> >>> > > >> >>> > > >> >>> > > >> >>> _______________________________________________ > > >> >>> infinispan-dev mailing list > > >> >>> infinispan-dev at lists.jboss.org > > > > >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > >> >> > > >> > _______________________________________________ > > >> > infinispan-dev mailing list > > >> > infinispan-dev at lists.jboss.org > > > > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > >> > > >> > > >> -- > > >> Radim Vansa > > > >> JBoss Performance Team > > >> > > >> _______________________________________________ > > >> infinispan-dev mailing list > > >> infinispan-dev at lists.jboss.org > > > > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > > _______________________________________________ > > > infinispan-dev mailing list > > > infinispan-dev at lists.jboss.org > > > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org infinispan-dev at lists.jboss.org> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150723/ba97fd4a/attachment-0001.html From anistor at redhat.com Fri Jul 24 17:52:36 2015 From: anistor at redhat.com (Adrian Nistor) Date: Sat, 25 Jul 2015 00:52:36 +0300 Subject: [infinispan-dev] Infinispan 8.0.0.Beta2 released Message-ID: <55B2B3A4.3060408@redhat.com> Dear community Infinispan 8.0.0.Beta2 is now available! Further details in the blog post: http://blog.infinispan.org/2015/07/infinispan-800beta2.html Cheers Adrian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150725/13ead1e1/attachment.html From ttarrant at redhat.com Mon Jul 27 09:59:21 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 27 Jul 2015 15:59:21 +0200 Subject: [infinispan-dev] Infinispan 8.0.0.Beta2 released In-Reply-To: <55B2B3A4.3060408@redhat.com> References: <55B2B3A4.3060408@redhat.com> Message-ID: <55B63939.2010303@redhat.com> The HTML version of this e-mail had a wrong link to Beta1. Here's the correct link: http://blog.infinispan.org/2015/07/infinispan-800beta2.html Tristan On 24/07/2015 23:52, Adrian Nistor wrote: > Dear community > > Infinispan 8.0.0.Beta2 is now available! Further details in the blog post: > http://blog.infinispan.org/2015/07/infinispan-800beta2.html > > Cheers > Adrian > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Mon Jul 27 10:32:23 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 27 Jul 2015 16:32:23 +0200 Subject: [infinispan-dev] Weekly IRC Meeting log 2015-07-27 Message-ID: <55B640F7.1020603@redhat.com> The meeting logs for this weeks #infinispan IRC meeting are available at http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2015/infinispan.2015-07-27-14.01.log.html Cheers Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Mon Jul 27 10:41:07 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 27 Jul 2015 16:41:07 +0200 Subject: [infinispan-dev] Special cache types and their configuration (or lack of) Message-ID: <55B64303.4070006@redhat.com> Hi all, I wanted to bring attention to some discussion that has happened in the context of Radim's work on simplified code for specific cache types [1]. In particular, Radim proposes adding explicit configuration options (i.e. a new simple-cache cache type) to the programmatic/declarative API to ensure that a user is aware of the limitations of the resulting cache type (no interceptors, no persistence, no tx, etc). My opinion is that we should aim for "less" configuration and not "more", and that optimizations such as these should get enabled implicitly when the parameters allow it: if the configuration code detects it can use a "simple" cache. Also, this choice should happen at cache construction time, and not dynamically at cache usage time. WDYT ? Tristan [1] https://github.com/infinispan/infinispan/pull/3577 -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From rvansa at redhat.com Mon Jul 27 11:29:48 2015 From: rvansa at redhat.com (Radim Vansa) Date: Mon, 27 Jul 2015 17:29:48 +0200 Subject: [infinispan-dev] Special cache types and their configuration (or lack of) In-Reply-To: <55B64303.4070006@redhat.com> References: <55B64303.4070006@redhat.com> Message-ID: <55B64E6C.6020706@redhat.com> There's one glitch that needs to be stressed: some limitations of simplified cache are not discoverable on creation time. While persistence, tx and others are, adding custom interceptors and running map-reduce or distributed-executors can't be guessed when the cache is created. I could (theoretically) implement MR and DistExec, but never the custom interceptors: the idea of simple cache is that there are *no interceptors*. And regrettably, this is not as rare case as I have initially assumed, as for example JCaches grab any cache, insert their interceptor and provide the wrapper. One way to go would be to not return the simple cache directly, but wrap it in a delegating cache that would switch the implementation on the fly as soon as someone tries to play with interceptors. However, this is not without cost - the delegate would have to read a volatile field and execute megamorphic call upon every cache operation. Applications could get around that by doing instanceof and calling unwrap method during initialization, but it's not really elegant solution. I wanted the choice transparent to the user from the beginning, but it's not a way to go without penalties. For those who will suggest 'just a flag on local cache': Following the 'less configuration, not more' I believe that the amount of runtime-prohibited configurations should be kept at minimum. With such flag, we would expand the state space of configuration 2 times, while 95% of the configurations would be illegal. That's why I have rather used new cache mode than adding a flag. Radim On 07/27/2015 04:41 PM, Tristan Tarrant wrote: > Hi all, > > I wanted to bring attention to some discussion that has happened in the > context of Radim's work on simplified code for specific cache types [1]. > > In particular, Radim proposes adding explicit configuration options > (i.e. a new simple-cache cache type) to the programmatic/declarative API > to ensure that a user is aware of the limitations of the resulting cache > type (no interceptors, no persistence, no tx, etc). > > My opinion is that we should aim for "less" configuration and not > "more", and that optimizations such as these should get enabled > implicitly when the parameters allow it: if the configuration code > detects it can use a "simple" cache. > > Also, this choice should happen at cache construction time, and not > dynamically at cache usage time. > > WDYT ? > > Tristan > > [1] https://github.com/infinispan/infinispan/pull/3577 -- Radim Vansa JBoss Performance Team From smarlow at redhat.com Mon Jul 27 15:51:01 2015 From: smarlow at redhat.com (Scott Marlow) Date: Mon, 27 Jul 2015 15:51:01 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55A76CA4.3000608@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> Message-ID: <55B68BA5.9080304@redhat.com> Radim, I tried compiling Hibernate ORM 5.0 (master branch) against Infinispan 8.0.0.Beta2 and get the same compile error. I think that others want to bring Infinispan 8.0.0.Final into WildFly 10 but Infinispan 8.0.0.Final currently isn't compatible with Hibernate 5.0 (which is almost ready to go final). Would it be possible for Infinispan 8.0.0.Final to correct the below compiler error, so that WildFly 10 can work with both Hibernate 5.0 + Infinispan 8.0? Scott On 07/16/2015 04:34 AM, Radim Vansa wrote: > > On 07/15/2015 03:25 PM, Scott Marlow wrote: >> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >> >> " >> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >> error: cannot find symbol >> rpcManager.broadcastRpcCommand( cmd, isSync ); >> ^ >> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >> location: variable rpcManager of type RpcManager >> " >> >> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? > > This should be fixed in ORM 5.x, but should I do a PR that uses > Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get > out before ORM 5.0.0.Final? > >> >> More inline below... >> >> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>> in a micro release of Hibernate.. that's against our conventions. >>> The better plan would be to work towards a Hibernate 5.1 version for >>> this, or make sure to talk with him in advance if that doesn't work >>> out. >> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >> with that target in mind, we should figure out if we can have a >> Hibernate 5.x and Infinispan 8.x that work together. I don't >> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >> together. IMO, we should either change the Infinispan 8 API to still >> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. > > See above: I would 'make sure' that they work together, but it's IMO not > possible until Infinispan 8.0 is released. > > Radim > >> >>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>> API, which won't change. However, with certain consistency fixes ([1] >>>> [2] and maybe more) the Interceptor API is used (as the basic >>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>> to change in Infinispan 8.0. >>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>> will be out. Since that would require only changes internal to that >>>> module, I hope this upgrade can be scoped to a micro-release. >>>> >>>> Radim >>>> >>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>> >>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>> Hi, >>>>> >>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>> changes to integrate with Infinispan 8.0? >>>>> >>>>> Thanks, >>>>> Scott >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> -- >>>> Radim Vansa >>>> JBoss Performance Team >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From ttarrant at redhat.com Mon Jul 27 18:13:48 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 28 Jul 2015 00:13:48 +0200 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55B68BA5.9080304@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> Message-ID: <55B6AD1C.5080309@redhat.com> We should work to provide a public API we can maintain for the future, instead of ORM relying on Infinispan internals. Tristan On 27/07/2015 21:51, Scott Marlow wrote: > Radim, > > I tried compiling Hibernate ORM 5.0 (master branch) against Infinispan > 8.0.0.Beta2 and get the same compile error. > > I think that others want to bring Infinispan 8.0.0.Final into WildFly 10 > but Infinispan 8.0.0.Final currently isn't compatible with Hibernate 5.0 > (which is almost ready to go final). > > Would it be possible for Infinispan 8.0.0.Final to correct the below > compiler error, so that WildFly 10 can work with both Hibernate 5.0 + > Infinispan 8.0? > > Scott > > On 07/16/2015 04:34 AM, Radim Vansa wrote: >> >> On 07/15/2015 03:25 PM, Scott Marlow wrote: >>> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >>> >>> " >>> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >>> error: cannot find symbol >>> rpcManager.broadcastRpcCommand( cmd, isSync ); >>> ^ >>> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >>> location: variable rpcManager of type RpcManager >>> " >>> >>> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? >> >> This should be fixed in ORM 5.x, but should I do a PR that uses >> Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get >> out before ORM 5.0.0.Final? >> >>> >>> More inline below... >>> >>> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>>> in a micro release of Hibernate.. that's against our conventions. >>>> The better plan would be to work towards a Hibernate 5.1 version for >>>> this, or make sure to talk with him in advance if that doesn't work >>>> out. >>> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >>> with that target in mind, we should figure out if we can have a >>> Hibernate 5.x and Infinispan 8.x that work together. I don't >>> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >>> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >>> together. IMO, we should either change the Infinispan 8 API to still >>> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. >> >> See above: I would 'make sure' that they work together, but it's IMO not >> possible until Infinispan 8.0 is released. >> >> Radim >> >>> >>>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>>> API, which won't change. However, with certain consistency fixes ([1] >>>>> [2] and maybe more) the Interceptor API is used (as the basic >>>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>>> to change in Infinispan 8.0. >>>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>>> will be out. Since that would require only changes internal to that >>>>> module, I hope this upgrade can be scoped to a micro-release. >>>>> >>>>> Radim >>>>> >>>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>>> >>>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>>> Hi, >>>>>> >>>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>>> changes to integrate with Infinispan 8.0? >>>>>> >>>>>> Thanks, >>>>>> Scott >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> -- >>>>> Radim Vansa >>>>> JBoss Performance Team >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Tue Jul 28 07:13:14 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 28 Jul 2015 14:13:14 +0300 Subject: [infinispan-dev] Lucene 5 is coming: pitfalls to consider Message-ID: Hi all, the Hibernate Search branch upgrading to Apache Lucene 5.2.x is almost ready, but there are some drawbacks on top of the many nice efficiency improvements. # API changes The API changes are not too bad, and definitely an improvement. I'll provide a detailed list as usual in the Hibernate Search migration guide - for now let it suffice to know that it's an easy upgrade for end users, as long as they were just creating Query instances and not using the more powerful and complex stuff. # Sorting To sort on a field will require an UninvertingReader to wrap the cached IndexReaders, and the uninverting process is very inefficient. On top of that, the result of the uninverting process is not cacheable, so that will need to be repeated on each index, for each query which is executed. In short, I expect performance of sorted queries to be quite degraded in our first milestone using Lucene 5, and we'll have to discuss how to fix this. Needless to say, fixing this is a blocking requirement before we can consider the migration complete. Sorting will not need an UninvertingReader if the target field has been indexed as DocValues, but that implies: - we'll need an explicit, upfront (indexing time) flag to be set - we'll need to detect if the matching indexing options are compatible with the runtime query to skip the uninverting process This is mostly a job for Hibernate Search, but in terms of user experience it means you have to mark fields for "sortability" explicitly; will we need to extend the protobuf schema? Please make sure we'll just have to hook in existing metadata, we can't fix this after API freeze. # Filters We did some clever bitset level optimisations to merge multiple Filter instances and save memory to cache multiple filter instances, I had to drop that code as we don't deal with in-heap structures more but the design is about iterating off heap chunks of data, and resort on the more traditional Lucene stack for filtering. I couldn't measure the performance impact yet; it's a significantly different approach and while it sounds promising on paper, we'll need some help testing this. The Lucene team can generally be trusted to go in the better direction, but we'll have to verify if we're using it in the best way. # Analyzers It is no longer possible to override the field->analyzer mapping at runtime. We did expose this feature as a public API and I found a way to still do it, but it comes with a performance price tag. We'll soon deprecate this feature; if you can, start making sure there's no need for this in Infinispan as at some time in the near future we'll have to drop this, with no replacement. # Index encoding As usual the index encoding evolves and the easy solution is to rebuild it. Lucene 5 no longer ships with backwards compatible de-coders, but these are available as separate dependencies. If you feel the need to be able to read existing indexes, we should include these. (I'm including these as private dependencies in the Hibernate Search modules). Thanks, Sanne From dan.berindei at gmail.com Tue Jul 28 07:48:21 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 28 Jul 2015 14:48:21 +0300 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55B6AD1C.5080309@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B6AD1C.5080309@redhat.com> Message-ID: To be fair, the RpcManager is not in an impl package, so I think it's fair for the Hibernate 2LC plugin to use it (as long as they accept more breakage than in the basic cache API). OTOH it may be a good idea to discuss what 2LC needs, and if we could integrate that in the basic API. Dan On Tue, Jul 28, 2015 at 1:13 AM, Tristan Tarrant wrote: > We should work to provide a public API we can maintain for the future, > instead of ORM relying on Infinispan internals. > > Tristan > > On 27/07/2015 21:51, Scott Marlow wrote: >> Radim, >> >> I tried compiling Hibernate ORM 5.0 (master branch) against Infinispan >> 8.0.0.Beta2 and get the same compile error. >> >> I think that others want to bring Infinispan 8.0.0.Final into WildFly 10 >> but Infinispan 8.0.0.Final currently isn't compatible with Hibernate 5.0 >> (which is almost ready to go final). >> >> Would it be possible for Infinispan 8.0.0.Final to correct the below >> compiler error, so that WildFly 10 can work with both Hibernate 5.0 + >> Infinispan 8.0? >> >> Scott >> >> On 07/16/2015 04:34 AM, Radim Vansa wrote: >>> >>> On 07/15/2015 03:25 PM, Scott Marlow wrote: >>>> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >>>> >>>> " >>>> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >>>> error: cannot find symbol >>>> rpcManager.broadcastRpcCommand( cmd, isSync ); >>>> ^ >>>> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >>>> location: variable rpcManager of type RpcManager >>>> " >>>> >>>> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? >>> >>> This should be fixed in ORM 5.x, but should I do a PR that uses >>> Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get >>> out before ORM 5.0.0.Final? >>> >>>> >>>> More inline below... >>>> >>>> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>>>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>>>> in a micro release of Hibernate.. that's against our conventions. >>>>> The better plan would be to work towards a Hibernate 5.1 version for >>>>> this, or make sure to talk with him in advance if that doesn't work >>>>> out. >>>> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >>>> with that target in mind, we should figure out if we can have a >>>> Hibernate 5.x and Infinispan 8.x that work together. I don't >>>> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >>>> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >>>> together. IMO, we should either change the Infinispan 8 API to still >>>> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. >>> >>> See above: I would 'make sure' that they work together, but it's IMO not >>> possible until Infinispan 8.0 is released. >>> >>> Radim >>> >>>> >>>>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>>>> API, which won't change. However, with certain consistency fixes ([1] >>>>>> [2] and maybe more) the Interceptor API is used (as the basic >>>>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>>>> to change in Infinispan 8.0. >>>>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>>>> will be out. Since that would require only changes internal to that >>>>>> module, I hope this upgrade can be scoped to a micro-release. >>>>>> >>>>>> Radim >>>>>> >>>>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>>>> >>>>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>>>> changes to integrate with Infinispan 8.0? >>>>>>> >>>>>>> Thanks, >>>>>>> Scott >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> >>>>>> -- >>>>>> Radim Vansa >>>>>> JBoss Performance Team >>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Tue Jul 28 08:09:50 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 28 Jul 2015 14:09:50 +0200 Subject: [infinispan-dev] Lucene 5 is coming: pitfalls to consider In-Reply-To: References: Message-ID: <55B7710E.8080208@redhat.com> On 28/07/2015 13:13, Sanne Grinovero wrote: > # Sorting > To sort on a field will require an UninvertingReader to wrap the > cached IndexReaders, and the uninverting process is very inefficient. > On top of that, the result of the uninverting process is not > cacheable, so that will need to be repeated on each index, for each > query which is executed. > In short, I expect performance of sorted queries to be quite degraded > in our first milestone using Lucene 5, and we'll have to discuss how > to fix this. > Needless to say, fixing this is a blocking requirement before we can > consider the migration complete. > > Sorting will not need an UninvertingReader if the target field has > been indexed as DocValues, but that implies: > - we'll need an explicit, upfront (indexing time) flag to be set > - we'll need to detect if the matching indexing options are > compatible with the runtime query to skip the uninverting process > > This is mostly a job for Hibernate Search, but in terms of user > experience it means you have to mark fields for "sortability" > explicitly; will we need to extend the protobuf schema? > > Please make sure we'll just have to hook in existing metadata, we > can't fix this after API freeze. This is very important to get right. As a user, I'd honestly expect an indexed field to also be used for sorting, so probably having some per-index flag (off by default) which implicitly enables DocValues for all indexed fields. Does uninverting offer any performance advantage over the sorting we already do ? Sorting wouldn't help anyway in the clustered query scenario, where you'd have to merge the results from multiple nodes anyway (I guess there is some win in pre-sorting on each node and then splicing the sorted sub-resultsets). > # Index encoding > As usual the index encoding evolves and the easy solution is to > rebuild it. Lucene 5 no longer ships with backwards compatible > de-coders, but these are available as separate dependencies. If you > feel the need to be able to read existing indexes, we should include > these. Possibly make them optional. I guess our recommendation is to mass-reindex anyway. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Tue Jul 28 08:13:31 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 28 Jul 2015 14:13:31 +0200 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B6AD1C.5080309@redhat.com> Message-ID: <55B771EB.6090208@redhat.com> Yes, while I said "internal", it is that kind of public integration point for advanced users (and ORM is one such user) which we feel the right to change in a major version upgrade. Dan, can we temporarily restore that method for Beta3 and mark it as deprecated, while at the same time we work on ORM's needs in a way that we can also backport to 7.2.x ? Tristan On 28/07/2015 13:48, Dan Berindei wrote: > To be fair, the RpcManager is not in an impl package, so I think it's > fair for the Hibernate 2LC plugin to use it (as long as they accept > more breakage than in the basic cache API). OTOH it may be a good idea > to discuss what 2LC needs, and if we could integrate that in the basic > API. > > Dan > > On Tue, Jul 28, 2015 at 1:13 AM, Tristan Tarrant wrote: >> We should work to provide a public API we can maintain for the future, >> instead of ORM relying on Infinispan internals. >> >> Tristan >> >> On 27/07/2015 21:51, Scott Marlow wrote: >>> Radim, >>> >>> I tried compiling Hibernate ORM 5.0 (master branch) against Infinispan >>> 8.0.0.Beta2 and get the same compile error. >>> >>> I think that others want to bring Infinispan 8.0.0.Final into WildFly 10 >>> but Infinispan 8.0.0.Final currently isn't compatible with Hibernate 5.0 >>> (which is almost ready to go final). >>> >>> Would it be possible for Infinispan 8.0.0.Final to correct the below >>> compiler error, so that WildFly 10 can work with both Hibernate 5.0 + >>> Infinispan 8.0? >>> >>> Scott >>> >>> On 07/16/2015 04:34 AM, Radim Vansa wrote: >>>> >>>> On 07/15/2015 03:25 PM, Scott Marlow wrote: >>>>> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >>>>> >>>>> " >>>>> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >>>>> error: cannot find symbol >>>>> rpcManager.broadcastRpcCommand( cmd, isSync ); >>>>> ^ >>>>> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >>>>> location: variable rpcManager of type RpcManager >>>>> " >>>>> >>>>> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? >>>> >>>> This should be fixed in ORM 5.x, but should I do a PR that uses >>>> Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get >>>> out before ORM 5.0.0.Final? >>>> >>>>> >>>>> More inline below... >>>>> >>>>> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>>>>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>>>>> in a micro release of Hibernate.. that's against our conventions. >>>>>> The better plan would be to work towards a Hibernate 5.1 version for >>>>>> this, or make sure to talk with him in advance if that doesn't work >>>>>> out. >>>>> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >>>>> with that target in mind, we should figure out if we can have a >>>>> Hibernate 5.x and Infinispan 8.x that work together. I don't >>>>> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >>>>> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >>>>> together. IMO, we should either change the Infinispan 8 API to still >>>>> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. >>>> >>>> See above: I would 'make sure' that they work together, but it's IMO not >>>> possible until Infinispan 8.0 is released. >>>> >>>> Radim >>>> >>>>> >>>>>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>>>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>>>>> API, which won't change. However, with certain consistency fixes ([1] >>>>>>> [2] and maybe more) the Interceptor API is used (as the basic >>>>>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>>>>> to change in Infinispan 8.0. >>>>>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>>>>> will be out. Since that would require only changes internal to that >>>>>>> module, I hope this upgrade can be scoped to a micro-release. >>>>>>> >>>>>>> Radim >>>>>>> >>>>>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>>>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>>>>> >>>>>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>>>>> changes to integrate with Infinispan 8.0? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Scott >>>>>>>> _______________________________________________ >>>>>>>> infinispan-dev mailing list >>>>>>>> infinispan-dev at lists.jboss.org >>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> >>>>>>> -- >>>>>>> Radim Vansa >>>>>>> JBoss Performance Team >>>>>>> >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >> >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Tue Jul 28 08:43:30 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 28 Jul 2015 15:43:30 +0300 Subject: [infinispan-dev] Lucene 5 is coming: pitfalls to consider In-Reply-To: <55B7710E.8080208@redhat.com> References: <55B7710E.8080208@redhat.com> Message-ID: On 28 July 2015 at 15:09, Tristan Tarrant wrote: > On 28/07/2015 13:13, Sanne Grinovero wrote: >> # Sorting >> To sort on a field will require an UninvertingReader to wrap the >> cached IndexReaders, and the uninverting process is very inefficient. >> On top of that, the result of the uninverting process is not >> cacheable, so that will need to be repeated on each index, for each >> query which is executed. >> In short, I expect performance of sorted queries to be quite degraded >> in our first milestone using Lucene 5, and we'll have to discuss how >> to fix this. >> Needless to say, fixing this is a blocking requirement before we can >> consider the migration complete. >> >> Sorting will not need an UninvertingReader if the target field has >> been indexed as DocValues, but that implies: >> - we'll need an explicit, upfront (indexing time) flag to be set >> - we'll need to detect if the matching indexing options are >> compatible with the runtime query to skip the uninverting process >> >> This is mostly a job for Hibernate Search, but in terms of user >> experience it means you have to mark fields for "sortability" >> explicitly; will we need to extend the protobuf schema? >> >> Please make sure we'll just have to hook in existing metadata, we >> can't fix this after API freeze. > > This is very important to get right. As a user, I'd honestly expect an > indexed field to also be used for sorting, so probably having some > per-index flag (off by default) which implicitly enables DocValues for > all indexed fields. Even in past Lucene versions it never has been that simple: a field to be sortable *correctly* would require the tokenizer to output a single token. Which implies that the user had to explicitly design some fields for the "purpose of sorting". DocValues express this as a stricter requirement: you can't encode a multi-token value as a DocValue. The problem is that we can't guess for which analyzers it would be safe to automatically enable DocValue(s). - The Analyzer is an open set - The number of tokens depends on the input What we could do is to store the value as DocValue IFF the output of a specific analyzer chain happens to be a single token.. but to me it sounds a bit ugly, not sure what kind of issues we'd be getting into. > Does uninverting offer any performance advantage > over the sorting we already do ? Sorting wouldn't help anyway in the > clustered query scenario, where you'd have to merge the results from > multiple nodes anyway (I guess there is some win in pre-sorting on each > node and then splicing the sorted sub-resultsets). If I skip "uninverting" you will only be able to sort on fields which have DocValues stored in the index; it's not offering any performance benefit it all: it's much slower and takes quite some memory, we've put the uninverting process in place just as a fallback to allow sorting to happen on current models. We'll need to a) define a convenient metadata to mark fields used for sorting (for embedded mode we'll discuss this on hibernate-dev, for Hot Rod queries it's up to you) b) encourage users to explicitly use this: - should we log a warning when we fallback to the slow (uninverting) strategy? - should we disable the fallback and have people stare at an explicit stacktrace? >> # Index encoding >> As usual the index encoding evolves and the easy solution is to >> rebuild it. Lucene 5 no longer ships with backwards compatible >> de-coders, but these are available as separate dependencies. If you >> feel the need to be able to read existing indexes, we should include >> these. > > Possibly make them optional. +1 > I guess our recommendation is to > mass-reindex anyway. Always ;) > Tristan > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From gustavo at infinispan.org Tue Jul 28 09:13:08 2015 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Tue, 28 Jul 2015 14:13:08 +0100 Subject: [infinispan-dev] Lucene 5 is coming: pitfalls to consider In-Reply-To: References: Message-ID: > This is mostly a job for Hibernate Search, but in terms of user > experience it means you have to mark fields for "sortability" > explicitly; will we need to extend the protobuf schema? > Docvalues in theory are not only useful for sorting, but for aggregations as well, so the extra flag should not be tied conceptually to "sorting". > > Please make sure we'll just have to hook in existing metadata, we > can't fix this after API freeze. > > # Filters > We did some clever bitset level optimisations to merge multiple Filter > instances and save memory to cache multiple filter instances, I had to > drop that code as we don't deal with in-heap structures more but the > design is about iterating off heap chunks of data, Unless the directory implementation stores data in the heap itself :) > and resort on the > more traditional Lucene stack for filtering. > I couldn't measure the performance impact yet; it's a significantly > different approach and while it sounds promising on paper, we'll need > some help testing this. The Lucene team can generally be trusted to go > in the better direction, but we'll have to verify if we're using it in > the best way. > > # Analyzers > It is no longer possible to override the field->analyzer mapping at > runtime. We did expose this feature as a public API and I found a way > to still do it, but it comes with a performance price tag. > We'll soon deprecate this feature; if you can, start making sure > there's no need for this in Infinispan as at some time in the near > future we'll have to drop this, with no replacement. > > # Index encoding > As usual the index encoding evolves and the easy solution is to > rebuild it. Lucene 5 no longer ships with backwards compatible > de-coders, but these are available as separate dependencies. If you > feel the need to be able to read existing indexes, we should include > these. > (I'm including these as private dependencies in the Hibernate Search > modules). > > Thanks, > Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150728/f367dedd/attachment-0001.html From spaulger at codezen.co.uk Tue Jul 28 16:43:56 2015 From: spaulger at codezen.co.uk (Simon Paulger) Date: Tue, 28 Jul 2015 21:43:56 +0100 Subject: [infinispan-dev] Redis infinispan cache store Message-ID: Hi, I'm interested in developing inifinispan integration with Redis for use in JBoss. Before working on JBoss, I first need to add the capability to Infinispan itself. Is this an enhancement that the infinispan community would be interested in? Regards, Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150728/94ea5246/attachment.html From ttarrant at redhat.com Wed Jul 29 05:47:52 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Wed, 29 Jul 2015 11:47:52 +0200 Subject: [infinispan-dev] Redis infinispan cache store In-Reply-To: References: Message-ID: <55B8A148.1090709@redhat.com> Yes, we would be very interested. Check out the Infinispan cachestore archetype [1] to get things started, and ask here or on IRC on #infinispan for help, if you need more information. Tristan [1] https://github.com/infinispan/infinispan-cachestore-archetype On 28/07/2015 22:43, Simon Paulger wrote: > Hi, > > I'm interested in developing inifinispan integration with Redis for use > in JBoss. Before working on JBoss, I first need to add the capability to > Infinispan itself. > > Is this an enhancement that the infinispan community would be interested in? > > Regards, > Simon > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Wed Jul 29 08:58:03 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Wed, 29 Jul 2015 14:58:03 +0200 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55B68BA5.9080304@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> Message-ID: <55B8CDDB.2020507@redhat.com> Looking at the RpcManager interface in 7.x, that method had already been deprecated and the recommendation is to migrate to RpcManager.invokeRemotely(java.util.Collection, org.infinispan.commands.ReplicableCommand, RpcOptions) so I suggest the change should be in ORM rather than in Infinispan. Tristan On 27/07/2015 21:51, Scott Marlow wrote: > Radim, > > I tried compiling Hibernate ORM 5.0 (master branch) against Infinispan > 8.0.0.Beta2 and get the same compile error. > > I think that others want to bring Infinispan 8.0.0.Final into WildFly 10 > but Infinispan 8.0.0.Final currently isn't compatible with Hibernate 5.0 > (which is almost ready to go final). > > Would it be possible for Infinispan 8.0.0.Final to correct the below > compiler error, so that WildFly 10 can work with both Hibernate 5.0 + > Infinispan 8.0? > > Scott > > On 07/16/2015 04:34 AM, Radim Vansa wrote: >> >> On 07/15/2015 03:25 PM, Scott Marlow wrote: >>> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >>> >>> " >>> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >>> error: cannot find symbol >>> rpcManager.broadcastRpcCommand( cmd, isSync ); >>> ^ >>> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >>> location: variable rpcManager of type RpcManager >>> " >>> >>> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? >> >> This should be fixed in ORM 5.x, but should I do a PR that uses >> Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get >> out before ORM 5.0.0.Final? >> >>> >>> More inline below... >>> >>> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>>> in a micro release of Hibernate.. that's against our conventions. >>>> The better plan would be to work towards a Hibernate 5.1 version for >>>> this, or make sure to talk with him in advance if that doesn't work >>>> out. >>> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >>> with that target in mind, we should figure out if we can have a >>> Hibernate 5.x and Infinispan 8.x that work together. I don't >>> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >>> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >>> together. IMO, we should either change the Infinispan 8 API to still >>> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. >> >> See above: I would 'make sure' that they work together, but it's IMO not >> possible until Infinispan 8.0 is released. >> >> Radim >> >>> >>>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>>> API, which won't change. However, with certain consistency fixes ([1] >>>>> [2] and maybe more) the Interceptor API is used (as the basic >>>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>>> to change in Infinispan 8.0. >>>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>>> will be out. Since that would require only changes internal to that >>>>> module, I hope this upgrade can be scoped to a micro-release. >>>>> >>>>> Radim >>>>> >>>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>>> >>>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>>> Hi, >>>>>> >>>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>>> changes to integrate with Infinispan 8.0? >>>>>> >>>>>> Thanks, >>>>>> Scott >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> -- >>>>> Radim Vansa >>>>> JBoss Performance Team >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From spaulger at codezen.co.uk Wed Jul 29 14:31:28 2015 From: spaulger at codezen.co.uk (Simon Paulger) Date: Wed, 29 Jul 2015 19:31:28 +0100 Subject: [infinispan-dev] Redis infinispan cache store In-Reply-To: <55B8A148.1090709@redhat.com> References: <55B8A148.1090709@redhat.com> Message-ID: Hi Tristan, With regards to project repositories, should I add the code to a fork of the main infinispan project or create a standalone repository as per hbase, jdbm, etc? And I presume there's no objections to using a third party Redis client? I was thinking Jedis (https://github.com/xetorthio/jedis - MIT license, currently maintained). Thanks, Simon On 29 July 2015 at 10:47, Tristan Tarrant wrote: > Yes, we would be very interested. Check out the Infinispan cachestore > archetype [1] to get things started, and ask here or on IRC on > #infinispan for help, if you need more information. > > > Tristan > > [1] https://github.com/infinispan/infinispan-cachestore-archetype > > On 28/07/2015 22:43, Simon Paulger wrote: > > Hi, > > > > I'm interested in developing inifinispan integration with Redis for use > > in JBoss. Before working on JBoss, I first need to add the capability to > > Infinispan itself. > > > > Is this an enhancement that the infinispan community would be interested > in? > > > > Regards, > > Simon > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150729/b44c4c0e/attachment.html From smarlow at redhat.com Wed Jul 29 14:48:03 2015 From: smarlow at redhat.com (Scott Marlow) Date: Wed, 29 Jul 2015 14:48:03 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55B8CDDB.2020507@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> Message-ID: <55B91FE3.70008@redhat.com> On 07/29/2015 08:58 AM, Tristan Tarrant wrote: > Looking at the RpcManager interface in 7.x, that method had already been > deprecated and the recommendation is to migrate to > > RpcManager.invokeRemotely(java.util.Collection, > org.infinispan.commands.ReplicableCommand, RpcOptions) > > so I suggest the change should be in ORM rather than in Infinispan. I think that this will block WildFly 10 from upgrading to Infinispan 8, as doing so will break Hibernate 5.0 (when Infinispan 8 is used as the 2lc). > > Tristan > > On 27/07/2015 21:51, Scott Marlow wrote: >> Radim, >> >> I tried compiling Hibernate ORM 5.0 (master branch) against Infinispan >> 8.0.0.Beta2 and get the same compile error. >> >> I think that others want to bring Infinispan 8.0.0.Final into WildFly 10 >> but Infinispan 8.0.0.Final currently isn't compatible with Hibernate 5.0 >> (which is almost ready to go final). >> >> Would it be possible for Infinispan 8.0.0.Final to correct the below >> compiler error, so that WildFly 10 can work with both Hibernate 5.0 + >> Infinispan 8.0? >> >> Scott >> >> On 07/16/2015 04:34 AM, Radim Vansa wrote: >>> >>> On 07/15/2015 03:25 PM, Scott Marlow wrote: >>>> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >>>> >>>> " >>>> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >>>> error: cannot find symbol >>>> rpcManager.broadcastRpcCommand( cmd, isSync ); >>>> ^ >>>> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >>>> location: variable rpcManager of type RpcManager >>>> " >>>> >>>> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? >>> >>> This should be fixed in ORM 5.x, but should I do a PR that uses >>> Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get >>> out before ORM 5.0.0.Final? >>> >>>> >>>> More inline below... >>>> >>>> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>>>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>>>> in a micro release of Hibernate.. that's against our conventions. >>>>> The better plan would be to work towards a Hibernate 5.1 version for >>>>> this, or make sure to talk with him in advance if that doesn't work >>>>> out. >>>> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >>>> with that target in mind, we should figure out if we can have a >>>> Hibernate 5.x and Infinispan 8.x that work together. I don't >>>> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >>>> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >>>> together. IMO, we should either change the Infinispan 8 API to still >>>> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. >>> >>> See above: I would 'make sure' that they work together, but it's IMO not >>> possible until Infinispan 8.0 is released. >>> >>> Radim >>> >>>> >>>>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>>>> API, which won't change. However, with certain consistency fixes ([1] >>>>>> [2] and maybe more) the Interceptor API is used (as the basic >>>>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>>>> to change in Infinispan 8.0. >>>>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>>>> will be out. Since that would require only changes internal to that >>>>>> module, I hope this upgrade can be scoped to a micro-release. >>>>>> >>>>>> Radim >>>>>> >>>>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>>>> >>>>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>>>> changes to integrate with Infinispan 8.0? >>>>>>> >>>>>>> Thanks, >>>>>>> Scott >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> >>>>>> -- >>>>>> Radim Vansa >>>>>> JBoss Performance Team >>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > From smarlow at redhat.com Wed Jul 29 15:49:41 2015 From: smarlow at redhat.com (Scott Marlow) Date: Wed, 29 Jul 2015 15:49:41 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55B91FE3.70008@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> Message-ID: <55B92E55.9030709@redhat.com> Paul Ferraro is going to work on a HHH-9999 change to address this that could help Hibernate ORM 5.0 work with Infinispan 8.0. If this works out, the change could be in Hibernate ORM 5.0 instead of Infinispan 8.0.0.Final. If not, we need a different solution. On 07/29/2015 02:48 PM, Scott Marlow wrote: > On 07/29/2015 08:58 AM, Tristan Tarrant wrote: >> Looking at the RpcManager interface in 7.x, that method had already been >> deprecated and the recommendation is to migrate to >> >> RpcManager.invokeRemotely(java.util.Collection, >> org.infinispan.commands.ReplicableCommand, RpcOptions) >> >> so I suggest the change should be in ORM rather than in Infinispan. > > I think that this will block WildFly 10 from upgrading to Infinispan 8, > as doing so will break Hibernate 5.0 (when Infinispan 8 is used as the > 2lc). > >> >> Tristan >> >> On 27/07/2015 21:51, Scott Marlow wrote: >>> Radim, >>> >>> I tried compiling Hibernate ORM 5.0 (master branch) against Infinispan >>> 8.0.0.Beta2 and get the same compile error. >>> >>> I think that others want to bring Infinispan 8.0.0.Final into WildFly 10 >>> but Infinispan 8.0.0.Final currently isn't compatible with Hibernate 5.0 >>> (which is almost ready to go final). >>> >>> Would it be possible for Infinispan 8.0.0.Final to correct the below >>> compiler error, so that WildFly 10 can work with both Hibernate 5.0 + >>> Infinispan 8.0? >>> >>> Scott >>> >>> On 07/16/2015 04:34 AM, Radim Vansa wrote: >>>> >>>> On 07/15/2015 03:25 PM, Scott Marlow wrote: >>>>> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >>>>> >>>>> " >>>>> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >>>>> error: cannot find symbol >>>>> rpcManager.broadcastRpcCommand( cmd, isSync ); >>>>> ^ >>>>> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >>>>> location: variable rpcManager of type RpcManager >>>>> " >>>>> >>>>> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? >>>> >>>> This should be fixed in ORM 5.x, but should I do a PR that uses >>>> Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get >>>> out before ORM 5.0.0.Final? >>>> >>>>> >>>>> More inline below... >>>>> >>>>> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>>>>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>>>>> in a micro release of Hibernate.. that's against our conventions. >>>>>> The better plan would be to work towards a Hibernate 5.1 version for >>>>>> this, or make sure to talk with him in advance if that doesn't work >>>>>> out. >>>>> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >>>>> with that target in mind, we should figure out if we can have a >>>>> Hibernate 5.x and Infinispan 8.x that work together. I don't >>>>> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >>>>> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >>>>> together. IMO, we should either change the Infinispan 8 API to still >>>>> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. >>>> >>>> See above: I would 'make sure' that they work together, but it's IMO not >>>> possible until Infinispan 8.0 is released. >>>> >>>> Radim >>>> >>>>> >>>>>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>>>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>>>>> API, which won't change. However, with certain consistency fixes ([1] >>>>>>> [2] and maybe more) the Interceptor API is used (as the basic >>>>>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>>>>> to change in Infinispan 8.0. >>>>>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>>>>> will be out. Since that would require only changes internal to that >>>>>>> module, I hope this upgrade can be scoped to a micro-release. >>>>>>> >>>>>>> Radim >>>>>>> >>>>>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>>>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>>>>> >>>>>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>>>>> changes to integrate with Infinispan 8.0? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Scott >>>>>>>> _______________________________________________ >>>>>>>> infinispan-dev mailing list >>>>>>>> infinispan-dev at lists.jboss.org >>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> >>>>>>> -- >>>>>>> Radim Vansa >>>>>>> JBoss Performance Team >>>>>>> >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From smarlow at redhat.com Wed Jul 29 16:58:47 2015 From: smarlow at redhat.com (Scott Marlow) Date: Wed, 29 Jul 2015 16:58:47 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55B92E55.9030709@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> Message-ID: <55B93E87.2060604@redhat.com> With Paul's HHH-9999 fix, we can now compile and pass the ORM tests with Infinispan 7.3.x. We can also compile Hibernate 5.0 against Infinispan 8.0.0.Beta2 but get ORM test failures http://pastebin.com/T2Yt3gdn On 07/29/2015 03:49 PM, Scott Marlow wrote: > Paul Ferraro is going to work on a HHH-9999 change to address this that > could help Hibernate ORM 5.0 work with Infinispan 8.0. > > If this works out, the change could be in Hibernate ORM 5.0 instead of > Infinispan 8.0.0.Final. If not, we need a different solution. > > On 07/29/2015 02:48 PM, Scott Marlow wrote: >> On 07/29/2015 08:58 AM, Tristan Tarrant wrote: >>> Looking at the RpcManager interface in 7.x, that method had already been >>> deprecated and the recommendation is to migrate to >>> >>> RpcManager.invokeRemotely(java.util.Collection, >>> org.infinispan.commands.ReplicableCommand, RpcOptions) >>> >>> so I suggest the change should be in ORM rather than in Infinispan. >> >> I think that this will block WildFly 10 from upgrading to Infinispan 8, >> as doing so will break Hibernate 5.0 (when Infinispan 8 is used as the >> 2lc). >> >>> >>> Tristan >>> >>> On 27/07/2015 21:51, Scott Marlow wrote: >>>> Radim, >>>> >>>> I tried compiling Hibernate ORM 5.0 (master branch) against Infinispan >>>> 8.0.0.Beta2 and get the same compile error. >>>> >>>> I think that others want to bring Infinispan 8.0.0.Final into WildFly 10 >>>> but Infinispan 8.0.0.Final currently isn't compatible with Hibernate 5.0 >>>> (which is almost ready to go final). >>>> >>>> Would it be possible for Infinispan 8.0.0.Final to correct the below >>>> compiler error, so that WildFly 10 can work with both Hibernate 5.0 + >>>> Infinispan 8.0? >>>> >>>> Scott >>>> >>>> On 07/16/2015 04:34 AM, Radim Vansa wrote: >>>>> >>>>> On 07/15/2015 03:25 PM, Scott Marlow wrote: >>>>>> Looks like Hibernate 5.0 cannot be compiled against Infinispan 8.0.0.Beta1: >>>>>> >>>>>> " >>>>>> hibernate-infinispan/src/main/java/org/hibernate/cache/infinispan/util/Caches.java:209: >>>>>> error: cannot find symbol >>>>>> rpcManager.broadcastRpcCommand( cmd, isSync ); >>>>>> ^ >>>>>> symbol: method broadcastRpcCommand(EvictAllCommand,boolean) >>>>>> location: variable rpcManager of type RpcManager >>>>>> " >>>>>> >>>>>> Will this be fixed in Hibernate ORM 5.0 or Infinispan 8.0.0.Beta2? >>>>> >>>>> This should be fixed in ORM 5.x, but should I do a PR that uses >>>>> Infinispan 8.0.0.Beta1 as dependency, even though 8.0.0.Final won't get >>>>> out before ORM 5.0.0.Final? >>>>> >>>>>> >>>>>> More inline below... >>>>>> >>>>>> On 07/14/2015 05:27 PM, Sanne Grinovero wrote: >>>>>>> Hi Radim, I suspect that Steve won't allow an Infinispan upgrade to 8 >>>>>>> in a micro release of Hibernate.. that's against our conventions. >>>>>>> The better plan would be to work towards a Hibernate 5.1 version for >>>>>>> this, or make sure to talk with him in advance if that doesn't work >>>>>>> out. >>>>>> The first beta of WildFly 10 is scheduled for August 6th, 2015. IMO, >>>>>> with that target in mind, we should figure out if we can have a >>>>>> Hibernate 5.x and Infinispan 8.x that work together. I don't >>>>>> particularly care if the Hibernate ORM 5.0 release uses Infinispan 8.x, >>>>>> as long as someone is making sure Infinispan 8.x and ORM 5.0, work well >>>>>> together. IMO, we should either change the Infinispan 8 API to still >>>>>> work with Hibernate ORM 5 or update ORM 5 to work with Infinispan 8. >>>>> >>>>> See above: I would 'make sure' that they work together, but it's IMO not >>>>> possible until Infinispan 8.0 is released. >>>>> >>>>> Radim >>>>> >>>>>> >>>>>>> On 14 July 2015 at 16:16, Radim Vansa wrote: >>>>>>>> IIRC currently hibernate-infinispan module uses just the basic cache >>>>>>>> API, which won't change. However, with certain consistency fixes ([1] >>>>>>>> [2] and maybe more) the Interceptor API is used (as the basic >>>>>>>> invalidation mode cannot make 2LC consistent on it own), which is about >>>>>>>> to change in Infinispan 8.0. >>>>>>>> I will take care of updating hibernate-infinispan module to 8.0 when it >>>>>>>> will be out. Since that would require only changes internal to that >>>>>>>> module, I hope this upgrade can be scoped to a micro-release. >>>>>>>> >>>>>>>> Radim >>>>>>>> >>>>>>>> [1] https://hibernate.atlassian.net/browse/HHH-9868 >>>>>>>> [2] https://hibernate.atlassian.net/browse/HHH-9881 >>>>>>>> >>>>>>>> On 07/14/2015 02:34 PM, Scott Marlow wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I heard that Infinispan 8.0 may soon be integrated into WildFly 10.0. >>>>>>>>> If that happens, how does that impact Hibernate ORM 5.0 which currently >>>>>>>>> integrates with Infinispan 7.2.1.Final? Does Hibernate ORM 5.0 need any >>>>>>>>> changes to integrate with Infinispan 8.0? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Scott >>>>>>>>> _______________________________________________ >>>>>>>>> infinispan-dev mailing list >>>>>>>>> infinispan-dev at lists.jboss.org >>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>>> >>>>>>>> -- >>>>>>>> Radim Vansa >>>>>>>> JBoss Performance Team >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> infinispan-dev mailing list >>>>>>>> infinispan-dev at lists.jboss.org >>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From sanne at infinispan.org Wed Jul 29 17:08:50 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Wed, 29 Jul 2015 22:08:50 +0100 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55B93E87.2060604@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> <55B93E87.2060604@redhat.com> Message-ID: On 29 July 2015 at 21:58, Scott Marlow wrote: > With Paul's HHH-9999 fix, we can now compile and pass the ORM tests with > Infinispan 7.3.x. We can also compile Hibernate 5.0 against Infinispan > 8.0.0.Beta2 but get ORM test failures http://pastebin.com/T2Yt3gdn Let me guess.. you expect Infinispan 8 to parse configuration files meant for Infinispan 7? From sanne at infinispan.org Wed Jul 29 18:09:50 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Wed, 29 Jul 2015 23:09:50 +0100 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> <55B93E87.2060604@redhat.com> Message-ID: On 29 July 2015 at 22:08, Sanne Grinovero wrote: > On 29 July 2015 at 21:58, Scott Marlow wrote: >> With Paul's HHH-9999 fix, we can now compile and pass the ORM tests with >> Infinispan 7.3.x. We can also compile Hibernate 5.0 against Infinispan >> 8.0.0.Beta2 but get ORM test failures http://pastebin.com/T2Yt3gdn > > Let me guess.. you expect Infinispan 8 to parse configuration files > meant for Infinispan 7? I was wrong, the problem is caused by this commit, which makes Infinispan now very picky on not having used AbstractInfinispanTest as a base class: https://github.com/infinispan/infinispan/commit/c9dfb Hibernate uses the Infinispan core testing helpers, but doesn't use TestNG so I'm not sure what the best solution is here. Should we set the thread name consistently in all Hibernate tests too? Seems quite pointless, the tests aren't run in parallel either. Could the helper be made a bit more tolerant for other use cases? Or maybe I should add a system property to have it change behaviour? I'd prefer to set the system property for Infinispan's own use, and default to lenient so that other people can reuse the helpers w/o these contraints. Thanks, Sanne From sanne at infinispan.org Wed Jul 29 19:24:04 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Thu, 30 Jul 2015 00:24:04 +0100 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> <55B93E87.2060604@redhat.com> Message-ID: I'm fixing the ORM testsuite so that we can run it on both versions as https://hibernate.atlassian.net/browse/HHH-10001 Looks promising: After working around the testsuite problem I'm down to two errors - which seem more likely related to my changes than the Infinispan version change - but will need some sleep before debugging those ;) Sanne On 29 July 2015 at 23:09, Sanne Grinovero wrote: > On 29 July 2015 at 22:08, Sanne Grinovero wrote: >> On 29 July 2015 at 21:58, Scott Marlow wrote: >>> With Paul's HHH-9999 fix, we can now compile and pass the ORM tests with >>> Infinispan 7.3.x. We can also compile Hibernate 5.0 against Infinispan >>> 8.0.0.Beta2 but get ORM test failures http://pastebin.com/T2Yt3gdn >> >> Let me guess.. you expect Infinispan 8 to parse configuration files >> meant for Infinispan 7? > > I was wrong, the problem is caused by this commit, which makes > Infinispan now very picky on not having used AbstractInfinispanTest as > a base class: > https://github.com/infinispan/infinispan/commit/c9dfb > > Hibernate uses the Infinispan core testing helpers, but doesn't use > TestNG so I'm not sure what the best solution is here. > Should we set the thread name consistently in all Hibernate tests too? > Seems quite pointless, the tests aren't run in parallel either. > > Could the helper be made a bit more tolerant for other use cases? Or > maybe I should add a system property to have it change behaviour? > I'd prefer to set the system property for Infinispan's own use, and > default to lenient so that other people can reuse the helpers w/o > these contraints. > > Thanks, > Sanne From smarlow at redhat.com Thu Jul 30 10:33:09 2015 From: smarlow at redhat.com (Scott Marlow) Date: Thu, 30 Jul 2015 10:33:09 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> <55B93E87.2060604@redhat.com> Message-ID: <55BA35A5.1010803@redhat.com> Thanks Sanne, this is a big help! IMO, we should run the Hibernate 5.0 testsuite again with Infinispan 8.x/master before releasing Infinispan 8.0.0.Final. Scott On 07/29/2015 07:24 PM, Sanne Grinovero wrote: > I'm fixing the ORM testsuite so that we can run it on both versions as > https://hibernate.atlassian.net/browse/HHH-10001 > > Looks promising: > After working around the testsuite problem I'm down to two errors - > which seem more likely related to my changes than the Infinispan > version change - but will need some sleep before debugging those ;) > > Sanne > > On 29 July 2015 at 23:09, Sanne Grinovero wrote: >> On 29 July 2015 at 22:08, Sanne Grinovero wrote: >>> On 29 July 2015 at 21:58, Scott Marlow wrote: >>>> With Paul's HHH-9999 fix, we can now compile and pass the ORM tests with >>>> Infinispan 7.3.x. We can also compile Hibernate 5.0 against Infinispan >>>> 8.0.0.Beta2 but get ORM test failures http://pastebin.com/T2Yt3gdn >>> >>> Let me guess.. you expect Infinispan 8 to parse configuration files >>> meant for Infinispan 7? >> >> I was wrong, the problem is caused by this commit, which makes >> Infinispan now very picky on not having used AbstractInfinispanTest as >> a base class: >> https://github.com/infinispan/infinispan/commit/c9dfb >> >> Hibernate uses the Infinispan core testing helpers, but doesn't use >> TestNG so I'm not sure what the best solution is here. >> Should we set the thread name consistently in all Hibernate tests too? >> Seems quite pointless, the tests aren't run in parallel either. >> >> Could the helper be made a bit more tolerant for other use cases? Or >> maybe I should add a system property to have it change behaviour? >> I'd prefer to set the system property for Infinispan's own use, and >> default to lenient so that other people can reuse the helpers w/o >> these contraints. >> >> Thanks, >> Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From sanne at infinispan.org Thu Jul 30 10:55:10 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Thu, 30 Jul 2015 14:55:10 +0000 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: <55BA35A5.1010803@redhat.com> References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> <55B93E87.2060604@redhat.com> <55BA35A5.1010803@redhat.com> Message-ID: Good point. It works now (the issue is fixed) but we need to monitor both projects. I'll see how far that can be automated. Can I have ci spam you in case of failures? On Thu, 30 Jul 2015 15:34 Scott Marlow wrote: > Thanks Sanne, this is a big help! IMO, we should run the Hibernate 5.0 > testsuite again with Infinispan 8.x/master before releasing Infinispan > 8.0.0.Final. > > Scott > > On 07/29/2015 07:24 PM, Sanne Grinovero wrote: > > I'm fixing the ORM testsuite so that we can run it on both versions as > > https://hibernate.atlassian.net/browse/HHH-10001 > > > > Looks promising: > > After working around the testsuite problem I'm down to two errors - > > which seem more likely related to my changes than the Infinispan > > version change - but will need some sleep before debugging those ;) > > > > Sanne > > > > On 29 July 2015 at 23:09, Sanne Grinovero wrote: > >> On 29 July 2015 at 22:08, Sanne Grinovero wrote: > >>> On 29 July 2015 at 21:58, Scott Marlow wrote: > >>>> With Paul's HHH-9999 fix, we can now compile and pass the ORM tests > with > >>>> Infinispan 7.3.x. We can also compile Hibernate 5.0 against > Infinispan > >>>> 8.0.0.Beta2 but get ORM test failures http://pastebin.com/T2Yt3gdn > >>> > >>> Let me guess.. you expect Infinispan 8 to parse configuration files > >>> meant for Infinispan 7? > >> > >> I was wrong, the problem is caused by this commit, which makes > >> Infinispan now very picky on not having used AbstractInfinispanTest as > >> a base class: > >> https://github.com/infinispan/infinispan/commit/c9dfb > >> > >> Hibernate uses the Infinispan core testing helpers, but doesn't use > >> TestNG so I'm not sure what the best solution is here. > >> Should we set the thread name consistently in all Hibernate tests too? > >> Seems quite pointless, the tests aren't run in parallel either. > >> > >> Could the helper be made a bit more tolerant for other use cases? Or > >> maybe I should add a system property to have it change behaviour? > >> I'd prefer to set the system property for Infinispan's own use, and > >> default to lenient so that other people can reuse the helpers w/o > >> these contraints. > >> > >> Thanks, > >> Sanne > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150730/93784f5b/attachment-0001.html From smarlow at redhat.com Thu Jul 30 12:30:17 2015 From: smarlow at redhat.com (Scott Marlow) Date: Thu, 30 Jul 2015 12:30:17 -0400 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> <55B93E87.2060604@redhat.com> <55BA35A5.1010803@redhat.com> Message-ID: <55BA5119.5030304@redhat.com> I don't think the email should be sent directly to me, better if that goes to someone that works on Infinispan and the Hibernate-Infinispan integration. On 07/30/2015 10:55 AM, Sanne Grinovero wrote: > Good point. It works now (the issue is fixed) but we need to monitor > both projects. I'll see how far that can be automated. Can I have ci > spam you in case of failures? > > > On Thu, 30 Jul 2015 15:34 Scott Marlow > wrote: > > Thanks Sanne, this is a big help! IMO, we should run the Hibernate 5.0 > testsuite again with Infinispan 8.x/master before releasing Infinispan > 8.0.0.Final. > > Scott > > On 07/29/2015 07:24 PM, Sanne Grinovero wrote: > > I'm fixing the ORM testsuite so that we can run it on both > versions as > > https://hibernate.atlassian.net/browse/HHH-10001 > > > > Looks promising: > > After working around the testsuite problem I'm down to two errors - > > which seem more likely related to my changes than the Infinispan > > version change - but will need some sleep before debugging those ;) > > > > Sanne > > > > On 29 July 2015 at 23:09, Sanne Grinovero > wrote: > >> On 29 July 2015 at 22:08, Sanne Grinovero > wrote: > >>> On 29 July 2015 at 21:58, Scott Marlow > wrote: > >>>> With Paul's HHH-9999 fix, we can now compile and pass the ORM > tests with > >>>> Infinispan 7.3.x. We can also compile Hibernate 5.0 against > Infinispan > >>>> 8.0.0.Beta2 but get ORM test failures http://pastebin.com/T2Yt3gdn > >>> > >>> Let me guess.. you expect Infinispan 8 to parse configuration files > >>> meant for Infinispan 7? > >> > >> I was wrong, the problem is caused by this commit, which makes > >> Infinispan now very picky on not having used > AbstractInfinispanTest as > >> a base class: > >> https://github.com/infinispan/infinispan/commit/c9dfb > >> > >> Hibernate uses the Infinispan core testing helpers, but doesn't use > >> TestNG so I'm not sure what the best solution is here. > >> Should we set the thread name consistently in all Hibernate > tests too? > >> Seems quite pointless, the tests aren't run in parallel either. > >> > >> Could the helper be made a bit more tolerant for other use cases? Or > >> maybe I should add a system property to have it change behaviour? > >> I'd prefer to set the system property for Infinispan's own use, and > >> default to lenient so that other people can reuse the helpers w/o > >> these contraints. > >> > >> Thanks, > >> Sanne > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From ttarrant at redhat.com Fri Jul 31 04:01:12 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 31 Jul 2015 10:01:12 +0200 Subject: [infinispan-dev] Redis infinispan cache store In-Reply-To: References: <55B8A148.1090709@redhat.com> Message-ID: <55BB2B48.5080802@redhat.com> Let's start with a separate repo to begin with. As for third party clients, choose the one you feel is the best. Thanks for looking into this Tristan On 29/07/2015 20:31, Simon Paulger wrote: > Hi Tristan, > > With regards to project repositories, should I add the code to a fork of > the main infinispan project or create a standalone repository as per > hbase, jdbm, etc? > > And I presume there's no objections to using a third party Redis client? > I was thinking Jedis (https://github.com/xetorthio/jedis - MIT license, > currently maintained). > > Thanks, > Simon > > On 29 July 2015 at 10:47, Tristan Tarrant > wrote: > > Yes, we would be very interested. Check out the Infinispan cachestore > archetype [1] to get things started, and ask here or on IRC on > #infinispan for help, if you need more information. > > > Tristan > > [1] https://github.com/infinispan/infinispan-cachestore-archetype > > On 28/07/2015 22:43, Simon Paulger wrote: > > Hi, > > > > I'm interested in developing inifinispan integration with Redis > for use > > in JBoss. Before working on JBoss, I first need to add the > capability to > > Infinispan itself. > > > > Is this an enhancement that the infinispan community would be > interested in? > > > > Regards, > > Simon > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From dan.berindei at gmail.com Fri Jul 31 06:30:27 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 31 Jul 2015 13:30:27 +0300 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> <55B93E87.2060604@redhat.com> Message-ID: Hi Sanne Does Hibernate really need to use the Infinispan test helpers? They don't really do much unless you run the tests in parallel and you need them to be isolated... Cheers Dan On Thu, Jul 30, 2015 at 1:09 AM, Sanne Grinovero wrote: > On 29 July 2015 at 22:08, Sanne Grinovero wrote: >> On 29 July 2015 at 21:58, Scott Marlow wrote: >>> With Paul's HHH-9999 fix, we can now compile and pass the ORM tests with >>> Infinispan 7.3.x. We can also compile Hibernate 5.0 against Infinispan >>> 8.0.0.Beta2 but get ORM test failures http://pastebin.com/T2Yt3gdn >> >> Let me guess.. you expect Infinispan 8 to parse configuration files >> meant for Infinispan 7? > > I was wrong, the problem is caused by this commit, which makes > Infinispan now very picky on not having used AbstractInfinispanTest as > a base class: > https://github.com/infinispan/infinispan/commit/c9dfb > > Hibernate uses the Infinispan core testing helpers, but doesn't use > TestNG so I'm not sure what the best solution is here. > Should we set the thread name consistently in all Hibernate tests too? > Seems quite pointless, the tests aren't run in parallel either. > > Could the helper be made a bit more tolerant for other use cases? Or > maybe I should add a system property to have it change behaviour? > I'd prefer to set the system property for Infinispan's own use, and > default to lenient so that other people can reuse the helpers w/o > these contraints. > > Thanks, > Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From sanne at infinispan.org Fri Jul 31 08:17:24 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 31 Jul 2015 13:17:24 +0100 Subject: [infinispan-dev] Question about Hibernate ORM 5.0 + Infinispan 8.0... In-Reply-To: References: <55A501B9.7060608@redhat.com> <55A527D5.8060606@redhat.com> <55A65F5A.7090904@redhat.com> <55A76CA4.3000608@redhat.com> <55B68BA5.9080304@redhat.com> <55B8CDDB.2020507@redhat.com> <55B91FE3.70008@redhat.com> <55B92E55.9030709@redhat.com> <55B93E87.2060604@redhat.com> Message-ID: On 31 July 2015 at 11:30, Dan Berindei wrote: > Hi Sanne > > Does Hibernate really need to use the Infinispan test helpers? They > don't really do much unless you run the tests in parallel and you need > them to be isolated... Good point. I don't know why it's using them exactly, but I guess it's also to reduce thread pools and similar. We should ask Galder. But I thought there was an intention to ultimately suggest Infinispan end users to use our test helpers for development too? Sanne From sanne at infinispan.org Fri Jul 31 08:30:03 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 31 Jul 2015 13:30:03 +0100 Subject: [infinispan-dev] Shared vs Non-Shared CacheStores In-Reply-To: References: <55ACBA42.5070507@redhat.com> Message-ID: On 20 July 2015 at 11:02, Dan Berindei wrote: > Sanne, I think changing the cache store API is actually the most > painful part, so we should only do it if we gain a concrete advantage > from doing it. From a compatibility point of view, implementing a new > interface vs implementing the same interface with completely different > methods is just as bad. Right, from that perspective it's a quite horrible proposal. But I think we can agree that only the "SharedCacheStore" deserves to be considered an SPI, right? That's the one people will normally customize to map stuff to other stores one might have. I think it's important that beyond Infinispan 8.0 API's freeze, we can make any change to the non-shared SPI without affecting users who implement a custom shared cachestore. I highly doubt someone will implement a high-performance custom off heap swap strategy, but if someone does he should contribute it and will probably need to make integration level changes. We probably won't have the time to implement a new super efficient local-only cachestore to replace the leveldb one, but I'd like to keep the possibility open to do that beyond 8.0, *especially* without breaking compatibility for other people. Sanne > > On Mon, Jul 20, 2015 at 12:41 PM, Sanne Grinovero wrote: >> +1 for incremental changes.. >> >> I'd see the first step as defining two different interfaces; >> essentially we need to choose two good names. >> >> Then we could have both interfaces still implement the same identical >> methods, but go through each implementation and decide to "mark" it as >> shared-only or never-shared. >> >> That would make it simpler to make concrete change proposals on each >> of them and start taking some advantage from the split. I think you'll >> need the two different interfaces to implement the validations you >> mentioned. >> >> For Infinispan 8's goals, I'd be happy enough to keep the >> "shared-only" interface quite similar to the current one, but mark the >> never-shared one as a private or experimental SPI to allow ourselves >> some more flexibility in performance oriented changes. >> >> Thanks, >> Sanne >> >> On 20 July 2015 at 10:07, Tristan Tarrant wrote: >>> Sanne, well written. >>> Before actually implementing any of the optimizations/changes you >>> mention, I think the lowest-hanging fruit we should grab now is just to >>> add checks to all of our cachestores to actually throw an exception when >>> they are being enabled in unsupported configurations. >>> >>> I've created [1] to get us started >>> >>> Tristan >>> >>> [1] https://issues.jboss.org/browse/ISPN-5617 >>> >>> On 16/07/2015 15:32, Sanne Grinovero wrote: >>>> I would like to propose a clear cut separation between our shared and >>>> non-shared CacheStores, >>>> in all terms such as: >>>> - Configuration options >>>> - Integration contracts (Split the CacheStore SPI) >>>> - Implementations >>>> - Terminology, to avoid any further confusion around valid >>>> configurations and sensible architectures >>>> >>>> We have loads of examples of users who get in trouble by configuring >>>> one incorrectly, but also there are plenty of efficiency improvements >>>> we could take advantage of by clearly splitting the integration points >>>> and the implementations in two categories. >>>> >>>> Not least, it's a very common and dangerous pitfall to assume that >>>> Infinispan is able to restore a consistent state after having stopped >>>> a DIST cluster which passivated into non-shared CacheStore instances, >>>> or even REPL clusters when they don't shutdown all at the same exact >>>> time (and "exact same time" is a strange concept at least..). We need >>>> to clarify the different options, tradeoffs and their consequences.. >>>> to users and ourselves, as a clearly defined use case will avoid bugs >>>> and simplify implementations. >>>> >>>> # The purpose of each >>>> I think that people should use a non-shared (local?) CacheStore for >>>> the sole purpose of expanding to storage capacity of each single >>>> node.. be it because you don't have enough memory at all, or be it >>>> because you prefer some extra safety margin because either your >>>> estimates are complex, or maybe because we live in a real world were >>>> the hashing function might not be perfect in practice. I hope we all >>>> agree that Infinispan should be able to take such situations with at >>>> worst a graceful performance degradatation, rather than complain >>>> sending OOMs to the admin and setting the service on strike. >>>> >>>> A Shared CacheStore is useful for very different purposes; primarily >>>> to implement a Cache on some other service - for example your (single, >>>> shared) RDBMs, a slow (or expensive) webservice your organization has >>>> to call frequently, etc.. Or it's useful even as a write-through cache >>>> on a similar service, maybe internal but not able to handle the high >>>> variation of load spikes which Infinsipan can handle better. >>>> Finally, a great use case is to have a consistent backup of all your >>>> data-grid content, possibly in some "reference" form such as JPA >>>> mapped entities. >>>> >>>> # Benefits of a Non-Shared >>>> A non-shared CacheStore implementor should be able to take advantage >>>> of *its purpose*, among the big ones I see: >>>> - Exclusive usage -> locking of a specific entry can be handled at >>>> datacontainer level, can simplify quite some internal code. >>>> - Reliability -> since a clustered node needs to wipe its state at >>>> reboot (after a crash), it's much simpler to code any such CacheStore >>>> to avoid any form of disk synch or persistance guarantees. >>>> - Encoding format -> this can be controlled entirely by Infinispan, >>>> and no need to take factors like rolling upgrade compatible encodings >>>> in mind. JBoss Marshalling would be good enough, or some >>>> implementations might not need to serialize at all. >>>> >>>> Our non-shared CacheStore implentation(s) could take advantage of >>>> lower level more complex code optimisations and interfaces, as users >>>> would rarely want to customize one of these, while the use case of >>>> mapping data to a shared service needs a more user friendly SPI so to >>>> keep it simple to plug in custom stores: custom data formats, custom >>>> connectors, get some help in implementing concurrency correctly. >>>> Proper Transaction integration for the CacheStore has been on our >>>> wishlist for some time too, I suspect that accepting that we have been >>>> mixing up two different things under a same name so far, would make it >>>> simpler to implement further improvements such as transactions: the >>>> way to do such a thing is very different in each of these use cases, >>>> so it would help at least to implement it on a subset first, or maybe >>>> only if it turns out there's no need for such things in the context of >>>> the local-only-dedicated "swapfile". >>>> >>>> # Mixed types should be killed >>>> I'm aware that some of our current implementations _could_ work both as >>>> shared or non-shared, for example the JDBC or JPACacheStore or the >>>> Remote Cachestore.. but in most cases it doesn't make much sense. Why >>>> would you ever want to use the JPACacheStore if not to share data with >>>> a _shared_ database? >>>> >>>> We should take such options away, and by doing so focus on the use >>>> cases which actually matter and simplify the implementations and >>>> improve the configuration validations. >>>> >>>> If ever a compelling storage technology is identified which we'd like to >>>> offer as an option for both shared or non-shared, I would still >>>> recommend to make two different implementations, as there certainly are >>>> different requirements and assumptions when coding such a thing. >>>> >>>> Not least, I would very like to see a default local CacheStore: >>>> picking one for local "emergency swapping" should be a no-brainer for >>>> users; we could setup one by default and not bother newcomers with >>>> complex choices. >>>> >>>> If we simplify the requirement of such a thing, it should be easy to >>>> write one on standard Java NIO2 APIs and get rid of the complexities of >>>> maintaining the native integration with things like LevelDB, not least >>>> the inefficiency of Java to make such native calls. >>>> >>>> Then as a second step, we should attack the other use case: backups; >>>> from a *purpose driven perspective* I'd then see us revive the Cassandra >>>> integration; obviously as a shared-only option. >>>> >>>> Cheers, >>>> Sanne >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>> >>> -- >>> Tristan Tarrant >>> Infinispan Lead >>> JBoss, a division of Red Hat >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev