From ttarrant at redhat.com Thu Oct 2 09:21:07 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 02 Oct 2014 15:21:07 +0200 Subject: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan In-Reply-To: <851849146.38574524.1412064092251.JavaMail.zimbra@redhat.com> References: <54257C74.2050600@redhat.com> <1898996727.37561681.1411744742097.JavaMail.zimbra@redhat.com> <54297D4F.9060009@jboss.com> <133530692.31414088.1412009840434.JavaMail.zimbra@redhat.com> <542A5583.4030908@redhat.com> <851849146.38574524.1412064092251.JavaMail.zimbra@redhat.com> Message-ID: <542D5143.3070006@redhat.com> I have successfully created a "hybrid" cluster between an application using Infinispan in embedded mode and an Infinispan server by doing the following on the embedded side: - use a JGroups Channel wrapped in a MuxHandler - use a custom class resolver which simulates (or rather... hacks) the behaviour of the ModularClassResolver when not using modules You can find the code at my personal GitHub repo: https://github.com/tristantarrant/infinispan-playground/tree/master/src/main/java/net/dataforte/infinispan/hybrid suggestions and improvements are welcome. Tristan On 30/09/14 10:01, Stelios Koussouris wrote: > Hi, > > To give a bit of context on this. We are doing a POC where the customer wishes to utilize JDG to speed up their application. We need (due to some customer requirements) to cluster > EMBEDDED JDG (infinispan library mode) with REMOTE JDG (Infinispan Server) nodes. The infinispan jars should be the same as they are only libraries and they > are on the same version. However, during "clustering" of the caches we started seeing errors which looked like there were due to the fact that the clustering of the caches contained different > info between the 2 types of cache instantiation (embedded vs server). > > The result was to for a suggestion to create our own MuxChannel (I don't know if we have any other alternatives at this stage to cluster embedded with server infinispan caches) but at the moment we are facing https://gist.github.com/skoussou/5edc5689446b67f85ae8 > > Regards, > > Stylianos Kousouris > Red Hat Middleware Consultant > > ----- Original Message ----- > From: "Tristan Tarrant" > To: "infinispan -Dev List" , "Kurt T Stam" > Cc: "Stelios Koussouris" , "Richard Achmatowicz" > Sent: Tuesday, 30 September, 2014 8:02:27 AM > Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan > > I don't know what Kurt is doing, but Stelios is attempting to cluster an > application using embedded Infinispan deployed within WF together with > an Infinispan Server instance. > The application is managing its own caches, and therefore it is not > interacting with the underlying Infinispan and JGroups subsystems in WF. > Infinispan Server uses its Infinispan and JGroups subsystems (which are > forked from WF's) and therefore are using MuxChannels. > > I told Stelios to use a MuxChannel-wrapped Channel in his application > and it solved part of the issue (he was initially importing the one > included in the WF's jgroups subsystem, but now he's using his local > copy), but now he has run into further problems and I believe what Paul > & Dennis have written might be correct. > > The code that configures this is in > EmbeddedCacheManagerConfigurationService: > > GlobalConfigurationBuilder builder = new GlobalConfigurationBuilder(); > ModuleLoader moduleLoader = this.dependencies.getModuleLoader(); > builder.serialization().classResolver(ModularClassResolver.getInstance(moduleLoader)); > > I don't know how you'd get a ModuleLoader from within a WF deployment, > but I'm sure it can be done. > > Tristan > > On 29/09/14 18:57, Paul Ferraro wrote: >> You should not need to use a MuxChannel. This would only be necessary if there are other EAP services sharing the channel. Using a MuxChannel allows your standalone Infinispan instance to filter these irrelevant messages. However, in JDG, there should be no other services other than Infinispan using the channel - hence the MuxChannel stuff is unnecessary. >> >> I think Dennis earlier response was spot on. EAP/JDG configures it's cache managers using a ModularClassResolver (which includes a module name along with the class name when marshalling). Your standalone Infinispan instances do not use this and therefore cannot make sense of the message body. >> >> Paul >> >> ----- Original Message ----- >>> From: "Kurt T Stam" >>> To: "Stelios Koussouris" , "Radoslav Husar" >>> Cc: "Galder Zamarre?o" , "Paul Ferraro" , "Richard Achmatowicz" >>> , "infinispan -Dev List" >>> Sent: Monday, September 29, 2014 11:39:59 AM >>> Subject: Re: Clustering standalone Infinispan w/ WF running Infinispan >>> >>> Thanks for following up Stelios, I think Galder is traveling the next 2 >>> weeks. >>> >>> So - do we need fixes on both ends then so that the boot order does not >>> matter? In which project(s) would we apply >>> there changes? Or can they be applied in the end-user's code? >>> >>> Thx, >>> >>> --Kurt >>> >>> >>> >>> On 9/26/14, 11:19 AM, Stelios Koussouris wrote: >>>> Hi, >>>> >>>> Rado: It is both ways. ie. if I start first the JDG Server I get the issue >>>> on the library mode side when I start that one. If reverse the order of >>>> startup I get it in the JDG Server side. >>>> >>>> Question: >>>> ----------------------------------------------------------------------------------------------------------------------- >>>> ...IMO the channel needs to be wrapped as >>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to infinispan. >>>> ... >>>> ----------------------------------------------------------------------------------------------------------------------- >>>> For now that this is not being done. If I wanted to do it manually on the >>>> library side where I can create the protocol programmatically we are >>>> talking about something like this? >>>> >>>> ProtocolStackConfigurator configurator = >>>> ConfiguratorFactory.getStackConfigurator("jgroups-udp.xml"); >>>> MuxChannel channel = new MuxChannel(configurator); >>>> org.infinispan.remoting.transport.Transport transport = new >>>> org.infinispan.remoting.transport.jgroups.JGroupsTransport(channel); >>>> >>>> .... >>>> then replace the below >>>> new >>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport().clusterName("UDM-CLUSTER").addProperty("configurationFile", >>>> "jgroups-udp.xml") >>>> >>>> WITH >>>> new >>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport(Transport).clusterName("UDM-CLUSTER") >>>> >>>> Btw, someone mentioned that if I follow this method I need to to know the >>>> assigned mux ids, but that is not quite clear what it means with regards >>>> to the JGroupsTransport configuration >>>> >>>> Thanks, >>>> >>>> Stylianos Kousouris >>>> Red Hat Middleware Consultant >>>> >>>> ----- Original Message ----- >>>> From: "Radoslav Husar" >>>> To: "Galder Zamarre?o" , "Paul Ferraro" >>>> >>>> Cc: "Richard Achmatowicz" , "infinispan -Dev List" >>>> , "Stelios Koussouris" >>>> , "Kurt T Stam" >>>> Sent: Friday, 26 September, 2014 3:47:16 PM >>>> Subject: Re: Clustering standalone Infinispan w/ WF running Infinispan >>>> >>>> From what Stelios is telling me the question is a little bit other way >>>> round: he is using library mode infinispan and jgroups in EAP and >>>> connecting to JDG. So the question is what JDG is doing with the stack, >>>> not AS/WF as its infinispan/jgroups subsystem is not used. >>>> >>>> Unfortunately I don't have access to the JDG repo so I don't know what >>>> changes have been made there but if you are using the same jgroups >>>> logic, IMO the channel needs to be wrapped as >>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to infinispan. >>>> >>>> Rado >>>> >>>> On 26/09/14 15:03, Galder Zamarre?o wrote: >>>>> Hey Paul, >>>>> >>>>> In the last couple of days, a couple of people have encountered the >>>>> exception in [1] when trying to cluster a standalone Infinispan app with >>>>> its own JGroups configuration file with a AS/WF running Infinispan cache. >>>>> >>>>> From my POV, 3 possible causes: >>>>> >>>>> 1. Dependency mismatches between AS/WF and the standalone app. Having done >>>>> some quick study of Kurt?s case, apart from micro version changes, all >>>>> looks good. >>>>> >>>>> 2. Mismatch in the Infinispan and/or JGroups configuration file. >>>>> >>>>> 3. AS/WF puts something on the clustered wire that standalone Infinispan >>>>> does not expect. Are you still doing multiplexing? Could you be adding >>>>> extra info to the wire? >>>>> >>>>> With this email, I?m trying to get some clarification from you if the >>>>> issue could be due to 3rd option. If it?s either of the first two, it?s a >>>>> matter of digging and finding the difference, but if it?s 3rd one, it?s >>>>> more problematic. >>>>> >>>>> Any ideas? >>>>> >>>>> [1] https://gist.github.com/skoussou/92f062f2d0bd17168e01 >>>>> -- >>>>> Galder Zamarre?o >>>>> galder at redhat.com >>>>> twitter.com/galderz >>>>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > From paul.ferraro at redhat.com Thu Oct 2 10:06:01 2014 From: paul.ferraro at redhat.com (Paul Ferraro) Date: Thu, 2 Oct 2014 10:06:01 -0400 (EDT) Subject: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan In-Reply-To: <542D5143.3070006@redhat.com> References: <54257C74.2050600@redhat.com> <1898996727.37561681.1411744742097.JavaMail.zimbra@redhat.com> <54297D4F.9060009@jboss.com> <133530692.31414088.1412009840434.JavaMail.zimbra@redhat.com> <542A5583.4030908@redhat.com> <851849146.38574524.1412064092251.JavaMail.zimbra@redhat.com> <542D5143.3070006@redhat.com> Message-ID: <905338673.1908952.1412258761247.JavaMail.zimbra@redhat.com> The only other obvious alternative of which I can think is to actually start the application which uses embedded Infinispan using jboss-modules. That way you don't need to hack the behavior of ModularClassResolver. ----- Original Message ----- > From: "Tristan Tarrant" > To: "Stelios Koussouris" > Cc: "Kurt T Stam" , "infinispan -Dev List" , "Richard > Achmatowicz" > Sent: Thursday, October 2, 2014 9:21:07 AM > Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan > > I have successfully created a "hybrid" cluster between an application > using Infinispan in embedded mode and an Infinispan server by doing the > following on the embedded side: > > - use a JGroups Channel wrapped in a MuxHandler > - use a custom class resolver which simulates (or rather... hacks) the > behaviour of the ModularClassResolver when not using modules > > You can find the code at my personal GitHub repo: > > https://github.com/tristantarrant/infinispan-playground/tree/master/src/main/java/net/dataforte/infinispan/hybrid > > suggestions and improvements are welcome. > > Tristan > > On 30/09/14 10:01, Stelios Koussouris wrote: > > Hi, > > > > To give a bit of context on this. We are doing a POC where the customer > > wishes to utilize JDG to speed up their application. We need (due to some > > customer requirements) to cluster > > EMBEDDED JDG (infinispan library mode) with REMOTE JDG (Infinispan Server) > > nodes. The infinispan jars should be the same as they are only libraries > > and they > > are on the same version. However, during "clustering" of the caches we > > started seeing errors which looked like there were due to the fact that > > the clustering of the caches contained different > > info between the 2 types of cache instantiation (embedded vs server). > > > > The result was to for a suggestion to create our own MuxChannel (I don't > > know if we have any other alternatives at this stage to cluster embedded > > with server infinispan caches) but at the moment we are facing > > https://gist.github.com/skoussou/5edc5689446b67f85ae8 > > > > Regards, > > > > Stylianos Kousouris > > Red Hat Middleware Consultant > > > > ----- Original Message ----- > > From: "Tristan Tarrant" > > To: "infinispan -Dev List" , "Kurt T Stam" > > > > Cc: "Stelios Koussouris" , "Richard Achmatowicz" > > > > Sent: Tuesday, 30 September, 2014 8:02:27 AM > > Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF > > running Infinispan > > > > I don't know what Kurt is doing, but Stelios is attempting to cluster an > > application using embedded Infinispan deployed within WF together with > > an Infinispan Server instance. > > The application is managing its own caches, and therefore it is not > > interacting with the underlying Infinispan and JGroups subsystems in WF. > > Infinispan Server uses its Infinispan and JGroups subsystems (which are > > forked from WF's) and therefore are using MuxChannels. > > > > I told Stelios to use a MuxChannel-wrapped Channel in his application > > and it solved part of the issue (he was initially importing the one > > included in the WF's jgroups subsystem, but now he's using his local > > copy), but now he has run into further problems and I believe what Paul > > & Dennis have written might be correct. > > > > The code that configures this is in > > EmbeddedCacheManagerConfigurationService: > > > > GlobalConfigurationBuilder builder = new GlobalConfigurationBuilder(); > > ModuleLoader moduleLoader = this.dependencies.getModuleLoader(); > > builder.serialization().classResolver(ModularClassResolver.getInstance(moduleLoader)); > > > > I don't know how you'd get a ModuleLoader from within a WF deployment, > > but I'm sure it can be done. > > > > Tristan > > > > On 29/09/14 18:57, Paul Ferraro wrote: > >> You should not need to use a MuxChannel. This would only be necessary if > >> there are other EAP services sharing the channel. Using a MuxChannel > >> allows your standalone Infinispan instance to filter these irrelevant > >> messages. However, in JDG, there should be no other services other than > >> Infinispan using the channel - hence the MuxChannel stuff is unnecessary. > >> > >> I think Dennis earlier response was spot on. EAP/JDG configures it's > >> cache managers using a ModularClassResolver (which includes a module name > >> along with the class name when marshalling). Your standalone Infinispan > >> instances do not use this and therefore cannot make sense of the message > >> body. > >> > >> Paul > >> > >> ----- Original Message ----- > >>> From: "Kurt T Stam" > >>> To: "Stelios Koussouris" , "Radoslav Husar" > >>> > >>> Cc: "Galder Zamarre?o" , "Paul Ferraro" > >>> , "Richard Achmatowicz" > >>> , "infinispan -Dev List" > >>> > >>> Sent: Monday, September 29, 2014 11:39:59 AM > >>> Subject: Re: Clustering standalone Infinispan w/ WF running Infinispan > >>> > >>> Thanks for following up Stelios, I think Galder is traveling the next 2 > >>> weeks. > >>> > >>> So - do we need fixes on both ends then so that the boot order does not > >>> matter? In which project(s) would we apply > >>> there changes? Or can they be applied in the end-user's code? > >>> > >>> Thx, > >>> > >>> --Kurt > >>> > >>> > >>> > >>> On 9/26/14, 11:19 AM, Stelios Koussouris wrote: > >>>> Hi, > >>>> > >>>> Rado: It is both ways. ie. if I start first the JDG Server I get the > >>>> issue > >>>> on the library mode side when I start that one. If reverse the order of > >>>> startup I get it in the JDG Server side. > >>>> > >>>> Question: > >>>> ----------------------------------------------------------------------------------------------------------------------- > >>>> ...IMO the channel needs to be wrapped as > >>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to infinispan. > >>>> ... > >>>> ----------------------------------------------------------------------------------------------------------------------- > >>>> For now that this is not being done. If I wanted to do it manually on > >>>> the > >>>> library side where I can create the protocol programmatically we are > >>>> talking about something like this? > >>>> > >>>> ProtocolStackConfigurator configurator = > >>>> ConfiguratorFactory.getStackConfigurator("jgroups-udp.xml"); > >>>> MuxChannel channel = new MuxChannel(configurator); > >>>> org.infinispan.remoting.transport.Transport transport = new > >>>> org.infinispan.remoting.transport.jgroups.JGroupsTransport(channel); > >>>> > >>>> .... > >>>> then replace the below > >>>> new > >>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport().clusterName("UDM-CLUSTER").addProperty("configurationFile", > >>>> "jgroups-udp.xml") > >>>> > >>>> WITH > >>>> new > >>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport(Transport).clusterName("UDM-CLUSTER") > >>>> > >>>> Btw, someone mentioned that if I follow this method I need to to know > >>>> the > >>>> assigned mux ids, but that is not quite clear what it means with regards > >>>> to the JGroupsTransport configuration > >>>> > >>>> Thanks, > >>>> > >>>> Stylianos Kousouris > >>>> Red Hat Middleware Consultant > >>>> > >>>> ----- Original Message ----- > >>>> From: "Radoslav Husar" > >>>> To: "Galder Zamarre?o" , "Paul Ferraro" > >>>> > >>>> Cc: "Richard Achmatowicz" , "infinispan -Dev List" > >>>> , "Stelios Koussouris" > >>>> , "Kurt T Stam" > >>>> Sent: Friday, 26 September, 2014 3:47:16 PM > >>>> Subject: Re: Clustering standalone Infinispan w/ WF running Infinispan > >>>> > >>>> From what Stelios is telling me the question is a little bit other > >>>> way > >>>> round: he is using library mode infinispan and jgroups in EAP and > >>>> connecting to JDG. So the question is what JDG is doing with the stack, > >>>> not AS/WF as its infinispan/jgroups subsystem is not used. > >>>> > >>>> Unfortunately I don't have access to the JDG repo so I don't know what > >>>> changes have been made there but if you are using the same jgroups > >>>> logic, IMO the channel needs to be wrapped as > >>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to infinispan. > >>>> > >>>> Rado > >>>> > >>>> On 26/09/14 15:03, Galder Zamarre?o wrote: > >>>>> Hey Paul, > >>>>> > >>>>> In the last couple of days, a couple of people have encountered the > >>>>> exception in [1] when trying to cluster a standalone Infinispan app > >>>>> with > >>>>> its own JGroups configuration file with a AS/WF running Infinispan > >>>>> cache. > >>>>> > >>>>> From my POV, 3 possible causes: > >>>>> > >>>>> 1. Dependency mismatches between AS/WF and the standalone app. Having > >>>>> done > >>>>> some quick study of Kurt?s case, apart from micro version changes, all > >>>>> looks good. > >>>>> > >>>>> 2. Mismatch in the Infinispan and/or JGroups configuration file. > >>>>> > >>>>> 3. AS/WF puts something on the clustered wire that standalone > >>>>> Infinispan > >>>>> does not expect. Are you still doing multiplexing? Could you be adding > >>>>> extra info to the wire? > >>>>> > >>>>> With this email, I?m trying to get some clarification from you if the > >>>>> issue could be due to 3rd option. If it?s either of the first two, it?s > >>>>> a > >>>>> matter of digging and finding the difference, but if it?s 3rd one, it?s > >>>>> more problematic. > >>>>> > >>>>> Any ideas? > >>>>> > >>>>> [1] https://gist.github.com/skoussou/92f062f2d0bd17168e01 > >>>>> -- > >>>>> Galder Zamarre?o > >>>>> galder at redhat.com > >>>>> twitter.com/galderz > >>>>> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Thu Oct 2 10:38:21 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 02 Oct 2014 16:38:21 +0200 Subject: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan In-Reply-To: <905338673.1908952.1412258761247.JavaMail.zimbra@redhat.com> References: <54257C74.2050600@redhat.com> <1898996727.37561681.1411744742097.JavaMail.zimbra@redhat.com> <54297D4F.9060009@jboss.com> <133530692.31414088.1412009840434.JavaMail.zimbra@redhat.com> <542A5583.4030908@redhat.com> <851849146.38574524.1412064092251.JavaMail.zimbra@redhat.com> <542D5143.3070006@redhat.com> <905338673.1908952.1412258761247.JavaMail.zimbra@redhat.com> Message-ID: <542D635D.6030609@redhat.com> But then the module identifier wouldn't make sense: if you are embedding infinispan-core.jar, it would definitely not send "org.infinispan:main" as module:slot, which is what server needs instead. Tristan On 02/10/14 16:06, Paul Ferraro wrote: > The only other obvious alternative of which I can think is to actually start the application which uses embedded Infinispan using jboss-modules. > That way you don't need to hack the behavior of ModularClassResolver. > > ----- Original Message ----- >> From: "Tristan Tarrant" >> To: "Stelios Koussouris" >> Cc: "Kurt T Stam" , "infinispan -Dev List" , "Richard >> Achmatowicz" >> Sent: Thursday, October 2, 2014 9:21:07 AM >> Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan >> >> I have successfully created a "hybrid" cluster between an application >> using Infinispan in embedded mode and an Infinispan server by doing the >> following on the embedded side: >> >> - use a JGroups Channel wrapped in a MuxHandler >> - use a custom class resolver which simulates (or rather... hacks) the >> behaviour of the ModularClassResolver when not using modules >> >> You can find the code at my personal GitHub repo: >> >> https://github.com/tristantarrant/infinispan-playground/tree/master/src/main/java/net/dataforte/infinispan/hybrid >> >> suggestions and improvements are welcome. >> >> Tristan >> >> On 30/09/14 10:01, Stelios Koussouris wrote: >>> Hi, >>> >>> To give a bit of context on this. We are doing a POC where the customer >>> wishes to utilize JDG to speed up their application. We need (due to some >>> customer requirements) to cluster >>> EMBEDDED JDG (infinispan library mode) with REMOTE JDG (Infinispan Server) >>> nodes. The infinispan jars should be the same as they are only libraries >>> and they >>> are on the same version. However, during "clustering" of the caches we >>> started seeing errors which looked like there were due to the fact that >>> the clustering of the caches contained different >>> info between the 2 types of cache instantiation (embedded vs server). >>> >>> The result was to for a suggestion to create our own MuxChannel (I don't >>> know if we have any other alternatives at this stage to cluster embedded >>> with server infinispan caches) but at the moment we are facing >>> https://gist.github.com/skoussou/5edc5689446b67f85ae8 >>> >>> Regards, >>> >>> Stylianos Kousouris >>> Red Hat Middleware Consultant >>> >>> ----- Original Message ----- >>> From: "Tristan Tarrant" >>> To: "infinispan -Dev List" , "Kurt T Stam" >>> >>> Cc: "Stelios Koussouris" , "Richard Achmatowicz" >>> >>> Sent: Tuesday, 30 September, 2014 8:02:27 AM >>> Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF >>> running Infinispan >>> >>> I don't know what Kurt is doing, but Stelios is attempting to cluster an >>> application using embedded Infinispan deployed within WF together with >>> an Infinispan Server instance. >>> The application is managing its own caches, and therefore it is not >>> interacting with the underlying Infinispan and JGroups subsystems in WF. >>> Infinispan Server uses its Infinispan and JGroups subsystems (which are >>> forked from WF's) and therefore are using MuxChannels. >>> >>> I told Stelios to use a MuxChannel-wrapped Channel in his application >>> and it solved part of the issue (he was initially importing the one >>> included in the WF's jgroups subsystem, but now he's using his local >>> copy), but now he has run into further problems and I believe what Paul >>> & Dennis have written might be correct. >>> >>> The code that configures this is in >>> EmbeddedCacheManagerConfigurationService: >>> >>> GlobalConfigurationBuilder builder = new GlobalConfigurationBuilder(); >>> ModuleLoader moduleLoader = this.dependencies.getModuleLoader(); >>> builder.serialization().classResolver(ModularClassResolver.getInstance(moduleLoader)); >>> >>> I don't know how you'd get a ModuleLoader from within a WF deployment, >>> but I'm sure it can be done. >>> >>> Tristan >>> >>> On 29/09/14 18:57, Paul Ferraro wrote: >>>> You should not need to use a MuxChannel. This would only be necessary if >>>> there are other EAP services sharing the channel. Using a MuxChannel >>>> allows your standalone Infinispan instance to filter these irrelevant >>>> messages. However, in JDG, there should be no other services other than >>>> Infinispan using the channel - hence the MuxChannel stuff is unnecessary. >>>> >>>> I think Dennis earlier response was spot on. EAP/JDG configures it's >>>> cache managers using a ModularClassResolver (which includes a module name >>>> along with the class name when marshalling). Your standalone Infinispan >>>> instances do not use this and therefore cannot make sense of the message >>>> body. >>>> >>>> Paul >>>> >>>> ----- Original Message ----- >>>>> From: "Kurt T Stam" >>>>> To: "Stelios Koussouris" , "Radoslav Husar" >>>>> >>>>> Cc: "Galder Zamarre?o" , "Paul Ferraro" >>>>> , "Richard Achmatowicz" >>>>> , "infinispan -Dev List" >>>>> >>>>> Sent: Monday, September 29, 2014 11:39:59 AM >>>>> Subject: Re: Clustering standalone Infinispan w/ WF running Infinispan >>>>> >>>>> Thanks for following up Stelios, I think Galder is traveling the next 2 >>>>> weeks. >>>>> >>>>> So - do we need fixes on both ends then so that the boot order does not >>>>> matter? In which project(s) would we apply >>>>> there changes? Or can they be applied in the end-user's code? >>>>> >>>>> Thx, >>>>> >>>>> --Kurt >>>>> >>>>> >>>>> >>>>> On 9/26/14, 11:19 AM, Stelios Koussouris wrote: >>>>>> Hi, >>>>>> >>>>>> Rado: It is both ways. ie. if I start first the JDG Server I get the >>>>>> issue >>>>>> on the library mode side when I start that one. If reverse the order of >>>>>> startup I get it in the JDG Server side. >>>>>> >>>>>> Question: >>>>>> ----------------------------------------------------------------------------------------------------------------------- >>>>>> ...IMO the channel needs to be wrapped as >>>>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to infinispan. >>>>>> ... >>>>>> ----------------------------------------------------------------------------------------------------------------------- >>>>>> For now that this is not being done. If I wanted to do it manually on >>>>>> the >>>>>> library side where I can create the protocol programmatically we are >>>>>> talking about something like this? >>>>>> >>>>>> ProtocolStackConfigurator configurator = >>>>>> ConfiguratorFactory.getStackConfigurator("jgroups-udp.xml"); >>>>>> MuxChannel channel = new MuxChannel(configurator); >>>>>> org.infinispan.remoting.transport.Transport transport = new >>>>>> org.infinispan.remoting.transport.jgroups.JGroupsTransport(channel); >>>>>> >>>>>> .... >>>>>> then replace the below >>>>>> new >>>>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport().clusterName("UDM-CLUSTER").addProperty("configurationFile", >>>>>> "jgroups-udp.xml") >>>>>> >>>>>> WITH >>>>>> new >>>>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport(Transport).clusterName("UDM-CLUSTER") >>>>>> >>>>>> Btw, someone mentioned that if I follow this method I need to to know >>>>>> the >>>>>> assigned mux ids, but that is not quite clear what it means with regards >>>>>> to the JGroupsTransport configuration >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Stylianos Kousouris >>>>>> Red Hat Middleware Consultant >>>>>> >>>>>> ----- Original Message ----- >>>>>> From: "Radoslav Husar" >>>>>> To: "Galder Zamarre?o" , "Paul Ferraro" >>>>>> >>>>>> Cc: "Richard Achmatowicz" , "infinispan -Dev List" >>>>>> , "Stelios Koussouris" >>>>>> , "Kurt T Stam" >>>>>> Sent: Friday, 26 September, 2014 3:47:16 PM >>>>>> Subject: Re: Clustering standalone Infinispan w/ WF running Infinispan >>>>>> >>>>>> From what Stelios is telling me the question is a little bit other >>>>>> way >>>>>> round: he is using library mode infinispan and jgroups in EAP and >>>>>> connecting to JDG. So the question is what JDG is doing with the stack, >>>>>> not AS/WF as its infinispan/jgroups subsystem is not used. >>>>>> >>>>>> Unfortunately I don't have access to the JDG repo so I don't know what >>>>>> changes have been made there but if you are using the same jgroups >>>>>> logic, IMO the channel needs to be wrapped as >>>>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to infinispan. >>>>>> >>>>>> Rado >>>>>> >>>>>> On 26/09/14 15:03, Galder Zamarre?o wrote: >>>>>>> Hey Paul, >>>>>>> >>>>>>> In the last couple of days, a couple of people have encountered the >>>>>>> exception in [1] when trying to cluster a standalone Infinispan app >>>>>>> with >>>>>>> its own JGroups configuration file with a AS/WF running Infinispan >>>>>>> cache. >>>>>>> >>>>>>> From my POV, 3 possible causes: >>>>>>> >>>>>>> 1. Dependency mismatches between AS/WF and the standalone app. Having >>>>>>> done >>>>>>> some quick study of Kurt?s case, apart from micro version changes, all >>>>>>> looks good. >>>>>>> >>>>>>> 2. Mismatch in the Infinispan and/or JGroups configuration file. >>>>>>> >>>>>>> 3. AS/WF puts something on the clustered wire that standalone >>>>>>> Infinispan >>>>>>> does not expect. Are you still doing multiplexing? Could you be adding >>>>>>> extra info to the wire? >>>>>>> >>>>>>> With this email, I?m trying to get some clarification from you if the >>>>>>> issue could be due to 3rd option. If it?s either of the first two, it?s >>>>>>> a >>>>>>> matter of digging and finding the difference, but if it?s 3rd one, it?s >>>>>>> more problematic. >>>>>>> >>>>>>> Any ideas? >>>>>>> >>>>>>> [1] https://gist.github.com/skoussou/92f062f2d0bd17168e01 >>>>>>> -- >>>>>>> Galder Zamarre?o >>>>>>> galder at redhat.com >>>>>>> twitter.com/galderz >>>>>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From paul.ferraro at redhat.com Thu Oct 2 13:46:14 2014 From: paul.ferraro at redhat.com (Paul Ferraro) Date: Thu, 2 Oct 2014 13:46:14 -0400 (EDT) Subject: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan In-Reply-To: <542D635D.6030609@redhat.com> References: <54297D4F.9060009@jboss.com> <133530692.31414088.1412009840434.JavaMail.zimbra@redhat.com> <542A5583.4030908@redhat.com> <851849146.38574524.1412064092251.JavaMail.zimbra@redhat.com> <542D5143.3070006@redhat.com> <905338673.1908952.1412258761247.JavaMail.zimbra@redhat.com> <542D635D.6030609@redhat.com> Message-ID: <1607379301.2119649.1412271974105.JavaMail.zimbra@redhat.com> infinispan-core and its dependencies would need to be bundled as modules using the same module descriptors as the server. ----- Original Message ----- > From: "Tristan Tarrant" > To: "infinispan -Dev List" > Cc: "Kurt T Stam" , "Stelios Koussouris" , "Richard Achmatowicz" > > Sent: Thursday, October 2, 2014 10:38:21 AM > Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan > > But then the module identifier wouldn't make sense: if you are embedding > infinispan-core.jar, it would definitely not send "org.infinispan:main" > as module:slot, which is what server needs instead. > > Tristan > > > On 02/10/14 16:06, Paul Ferraro wrote: > > The only other obvious alternative of which I can think is to actually > > start the application which uses embedded Infinispan using jboss-modules. > > That way you don't need to hack the behavior of ModularClassResolver. > > > > ----- Original Message ----- > >> From: "Tristan Tarrant" > >> To: "Stelios Koussouris" > >> Cc: "Kurt T Stam" , "infinispan -Dev List" > >> , "Richard > >> Achmatowicz" > >> Sent: Thursday, October 2, 2014 9:21:07 AM > >> Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF > >> running Infinispan > >> > >> I have successfully created a "hybrid" cluster between an application > >> using Infinispan in embedded mode and an Infinispan server by doing the > >> following on the embedded side: > >> > >> - use a JGroups Channel wrapped in a MuxHandler > >> - use a custom class resolver which simulates (or rather... hacks) the > >> behaviour of the ModularClassResolver when not using modules > >> > >> You can find the code at my personal GitHub repo: > >> > >> https://github.com/tristantarrant/infinispan-playground/tree/master/src/main/java/net/dataforte/infinispan/hybrid > >> > >> suggestions and improvements are welcome. > >> > >> Tristan > >> > >> On 30/09/14 10:01, Stelios Koussouris wrote: > >>> Hi, > >>> > >>> To give a bit of context on this. We are doing a POC where the customer > >>> wishes to utilize JDG to speed up their application. We need (due to some > >>> customer requirements) to cluster > >>> EMBEDDED JDG (infinispan library mode) with REMOTE JDG (Infinispan > >>> Server) > >>> nodes. The infinispan jars should be the same as they are only libraries > >>> and they > >>> are on the same version. However, during "clustering" of the caches we > >>> started seeing errors which looked like there were due to the fact that > >>> the clustering of the caches contained different > >>> info between the 2 types of cache instantiation (embedded vs server). > >>> > >>> The result was to for a suggestion to create our own MuxChannel (I don't > >>> know if we have any other alternatives at this stage to cluster embedded > >>> with server infinispan caches) but at the moment we are facing > >>> https://gist.github.com/skoussou/5edc5689446b67f85ae8 > >>> > >>> Regards, > >>> > >>> Stylianos Kousouris > >>> Red Hat Middleware Consultant > >>> > >>> ----- Original Message ----- > >>> From: "Tristan Tarrant" > >>> To: "infinispan -Dev List" , "Kurt T > >>> Stam" > >>> > >>> Cc: "Stelios Koussouris" , "Richard Achmatowicz" > >>> > >>> Sent: Tuesday, 30 September, 2014 8:02:27 AM > >>> Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF > >>> running Infinispan > >>> > >>> I don't know what Kurt is doing, but Stelios is attempting to cluster an > >>> application using embedded Infinispan deployed within WF together with > >>> an Infinispan Server instance. > >>> The application is managing its own caches, and therefore it is not > >>> interacting with the underlying Infinispan and JGroups subsystems in WF. > >>> Infinispan Server uses its Infinispan and JGroups subsystems (which are > >>> forked from WF's) and therefore are using MuxChannels. > >>> > >>> I told Stelios to use a MuxChannel-wrapped Channel in his application > >>> and it solved part of the issue (he was initially importing the one > >>> included in the WF's jgroups subsystem, but now he's using his local > >>> copy), but now he has run into further problems and I believe what Paul > >>> & Dennis have written might be correct. > >>> > >>> The code that configures this is in > >>> EmbeddedCacheManagerConfigurationService: > >>> > >>> GlobalConfigurationBuilder builder = new GlobalConfigurationBuilder(); > >>> ModuleLoader moduleLoader = this.dependencies.getModuleLoader(); > >>> builder.serialization().classResolver(ModularClassResolver.getInstance(moduleLoader)); > >>> > >>> I don't know how you'd get a ModuleLoader from within a WF deployment, > >>> but I'm sure it can be done. > >>> > >>> Tristan > >>> > >>> On 29/09/14 18:57, Paul Ferraro wrote: > >>>> You should not need to use a MuxChannel. This would only be necessary > >>>> if > >>>> there are other EAP services sharing the channel. Using a MuxChannel > >>>> allows your standalone Infinispan instance to filter these irrelevant > >>>> messages. However, in JDG, there should be no other services other than > >>>> Infinispan using the channel - hence the MuxChannel stuff is > >>>> unnecessary. > >>>> > >>>> I think Dennis earlier response was spot on. EAP/JDG configures it's > >>>> cache managers using a ModularClassResolver (which includes a module > >>>> name > >>>> along with the class name when marshalling). Your standalone Infinispan > >>>> instances do not use this and therefore cannot make sense of the message > >>>> body. > >>>> > >>>> Paul > >>>> > >>>> ----- Original Message ----- > >>>>> From: "Kurt T Stam" > >>>>> To: "Stelios Koussouris" , "Radoslav Husar" > >>>>> > >>>>> Cc: "Galder Zamarre?o" , "Paul Ferraro" > >>>>> , "Richard Achmatowicz" > >>>>> , "infinispan -Dev List" > >>>>> > >>>>> Sent: Monday, September 29, 2014 11:39:59 AM > >>>>> Subject: Re: Clustering standalone Infinispan w/ WF running Infinispan > >>>>> > >>>>> Thanks for following up Stelios, I think Galder is traveling the next 2 > >>>>> weeks. > >>>>> > >>>>> So - do we need fixes on both ends then so that the boot order does not > >>>>> matter? In which project(s) would we apply > >>>>> there changes? Or can they be applied in the end-user's code? > >>>>> > >>>>> Thx, > >>>>> > >>>>> --Kurt > >>>>> > >>>>> > >>>>> > >>>>> On 9/26/14, 11:19 AM, Stelios Koussouris wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Rado: It is both ways. ie. if I start first the JDG Server I get the > >>>>>> issue > >>>>>> on the library mode side when I start that one. If reverse the order > >>>>>> of > >>>>>> startup I get it in the JDG Server side. > >>>>>> > >>>>>> Question: > >>>>>> ----------------------------------------------------------------------------------------------------------------------- > >>>>>> ...IMO the channel needs to be wrapped as > >>>>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to > >>>>>> infinispan. > >>>>>> ... > >>>>>> ----------------------------------------------------------------------------------------------------------------------- > >>>>>> For now that this is not being done. If I wanted to do it manually on > >>>>>> the > >>>>>> library side where I can create the protocol programmatically we are > >>>>>> talking about something like this? > >>>>>> > >>>>>> ProtocolStackConfigurator configurator = > >>>>>> ConfiguratorFactory.getStackConfigurator("jgroups-udp.xml"); > >>>>>> MuxChannel channel = new MuxChannel(configurator); > >>>>>> org.infinispan.remoting.transport.Transport transport = new > >>>>>> org.infinispan.remoting.transport.jgroups.JGroupsTransport(channel); > >>>>>> > >>>>>> .... > >>>>>> then replace the below > >>>>>> new > >>>>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport().clusterName("UDM-CLUSTER").addProperty("configurationFile", > >>>>>> "jgroups-udp.xml") > >>>>>> > >>>>>> WITH > >>>>>> new > >>>>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport(Transport).clusterName("UDM-CLUSTER") > >>>>>> > >>>>>> Btw, someone mentioned that if I follow this method I need to to know > >>>>>> the > >>>>>> assigned mux ids, but that is not quite clear what it means with > >>>>>> regards > >>>>>> to the JGroupsTransport configuration > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Stylianos Kousouris > >>>>>> Red Hat Middleware Consultant > >>>>>> > >>>>>> ----- Original Message ----- > >>>>>> From: "Radoslav Husar" > >>>>>> To: "Galder Zamarre?o" , "Paul Ferraro" > >>>>>> > >>>>>> Cc: "Richard Achmatowicz" , "infinispan -Dev > >>>>>> List" > >>>>>> , "Stelios Koussouris" > >>>>>> , "Kurt T Stam" > >>>>>> Sent: Friday, 26 September, 2014 3:47:16 PM > >>>>>> Subject: Re: Clustering standalone Infinispan w/ WF running Infinispan > >>>>>> > >>>>>> From what Stelios is telling me the question is a little bit > >>>>>> other > >>>>>> way > >>>>>> round: he is using library mode infinispan and jgroups in EAP and > >>>>>> connecting to JDG. So the question is what JDG is doing with the > >>>>>> stack, > >>>>>> not AS/WF as its infinispan/jgroups subsystem is not used. > >>>>>> > >>>>>> Unfortunately I don't have access to the JDG repo so I don't know what > >>>>>> changes have been made there but if you are using the same jgroups > >>>>>> logic, IMO the channel needs to be wrapped as > >>>>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to > >>>>>> infinispan. > >>>>>> > >>>>>> Rado > >>>>>> > >>>>>> On 26/09/14 15:03, Galder Zamarre?o wrote: > >>>>>>> Hey Paul, > >>>>>>> > >>>>>>> In the last couple of days, a couple of people have encountered the > >>>>>>> exception in [1] when trying to cluster a standalone Infinispan app > >>>>>>> with > >>>>>>> its own JGroups configuration file with a AS/WF running Infinispan > >>>>>>> cache. > >>>>>>> > >>>>>>> From my POV, 3 possible causes: > >>>>>>> > >>>>>>> 1. Dependency mismatches between AS/WF and the standalone app. Having > >>>>>>> done > >>>>>>> some quick study of Kurt?s case, apart from micro version changes, > >>>>>>> all > >>>>>>> looks good. > >>>>>>> > >>>>>>> 2. Mismatch in the Infinispan and/or JGroups configuration file. > >>>>>>> > >>>>>>> 3. AS/WF puts something on the clustered wire that standalone > >>>>>>> Infinispan > >>>>>>> does not expect. Are you still doing multiplexing? Could you be > >>>>>>> adding > >>>>>>> extra info to the wire? > >>>>>>> > >>>>>>> With this email, I?m trying to get some clarification from you if the > >>>>>>> issue could be due to 3rd option. If it?s either of the first two, > >>>>>>> it?s > >>>>>>> a > >>>>>>> matter of digging and finding the difference, but if it?s 3rd one, > >>>>>>> it?s > >>>>>>> more problematic. > >>>>>>> > >>>>>>> Any ideas? > >>>>>>> > >>>>>>> [1] https://gist.github.com/skoussou/92f062f2d0bd17168e01 > >>>>>>> -- > >>>>>>> Galder Zamarre?o > >>>>>>> galder at redhat.com > >>>>>>> twitter.com/galderz > >>>>>>> > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> infinispan-dev at lists.jboss.org > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Fri Oct 3 04:30:10 2014 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 03 Oct 2014 10:30:10 +0200 Subject: [infinispan-dev] About size() Message-ID: <542E5E92.7060504@redhat.com> Hi, recently we had a discussion about what size() returns, but I've realized there are more things that users would like to know. My question is whether you think that they would really appreciate it, or whether it's just my QA point of view where I sometimes compute the 'checksums' of cache to see if I didn't lost anything. There are those sizes: A) number of owned entries B) number of entries stored locally in memory C) number of entries stored in each local cache store D) number of entries stored in each shared cache store E) total number of entries in cache So far, we can get B via withFlags(SKIP_CACHE_LOAD).size() (passivation ? B : 0) + firstNonZero(C, D) via size() E via distributed iterators / MR A via data container iteration + distribution manager query, but only without cache store C or D through getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() I think that it would go along with users' expectations if size() returned E and for the rest we should have special methods on AdvancedCache. That would of course change the meaning of size(), but I'd say that finally to something that has firm meaning. WDYT? Radim -- Radim Vansa JBoss DataGrid QA From dereed at redhat.com Fri Oct 3 13:38:50 2014 From: dereed at redhat.com (Dennis Reed) Date: Fri, 03 Oct 2014 12:38:50 -0500 Subject: [infinispan-dev] About size() In-Reply-To: <542E5E92.7060504@redhat.com> References: <542E5E92.7060504@redhat.com> Message-ID: <542EDF2A.7080807@redhat.com> Since size() is defined by the ConcurrentMap interface, it already has a precisely defined meaning. The only "correct" implementation is E. The current non-correct implementation was just because it's expensive to calculate correctly. I'm not sure the current impl is really that useful for anything. -Dennis On 10/03/2014 03:30 AM, Radim Vansa wrote: > Hi, > > recently we had a discussion about what size() returns, but I've > realized there are more things that users would like to know. My > question is whether you think that they would really appreciate it, or > whether it's just my QA point of view where I sometimes compute the > 'checksums' of cache to see if I didn't lost anything. > > There are those sizes: > A) number of owned entries > B) number of entries stored locally in memory > C) number of entries stored in each local cache store > D) number of entries stored in each shared cache store > E) total number of entries in cache > > So far, we can get > B via withFlags(SKIP_CACHE_LOAD).size() > (passivation ? B : 0) + firstNonZero(C, D) via size() > E via distributed iterators / MR > A via data container iteration + distribution manager query, but only > without cache store > C or D through > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > > I think that it would go along with users' expectations if size() > returned E and for the rest we should have special methods on > AdvancedCache. That would of course change the meaning of size(), but > I'd say that finally to something that has firm meaning. > > WDYT? > > Radim > From radhamohanmaheshwari at gmail.com Fri Oct 3 15:36:07 2014 From: radhamohanmaheshwari at gmail.com (Radha Mohan Maheshwari) Date: Sat, 4 Oct 2014 01:06:07 +0530 Subject: [infinispan-dev] Configure named cache in remote infinispan 6.0.2 cluster In-Reply-To: <54196237.2060906@redhat.com> References: <54196237.2060906@redhat.com> Message-ID: Hi All , I am getting exception while enabling jmx in infinispan 6 server WARNING: Failed to load the specified log manager class org.jboss.logmanager.LogManager Oct 04, 2014 1:01:36 AM org.jboss.msc.service.ServiceLogger_$logger greeting INFO: JBoss MSC version 1.0.4.GA Oct 04, 2014 1:01:36 AM org.jboss.as.server.ApplicationServerService start INFO: JBAS015899: JBoss Infinispan Server 6.0.2.Final (AS 7.2.0.Final) starting Oct 04, 2014 1:01:38 AM org.jboss.as.controller.AbstractOperationContext executeStep ERROR: JBAS014612: Operation ("parallel-extension-add") failed - address: ([]) java.lang.RuntimeException: JBAS014670: Failed initializing module org.jboss.as.logging at org.jboss.as.controller.extension.ParallelExtensionAddHandler$1.execute(ParallelExtensionAddHandler.java:99) at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:440) at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:322) at org.jboss.as.controller.AbstractOperationContext.completeStepInternal(AbstractOperationContext.java:229) at org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:224) at org.jboss.as.controller.ModelControllerImpl.boot(ModelControllerImpl.java:172) at org.jboss.as.controller.AbstractControllerService.boot(AbstractControllerService.java:225) at org.jboss.as.server.ServerService.boot(ServerService.java:333) at org.jboss.as.server.ServerService.boot(ServerService.java:308) at org.jboss.as.controller.AbstractControllerService$1.run(AbstractControllerService.java:188) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalStateException: JBAS011592: The logging subsystem requires the log manager to be org.jboss.logmanager.LogManager. The subsystem has not be initialized and cannot be used. To use JBoss Log Manager you must add the system property "java.util.logging.manager" and set it to "org.jboss.logmanager.LogManager" at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.jboss.as.controller.extension.ParallelExtensionAddHandler$1.execute(ParallelExtensionAddHandler.java:91) ... 10 more Caused by: java.lang.IllegalStateException: JBAS011592: The logging subsystem requires the log manager to be org.jboss.logmanager.LogManager. The subsystem has not be initialized and cannot be used. To use JBoss Log Manager you must add the system property "java.util.logging.manager" and set it to "org.jboss.logmanager.LogManager" at org.jboss.as.logging.LoggingExtension.initialize(LoggingExtension.java:103) at org.jboss.as.controller.extension.ExtensionAddHandler.initializeExtension(ExtensionAddHandler.java:97) at org.jboss.as.controller.extension.ParallelExtensionAddHandler$ExtensionInitializeTask.call(ParallelExtensionAddHandler.java:127) at org.jboss.as.controller.extension.ParallelExtensionAddHandler$ExtensionInitializeTask.call(ParallelExtensionAddHandler.java:113) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) at org.jboss.threads.JBossThread.run(JBossThread.java:122) Radha Mohan Maheshwari -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141004/fe7af390/attachment.html From sanne at infinispan.org Mon Oct 6 06:57:36 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 6 Oct 2014 11:57:36 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <542EDF2A.7080807@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> Message-ID: On 3 October 2014 18:38, Dennis Reed wrote: > Since size() is defined by the ConcurrentMap interface, it already has a > precisely defined meaning. The only "correct" implementation is E. +1 > The current non-correct implementation was just because it's expensive > to calculate correctly. I'm not sure the current impl is really that > useful for anything. +1 And not just size() but many others from ConcurrentMap. The question is if we should drop the interface and all the methods which aren't efficiently implementable, or fix all those methods. In the past I loved that I could inject "Infinispan superpowers" into an application making extensive use of Map and ConcurrentMap without changes, but that has been deceiving and required great care such as verifying that these features would not be used anywhere in the code. I ended up wrapping the Cache implementation in a custom adapter which would also implement ConcurrentMap but would throw a RuntimeException if any of the "unallowed" methods was called, at least I would detect violations safely. I still think that for the time being - until a better solution is planned - we should throw exceptions.. alas that's an old conversation and it was never done. Sanne > > -Dennis > > On 10/03/2014 03:30 AM, Radim Vansa wrote: >> Hi, >> >> recently we had a discussion about what size() returns, but I've >> realized there are more things that users would like to know. My >> question is whether you think that they would really appreciate it, or >> whether it's just my QA point of view where I sometimes compute the >> 'checksums' of cache to see if I didn't lost anything. >> >> There are those sizes: >> A) number of owned entries >> B) number of entries stored locally in memory >> C) number of entries stored in each local cache store >> D) number of entries stored in each shared cache store >> E) total number of entries in cache >> >> So far, we can get >> B via withFlags(SKIP_CACHE_LOAD).size() >> (passivation ? B : 0) + firstNonZero(C, D) via size() >> E via distributed iterators / MR >> A via data container iteration + distribution manager query, but only >> without cache store >> C or D through >> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >> >> I think that it would go along with users' expectations if size() >> returned E and for the rest we should have special methods on >> AdvancedCache. That would of course change the meaning of size(), but >> I'd say that finally to something that has firm meaning. >> >> WDYT? >> >> Radim >> > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Oct 6 07:44:40 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 06 Oct 2014 13:44:40 +0200 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> Message-ID: <543280A8.5040109@redhat.com> I think we should provide correct implementations of size() (and others) and provide shortcut implementations using our usual Flag API (e.g. SKIP_REMOTE_LOOKUP). Tristan On 06/10/14 12:57, Sanne Grinovero wrote: > On 3 October 2014 18:38, Dennis Reed wrote: >> Since size() is defined by the ConcurrentMap interface, it already has a >> precisely defined meaning. The only "correct" implementation is E. > +1 > >> The current non-correct implementation was just because it's expensive >> to calculate correctly. I'm not sure the current impl is really that >> useful for anything. > +1 > > And not just size() but many others from ConcurrentMap. > The question is if we should drop the interface and all the methods > which aren't efficiently implementable, or fix all those methods. > > In the past I loved that I could inject "Infinispan superpowers" into > an application making extensive use of Map and ConcurrentMap without > changes, but that has been deceiving and required great care such as > verifying that these features would not be used anywhere in the code. > I ended up wrapping the Cache implementation in a custom adapter which > would also implement ConcurrentMap but would throw a RuntimeException > if any of the "unallowed" methods was called, at least I would detect > violations safely. > > I still think that for the time being - until a better solution is > planned - we should throw exceptions.. alas that's an old conversation > and it was never done. > > Sanne > > >> -Dennis >> >> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>> Hi, >>> >>> recently we had a discussion about what size() returns, but I've >>> realized there are more things that users would like to know. My >>> question is whether you think that they would really appreciate it, or >>> whether it's just my QA point of view where I sometimes compute the >>> 'checksums' of cache to see if I didn't lost anything. >>> >>> There are those sizes: >>> A) number of owned entries >>> B) number of entries stored locally in memory >>> C) number of entries stored in each local cache store >>> D) number of entries stored in each shared cache store >>> E) total number of entries in cache >>> >>> So far, we can get >>> B via withFlags(SKIP_CACHE_LOAD).size() >>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>> E via distributed iterators / MR >>> A via data container iteration + distribution manager query, but only >>> without cache store >>> C or D through >>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>> >>> I think that it would go along with users' expectations if size() >>> returned E and for the rest we should have special methods on >>> AdvancedCache. That would of course change the meaning of size(), but >>> I'd say that finally to something that has firm meaning. >>> >>> WDYT? >>> >>> Radim >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From sanne at infinispan.org Mon Oct 6 07:57:29 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 6 Oct 2014 12:57:29 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <543280A8.5040109@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> Message-ID: On 6 October 2014 12:44, Tristan Tarrant wrote: > I think we should provide correct implementations of size() (and others) > and provide shortcut implementations using our usual Flag API (e.g. > SKIP_REMOTE_LOOKUP). Right that would be very nice. Same for CacheStore interaction: all cachestores should be included unless skipped explicitly. Sanne > > Tristan > > On 06/10/14 12:57, Sanne Grinovero wrote: >> On 3 October 2014 18:38, Dennis Reed wrote: >>> Since size() is defined by the ConcurrentMap interface, it already has a >>> precisely defined meaning. The only "correct" implementation is E. >> +1 >> >>> The current non-correct implementation was just because it's expensive >>> to calculate correctly. I'm not sure the current impl is really that >>> useful for anything. >> +1 >> >> And not just size() but many others from ConcurrentMap. >> The question is if we should drop the interface and all the methods >> which aren't efficiently implementable, or fix all those methods. >> >> In the past I loved that I could inject "Infinispan superpowers" into >> an application making extensive use of Map and ConcurrentMap without >> changes, but that has been deceiving and required great care such as >> verifying that these features would not be used anywhere in the code. >> I ended up wrapping the Cache implementation in a custom adapter which >> would also implement ConcurrentMap but would throw a RuntimeException >> if any of the "unallowed" methods was called, at least I would detect >> violations safely. >> >> I still think that for the time being - until a better solution is >> planned - we should throw exceptions.. alas that's an old conversation >> and it was never done. >> >> Sanne >> >> >>> -Dennis >>> >>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>> Hi, >>>> >>>> recently we had a discussion about what size() returns, but I've >>>> realized there are more things that users would like to know. My >>>> question is whether you think that they would really appreciate it, or >>>> whether it's just my QA point of view where I sometimes compute the >>>> 'checksums' of cache to see if I didn't lost anything. >>>> >>>> There are those sizes: >>>> A) number of owned entries >>>> B) number of entries stored locally in memory >>>> C) number of entries stored in each local cache store >>>> D) number of entries stored in each shared cache store >>>> E) total number of entries in cache >>>> >>>> So far, we can get >>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>> E via distributed iterators / MR >>>> A via data container iteration + distribution manager query, but only >>>> without cache store >>>> C or D through >>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>> >>>> I think that it would go along with users' expectations if size() >>>> returned E and for the rest we should have special methods on >>>> AdvancedCache. That would of course change the meaning of size(), but >>>> I'd say that finally to something that has firm meaning. >>>> >>>> WDYT? >>>> >>>> Radim >>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Tue Oct 7 03:47:11 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 07 Oct 2014 09:47:11 +0200 Subject: [infinispan-dev] Weekly Infinispan IRC meeting 2014-10-06 Message-ID: <54339A7F.7080201@redhat.com> Get the minutes from here: http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2014/infinispan.2014-10-06-14.02.log.html From ttarrant at redhat.com Tue Oct 7 03:48:30 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 07 Oct 2014 09:48:30 +0200 Subject: [infinispan-dev] Weekly Infinispan IRC meeting 2014-09-29 Message-ID: <54339ACE.6090805@redhat.com> I forgot to send this last week :) Get the minutes from here: http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2014/infinispan.2014-09-29-14.02.log.html Tristan From isavin at redhat.com Tue Oct 7 04:06:30 2014 From: isavin at redhat.com (Ion Savin) Date: Tue, 07 Oct 2014 11:06:30 +0300 Subject: [infinispan-dev] GSoC 2015 Message-ID: <54339F06.2020107@redhat.com> http://google-opensource.blogspot.ro/2014/10/google-summer-of-code-2015-and-google.html http://www.google-melange.com/gsoc/events/google/gsoc2015 From pedro at infinispan.org Tue Oct 7 04:09:17 2014 From: pedro at infinispan.org (Pedro Ruivo) Date: Tue, 07 Oct 2014 11:09:17 +0300 Subject: [infinispan-dev] Weekly Infinispan IRC meeting 2014-10-06 In-Reply-To: <54339A7F.7080201@redhat.com> References: <54339A7F.7080201@redhat.com> Message-ID: <54339FAD.20502@infinispan.org> My update: Last week: * I worked in Cross-Site state transfer: applying last comments and finally got integrated (it will be available next release). * Also, it was backported the code to product. * I started adding the Cross-Site state transfer configuration to the server mode. * Review and integrate pull requests. This week: * LEADS meeting in Crete. Cheers, Pedro On 10/07/2014 10:47 AM, Tristan Tarrant wrote: > Get the minutes from here: > > http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2014/infinispan.2014-10-06-14.02.log.html > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From isavin at redhat.com Tue Oct 7 04:16:23 2014 From: isavin at redhat.com (Ion Savin) Date: Tue, 07 Oct 2014 11:16:23 +0300 Subject: [infinispan-dev] Weekly Infinispan IRC meeting 2014-10-06 In-Reply-To: <54339A7F.7080201@redhat.com> References: <54339A7F.7080201@redhat.com> Message-ID: <5433A157.1090106@redhat.com> Last week: * HRCPP-174 MSI installer not working on WIN32 platforms * cleanup + OSGi tests for https://github.com/infinispan/infinispan/pull/2640 * product work * integrated the uberjar fixes This week: * finish tests and integrate PR #2640 * ISPN-3836 TCCL socket leak * HRCPP-173 The HotRod client should support a separate CH for each cache -- Ion Savin From ttarrant at redhat.com Tue Oct 7 05:28:03 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 07 Oct 2014 11:28:03 +0200 Subject: [infinispan-dev] Feature branches on Infinispan GitHub repo Message-ID: <5433B223.1000304@redhat.com> Hi guys, since Vladimir and myself are starting work on the server management console task (ISPN-4800), I have created the feature branch to which pull requests will be issued directly on the Infinispan GitHub repository. https://github.com/infinispan/infinispan/tree/ISPN-4800/management_ui So when you see that branch appear when you pull from "origin", know that it wasn't pushed there by mistake :) Tristan From rvansa at redhat.com Tue Oct 7 07:32:04 2014 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 07 Oct 2014 13:32:04 +0200 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> Message-ID: <5433CF34.8010209@redhat.com> If you have one local and one shared cache store, how should the command behave? a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP, SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no SKIP_BACKUP_ENTRIES flag right now), where this method returns localStore.size() for first non-shared cache store + passivation ? dataContainer.size(SKIP_BACKUP_ENTRIES) : 0) b) distexec/MR sum of sharedStore.size() + passivation ? sum of dataContainer.size(SKIP_BACKUP_ENTRIES) : 0 c) MR that would count the entries d) wrapper on distributed entry iteration with converters set to return 0-sized entries And what about nodes with different configuration? Radim On 10/06/2014 01:57 PM, Sanne Grinovero wrote: > On 6 October 2014 12:44, Tristan Tarrant wrote: >> I think we should provide correct implementations of size() (and others) >> and provide shortcut implementations using our usual Flag API (e.g. >> SKIP_REMOTE_LOOKUP). > Right that would be very nice. Same for CacheStore interaction: all > cachestores should be included unless skipped explicitly. > > Sanne > >> Tristan >> >> On 06/10/14 12:57, Sanne Grinovero wrote: >>> On 3 October 2014 18:38, Dennis Reed wrote: >>>> Since size() is defined by the ConcurrentMap interface, it already has a >>>> precisely defined meaning. The only "correct" implementation is E. >>> +1 >>> >>>> The current non-correct implementation was just because it's expensive >>>> to calculate correctly. I'm not sure the current impl is really that >>>> useful for anything. >>> +1 >>> >>> And not just size() but many others from ConcurrentMap. >>> The question is if we should drop the interface and all the methods >>> which aren't efficiently implementable, or fix all those methods. >>> >>> In the past I loved that I could inject "Infinispan superpowers" into >>> an application making extensive use of Map and ConcurrentMap without >>> changes, but that has been deceiving and required great care such as >>> verifying that these features would not be used anywhere in the code. >>> I ended up wrapping the Cache implementation in a custom adapter which >>> would also implement ConcurrentMap but would throw a RuntimeException >>> if any of the "unallowed" methods was called, at least I would detect >>> violations safely. >>> >>> I still think that for the time being - until a better solution is >>> planned - we should throw exceptions.. alas that's an old conversation >>> and it was never done. >>> >>> Sanne >>> >>> >>>> -Dennis >>>> >>>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>>> Hi, >>>>> >>>>> recently we had a discussion about what size() returns, but I've >>>>> realized there are more things that users would like to know. My >>>>> question is whether you think that they would really appreciate it, or >>>>> whether it's just my QA point of view where I sometimes compute the >>>>> 'checksums' of cache to see if I didn't lost anything. >>>>> >>>>> There are those sizes: >>>>> A) number of owned entries >>>>> B) number of entries stored locally in memory >>>>> C) number of entries stored in each local cache store >>>>> D) number of entries stored in each shared cache store >>>>> E) total number of entries in cache >>>>> >>>>> So far, we can get >>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>> E via distributed iterators / MR >>>>> A via data container iteration + distribution manager query, but only >>>>> without cache store >>>>> C or D through >>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>> >>>>> I think that it would go along with users' expectations if size() >>>>> returned E and for the rest we should have special methods on >>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>> I'd say that finally to something that has firm meaning. >>>>> >>>>> WDYT? >>>>> >>>>> Radim >>>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From mudokonman at gmail.com Tue Oct 7 08:21:02 2014 From: mudokonman at gmail.com (William Burns) Date: Tue, 7 Oct 2014 08:21:02 -0400 Subject: [infinispan-dev] About size() In-Reply-To: <5433CF34.8010209@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> Message-ID: On Tue, Oct 7, 2014 at 7:32 AM, Radim Vansa wrote: > If you have one local and one shared cache store, how should the command > behave? > > a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP, > SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no > SKIP_BACKUP_ENTRIES flag right now), where this method returns > localStore.size() for first non-shared cache store + passivation ? > dataContainer.size(SKIP_BACKUP_ENTRIES) : 0) Calling the size method in either distexec or MR will give you inflated numbers as you need to pay attention to the numOwners to get a proper count. > b) distexec/MR sum of sharedStore.size() + passivation ? sum of > dataContainer.size(SKIP_BACKUP_ENTRIES) : 0 Calling the size on a shared cache actually should work somewhat well (assuming all entries are stored in the shared cache). The problem is if passivation is enabled as you point out because you also have to check the data container which means you can also have an issue with concurrent activations and passivations (which you can't verify properly in either case without knowing the keys). > c) MR that would count the entries This is the only reliable way to do this with MR. And unfortunately if a rehash occurs I am not sure if you would get inconsistent numbers or an Exception. In the latter at least you should be able to make sure that you have the proper number when it does return without exception. I can't say how it works with multiple loaders though, my guess is that it may process the entry more than once so it depends on if your mapper is smart enough to realize it. > d) wrapper on distributed entry iteration with converters set to return > 0-sized entries Entry iterator can't return 0 sized entries (just the values). The keys are required to make sure that the count is correct and also to ensure that if a rehash happens in the middle it can properly continue to operate without having to start over. Entry iterator should work properly irrespective of the number of stores/loaders that are configured, since it keep track of already seen keys (so duplicates are ignored). > > And what about nodes with different configuration? Hard to know without knowing what the differences are. > > Radim > > On 10/06/2014 01:57 PM, Sanne Grinovero wrote: >> On 6 October 2014 12:44, Tristan Tarrant wrote: >>> I think we should provide correct implementations of size() (and others) >>> and provide shortcut implementations using our usual Flag API (e.g. >>> SKIP_REMOTE_LOOKUP). >> Right that would be very nice. Same for CacheStore interaction: all >> cachestores should be included unless skipped explicitly. >> >> Sanne >> >>> Tristan >>> >>> On 06/10/14 12:57, Sanne Grinovero wrote: >>>> On 3 October 2014 18:38, Dennis Reed wrote: >>>>> Since size() is defined by the ConcurrentMap interface, it already has a >>>>> precisely defined meaning. The only "correct" implementation is E. >>>> +1 This is one of the things I have been wanting to do is actually implement the other Map methods across the entire cache. However to do a lot of these in a memory conscious way they would need to be ran ignoring any ongoing transactions. Actually having this requirement allows these methods to be implemented quite easily especially in conjunction with the EntryIterator. I almost made a PR for it a while back, but it seemed a little zealous to do at the same time and it didn't seem that people were pushing for it very hard (maybe that was a wrong assumption). Also I wasn't quite sure the transactional part not being functional anymore would be a deterrent. >>>> >>>>> The current non-correct implementation was just because it's expensive >>>>> to calculate correctly. I'm not sure the current impl is really that >>>>> useful for anything. >>>> +1 >>>> >>>> And not just size() but many others from ConcurrentMap. >>>> The question is if we should drop the interface and all the methods >>>> which aren't efficiently implementable, or fix all those methods. >>>> >>>> In the past I loved that I could inject "Infinispan superpowers" into >>>> an application making extensive use of Map and ConcurrentMap without >>>> changes, but that has been deceiving and required great care such as >>>> verifying that these features would not be used anywhere in the code. >>>> I ended up wrapping the Cache implementation in a custom adapter which >>>> would also implement ConcurrentMap but would throw a RuntimeException >>>> if any of the "unallowed" methods was called, at least I would detect >>>> violations safely. >>>> >>>> I still think that for the time being - until a better solution is >>>> planned - we should throw exceptions.. alas that's an old conversation >>>> and it was never done. >>>> >>>> Sanne >>>> >>>> >>>>> -Dennis >>>>> >>>>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>>>> Hi, >>>>>> >>>>>> recently we had a discussion about what size() returns, but I've >>>>>> realized there are more things that users would like to know. My >>>>>> question is whether you think that they would really appreciate it, or >>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>> >>>>>> There are those sizes: >>>>>> A) number of owned entries >>>>>> B) number of entries stored locally in memory >>>>>> C) number of entries stored in each local cache store >>>>>> D) number of entries stored in each shared cache store >>>>>> E) total number of entries in cache >>>>>> >>>>>> So far, we can get >>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>> E via distributed iterators / MR >>>>>> A via data container iteration + distribution manager query, but only >>>>>> without cache store >>>>>> C or D through >>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>> >>>>>> I think that it would go along with users' expectations if size() >>>>>> returned E and for the rest we should have special methods on >>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>> I'd say that finally to something that has firm meaning. >>>>>> >>>>>> WDYT? >>>>>> >>>>>> Radim >>>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Tue Oct 7 08:43:31 2014 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 07 Oct 2014 14:43:31 +0200 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> Message-ID: <5433DFF3.2060100@redhat.com> On 10/07/2014 02:21 PM, William Burns wrote: > On Tue, Oct 7, 2014 at 7:32 AM, Radim Vansa wrote: >> If you have one local and one shared cache store, how should the command >> behave? >> >> a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP, >> SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no >> SKIP_BACKUP_ENTRIES flag right now), where this method returns >> localStore.size() for first non-shared cache store + passivation ? >> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0) > Calling the size method in either distexec or MR will give you > inflated numbers as you need to pay attention to the numOwners to get > a proper count. That's what I meant by the SKIP_BACKUP_ENTRIES - dataContainer should be able to report only primary-owned entries, or we have to iterate and apply the filtering outside. > >> b) distexec/MR sum of sharedStore.size() + passivation ? sum of >> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0 > Calling the size on a shared cache actually should work somewhat well > (assuming all entries are stored in the shared cache). The problem is > if passivation is enabled as you point out because you also have to > check the data container which means you can also have an issue with > concurrent activations and passivations (which you can't verify > properly in either case without knowing the keys). > >> c) MR that would count the entries > This is the only reliable way to do this with MR. And unfortunately > if a rehash occurs I am not sure if you would get inconsistent numbers > or an Exception. In the latter at least you should be able to make > sure that you have the proper number when it does return without > exception. I can't say how it works with multiple loaders though, my > guess is that it may process the entry more than once so it depends on > if your mapper is smart enough to realize it. I don't think that reporting incorrect size is *that* harmful - even ConcurrentMap interface says that it's just a wild guess and when things are changing, you can't rely on that. > >> d) wrapper on distributed entry iteration with converters set to return >> 0-sized entries > Entry iterator can't return 0 sized entries (just the values). The > keys are required to make sure that the count is correct and also to > ensure that if a rehash happens in the middle it can properly continue > to operate without having to start over. Entry iterator should work > properly irrespective of the number of stores/loaders that are > configured, since it keep track of already seen keys (so duplicates > are ignored). Ok, I was simplifying that a bit. And by the way, I don't really like the fact that for distributed entry iteration you need to be able to keep all keys from one segment at one moment in memory. But fine - distributed entry iteration is probably not the right way. > > >> And what about nodes with different configuration? > Hard to know without knowing what the differences are. I had in my mind different loaders and passivation configuration (e.g. some node could use shared store and some don't - do we want to handle such obscure configs? Can we design that without the need to have complicated decision trees what to include and what not?). Radim > >> Radim >> >> On 10/06/2014 01:57 PM, Sanne Grinovero wrote: >>> On 6 October 2014 12:44, Tristan Tarrant wrote: >>>> I think we should provide correct implementations of size() (and others) >>>> and provide shortcut implementations using our usual Flag API (e.g. >>>> SKIP_REMOTE_LOOKUP). >>> Right that would be very nice. Same for CacheStore interaction: all >>> cachestores should be included unless skipped explicitly. >>> >>> Sanne >>> >>>> Tristan >>>> >>>> On 06/10/14 12:57, Sanne Grinovero wrote: >>>>> On 3 October 2014 18:38, Dennis Reed wrote: >>>>>> Since size() is defined by the ConcurrentMap interface, it already has a >>>>>> precisely defined meaning. The only "correct" implementation is E. >>>>> +1 > This is one of the things I have been wanting to do is actually > implement the other Map methods across the entire cache. However to > do a lot of these in a memory conscious way they would need to be ran > ignoring any ongoing transactions. Actually having this requirement > allows these methods to be implemented quite easily especially in > conjunction with the EntryIterator. I almost made a PR for it a while > back, but it seemed a little zealous to do at the same time and it > didn't seem that people were pushing for it very hard (maybe that was > a wrong assumption). Also I wasn't quite sure the transactional part > not being functional anymore would be a deterrent. > >>>>>> The current non-correct implementation was just because it's expensive >>>>>> to calculate correctly. I'm not sure the current impl is really that >>>>>> useful for anything. >>>>> +1 >>>>> >>>>> And not just size() but many others from ConcurrentMap. >>>>> The question is if we should drop the interface and all the methods >>>>> which aren't efficiently implementable, or fix all those methods. >>>>> >>>>> In the past I loved that I could inject "Infinispan superpowers" into >>>>> an application making extensive use of Map and ConcurrentMap without >>>>> changes, but that has been deceiving and required great care such as >>>>> verifying that these features would not be used anywhere in the code. >>>>> I ended up wrapping the Cache implementation in a custom adapter which >>>>> would also implement ConcurrentMap but would throw a RuntimeException >>>>> if any of the "unallowed" methods was called, at least I would detect >>>>> violations safely. >>>>> >>>>> I still think that for the time being - until a better solution is >>>>> planned - we should throw exceptions.. alas that's an old conversation >>>>> and it was never done. >>>>> >>>>> Sanne >>>>> >>>>> >>>>>> -Dennis >>>>>> >>>>>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>>>>> Hi, >>>>>>> >>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>> realized there are more things that users would like to know. My >>>>>>> question is whether you think that they would really appreciate it, or >>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>> >>>>>>> There are those sizes: >>>>>>> A) number of owned entries >>>>>>> B) number of entries stored locally in memory >>>>>>> C) number of entries stored in each local cache store >>>>>>> D) number of entries stored in each shared cache store >>>>>>> E) total number of entries in cache >>>>>>> >>>>>>> So far, we can get >>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>> E via distributed iterators / MR >>>>>>> A via data container iteration + distribution manager query, but only >>>>>>> without cache store >>>>>>> C or D through >>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>> >>>>>>> I think that it would go along with users' expectations if size() >>>>>>> returned E and for the rest we should have special methods on >>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>> I'd say that finally to something that has firm meaning. >>>>>>> >>>>>>> WDYT? >>>>>>> >>>>>>> Radim >>>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa >> JBoss DataGrid QA >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From sanne at infinispan.org Tue Oct 7 09:16:06 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 7 Oct 2014 14:16:06 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <5433DFF3.2060100@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> <5433DFF3.2060100@redhat.com> Message-ID: Considering all these very valid concerns I'd return on my proposal for throwing runtime exceptions via an (optional) decorator. I'd have such a decorator in place by default, so that we make it very clear that - while you can remove it - the behaviour of such methods is "unusual" and that a user would be better off avoiding them unless he's into the advanced stuff. As said before, that worked very well for me in the past and it was great that - even while I did know - I had a safety guard to highlight unintended refactorings by others on my team who didn't know the black art of using Infinispan correctly. Sanne On 7 October 2014 13:43, Radim Vansa wrote: > On 10/07/2014 02:21 PM, William Burns wrote: >> On Tue, Oct 7, 2014 at 7:32 AM, Radim Vansa wrote: >>> If you have one local and one shared cache store, how should the command >>> behave? >>> >>> a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP, >>> SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no >>> SKIP_BACKUP_ENTRIES flag right now), where this method returns >>> localStore.size() for first non-shared cache store + passivation ? >>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0) >> Calling the size method in either distexec or MR will give you >> inflated numbers as you need to pay attention to the numOwners to get >> a proper count. > > That's what I meant by the SKIP_BACKUP_ENTRIES - dataContainer should be > able to report only primary-owned entries, or we have to iterate and > apply the filtering outside. > >> >>> b) distexec/MR sum of sharedStore.size() + passivation ? sum of >>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0 >> Calling the size on a shared cache actually should work somewhat well >> (assuming all entries are stored in the shared cache). The problem is >> if passivation is enabled as you point out because you also have to >> check the data container which means you can also have an issue with >> concurrent activations and passivations (which you can't verify >> properly in either case without knowing the keys). >> >>> c) MR that would count the entries >> This is the only reliable way to do this with MR. And unfortunately >> if a rehash occurs I am not sure if you would get inconsistent numbers >> or an Exception. In the latter at least you should be able to make >> sure that you have the proper number when it does return without >> exception. I can't say how it works with multiple loaders though, my >> guess is that it may process the entry more than once so it depends on >> if your mapper is smart enough to realize it. > > I don't think that reporting incorrect size is *that* harmful - even > ConcurrentMap interface says that it's just a wild guess and when things > are changing, you can't rely on that. > >> >>> d) wrapper on distributed entry iteration with converters set to return >>> 0-sized entries >> Entry iterator can't return 0 sized entries (just the values). The >> keys are required to make sure that the count is correct and also to >> ensure that if a rehash happens in the middle it can properly continue >> to operate without having to start over. Entry iterator should work >> properly irrespective of the number of stores/loaders that are >> configured, since it keep track of already seen keys (so duplicates >> are ignored). > > Ok, I was simplifying that a bit. And by the way, I don't really like > the fact that for distributed entry iteration you need to be able to > keep all keys from one segment at one moment in memory. But fine - > distributed entry iteration is probably not the right way. > >> >> >>> And what about nodes with different configuration? >> Hard to know without knowing what the differences are. > > I had in my mind different loaders and passivation configuration (e.g. > some node could use shared store and some don't - do we want to handle > such obscure configs? Can we design that without the need to have > complicated decision trees what to include and what not?). > > Radim > >> >>> Radim >>> >>> On 10/06/2014 01:57 PM, Sanne Grinovero wrote: >>>> On 6 October 2014 12:44, Tristan Tarrant wrote: >>>>> I think we should provide correct implementations of size() (and others) >>>>> and provide shortcut implementations using our usual Flag API (e.g. >>>>> SKIP_REMOTE_LOOKUP). >>>> Right that would be very nice. Same for CacheStore interaction: all >>>> cachestores should be included unless skipped explicitly. >>>> >>>> Sanne >>>> >>>>> Tristan >>>>> >>>>> On 06/10/14 12:57, Sanne Grinovero wrote: >>>>>> On 3 October 2014 18:38, Dennis Reed wrote: >>>>>>> Since size() is defined by the ConcurrentMap interface, it already has a >>>>>>> precisely defined meaning. The only "correct" implementation is E. >>>>>> +1 >> This is one of the things I have been wanting to do is actually >> implement the other Map methods across the entire cache. However to >> do a lot of these in a memory conscious way they would need to be ran >> ignoring any ongoing transactions. Actually having this requirement >> allows these methods to be implemented quite easily especially in >> conjunction with the EntryIterator. I almost made a PR for it a while >> back, but it seemed a little zealous to do at the same time and it >> didn't seem that people were pushing for it very hard (maybe that was >> a wrong assumption). Also I wasn't quite sure the transactional part >> not being functional anymore would be a deterrent. >> >>>>>>> The current non-correct implementation was just because it's expensive >>>>>>> to calculate correctly. I'm not sure the current impl is really that >>>>>>> useful for anything. >>>>>> +1 >>>>>> >>>>>> And not just size() but many others from ConcurrentMap. >>>>>> The question is if we should drop the interface and all the methods >>>>>> which aren't efficiently implementable, or fix all those methods. >>>>>> >>>>>> In the past I loved that I could inject "Infinispan superpowers" into >>>>>> an application making extensive use of Map and ConcurrentMap without >>>>>> changes, but that has been deceiving and required great care such as >>>>>> verifying that these features would not be used anywhere in the code. >>>>>> I ended up wrapping the Cache implementation in a custom adapter which >>>>>> would also implement ConcurrentMap but would throw a RuntimeException >>>>>> if any of the "unallowed" methods was called, at least I would detect >>>>>> violations safely. >>>>>> >>>>>> I still think that for the time being - until a better solution is >>>>>> planned - we should throw exceptions.. alas that's an old conversation >>>>>> and it was never done. >>>>>> >>>>>> Sanne >>>>>> >>>>>> >>>>>>> -Dennis >>>>>>> >>>>>>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>>> realized there are more things that users would like to know. My >>>>>>>> question is whether you think that they would really appreciate it, or >>>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>>> >>>>>>>> There are those sizes: >>>>>>>> A) number of owned entries >>>>>>>> B) number of entries stored locally in memory >>>>>>>> C) number of entries stored in each local cache store >>>>>>>> D) number of entries stored in each shared cache store >>>>>>>> E) total number of entries in cache >>>>>>>> >>>>>>>> So far, we can get >>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>>> E via distributed iterators / MR >>>>>>>> A via data container iteration + distribution manager query, but only >>>>>>>> without cache store >>>>>>>> C or D through >>>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>>> >>>>>>>> I think that it would go along with users' expectations if size() >>>>>>>> returned E and for the rest we should have special methods on >>>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>>> I'd say that finally to something that has firm meaning. >>>>>>>> >>>>>>>> WDYT? >>>>>>>> >>>>>>>> Radim >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> -- >>> Radim Vansa >>> JBoss DataGrid QA >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From mudokonman at gmail.com Tue Oct 7 09:17:54 2014 From: mudokonman at gmail.com (William Burns) Date: Tue, 7 Oct 2014 09:17:54 -0400 Subject: [infinispan-dev] About size() In-Reply-To: <5433DFF3.2060100@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> <5433DFF3.2060100@redhat.com> Message-ID: On Tue, Oct 7, 2014 at 8:43 AM, Radim Vansa wrote: > On 10/07/2014 02:21 PM, William Burns wrote: >> On Tue, Oct 7, 2014 at 7:32 AM, Radim Vansa wrote: >>> If you have one local and one shared cache store, how should the command >>> behave? >>> >>> a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP, >>> SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no >>> SKIP_BACKUP_ENTRIES flag right now), where this method returns >>> localStore.size() for first non-shared cache store + passivation ? >>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0) >> Calling the size method in either distexec or MR will give you >> inflated numbers as you need to pay attention to the numOwners to get >> a proper count. > > That's what I meant by the SKIP_BACKUP_ENTRIES - dataContainer should be > able to report only primary-owned entries, or we have to iterate and > apply the filtering outside. If we added this functionality then yes it would be promoted up to MR counting entries status though it would still have issues with rehash. As well as issues with concurrent activations and passivations. > >> >>> b) distexec/MR sum of sharedStore.size() + passivation ? sum of >>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0 >> Calling the size on a shared cache actually should work somewhat well >> (assuming all entries are stored in the shared cache). The problem is >> if passivation is enabled as you point out because you also have to >> check the data container which means you can also have an issue with >> concurrent activations and passivations (which you can't verify >> properly in either case without knowing the keys). >> >>> c) MR that would count the entries >> This is the only reliable way to do this with MR. And unfortunately >> if a rehash occurs I am not sure if you would get inconsistent numbers >> or an Exception. In the latter at least you should be able to make >> sure that you have the proper number when it does return without >> exception. I can't say how it works with multiple loaders though, my >> guess is that it may process the entry more than once so it depends on >> if your mapper is smart enough to realize it. > > I don't think that reporting incorrect size is *that* harmful - even > ConcurrentMap interface says that it's just a wild guess and when things > are changing, you can't rely on that. ConcurrentMap doesn't say anything about size method actually. ConcurrentHashMap has some verbage about saying that it might not be completely correct under concurrent modification though. It isn't a wild guess really though for ConcurrentHashMap. The worst is that you could count a value that was there but it is now removed or you don't count a value that was recently added. Really the guarantee from CHM is that it counts each individual segment properly for a glimpse of time for that segment, the problem is that each segment could change (since they are counted at different times). But the values missing in ConcurrentHashMap are totally different than losing an entire segment due to a rehash. You could theoretically have a rehash occur right after MR started iterating and see no values for that segment or a very small subset. There is a much larger margin of error in this case for what values are seen and which are not. > >> >>> d) wrapper on distributed entry iteration with converters set to return >>> 0-sized entries >> Entry iterator can't return 0 sized entries (just the values). The >> keys are required to make sure that the count is correct and also to >> ensure that if a rehash happens in the middle it can properly continue >> to operate without having to start over. Entry iterator should work >> properly irrespective of the number of stores/loaders that are >> configured, since it keep track of already seen keys (so duplicates >> are ignored). > > Ok, I was simplifying that a bit. And by the way, I don't really like > the fact that for distributed entry iteration you need to be able to > keep all keys from one segment at one moment in memory. But fine - > distributed entry iteration is probably not the right way. I agree it is annoying to have to keep the keys, but it is one of the few ways to reliably get all the values without losing one. Actually this approach provides a much closer approximation to what ConcurrentHashMap provides for its size implementation, since it can't drop a segment. It is pretty much required to do it this way to do keySet, entrySet, and values where you don't have the luxury of dropping whole swaths of entries like you do with calling size() method (even if the value(s) was there the entire time). > >> >> >>> And what about nodes with different configuration? >> Hard to know without knowing what the differences are. > > I had in my mind different loaders and passivation configuration (e.g. > some node could use shared store and some don't - do we want to handle > such obscure configs? Can we design that without the need to have > complicated decision trees what to include and what not?). Well the last sentence means we have to use MR or Entry Iterator since we can't call size on the shared loader. I would think that it should still work irrespective of the loader configuration (except for MR with multiple loaders). The main issue I can think of is that if everyone isn't using the shared loader that you could have stale values in the loader if you don't always have a node using the shared loader up (assuming purge at startup isn't enabled). > > Radim > >> >>> Radim >>> >>> On 10/06/2014 01:57 PM, Sanne Grinovero wrote: >>>> On 6 October 2014 12:44, Tristan Tarrant wrote: >>>>> I think we should provide correct implementations of size() (and others) >>>>> and provide shortcut implementations using our usual Flag API (e.g. >>>>> SKIP_REMOTE_LOOKUP). >>>> Right that would be very nice. Same for CacheStore interaction: all >>>> cachestores should be included unless skipped explicitly. >>>> >>>> Sanne >>>> >>>>> Tristan >>>>> >>>>> On 06/10/14 12:57, Sanne Grinovero wrote: >>>>>> On 3 October 2014 18:38, Dennis Reed wrote: >>>>>>> Since size() is defined by the ConcurrentMap interface, it already has a >>>>>>> precisely defined meaning. The only "correct" implementation is E. >>>>>> +1 >> This is one of the things I have been wanting to do is actually >> implement the other Map methods across the entire cache. However to >> do a lot of these in a memory conscious way they would need to be ran >> ignoring any ongoing transactions. Actually having this requirement >> allows these methods to be implemented quite easily especially in >> conjunction with the EntryIterator. I almost made a PR for it a while >> back, but it seemed a little zealous to do at the same time and it >> didn't seem that people were pushing for it very hard (maybe that was >> a wrong assumption). Also I wasn't quite sure the transactional part >> not being functional anymore would be a deterrent. >> >>>>>>> The current non-correct implementation was just because it's expensive >>>>>>> to calculate correctly. I'm not sure the current impl is really that >>>>>>> useful for anything. >>>>>> +1 >>>>>> >>>>>> And not just size() but many others from ConcurrentMap. >>>>>> The question is if we should drop the interface and all the methods >>>>>> which aren't efficiently implementable, or fix all those methods. >>>>>> >>>>>> In the past I loved that I could inject "Infinispan superpowers" into >>>>>> an application making extensive use of Map and ConcurrentMap without >>>>>> changes, but that has been deceiving and required great care such as >>>>>> verifying that these features would not be used anywhere in the code. >>>>>> I ended up wrapping the Cache implementation in a custom adapter which >>>>>> would also implement ConcurrentMap but would throw a RuntimeException >>>>>> if any of the "unallowed" methods was called, at least I would detect >>>>>> violations safely. >>>>>> >>>>>> I still think that for the time being - until a better solution is >>>>>> planned - we should throw exceptions.. alas that's an old conversation >>>>>> and it was never done. >>>>>> >>>>>> Sanne >>>>>> >>>>>> >>>>>>> -Dennis >>>>>>> >>>>>>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>>> realized there are more things that users would like to know. My >>>>>>>> question is whether you think that they would really appreciate it, or >>>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>>> >>>>>>>> There are those sizes: >>>>>>>> A) number of owned entries >>>>>>>> B) number of entries stored locally in memory >>>>>>>> C) number of entries stored in each local cache store >>>>>>>> D) number of entries stored in each shared cache store >>>>>>>> E) total number of entries in cache >>>>>>>> >>>>>>>> So far, we can get >>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>>> E via distributed iterators / MR >>>>>>>> A via data container iteration + distribution manager query, but only >>>>>>>> without cache store >>>>>>>> C or D through >>>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>>> >>>>>>>> I think that it would go along with users' expectations if size() >>>>>>>> returned E and for the rest we should have special methods on >>>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>>> I'd say that finally to something that has firm meaning. >>>>>>>> >>>>>>>> WDYT? >>>>>>>> >>>>>>>> Radim >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> -- >>> Radim Vansa >>> JBoss DataGrid QA >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Tue Oct 7 09:42:10 2014 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 07 Oct 2014 15:42:10 +0200 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> <5433DFF3.2060100@redhat.com> Message-ID: <5433EDB2.8020603@redhat.com> Considering the frequency of "How do I get the number of entries in cache", "How do I get all keys" on all forums, I think that backing to runtime exception would not satisfy the users. On 10/07/2014 03:16 PM, Sanne Grinovero wrote: > Considering all these very valid concerns I'd return on my proposal > for throwing runtime exceptions via an (optional) decorator. > > I'd have such a decorator in place by default, so that we make it very > clear that - while you can remove it - the behaviour of such methods > is "unusual" and that a user would be better off avoiding them unless > he's into the advanced stuff. > > As said before, that worked very well for me in the past and it was > great that - even while I did know - I had a safety guard to highlight > unintended refactorings by others on my team who didn't know the black > art of using Infinispan correctly. > > Sanne > > > > On 7 October 2014 13:43, Radim Vansa wrote: >> On 10/07/2014 02:21 PM, William Burns wrote: >>> On Tue, Oct 7, 2014 at 7:32 AM, Radim Vansa wrote: >>>> If you have one local and one shared cache store, how should the command >>>> behave? >>>> >>>> a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP, >>>> SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no >>>> SKIP_BACKUP_ENTRIES flag right now), where this method returns >>>> localStore.size() for first non-shared cache store + passivation ? >>>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0) >>> Calling the size method in either distexec or MR will give you >>> inflated numbers as you need to pay attention to the numOwners to get >>> a proper count. >> That's what I meant by the SKIP_BACKUP_ENTRIES - dataContainer should be >> able to report only primary-owned entries, or we have to iterate and >> apply the filtering outside. >> >>>> b) distexec/MR sum of sharedStore.size() + passivation ? sum of >>>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0 >>> Calling the size on a shared cache actually should work somewhat well >>> (assuming all entries are stored in the shared cache). The problem is >>> if passivation is enabled as you point out because you also have to >>> check the data container which means you can also have an issue with >>> concurrent activations and passivations (which you can't verify >>> properly in either case without knowing the keys). >>> >>>> c) MR that would count the entries >>> This is the only reliable way to do this with MR. And unfortunately >>> if a rehash occurs I am not sure if you would get inconsistent numbers >>> or an Exception. In the latter at least you should be able to make >>> sure that you have the proper number when it does return without >>> exception. I can't say how it works with multiple loaders though, my >>> guess is that it may process the entry more than once so it depends on >>> if your mapper is smart enough to realize it. >> I don't think that reporting incorrect size is *that* harmful - even >> ConcurrentMap interface says that it's just a wild guess and when things >> are changing, you can't rely on that. >> >>>> d) wrapper on distributed entry iteration with converters set to return >>>> 0-sized entries >>> Entry iterator can't return 0 sized entries (just the values). The >>> keys are required to make sure that the count is correct and also to >>> ensure that if a rehash happens in the middle it can properly continue >>> to operate without having to start over. Entry iterator should work >>> properly irrespective of the number of stores/loaders that are >>> configured, since it keep track of already seen keys (so duplicates >>> are ignored). >> Ok, I was simplifying that a bit. And by the way, I don't really like >> the fact that for distributed entry iteration you need to be able to >> keep all keys from one segment at one moment in memory. But fine - >> distributed entry iteration is probably not the right way. >> >>> >>>> And what about nodes with different configuration? >>> Hard to know without knowing what the differences are. >> I had in my mind different loaders and passivation configuration (e.g. >> some node could use shared store and some don't - do we want to handle >> such obscure configs? Can we design that without the need to have >> complicated decision trees what to include and what not?). >> >> Radim >> >>>> Radim >>>> >>>> On 10/06/2014 01:57 PM, Sanne Grinovero wrote: >>>>> On 6 October 2014 12:44, Tristan Tarrant wrote: >>>>>> I think we should provide correct implementations of size() (and others) >>>>>> and provide shortcut implementations using our usual Flag API (e.g. >>>>>> SKIP_REMOTE_LOOKUP). >>>>> Right that would be very nice. Same for CacheStore interaction: all >>>>> cachestores should be included unless skipped explicitly. >>>>> >>>>> Sanne >>>>> >>>>>> Tristan >>>>>> >>>>>> On 06/10/14 12:57, Sanne Grinovero wrote: >>>>>>> On 3 October 2014 18:38, Dennis Reed wrote: >>>>>>>> Since size() is defined by the ConcurrentMap interface, it already has a >>>>>>>> precisely defined meaning. The only "correct" implementation is E. >>>>>>> +1 >>> This is one of the things I have been wanting to do is actually >>> implement the other Map methods across the entire cache. However to >>> do a lot of these in a memory conscious way they would need to be ran >>> ignoring any ongoing transactions. Actually having this requirement >>> allows these methods to be implemented quite easily especially in >>> conjunction with the EntryIterator. I almost made a PR for it a while >>> back, but it seemed a little zealous to do at the same time and it >>> didn't seem that people were pushing for it very hard (maybe that was >>> a wrong assumption). Also I wasn't quite sure the transactional part >>> not being functional anymore would be a deterrent. >>> >>>>>>>> The current non-correct implementation was just because it's expensive >>>>>>>> to calculate correctly. I'm not sure the current impl is really that >>>>>>>> useful for anything. >>>>>>> +1 >>>>>>> >>>>>>> And not just size() but many others from ConcurrentMap. >>>>>>> The question is if we should drop the interface and all the methods >>>>>>> which aren't efficiently implementable, or fix all those methods. >>>>>>> >>>>>>> In the past I loved that I could inject "Infinispan superpowers" into >>>>>>> an application making extensive use of Map and ConcurrentMap without >>>>>>> changes, but that has been deceiving and required great care such as >>>>>>> verifying that these features would not be used anywhere in the code. >>>>>>> I ended up wrapping the Cache implementation in a custom adapter which >>>>>>> would also implement ConcurrentMap but would throw a RuntimeException >>>>>>> if any of the "unallowed" methods was called, at least I would detect >>>>>>> violations safely. >>>>>>> >>>>>>> I still think that for the time being - until a better solution is >>>>>>> planned - we should throw exceptions.. alas that's an old conversation >>>>>>> and it was never done. >>>>>>> >>>>>>> Sanne >>>>>>> >>>>>>> >>>>>>>> -Dennis >>>>>>>> >>>>>>>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>>>> realized there are more things that users would like to know. My >>>>>>>>> question is whether you think that they would really appreciate it, or >>>>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>>>> >>>>>>>>> There are those sizes: >>>>>>>>> A) number of owned entries >>>>>>>>> B) number of entries stored locally in memory >>>>>>>>> C) number of entries stored in each local cache store >>>>>>>>> D) number of entries stored in each shared cache store >>>>>>>>> E) total number of entries in cache >>>>>>>>> >>>>>>>>> So far, we can get >>>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>>>> E via distributed iterators / MR >>>>>>>>> A via data container iteration + distribution manager query, but only >>>>>>>>> without cache store >>>>>>>>> C or D through >>>>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>>>> >>>>>>>>> I think that it would go along with users' expectations if size() >>>>>>>>> returned E and for the rest we should have special methods on >>>>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>>>> I'd say that finally to something that has firm meaning. >>>>>>>>> >>>>>>>>> WDYT? >>>>>>>>> >>>>>>>>> Radim >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> infinispan-dev mailing list >>>>>>>> infinispan-dev at lists.jboss.org >>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> -- >>>> Radim Vansa >>>> JBoss DataGrid QA >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa >> JBoss DataGrid QA >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From sanne at infinispan.org Tue Oct 7 10:23:03 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 7 Oct 2014 15:23:03 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <5433EDB2.8020603@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> <5433DFF3.2060100@redhat.com> <5433EDB2.8020603@redhat.com> Message-ID: On 7 October 2014 14:42, Radim Vansa wrote: > Considering the frequency of "How do I get the number of entries in > cache", "How do I get all keys" on all forums, I think that backing to > runtime exception would not satisfy the users. Correct but when they get to ask the right question, that's usually after several hours of debugging and swearing. With an immediate exception, they get immediate advice and hopefully a hint of where to look in the docs for the special methods like statistics, disabling the exception-throwing decorator, or implement their own M/R job with the exact flags they need. From ttarrant at redhat.com Tue Oct 7 10:59:25 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 07 Oct 2014 16:59:25 +0200 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> <5433DFF3.2060100@redhat.com> <5433EDB2.8020603@redhat.com> Message-ID: <5433FFCD.10409@redhat.com> I'm not sure idiot-proof API is what we want to encourage. I'd rather tell users to RTFM. Tristan On 07/10/14 16:23, Sanne Grinovero wrote: > On 7 October 2014 14:42, Radim Vansa wrote: >> Considering the frequency of "How do I get the number of entries in >> cache", "How do I get all keys" on all forums, I think that backing to >> runtime exception would not satisfy the users. > Correct but when they get to ask the right question, that's usually > after several hours of debugging and swearing. > With an immediate exception, they get immediate advice and hopefully a > hint of where to look in the docs for the special methods like > statistics, disabling the exception-throwing decorator, or implement > their own M/R job with the exact flags they need. > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From dan.berindei at gmail.com Tue Oct 7 12:28:07 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 7 Oct 2014 19:28:07 +0300 Subject: [infinispan-dev] Infinispan 7.0.0.CR1 is out! Message-ID: Dear Community, We are gearing up towards a great Infinispan 7.0.0, and we are happy to announce our first candidate release! Notable features and improvements in this release: * Cross-site state transfer now handles failures (ISPN-4025) * Easier management of Protobuf schemas (ISPN-4357) * New uberjars-based distribution (ISPN-4728) * The HotRod protocol and Java client now have a size() operation (ISPN-4736) * Cluster listeners' filters and converters can now see the old value and metadata (ISPN-4753) See the full announcement here: http://goo.gl/ERslmk Cheers Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141007/f3cf39ba/attachment.html From sanne at infinispan.org Tue Oct 7 12:48:53 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 7 Oct 2014 17:48:53 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <5433FFCD.10409@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> <5433DFF3.2060100@redhat.com> <5433EDB2.8020603@redhat.com> <5433FFCD.10409@redhat.com> Message-ID: On 7 October 2014 15:59, Tristan Tarrant wrote: > I'm not sure idiot-proof API is what we want to encourage. I'd rather > tell users to RTFM. I'm not thinking about idiots at all. As I said I made such a decorator for my own sake, as it was handy to spot bad-usage cases I had missed, or would miss after future refactorings. I would not underestimate users: if you explain the problem and make sure people see your message, everyone will be able to find many clever solutions to get what they need. On the other hand, relying for a user to read these specific javadocs is foolish: since you're inheriting the ConcurrentMap contract, people are going to use the ConcurrentMap type in some cases and clients of that API will have access to the ConcurrentMap javadoc exclusively. Maybe it's even code which was written before the introduction of Infinispan in a project, or written by a team which had never heard of Infinispan... you know, people make expecations out of typesafety. You could certainly put a link to TFM in the exception message. Sanne > > Tristan > > On 07/10/14 16:23, Sanne Grinovero wrote: >> On 7 October 2014 14:42, Radim Vansa wrote: >>> Considering the frequency of "How do I get the number of entries in >>> cache", "How do I get all keys" on all forums, I think that backing to >>> runtime exception would not satisfy the users. >> Correct but when they get to ask the right question, that's usually >> after several hours of debugging and swearing. >> With an immediate exception, they get immediate advice and hopefully a >> hint of where to look in the docs for the special methods like >> statistics, disabling the exception-throwing decorator, or implement >> their own M/R job with the exact flags they need. >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Wed Oct 8 10:02:22 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 8 Oct 2014 17:02:22 +0300 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> <5433CF34.8010209@redhat.com> <5433DFF3.2060100@redhat.com> Message-ID: On Tue, Oct 7, 2014 at 4:17 PM, William Burns wrote: > On Tue, Oct 7, 2014 at 8:43 AM, Radim Vansa wrote: > > On 10/07/2014 02:21 PM, William Burns wrote: > >> On Tue, Oct 7, 2014 at 7:32 AM, Radim Vansa wrote: > >>> If you have one local and one shared cache store, how should the > command > >>> behave? > >>> > >>> a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP, > >>> SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no > >>> SKIP_BACKUP_ENTRIES flag right now), where this method returns > >>> localStore.size() for first non-shared cache store + passivation ? > >>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0) > >> Calling the size method in either distexec or MR will give you > >> inflated numbers as you need to pay attention to the numOwners to get > >> a proper count. > > > > That's what I meant by the SKIP_BACKUP_ENTRIES - dataContainer should be > > able to report only primary-owned entries, or we have to iterate and > > apply the filtering outside. > > If we added this functionality then yes it would be promoted up to MR > counting entries status though it would still have issues with rehash. > As well as issues with concurrent activations and passivations. > I think we can use something like OutdatedTopologyException to make sure we count each segment once, on the primary owner. But in order to verify that a particular node is the primary owner we'd have to load each cache store entry, so performance with cache stores will be pretty bad. Dealing with concurrent activations/passivations will is even trickier. > > > > >> > >>> b) distexec/MR sum of sharedStore.size() + passivation ? sum of > >>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0 > >> Calling the size on a shared cache actually should work somewhat well > >> (assuming all entries are stored in the shared cache). The problem is > >> if passivation is enabled as you point out because you also have to > >> check the data container which means you can also have an issue with > >> concurrent activations and passivations (which you can't verify > >> properly in either case without knowing the keys). > >> > >>> c) MR that would count the entries > >> This is the only reliable way to do this with MR. And unfortunately > >> if a rehash occurs I am not sure if you would get inconsistent numbers > >> or an Exception. In the latter at least you should be able to make > >> sure that you have the proper number when it does return without > >> exception. I can't say how it works with multiple loaders though, my > >> guess is that it may process the entry more than once so it depends on > >> if your mapper is smart enough to realize it. > > > > I don't think that reporting incorrect size is *that* harmful - even > > ConcurrentMap interface says that it's just a wild guess and when things > > are changing, you can't rely on that. > > ConcurrentMap doesn't say anything about size method actually. > ConcurrentHashMap has some verbage about saying that it might not be > completely correct under concurrent modification though. > > It isn't a wild guess really though for ConcurrentHashMap. The worst > is that you could count a value that was there but it is now removed > or you don't count a value that was recently added. Really the > guarantee from CHM is that it counts each individual segment properly > for a glimpse of time for that segment, the problem is that each > segment could change (since they are counted at different times). But > the values missing in ConcurrentHashMap are totally different than > losing an entire segment due to a rehash. You could theoretically > have a rehash occur right after MR started iterating and see no values > for that segment or a very small subset. There is a much larger > margin of error in this case for what values are seen and which are > not. > > Interesting... the Map javadoc seems to assume linearizability, maybe because the original implementation was Hashtable :) So there is precedent for relaxing the definition of size(). But of course some users will still expect a 0 error margin when there are no concurrent writes, so I agree we don't get a free pass to ignore rehashes and activations during get(). > > > >> > >>> d) wrapper on distributed entry iteration with converters set to return > >>> 0-sized entries > >> Entry iterator can't return 0 sized entries (just the values). The > >> keys are required to make sure that the count is correct and also to > >> ensure that if a rehash happens in the middle it can properly continue > >> to operate without having to start over. Entry iterator should work > >> properly irrespective of the number of stores/loaders that are > >> configured, since it keep track of already seen keys (so duplicates > >> are ignored). > > > > Ok, I was simplifying that a bit. And by the way, I don't really like > > the fact that for distributed entry iteration you need to be able to > > keep all keys from one segment at one moment in memory. But fine - > > distributed entry iteration is probably not the right way. > > I agree it is annoying to have to keep the keys, but it is one of the > few ways to reliably get all the values without losing one. Actually > this approach provides a much closer approximation to what > ConcurrentHashMap provides for its size implementation, since it can't > drop a segment. It is pretty much required to do it this way to do > keySet, entrySet, and values where you don't have the luxury of > dropping whole swaths of entries like you do with calling size() > method (even if the value(s) was there the entire time). > If we decide to improve size(), I'd vote to use distributed entry iterators. We may be able to avoid sending all the keys to the originator when the cache doesn't have any stores. But with a store it looks like we can't avoid reading all the keys from the store, so skipping the transfer of the keys wouldn't help that much. > > > > >> > >> > >>> And what about nodes with different configuration? > >> Hard to know without knowing what the differences are. > > > > I had in my mind different loaders and passivation configuration (e.g. > > some node could use shared store and some don't - do we want to handle > > such obscure configs? Can we design that without the need to have > > complicated decision trees what to include and what not?). > > Well the last sentence means we have to use MR or Entry Iterator since > we can't call size on the shared loader. I would think that it should > still work irrespective of the loader configuration (except for MR > with multiple loaders). The main issue I can think of is that if > everyone isn't using the shared loader that you could have stale > values in the loader if you don't always have a node using the shared > loader up (assuming purge at startup isn't enabled). > We really shouldn't support different store/loader configurations on each node, except for minor stuff like paths. > > > > > Radim > > > >> > >>> Radim > >>> > >>> On 10/06/2014 01:57 PM, Sanne Grinovero wrote: > >>>> On 6 October 2014 12:44, Tristan Tarrant wrote: > >>>>> I think we should provide correct implementations of size() (and > others) > >>>>> and provide shortcut implementations using our usual Flag API (e.g. > >>>>> SKIP_REMOTE_LOOKUP). > >>>> Right that would be very nice. Same for CacheStore interaction: all > >>>> cachestores should be included unless skipped explicitly. > >>>> > >>>> Sanne > >>>> > >>>>> Tristan > >>>>> > >>>>> On 06/10/14 12:57, Sanne Grinovero wrote: > >>>>>> On 3 October 2014 18:38, Dennis Reed wrote: > >>>>>>> Since size() is defined by the ConcurrentMap interface, it already > has a > >>>>>>> precisely defined meaning. The only "correct" implementation is E. > >>>>>> +1 > >> This is one of the things I have been wanting to do is actually > >> implement the other Map methods across the entire cache. However to > >> do a lot of these in a memory conscious way they would need to be ran > >> ignoring any ongoing transactions. Actually having this requirement > >> allows these methods to be implemented quite easily especially in > >> conjunction with the EntryIterator. I almost made a PR for it a while > >> back, but it seemed a little zealous to do at the same time and it > >> didn't seem that people were pushing for it very hard (maybe that was > >> a wrong assumption). Also I wasn't quite sure the transactional part > >> not being functional anymore would be a deterrent. > >> > >>>>>>> The current non-correct implementation was just because it's > expensive > >>>>>>> to calculate correctly. I'm not sure the current impl is really > that > >>>>>>> useful for anything. > >>>>>> +1 > >>>>>> > >>>>>> And not just size() but many others from ConcurrentMap. > >>>>>> The question is if we should drop the interface and all the methods > >>>>>> which aren't efficiently implementable, or fix all those methods. > >>>>>> > >>>>>> In the past I loved that I could inject "Infinispan superpowers" > into > >>>>>> an application making extensive use of Map and ConcurrentMap without > >>>>>> changes, but that has been deceiving and required great care such as > >>>>>> verifying that these features would not be used anywhere in the > code. > >>>>>> I ended up wrapping the Cache implementation in a custom adapter > which > >>>>>> would also implement ConcurrentMap but would throw a > RuntimeException > >>>>>> if any of the "unallowed" methods was called, at least I would > detect > >>>>>> violations safely. > >>>>>> > >>>>>> I still think that for the time being - until a better solution is > >>>>>> planned - we should throw exceptions.. alas that's an old > conversation > >>>>>> and it was never done. > >>>>>> > >>>>>> Sanne > >>>>>> > >>>>>> > >>>>>>> -Dennis > >>>>>>> > >>>>>>> On 10/03/2014 03:30 AM, Radim Vansa wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> recently we had a discussion about what size() returns, but I've > >>>>>>>> realized there are more things that users would like to know. My > >>>>>>>> question is whether you think that they would really appreciate > it, or > >>>>>>>> whether it's just my QA point of view where I sometimes compute > the > >>>>>>>> 'checksums' of cache to see if I didn't lost anything. > >>>>>>>> > >>>>>>>> There are those sizes: > >>>>>>>> A) number of owned entries > >>>>>>>> B) number of entries stored locally in memory > >>>>>>>> C) number of entries stored in each local cache store > >>>>>>>> D) number of entries stored in each shared cache store > >>>>>>>> E) total number of entries in cache > >>>>>>>> > >>>>>>>> So far, we can get > >>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() > >>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() > >>>>>>>> E via distributed iterators / MR > >>>>>>>> A via data container iteration + distribution manager query, but > only > >>>>>>>> without cache store > >>>>>>>> C or D through > >>>>>>>> > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > >>>>>>>> > >>>>>>>> I think that it would go along with users' expectations if size() > >>>>>>>> returned E and for the rest we should have special methods on > >>>>>>>> AdvancedCache. That would of course change the meaning of size(), > but > >>>>>>>> I'd say that finally to something that has firm meaning. > >>>>>>>> > >>>>>>>> WDYT? > >>>>>>>> > >>>>>>>> Radim > >>>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> infinispan-dev mailing list > >>>>>>> infinispan-dev at lists.jboss.org > >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>>>> _______________________________________________ > >>>>>> infinispan-dev mailing list > >>>>>> infinispan-dev at lists.jboss.org > >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>>>> > >>>>>> > >>>>> _______________________________________________ > >>>>> infinispan-dev mailing list > >>>>> infinispan-dev at lists.jboss.org > >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> infinispan-dev at lists.jboss.org > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> > >>> -- > >>> Radim Vansa > >>> JBoss DataGrid QA > >>> > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev at lists.jboss.org > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > -- > > Radim Vansa > > JBoss DataGrid QA > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141008/0665c484/attachment-0001.html From mmarkus at redhat.com Wed Oct 8 10:03:13 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Wed, 8 Oct 2014 15:03:13 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <542E5E92.7060504@redhat.com> References: <542E5E92.7060504@redhat.com> Message-ID: <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> On Oct 3, 2014, at 9:30, Radim Vansa wrote: > Hi, > > recently we had a discussion about what size() returns, but I've > realized there are more things that users would like to know. My > question is whether you think that they would really appreciate it, or > whether it's just my QA point of view where I sometimes compute the > 'checksums' of cache to see if I didn't lost anything. > > There are those sizes: > A) number of owned entries > B) number of entries stored locally in memory > C) number of entries stored in each local cache store > D) number of entries stored in each shared cache store > E) total number of entries in cache > > So far, we can get > B via withFlags(SKIP_CACHE_LOAD).size() > (passivation ? B : 0) + firstNonZero(C, D) via size() > E via distributed iterators / MR > A via data container iteration + distribution manager query, but only > without cache store > C or D through > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > > I think that it would go along with users' expectations if size() > returned E and for the rest we should have special methods on > AdvancedCache. That would of course change the meaning of size(), but > I'd say that finally to something that has firm meaning. > > WDYT? There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: - they are approximate (data changes during iteration) - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. > > Radim > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From mmarkus at redhat.com Wed Oct 8 10:09:26 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Wed, 8 Oct 2014 15:09:26 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <542EDF2A.7080807@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> Message-ID: On Oct 3, 2014, at 18:38, Dennis Reed wrote: > Since size() is defined by the ConcurrentMap interface, it already has a > precisely defined meaning. The only "correct" implementation is E. > > The current non-correct implementation was just because it's expensive > to calculate correctly. I'm not sure the current impl is really that > useful for anything. +1 > > -Dennis > > On 10/03/2014 03:30 AM, Radim Vansa wrote: >> Hi, >> >> recently we had a discussion about what size() returns, but I've >> realized there are more things that users would like to know. My >> question is whether you think that they would really appreciate it, or >> whether it's just my QA point of view where I sometimes compute the >> 'checksums' of cache to see if I didn't lost anything. >> >> There are those sizes: >> A) number of owned entries >> B) number of entries stored locally in memory >> C) number of entries stored in each local cache store >> D) number of entries stored in each shared cache store >> E) total number of entries in cache >> >> So far, we can get >> B via withFlags(SKIP_CACHE_LOAD).size() >> (passivation ? B : 0) + firstNonZero(C, D) via size() >> E via distributed iterators / MR >> A via data container iteration + distribution manager query, but only >> without cache store >> C or D through >> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >> >> I think that it would go along with users' expectations if size() >> returned E and for the rest we should have special methods on >> AdvancedCache. That would of course change the meaning of size(), but >> I'd say that finally to something that has firm meaning. >> >> WDYT? >> >> Radim >> > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From mmarkus at redhat.com Wed Oct 8 10:11:55 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Wed, 8 Oct 2014 15:11:55 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <543280A8.5040109@redhat.com> References: <542E5E92.7060504@redhat.com> <542EDF2A.7080807@redhat.com> <543280A8.5040109@redhat.com> Message-ID: On Oct 6, 2014, at 12:44, Tristan Tarrant wrote: > I think we should provide correct implementations of size() (and others) > and provide shortcut implementations using our usual Flag API (e.g. > SKIP_REMOTE_LOOKUP). for keySet and values, Will's distributed iteration is a way nicer way of doing it, as it only fetches the data iteratively. Better to throw an exception and point user to the distributed iterator. > > Tristan > > On 06/10/14 12:57, Sanne Grinovero wrote: >> On 3 October 2014 18:38, Dennis Reed wrote: >>> Since size() is defined by the ConcurrentMap interface, it already has a >>> precisely defined meaning. The only "correct" implementation is E. >> +1 >> >>> The current non-correct implementation was just because it's expensive >>> to calculate correctly. I'm not sure the current impl is really that >>> useful for anything. >> +1 >> >> And not just size() but many others from ConcurrentMap. >> The question is if we should drop the interface and all the methods >> which aren't efficiently implementable, or fix all those methods. >> >> In the past I loved that I could inject "Infinispan superpowers" into >> an application making extensive use of Map and ConcurrentMap without >> changes, but that has been deceiving and required great care such as >> verifying that these features would not be used anywhere in the code. >> I ended up wrapping the Cache implementation in a custom adapter which >> would also implement ConcurrentMap but would throw a RuntimeException >> if any of the "unallowed" methods was called, at least I would detect >> violations safely. >> >> I still think that for the time being - until a better solution is >> planned - we should throw exceptions.. alas that's an old conversation >> and it was never done. >> >> Sanne >> >> >>> -Dennis >>> >>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>> Hi, >>>> >>>> recently we had a discussion about what size() returns, but I've >>>> realized there are more things that users would like to know. My >>>> question is whether you think that they would really appreciate it, or >>>> whether it's just my QA point of view where I sometimes compute the >>>> 'checksums' of cache to see if I didn't lost anything. >>>> >>>> There are those sizes: >>>> A) number of owned entries >>>> B) number of entries stored locally in memory >>>> C) number of entries stored in each local cache store >>>> D) number of entries stored in each shared cache store >>>> E) total number of entries in cache >>>> >>>> So far, we can get >>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>> E via distributed iterators / MR >>>> A via data container iteration + distribution manager query, but only >>>> without cache store >>>> C or D through >>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>> >>>> I think that it would go along with users' expectations if size() >>>> returned E and for the rest we should have special methods on >>>> AdvancedCache. That would of course change the meaning of size(), but >>>> I'd say that finally to something that has firm meaning. >>>> >>>> WDYT? >>>> >>>> Radim >>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From dan.berindei at gmail.com Wed Oct 8 10:11:59 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 8 Oct 2014 17:11:59 +0300 Subject: [infinispan-dev] About size() In-Reply-To: <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: > On Oct 3, 2014, at 9:30, Radim Vansa wrote: > > > Hi, > > > > recently we had a discussion about what size() returns, but I've > > realized there are more things that users would like to know. My > > question is whether you think that they would really appreciate it, or > > whether it's just my QA point of view where I sometimes compute the > > 'checksums' of cache to see if I didn't lost anything. > > > > There are those sizes: > > A) number of owned entries > > B) number of entries stored locally in memory > > C) number of entries stored in each local cache store > > D) number of entries stored in each shared cache store > > E) total number of entries in cache > > > > So far, we can get > > B via withFlags(SKIP_CACHE_LOAD).size() > > (passivation ? B : 0) + firstNonZero(C, D) via size() > > E via distributed iterators / MR > > A via data container iteration + distribution manager query, but only > > without cache store > > C or D through > > > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > > > > I think that it would go along with users' expectations if size() > > returned E and for the rest we should have special methods on > > AdvancedCache. That would of course change the meaning of size(), but > > I'd say that finally to something that has firm meaning. > > > > WDYT? > > There was a lot of arguments in past whether size() and other methods that > operate over all the elements (keySet, values) are useful because: > - they are approximate (data changes during iteration) > - they are very resource consuming and might be miss-used (this is the > reason we chosen to use size() with its current local semantic) > > These methods (size, keys, values) are useful for people and I think we > were not wise to implement them only on top of the local data: this is like > preferring efficiency over correctness. This also created a lot of > confusion with our users, question like size() doesn't return the correct > value being asked regularly. I totally agree that size() returns E (i.e. > everything that is stored within the grid, including persistence) and it's > performance implications to be documented accordingly. For keySet and > values - we should stop implementing them (throw exception) and point users > to Will's distributed iterator which is a nicer way to achieve the desired > behavior. > We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. > > > > > Radim > > > > -- > > Radim Vansa > > JBoss DataGrid QA > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141008/72903689/attachment.html From mmarkus at redhat.com Wed Oct 8 10:13:55 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Wed, 8 Oct 2014 15:13:55 +0100 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> Message-ID: <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> On Oct 8, 2014, at 15:11, Dan Berindei wrote: > > On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: > On Oct 3, 2014, at 9:30, Radim Vansa wrote: > > > Hi, > > > > recently we had a discussion about what size() returns, but I've > > realized there are more things that users would like to know. My > > question is whether you think that they would really appreciate it, or > > whether it's just my QA point of view where I sometimes compute the > > 'checksums' of cache to see if I didn't lost anything. > > > > There are those sizes: > > A) number of owned entries > > B) number of entries stored locally in memory > > C) number of entries stored in each local cache store > > D) number of entries stored in each shared cache store > > E) total number of entries in cache > > > > So far, we can get > > B via withFlags(SKIP_CACHE_LOAD).size() > > (passivation ? B : 0) + firstNonZero(C, D) via size() > > E via distributed iterators / MR > > A via data container iteration + distribution manager query, but only > > without cache store > > C or D through > > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > > > > I think that it would go along with users' expectations if size() > > returned E and for the rest we should have special methods on > > AdvancedCache. That would of course change the meaning of size(), but > > I'd say that finally to something that has firm meaning. > > > > WDYT? > > There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: > - they are approximate (data changes during iteration) > - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) > > These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. > > We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. Yes, that's what I meant as well. Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From mudokonman at gmail.com Wed Oct 8 10:42:16 2014 From: mudokonman at gmail.com (William Burns) Date: Wed, 8 Oct 2014 10:42:16 -0400 Subject: [infinispan-dev] About size() In-Reply-To: <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: So it seems we would want to change this for 7.0 if possible since it would be a bigger change for something like 7.1 and 8.0 would be even further out. I should be able to put this together for CR2. It seems that we want to implement keySet, values and entrySet methods using the entry iterator approach. It is however unclear for the size method if we want to use MR entry counting and not worry about the rehash and passivation issues since it is just an estimation anyways. Or if we want to also use the entry iterator which should be closer approximation but will require more network overhead and memory usage. Also we didn't really talk about the fact that these methods would ignore ongoing transactions and if that is a concern or not. - Will On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: > > On Oct 8, 2014, at 15:11, Dan Berindei wrote: > >> >> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >> >> > Hi, >> > >> > recently we had a discussion about what size() returns, but I've >> > realized there are more things that users would like to know. My >> > question is whether you think that they would really appreciate it, or >> > whether it's just my QA point of view where I sometimes compute the >> > 'checksums' of cache to see if I didn't lost anything. >> > >> > There are those sizes: >> > A) number of owned entries >> > B) number of entries stored locally in memory >> > C) number of entries stored in each local cache store >> > D) number of entries stored in each shared cache store >> > E) total number of entries in cache >> > >> > So far, we can get >> > B via withFlags(SKIP_CACHE_LOAD).size() >> > (passivation ? B : 0) + firstNonZero(C, D) via size() >> > E via distributed iterators / MR >> > A via data container iteration + distribution manager query, but only >> > without cache store >> > C or D through >> > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >> > >> > I think that it would go along with users' expectations if size() >> > returned E and for the rest we should have special methods on >> > AdvancedCache. That would of course change the meaning of size(), but >> > I'd say that finally to something that has firm meaning. >> > >> > WDYT? >> >> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >> - they are approximate (data changes during iteration) >> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >> >> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >> >> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. > > Yes, that's what I meant as well. > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Wed Oct 8 10:57:58 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 8 Oct 2014 17:57:58 +0300 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 5:42 PM, William Burns wrote: > So it seems we would want to change this for 7.0 if possible since it > would be a bigger change for something like 7.1 and 8.0 would be even > further out. I should be able to put this together for CR2. > I'm not 100% convinced that we need it for 7.x. For 8.0 I would recommend removing the size() method altogether, and providing some looser "statistics" instead. > > It seems that we want to implement keySet, values and entrySet methods > using the entry iterator approach. > > It is however unclear for the size method if we want to use MR entry > counting and not worry about the rehash and passivation issues since > it is just an estimation anyways. Or if we want to also use the entry > iterator which should be closer approximation but will require more > network overhead and memory usage. > +1 to use the entry iterator from me, ignoring state transfer we can get some pretty wild fluctuations in the size of the cache. We could use a distributed task for Cache.isEmpty() instead of size() == 0, though. > > Also we didn't really talk about the fact that these methods would > ignore ongoing transactions and if that is a concern or not. > > It might be a concern for the Hibernate 2LC impl, it was their TCK that prompted the last round of discussions about clear(). We haven't talked about what size(), keySet() and values() should return for an invalidation cache either... I forget, does the distributed entry iterator work with invalidation caches? > - Will > > On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: > > > > On Oct 8, 2014, at 15:11, Dan Berindei wrote: > > > >> > >> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus > wrote: > >> On Oct 3, 2014, at 9:30, Radim Vansa wrote: > >> > >> > Hi, > >> > > >> > recently we had a discussion about what size() returns, but I've > >> > realized there are more things that users would like to know. My > >> > question is whether you think that they would really appreciate it, or > >> > whether it's just my QA point of view where I sometimes compute the > >> > 'checksums' of cache to see if I didn't lost anything. > >> > > >> > There are those sizes: > >> > A) number of owned entries > >> > B) number of entries stored locally in memory > >> > C) number of entries stored in each local cache store > >> > D) number of entries stored in each shared cache store > >> > E) total number of entries in cache > >> > > >> > So far, we can get > >> > B via withFlags(SKIP_CACHE_LOAD).size() > >> > (passivation ? B : 0) + firstNonZero(C, D) via size() > >> > E via distributed iterators / MR > >> > A via data container iteration + distribution manager query, but only > >> > without cache store > >> > C or D through > >> > > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > >> > > >> > I think that it would go along with users' expectations if size() > >> > returned E and for the rest we should have special methods on > >> > AdvancedCache. That would of course change the meaning of size(), but > >> > I'd say that finally to something that has firm meaning. > >> > > >> > WDYT? > >> > >> There was a lot of arguments in past whether size() and other methods > that operate over all the elements (keySet, values) are useful because: > >> - they are approximate (data changes during iteration) > >> - they are very resource consuming and might be miss-used (this is the > reason we chosen to use size() with its current local semantic) > >> > >> These methods (size, keys, values) are useful for people and I think we > were not wise to implement them only on top of the local data: this is like > preferring efficiency over correctness. This also created a lot of > confusion with our users, question like size() doesn't return the correct > value being asked regularly. I totally agree that size() returns E (i.e. > everything that is stored within the grid, including persistence) and it's > performance implications to be documented accordingly. For keySet and > values - we should stop implementing them (throw exception) and point users > to Will's distributed iterator which is a nicer way to achieve the desired > behavior. > >> > >> We can also implement keySet() and values() on top of the distributed > entry iterator and document that using the iterator directly is better. > > > > Yes, that's what I meant as well. > > > > Cheers, > > -- > > Mircea Markus > > Infinispan lead (www.infinispan.org) > > > > > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141008/870e35b8/attachment.html From mudokonman at gmail.com Wed Oct 8 11:14:12 2014 From: mudokonman at gmail.com (William Burns) Date: Wed, 8 Oct 2014 11:14:12 -0400 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 10:57 AM, Dan Berindei wrote: > > > On Wed, Oct 8, 2014 at 5:42 PM, William Burns wrote: >> >> So it seems we would want to change this for 7.0 if possible since it >> would be a bigger change for something like 7.1 and 8.0 would be even >> further out. I should be able to put this together for CR2. > > > I'm not 100% convinced that we need it for 7.x. For 8.0 I would recommend > removing the size() method altogether, and providing some looser > "statistics" instead. Yeah I guess I don't know enough about the demand for these methods or what people wanted to use them for to know what kind of priority they should be given. It sounds like you are talking about decoupling from the Map/ConcurrentMap interface completely then, right? So we would also eliminate the other bulk methods (keySet, values, entrySet)? > >> >> >> It seems that we want to implement keySet, values and entrySet methods >> using the entry iterator approach. >> >> It is however unclear for the size method if we want to use MR entry >> counting and not worry about the rehash and passivation issues since >> it is just an estimation anyways. Or if we want to also use the entry >> iterator which should be closer approximation but will require more >> network overhead and memory usage. > > > +1 to use the entry iterator from me, ignoring state transfer we can get > some pretty wild fluctuations in the size of the cache. That is personally my feeling as well, but I tend to err more on the side of correctness to begin with. > We could use a distributed task for Cache.isEmpty() instead of size() == 0, > though. Yes that should be a good optimization either way. > >> >> >> Also we didn't really talk about the fact that these methods would >> ignore ongoing transactions and if that is a concern or not. >> > > It might be a concern for the Hibernate 2LC impl, it was their TCK that > prompted the last round of discussions about clear(). Although I wonder how much these methods are even used since they only work for Local, Replication or Invalidation caches in their current state (and didn't even use loaders until 6.0). > > We haven't talked about what size(), keySet() and values() should return for > an invalidation cache either... I forget, does the distributed entry > iterator work with invalidation caches? It works the same as a local cache so only the local node contents are returned. Replicated does the same thing, distributed is the only special case. This was the only thing that made sense to me, but if you have any ideas that would be great to hear for possibly enhancing Invalidation iteration. > > >> >> - Will >> >> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >> > >> > On Oct 8, 2014, at 15:11, Dan Berindei wrote: >> > >> >> >> >> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus >> >> wrote: >> >> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >> >> >> >> > Hi, >> >> > >> >> > recently we had a discussion about what size() returns, but I've >> >> > realized there are more things that users would like to know. My >> >> > question is whether you think that they would really appreciate it, >> >> > or >> >> > whether it's just my QA point of view where I sometimes compute the >> >> > 'checksums' of cache to see if I didn't lost anything. >> >> > >> >> > There are those sizes: >> >> > A) number of owned entries >> >> > B) number of entries stored locally in memory >> >> > C) number of entries stored in each local cache store >> >> > D) number of entries stored in each shared cache store >> >> > E) total number of entries in cache >> >> > >> >> > So far, we can get >> >> > B via withFlags(SKIP_CACHE_LOAD).size() >> >> > (passivation ? B : 0) + firstNonZero(C, D) via size() >> >> > E via distributed iterators / MR >> >> > A via data container iteration + distribution manager query, but only >> >> > without cache store >> >> > C or D through >> >> > >> >> > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >> >> > >> >> > I think that it would go along with users' expectations if size() >> >> > returned E and for the rest we should have special methods on >> >> > AdvancedCache. That would of course change the meaning of size(), but >> >> > I'd say that finally to something that has firm meaning. >> >> > >> >> > WDYT? >> >> >> >> There was a lot of arguments in past whether size() and other methods >> >> that operate over all the elements (keySet, values) are useful because: >> >> - they are approximate (data changes during iteration) >> >> - they are very resource consuming and might be miss-used (this is the >> >> reason we chosen to use size() with its current local semantic) >> >> >> >> These methods (size, keys, values) are useful for people and I think we >> >> were not wise to implement them only on top of the local data: this is like >> >> preferring efficiency over correctness. This also created a lot of confusion >> >> with our users, question like size() doesn't return the correct value being >> >> asked regularly. I totally agree that size() returns E (i.e. everything that >> >> is stored within the grid, including persistence) and it's performance >> >> implications to be documented accordingly. For keySet and values - we should >> >> stop implementing them (throw exception) and point users to Will's >> >> distributed iterator which is a nicer way to achieve the desired behavior. >> >> >> >> We can also implement keySet() and values() on top of the distributed >> >> entry iterator and document that using the iterator directly is better. >> > >> > Yes, that's what I meant as well. >> > >> > Cheers, >> > -- >> > Mircea Markus >> > Infinispan lead (www.infinispan.org) >> > >> > >> > >> > >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Wed Oct 8 11:19:42 2014 From: rvansa at redhat.com (Radim Vansa) Date: Wed, 08 Oct 2014 17:19:42 +0200 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: <5435560E.2030206@redhat.com> Users expect that size() will be constant-time (or linear to cluster size), and generally fast operation. I'd prefer to keep it that way. Though, even the MR way (used for HotRod size() now) needs to crawl through all the entries locally. 'Heretic, not very well though of and changing too many things' idea: what about having data container segment-aware? Then you'd just bcast SizeCommand with given topologyId and sum up sizes of primary-owned segments... It's not a complete solution, but at least that would enable to get the number of locally owned entries quite fast. Though, you can't do that easily with cache stores (without changing SPI). Regarding cache stores, IMO we're damned anyway: when calling cacheStore.size(), it can report more entries as those haven't been expired yet, it can report less entries as those can be expired due to [1]. Or, we'll enumerate all the entries, and that's going to be slow (btw., [1] reminded me that we should enumerate both datacontainer AND cachestores even if passivation is not enabled). Radim [1] https://issues.jboss.org/browse/ISPN-3202 On 10/08/2014 04:42 PM, William Burns wrote: > So it seems we would want to change this for 7.0 if possible since it > would be a bigger change for something like 7.1 and 8.0 would be even > further out. I should be able to put this together for CR2. > > It seems that we want to implement keySet, values and entrySet methods > using the entry iterator approach. > > It is however unclear for the size method if we want to use MR entry > counting and not worry about the rehash and passivation issues since > it is just an estimation anyways. Or if we want to also use the entry > iterator which should be closer approximation but will require more > network overhead and memory usage. > > Also we didn't really talk about the fact that these methods would > ignore ongoing transactions and if that is a concern or not. > > - Will > > On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >> >>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >>> >>>> Hi, >>>> >>>> recently we had a discussion about what size() returns, but I've >>>> realized there are more things that users would like to know. My >>>> question is whether you think that they would really appreciate it, or >>>> whether it's just my QA point of view where I sometimes compute the >>>> 'checksums' of cache to see if I didn't lost anything. >>>> >>>> There are those sizes: >>>> A) number of owned entries >>>> B) number of entries stored locally in memory >>>> C) number of entries stored in each local cache store >>>> D) number of entries stored in each shared cache store >>>> E) total number of entries in cache >>>> >>>> So far, we can get >>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>> E via distributed iterators / MR >>>> A via data container iteration + distribution manager query, but only >>>> without cache store >>>> C or D through >>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>> >>>> I think that it would go along with users' expectations if size() >>>> returned E and for the rest we should have special methods on >>>> AdvancedCache. That would of course change the meaning of size(), but >>>> I'd say that finally to something that has firm meaning. >>>> >>>> WDYT? >>> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >>> - they are approximate (data changes during iteration) >>> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >>> >>> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >>> >>> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. >> Yes, that's what I meant as well. >> >> Cheers, >> -- >> Mircea Markus >> Infinispan lead (www.infinispan.org) >> >> >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From dan.berindei at gmail.com Wed Oct 8 12:23:06 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 8 Oct 2014 19:23:06 +0300 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 6:14 PM, William Burns wrote: > On Wed, Oct 8, 2014 at 10:57 AM, Dan Berindei > wrote: > > > > > > On Wed, Oct 8, 2014 at 5:42 PM, William Burns > wrote: > >> > >> So it seems we would want to change this for 7.0 if possible since it > >> would be a bigger change for something like 7.1 and 8.0 would be even > >> further out. I should be able to put this together for CR2. > > > > > > I'm not 100% convinced that we need it for 7.x. For 8.0 I would recommend > > removing the size() method altogether, and providing some looser > > "statistics" instead. > > Yeah I guess I don't know enough about the demand for these methods or > what people wanted to use them for to know what kind of priority they > should be given. > > It sounds like you are talking about decoupling from the > Map/ConcurrentMap interface completely then, right? So we would also > eliminate the other bulk methods (keySet, values, entrySet)? > Yes, I would base the Cache interface on JSR-107's Cache, which doesn't have size() or the other methods. > > > > >> > >> > >> It seems that we want to implement keySet, values and entrySet methods > >> using the entry iterator approach. > >> > >> It is however unclear for the size method if we want to use MR entry > >> counting and not worry about the rehash and passivation issues since > >> it is just an estimation anyways. Or if we want to also use the entry > >> iterator which should be closer approximation but will require more > >> network overhead and memory usage. > > > > > > +1 to use the entry iterator from me, ignoring state transfer we can get > > some pretty wild fluctuations in the size of the cache. > > That is personally my feeling as well, but I tend to err more on the > side of correctness to begin with. > > > We could use a distributed task for Cache.isEmpty() instead of size() == > 0, > > though. > > Yes that should be a good optimization either way. > > > > >> > >> > >> Also we didn't really talk about the fact that these methods would > >> ignore ongoing transactions and if that is a concern or not. > >> > > > > It might be a concern for the Hibernate 2LC impl, it was their TCK that > > prompted the last round of discussions about clear(). > > Although I wonder how much these methods are even used since they only > work for Local, Replication or Invalidation caches in their current > state (and didn't even use loaders until 6.0). > There is some more information about the test in the mailing list discussion [1] There's also a JIRA for clear() [2] I think 2LC almost never uses distribution, so size() being local-only didn't matter, but making it non-tx could cause problems - at least for that particular test. [1] http://lists.jboss.org/pipermail/infinispan-dev/2013-October/013914.html [2] https://issues.jboss.org/browse/ISPN-3656 > > > > We haven't talked about what size(), keySet() and values() should return > for > > an invalidation cache either... I forget, does the distributed entry > > iterator work with invalidation caches? > > It works the same as a local cache so only the local node contents are > returned. Replicated does the same thing, distributed is the only > special case. This was the only thing that made sense to me, but if > you have any ideas that would be great to hear for possibly enhancing > Invalidation iteration. > Sounds good to me. cache.get(k) will search on all the nodes via ClusterLoader, so there is a certain appeal in making the entry iterator do the same. But invalidation caches are used with an external (non-CacheLoader) source of data anyway, so we can never return "all the entries". > > > > > >> > >> - Will > >> > >> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus > wrote: > >> > > >> > On Oct 8, 2014, at 15:11, Dan Berindei > wrote: > >> > > >> >> > >> >> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus > >> >> wrote: > >> >> On Oct 3, 2014, at 9:30, Radim Vansa wrote: > >> >> > >> >> > Hi, > >> >> > > >> >> > recently we had a discussion about what size() returns, but I've > >> >> > realized there are more things that users would like to know. My > >> >> > question is whether you think that they would really appreciate it, > >> >> > or > >> >> > whether it's just my QA point of view where I sometimes compute the > >> >> > 'checksums' of cache to see if I didn't lost anything. > >> >> > > >> >> > There are those sizes: > >> >> > A) number of owned entries > >> >> > B) number of entries stored locally in memory > >> >> > C) number of entries stored in each local cache store > >> >> > D) number of entries stored in each shared cache store > >> >> > E) total number of entries in cache > >> >> > > >> >> > So far, we can get > >> >> > B via withFlags(SKIP_CACHE_LOAD).size() > >> >> > (passivation ? B : 0) + firstNonZero(C, D) via size() > >> >> > E via distributed iterators / MR > >> >> > A via data container iteration + distribution manager query, but > only > >> >> > without cache store > >> >> > C or D through > >> >> > > >> >> > > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > >> >> > > >> >> > I think that it would go along with users' expectations if size() > >> >> > returned E and for the rest we should have special methods on > >> >> > AdvancedCache. That would of course change the meaning of size(), > but > >> >> > I'd say that finally to something that has firm meaning. > >> >> > > >> >> > WDYT? > >> >> > >> >> There was a lot of arguments in past whether size() and other methods > >> >> that operate over all the elements (keySet, values) are useful > because: > >> >> - they are approximate (data changes during iteration) > >> >> - they are very resource consuming and might be miss-used (this is > the > >> >> reason we chosen to use size() with its current local semantic) > >> >> > >> >> These methods (size, keys, values) are useful for people and I think > we > >> >> were not wise to implement them only on top of the local data: this > is like > >> >> preferring efficiency over correctness. This also created a lot of > confusion > >> >> with our users, question like size() doesn't return the correct > value being > >> >> asked regularly. I totally agree that size() returns E (i.e. > everything that > >> >> is stored within the grid, including persistence) and it's > performance > >> >> implications to be documented accordingly. For keySet and values - > we should > >> >> stop implementing them (throw exception) and point users to Will's > >> >> distributed iterator which is a nicer way to achieve the desired > behavior. > >> >> > >> >> We can also implement keySet() and values() on top of the distributed > >> >> entry iterator and document that using the iterator directly is > better. > >> > > >> > Yes, that's what I meant as well. > >> > > >> > Cheers, > >> > -- > >> > Mircea Markus > >> > Infinispan lead (www.infinispan.org) > >> > > >> > > >> > > >> > > >> > > >> > _______________________________________________ > >> > infinispan-dev mailing list > >> > infinispan-dev at lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141008/e3158960/attachment.html From dan.berindei at gmail.com Wed Oct 8 12:41:42 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 8 Oct 2014 19:41:42 +0300 Subject: [infinispan-dev] About size() In-Reply-To: <5435560E.2030206@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 6:19 PM, Radim Vansa wrote: > Users expect that size() will be constant-time (or linear to cluster > size), and generally fast operation. I'd prefer to keep it that way. > Though, even the MR way (used for HotRod size() now) needs to crawl > through all the entries locally. > They might expect that, but there is nothing in the Map API suggesting it. > > 'Heretic, not very well though of and changing too many things' idea: > what about having data container segment-aware? Then you'd just bcast > SizeCommand with given topologyId and sum up sizes of primary-owned > segments... It's not a complete solution, but at least that would enable > to get the number of locally owned entries quite fast. Though, you can't > do that easily with cache stores (without changing SPI). > We could create a separate DataContainer for each segment. But would it really be worth the trouble? I don't know of anyone using size() for something other than checking that their data was properly loaded into the cache, and they don't need a super-fast size() for that. > > Regarding cache stores, IMO we're damned anyway: when calling > cacheStore.size(), it can report more entries as those haven't been > expired yet, it can report less entries as those can be expired due to > [1]. Or, we'll enumerate all the entries, and that's going to be slow > (btw., [1] reminded me that we should enumerate both datacontainer AND > cachestores even if passivation is not enabled). > Exactly, we need to iterate all the entries from the stores if we want something remotely accurate (although I had forgotten about expiration also being a problem). Otherwise we could just leave size() as it is now, it's pretty fast :) > > Radim > > [1] https://issues.jboss.org/browse/ISPN-3202 > > On 10/08/2014 04:42 PM, William Burns wrote: > > So it seems we would want to change this for 7.0 if possible since it > > would be a bigger change for something like 7.1 and 8.0 would be even > > further out. I should be able to put this together for CR2. > > > > It seems that we want to implement keySet, values and entrySet methods > > using the entry iterator approach. > > > > It is however unclear for the size method if we want to use MR entry > > counting and not worry about the rehash and passivation issues since > > it is just an estimation anyways. Or if we want to also use the entry > > iterator which should be closer approximation but will require more > > network overhead and memory usage. > > > > Also we didn't really talk about the fact that these methods would > > ignore ongoing transactions and if that is a concern or not. > > > > - Will > > > > On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus > wrote: > >> On Oct 8, 2014, at 15:11, Dan Berindei wrote: > >> > >>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus > wrote: > >>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: > >>> > >>>> Hi, > >>>> > >>>> recently we had a discussion about what size() returns, but I've > >>>> realized there are more things that users would like to know. My > >>>> question is whether you think that they would really appreciate it, or > >>>> whether it's just my QA point of view where I sometimes compute the > >>>> 'checksums' of cache to see if I didn't lost anything. > >>>> > >>>> There are those sizes: > >>>> A) number of owned entries > >>>> B) number of entries stored locally in memory > >>>> C) number of entries stored in each local cache store > >>>> D) number of entries stored in each shared cache store > >>>> E) total number of entries in cache > >>>> > >>>> So far, we can get > >>>> B via withFlags(SKIP_CACHE_LOAD).size() > >>>> (passivation ? B : 0) + firstNonZero(C, D) via size() > >>>> E via distributed iterators / MR > >>>> A via data container iteration + distribution manager query, but only > >>>> without cache store > >>>> C or D through > >>>> > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > >>>> > >>>> I think that it would go along with users' expectations if size() > >>>> returned E and for the rest we should have special methods on > >>>> AdvancedCache. That would of course change the meaning of size(), but > >>>> I'd say that finally to something that has firm meaning. > >>>> > >>>> WDYT? > >>> There was a lot of arguments in past whether size() and other methods > that operate over all the elements (keySet, values) are useful because: > >>> - they are approximate (data changes during iteration) > >>> - they are very resource consuming and might be miss-used (this is the > reason we chosen to use size() with its current local semantic) > >>> > >>> These methods (size, keys, values) are useful for people and I think > we were not wise to implement them only on top of the local data: this is > like preferring efficiency over correctness. This also created a lot of > confusion with our users, question like size() doesn't return the correct > value being asked regularly. I totally agree that size() returns E (i.e. > everything that is stored within the grid, including persistence) and it's > performance implications to be documented accordingly. For keySet and > values - we should stop implementing them (throw exception) and point users > to Will's distributed iterator which is a nicer way to achieve the desired > behavior. > >>> > >>> We can also implement keySet() and values() on top of the distributed > entry iterator and document that using the iterator directly is better. > >> Yes, that's what I meant as well. > >> > >> Cheers, > >> -- > >> Mircea Markus > >> Infinispan lead (www.infinispan.org) > >> > >> > >> > >> > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141008/2e3cacec/attachment-0001.html From mudokonman at gmail.com Wed Oct 8 12:53:42 2014 From: mudokonman at gmail.com (William Burns) Date: Wed, 8 Oct 2014 12:53:42 -0400 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 12:41 PM, Dan Berindei wrote: > > > On Wed, Oct 8, 2014 at 6:19 PM, Radim Vansa wrote: >> >> Users expect that size() will be constant-time (or linear to cluster >> size), and generally fast operation. I'd prefer to keep it that way. >> Though, even the MR way (used for HotRod size() now) needs to crawl >> through all the entries locally. > > > They might expect that, but there is nothing in the Map API suggesting it. > >> >> >> 'Heretic, not very well though of and changing too many things' idea: >> what about having data container segment-aware? Then you'd just bcast >> SizeCommand with given topologyId and sum up sizes of primary-owned >> segments... It's not a complete solution, but at least that would enable >> to get the number of locally owned entries quite fast. Though, you can't >> do that easily with cache stores (without changing SPI). > > > We could create a separate DataContainer for each segment. But would it > really be worth the trouble? I don't know of anyone using size() for > something other than checking that their data was properly loaded into the > cache, and they don't need a super-fast size() for that. Having a DataContainer per segment would actually reduce required memory usage for the distributed iterator as well, since we can query data by segment much more efficiently and close out segments one by one per node instead of having to keep multiple open at once. When I asked about this before it was kind of a we can deal with it later kind thing. I would think this would increase ST operation time as well. > >> >> >> Regarding cache stores, IMO we're damned anyway: when calling >> cacheStore.size(), it can report more entries as those haven't been >> expired yet, it can report less entries as those can be expired due to >> [1]. Or, we'll enumerate all the entries, and that's going to be slow >> (btw., [1] reminded me that we should enumerate both datacontainer AND >> cachestores even if passivation is not enabled). > > > Exactly, we need to iterate all the entries from the stores if we want > something remotely accurate (although I had forgotten about expiration also > being a problem). Otherwise we could just leave size() as it is now, it's > pretty fast :) > >> >> >> Radim >> >> [1] https://issues.jboss.org/browse/ISPN-3202 >> >> On 10/08/2014 04:42 PM, William Burns wrote: >> > So it seems we would want to change this for 7.0 if possible since it >> > would be a bigger change for something like 7.1 and 8.0 would be even >> > further out. I should be able to put this together for CR2. >> > >> > It seems that we want to implement keySet, values and entrySet methods >> > using the entry iterator approach. >> > >> > It is however unclear for the size method if we want to use MR entry >> > counting and not worry about the rehash and passivation issues since >> > it is just an estimation anyways. Or if we want to also use the entry >> > iterator which should be closer approximation but will require more >> > network overhead and memory usage. >> > >> > Also we didn't really talk about the fact that these methods would >> > ignore ongoing transactions and if that is a concern or not. >> > >> > - Will >> > >> > On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus >> > wrote: >> >> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >> >> >> >>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus >> >>> wrote: >> >>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >> >>> >> >>>> Hi, >> >>>> >> >>>> recently we had a discussion about what size() returns, but I've >> >>>> realized there are more things that users would like to know. My >> >>>> question is whether you think that they would really appreciate it, >> >>>> or >> >>>> whether it's just my QA point of view where I sometimes compute the >> >>>> 'checksums' of cache to see if I didn't lost anything. >> >>>> >> >>>> There are those sizes: >> >>>> A) number of owned entries >> >>>> B) number of entries stored locally in memory >> >>>> C) number of entries stored in each local cache store >> >>>> D) number of entries stored in each shared cache store >> >>>> E) total number of entries in cache >> >>>> >> >>>> So far, we can get >> >>>> B via withFlags(SKIP_CACHE_LOAD).size() >> >>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >> >>>> E via distributed iterators / MR >> >>>> A via data container iteration + distribution manager query, but only >> >>>> without cache store >> >>>> C or D through >> >>>> >> >>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >> >>>> >> >>>> I think that it would go along with users' expectations if size() >> >>>> returned E and for the rest we should have special methods on >> >>>> AdvancedCache. That would of course change the meaning of size(), but >> >>>> I'd say that finally to something that has firm meaning. >> >>>> >> >>>> WDYT? >> >>> There was a lot of arguments in past whether size() and other methods >> >>> that operate over all the elements (keySet, values) are useful because: >> >>> - they are approximate (data changes during iteration) >> >>> - they are very resource consuming and might be miss-used (this is the >> >>> reason we chosen to use size() with its current local semantic) >> >>> >> >>> These methods (size, keys, values) are useful for people and I think >> >>> we were not wise to implement them only on top of the local data: this is >> >>> like preferring efficiency over correctness. This also created a lot of >> >>> confusion with our users, question like size() doesn't return the correct >> >>> value being asked regularly. I totally agree that size() returns E (i.e. >> >>> everything that is stored within the grid, including persistence) and it's >> >>> performance implications to be documented accordingly. For keySet and values >> >>> - we should stop implementing them (throw exception) and point users to >> >>> Will's distributed iterator which is a nicer way to achieve the desired >> >>> behavior. >> >>> >> >>> We can also implement keySet() and values() on top of the distributed >> >>> entry iterator and document that using the iterator directly is better. >> >> Yes, that's what I meant as well. >> >> >> >> Cheers, >> >> -- >> >> Mircea Markus >> >> Infinispan lead (www.infinispan.org) >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> infinispan-dev at lists.jboss.org >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Radim Vansa >> JBoss DataGrid QA >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Thu Oct 9 02:48:38 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Thu, 9 Oct 2014 09:48:38 +0300 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 7:53 PM, William Burns wrote: > On Wed, Oct 8, 2014 at 12:41 PM, Dan Berindei > wrote: > > > > > > On Wed, Oct 8, 2014 at 6:19 PM, Radim Vansa wrote: > >> > >> Users expect that size() will be constant-time (or linear to cluster > >> size), and generally fast operation. I'd prefer to keep it that way. > >> Though, even the MR way (used for HotRod size() now) needs to crawl > >> through all the entries locally. > > > > > > They might expect that, but there is nothing in the Map API suggesting > it. > > > >> > >> > >> 'Heretic, not very well though of and changing too many things' idea: > >> what about having data container segment-aware? Then you'd just bcast > >> SizeCommand with given topologyId and sum up sizes of primary-owned > >> segments... It's not a complete solution, but at least that would enable > >> to get the number of locally owned entries quite fast. Though, you can't > >> do that easily with cache stores (without changing SPI). > > > > > > We could create a separate DataContainer for each segment. But would it > > really be worth the trouble? I don't know of anyone using size() for > > something other than checking that their data was properly loaded into > the > > cache, and they don't need a super-fast size() for that. > > Having a DataContainer per segment would actually reduce required > memory usage for the distributed iterator as well, since we can query > data by segment much more efficiently and close out segments one by > one per node instead of having to keep multiple open at once. When I > asked about this before it was kind of a we can deal with it later > kind thing. I would think this would increase ST operation time as > well. > You mean it would improve ST performance, because it wouldn't have to compute the hash of each key in the data container? I don't think we have ever considered splitting the data container for ST, as it didn't seem worth the trouble. OTOH we wanted to add a segment-based query to the cache loader SPI every since we started designing NBST :) > > > >> > >> > >> Regarding cache stores, IMO we're damned anyway: when calling > >> cacheStore.size(), it can report more entries as those haven't been > >> expired yet, it can report less entries as those can be expired due to > >> [1]. Or, we'll enumerate all the entries, and that's going to be slow > >> (btw., [1] reminded me that we should enumerate both datacontainer AND > >> cachestores even if passivation is not enabled). > > > > > > Exactly, we need to iterate all the entries from the stores if we want > > something remotely accurate (although I had forgotten about expiration > also > > being a problem). Otherwise we could just leave size() as it is now, it's > > pretty fast :) > > > >> > >> > >> Radim > >> > >> [1] https://issues.jboss.org/browse/ISPN-3202 > >> > >> On 10/08/2014 04:42 PM, William Burns wrote: > >> > So it seems we would want to change this for 7.0 if possible since it > >> > would be a bigger change for something like 7.1 and 8.0 would be even > >> > further out. I should be able to put this together for CR2. > >> > > >> > It seems that we want to implement keySet, values and entrySet methods > >> > using the entry iterator approach. > >> > > >> > It is however unclear for the size method if we want to use MR entry > >> > counting and not worry about the rehash and passivation issues since > >> > it is just an estimation anyways. Or if we want to also use the entry > >> > iterator which should be closer approximation but will require more > >> > network overhead and memory usage. > >> > > >> > Also we didn't really talk about the fact that these methods would > >> > ignore ongoing transactions and if that is a concern or not. > >> > > >> > - Will > >> > > >> > On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus > >> > wrote: > >> >> On Oct 8, 2014, at 15:11, Dan Berindei > wrote: > >> >> > >> >>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus > >> >>> wrote: > >> >>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: > >> >>> > >> >>>> Hi, > >> >>>> > >> >>>> recently we had a discussion about what size() returns, but I've > >> >>>> realized there are more things that users would like to know. My > >> >>>> question is whether you think that they would really appreciate it, > >> >>>> or > >> >>>> whether it's just my QA point of view where I sometimes compute the > >> >>>> 'checksums' of cache to see if I didn't lost anything. > >> >>>> > >> >>>> There are those sizes: > >> >>>> A) number of owned entries > >> >>>> B) number of entries stored locally in memory > >> >>>> C) number of entries stored in each local cache store > >> >>>> D) number of entries stored in each shared cache store > >> >>>> E) total number of entries in cache > >> >>>> > >> >>>> So far, we can get > >> >>>> B via withFlags(SKIP_CACHE_LOAD).size() > >> >>>> (passivation ? B : 0) + firstNonZero(C, D) via size() > >> >>>> E via distributed iterators / MR > >> >>>> A via data container iteration + distribution manager query, but > only > >> >>>> without cache store > >> >>>> C or D through > >> >>>> > >> >>>> > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > >> >>>> > >> >>>> I think that it would go along with users' expectations if size() > >> >>>> returned E and for the rest we should have special methods on > >> >>>> AdvancedCache. That would of course change the meaning of size(), > but > >> >>>> I'd say that finally to something that has firm meaning. > >> >>>> > >> >>>> WDYT? > >> >>> There was a lot of arguments in past whether size() and other > methods > >> >>> that operate over all the elements (keySet, values) are useful > because: > >> >>> - they are approximate (data changes during iteration) > >> >>> - they are very resource consuming and might be miss-used (this is > the > >> >>> reason we chosen to use size() with its current local semantic) > >> >>> > >> >>> These methods (size, keys, values) are useful for people and I think > >> >>> we were not wise to implement them only on top of the local data: > this is > >> >>> like preferring efficiency over correctness. This also created a > lot of > >> >>> confusion with our users, question like size() doesn't return the > correct > >> >>> value being asked regularly. I totally agree that size() returns E > (i.e. > >> >>> everything that is stored within the grid, including persistence) > and it's > >> >>> performance implications to be documented accordingly. For keySet > and values > >> >>> - we should stop implementing them (throw exception) and point > users to > >> >>> Will's distributed iterator which is a nicer way to achieve the > desired > >> >>> behavior. > >> >>> > >> >>> We can also implement keySet() and values() on top of the > distributed > >> >>> entry iterator and document that using the iterator directly is > better. > >> >> Yes, that's what I meant as well. > >> >> > >> >> Cheers, > >> >> -- > >> >> Mircea Markus > >> >> Infinispan lead (www.infinispan.org) > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> _______________________________________________ > >> >> infinispan-dev mailing list > >> >> infinispan-dev at lists.jboss.org > >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > _______________________________________________ > >> > infinispan-dev mailing list > >> > infinispan-dev at lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> > >> -- > >> Radim Vansa > >> JBoss DataGrid QA > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141009/d0a0c200/attachment-0001.html From rory.odonnell at oracle.com Thu Oct 9 05:27:16 2014 From: rory.odonnell at oracle.com (Rory O'Donnell Oracle, Dublin Ireland) Date: Thu, 09 Oct 2014 10:27:16 +0100 Subject: [infinispan-dev] Early Access builds for JDK 9 b33 and JDK 8u40 b09 are available on java.net Message-ID: <543654F4.1040605@oracle.com> Hi Galder, Early Access build for JDK 9 b33 is available on java.net, summary of changes are listed here Early Access build for JDK 8u40 b09 is available on java.net, summary of changes are listed here. Rgds,Rory -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141009/6d8f30d7/attachment.html From emmanuel at hibernate.org Thu Oct 9 08:18:38 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Thu, 9 Oct 2014 15:18:38 +0300 Subject: [infinispan-dev] TopologySafe Map / Reduce Message-ID: Pedro and I have been having discussions with the LEADS guys on their experience of Map / Reduce especially around stability during topology changes. This ties to the .size() thread you guys have been exchanging on (I only could read it partially). On the requirements, theirs is pretty straightforward and expected I think from most users. They are fine with inconsistencies with entries create/updated/deleted between the M/R start and the end. They are *not* fine with seeing the same key/value several time for the duration of the M/R execution. This AFAIK can happen when a topology change occurs. Here is a proposal. Why not run the M/R job not per node but rather per segment? The point is that segments are stable across topology changes. The M/R tasks would then be about iterating over the keys in a given segment. The M/R request would send the task per segments on each node where the segment is primary. (We can imagine interesting things like sending it to one of the backups for workload optimization purposes or sending it to both primary and backups and to comparisons). The M/R requester would be in an interesting situation. It could detect that a segment M/R never returns and trigger a new computation on another node than the one initially sent. One tricky question around that is when the M/R job store data in an intermediary state. We need some sort of way to expose the user indirectly to segments so that we can evict per segment intermediary caches in case of failure or retry. But before getting ahead of ourselves, what do you thing of the general idea? Even without retry framework, this approach would be more stable than our current per node approach during topology changes and improve dependability. Emmanuel From mudokonman at gmail.com Thu Oct 9 08:40:12 2014 From: mudokonman at gmail.com (William Burns) Date: Thu, 9 Oct 2014 08:40:12 -0400 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: References: Message-ID: Actually this was something I was hoping to get to possibly in the near future. I already have to do https://issues.jboss.org/browse/ISPN-4358 which will require rewriting parts of the distributed entry iterator. In doing so I was planning on breaking this out to a more generic framework where you could run a given operation by segment guaranteeing it was only ran once per entry. In doing so I was thinking I could try to move M/R on top of this to allow it to also be resilient to rehash events. Additional comments inline. On Thu, Oct 9, 2014 at 8:18 AM, Emmanuel Bernard wrote: > Pedro and I have been having discussions with the LEADS guys on their experience of Map / Reduce especially around stability during topology changes. > > This ties to the .size() thread you guys have been exchanging on (I only could read it partially). > > On the requirements, theirs is pretty straightforward and expected I think from most users. > They are fine with inconsistencies with entries create/updated/deleted between the M/R start and the end. There is no way we can fix this without adding a very strict isolation level like SERIALIZABLE. > They are *not* fine with seeing the same key/value several time for the duration of the M/R execution. This AFAIK can happen when a topology change occurs. This can happen if it was processed on one node and then rehash migrates the entry to another and runs it there. > > Here is a proposal. > Why not run the M/R job not per node but rather per segment? > The point is that segments are stable across topology changes. The M/R tasks would then be about iterating over the keys in a given segment. > > The M/R request would send the task per segments on each node where the segment is primary. This is exactly what the iterator does today but also watches for rehashes to send the request to a new owner when the segment moves between nodes. > (We can imagine interesting things like sending it to one of the backups for workload optimization purposes or sending it to both primary and backups and to comparisons). > The M/R requester would be in an interesting situation. It could detect that a segment M/R never returns and trigger a new computation on another node than the one initially sent. > > One tricky question around that is when the M/R job store data in an intermediary state. We need some sort of way to expose the user indirectly to segments so that we can evict per segment intermediary caches in case of failure or retry. This was one place I was thinking I would need to take special care to look into when doing a conversion like this. > > But before getting ahead of ourselves, what do you thing of the general idea? Even without retry framework, this approach would be more stable than our current per node approach during topology changes and improve dependability. Doing it solely based on segment would remove the possibility of having duplicates. However without a mechanism to send a new request on rehash it would be possible to only find a subset of values (if a segment is removed while iterating on it). > > Emmanuel > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Thu Oct 9 09:41:39 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Thu, 9 Oct 2014 16:41:39 +0300 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: References: Message-ID: On Thu, Oct 9, 2014 at 3:40 PM, William Burns wrote: > Actually this was something I was hoping to get to possibly in the near > future. > > I already have to do https://issues.jboss.org/browse/ISPN-4358 which > will require rewriting parts of the distributed entry iterator. In > doing so I was planning on breaking this out to a more generic > framework where you could run a given operation by segment > guaranteeing it was only ran once per entry. In doing so I was > thinking I could try to move M/R on top of this to allow it to also be > resilient to rehash events. > > Additional comments inline. > > On Thu, Oct 9, 2014 at 8:18 AM, Emmanuel Bernard > wrote: > > Pedro and I have been having discussions with the LEADS guys on their > experience of Map / Reduce especially around stability during topology > changes. > > > > This ties to the .size() thread you guys have been exchanging on (I only > could read it partially). > > > > On the requirements, theirs is pretty straightforward and expected I > think from most users. > > They are fine with inconsistencies with entries create/updated/deleted > between the M/R start and the end. > > There is no way we can fix this without adding a very strict isolation > level like SERIALIZABLE. > > They are *not* fine with seeing the same key/value several time for the > duration of the M/R execution. This AFAIK can happen when a topology change > occurs. > > This can happen if it was processed on one node and then rehash > migrates the entry to another and runs it there. > > > > > Here is a proposal. > > Why not run the M/R job not per node but rather per segment? > > The point is that segments are stable across topology changes. The M/R > tasks would then be about iterating over the keys in a given segment. > > > > The M/R request would send the task per segments on each node where the > segment is primary. > > This is exactly what the iterator does today but also watches for > rehashes to send the request to a new owner when the segment moves > between nodes. > > > (We can imagine interesting things like sending it to one of the backups > for workload optimization purposes or sending it to both primary and > backups and to comparisons). > > The M/R requester would be in an interesting situation. It could detect > that a segment M/R never returns and trigger a new computation on another > node than the one initially sent. > > > > One tricky question around that is when the M/R job store data in an > intermediary state. We need some sort of way to expose the user indirectly > to segments so that we can evict per segment intermediary caches in case of > failure or retry. > > This was one place I was thinking I would need to take special care to > look into when doing a conversion like this. > I'd rather not expose this to the user. Instead, we could split the intermediary values for each key by the source segment, and do the invalidation of the retried segments in our M/R framework (e.g. when we detect that the primary owner at the start of the map/combine phase is not an owner at all at the end). I think we have another problem with the publishing of intermediary values not being idempotent. The default configuration for the intermediate cache is non-transactional, and retrying the put(delta) command after a topology change could add the same intermediate values twice. A transactional intermediary cache should be safe, though, because the tx won't commit on the old owner until the new owner knows about the tx. > > > > But before getting ahead of ourselves, what do you thing of the general > idea? Even without retry framework, this approach would be more stable than > our current per node approach during topology changes and improve > dependability. > > Doing it solely based on segment would remove the possibility of > having duplicates. However without a mechanism to send a new request > on rehash it would be possible to only find a subset of values (if a > segment is removed while iterating on it). > > > > > Emmanuel > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141009/34d4e311/attachment.html From mmarkus at redhat.com Thu Oct 9 11:46:44 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Thu, 9 Oct 2014 16:46:44 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <5435560E.2030206@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> Message-ID: <081C49F0-4B51-4486-BD7C-19926E5A9178@redhat.com> On Oct 8, 2014, at 16:19, Radim Vansa wrote: > Users expect that size() will be constant-time (or linear to cluster > size), and generally fast operation. I'd prefer to keep it that way. > Though, even the MR way (used for HotRod size() now) needs to crawl > through all the entries locally. yes, but first of all they expect size to be correct, then fast. > > 'Heretic, not very well though of and changing too many things' idea: > what about having data container segment-aware? Then you'd just bcast > SizeCommand with given topologyId and sum up sizes of primary-owned > segments... It's not a complete solution, but at least that would enable > to get the number of locally owned entries quite fast. Though, you can't > do that easily with cache stores (without changing SPI). that would help and there were discussions to do this for other reasons as well: e.g. ST would migrate the data without iterating over the state in the DC. Not doable in the scope of ISPN 7.0, though. > > Regarding cache stores, IMO we're damned anyway: when calling > cacheStore.size(), it can report more entries as those haven't been > expired yet, it can report less entries as those can be expired due to > [1]. Or, we'll enumerate all the entries, and that's going to be slow > (btw., [1] reminded me that we should enumerate both datacontainer AND > cachestores even if passivation is not enabled). > > Radim > > [1] https://issues.jboss.org/browse/ISPN-3202 > > On 10/08/2014 04:42 PM, William Burns wrote: >> So it seems we would want to change this for 7.0 if possible since it >> would be a bigger change for something like 7.1 and 8.0 would be even >> further out. I should be able to put this together for CR2. >> >> It seems that we want to implement keySet, values and entrySet methods >> using the entry iterator approach. >> >> It is however unclear for the size method if we want to use MR entry >> counting and not worry about the rehash and passivation issues since >> it is just an estimation anyways. Or if we want to also use the entry >> iterator which should be closer approximation but will require more >> network overhead and memory usage. >> >> Also we didn't really talk about the fact that these methods would >> ignore ongoing transactions and if that is a concern or not. >> >> - Will >> >> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >>> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >>> >>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >>>> >>>>> Hi, >>>>> >>>>> recently we had a discussion about what size() returns, but I've >>>>> realized there are more things that users would like to know. My >>>>> question is whether you think that they would really appreciate it, or >>>>> whether it's just my QA point of view where I sometimes compute the >>>>> 'checksums' of cache to see if I didn't lost anything. >>>>> >>>>> There are those sizes: >>>>> A) number of owned entries >>>>> B) number of entries stored locally in memory >>>>> C) number of entries stored in each local cache store >>>>> D) number of entries stored in each shared cache store >>>>> E) total number of entries in cache >>>>> >>>>> So far, we can get >>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>> E via distributed iterators / MR >>>>> A via data container iteration + distribution manager query, but only >>>>> without cache store >>>>> C or D through >>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>> >>>>> I think that it would go along with users' expectations if size() >>>>> returned E and for the rest we should have special methods on >>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>> I'd say that finally to something that has firm meaning. >>>>> >>>>> WDYT? >>>> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >>>> - they are approximate (data changes during iteration) >>>> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >>>> >>>> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >>>> >>>> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. >>> Yes, that's what I meant as well. >>> >>> Cheers, >>> -- >>> Mircea Markus >>> Infinispan lead (www.infinispan.org) >>> >>> >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From mmarkus at redhat.com Thu Oct 9 11:47:25 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Thu, 9 Oct 2014 16:47:25 +0100 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: <45C26FF2-348C-4C3D-B030-AAAB81B725EF@redhat.com> On Oct 8, 2014, at 15:42, William Burns wrote: > So it seems we would want to change this for 7.0 if possible since it > would be a bigger change for something like 7.1 and 8.0 would be even > further out. I should be able to put this together for CR2. +1, plese go for it. Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From mmarkus at redhat.com Thu Oct 9 11:48:49 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Thu, 9 Oct 2014 16:48:49 +0100 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: <5C898CCC-55F0-4556-883B-B5EC4F2A62C5@redhat.com> On Oct 8, 2014, at 15:57, Dan Berindei wrote: > On Wed, Oct 8, 2014 at 5:42 PM, William Burns wrote: > So it seems we would want to change this for 7.0 if possible since it > would be a bigger change for something like 7.1 and 8.0 would be even > further out. I should be able to put this together for CR2. > > I'm not 100% convinced that we need it for 7.x. For 8.0 I would recommend removing the size() method altogether, and providing some looser "statistics" instead. 8.0 will happen in ?1 year's time, this is a small change for the better so +1 to have it in for CR2. Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From pedro at infinispan.org Thu Oct 9 17:13:13 2014 From: pedro at infinispan.org (Pedro Ruivo) Date: Fri, 10 Oct 2014 00:13:13 +0300 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: References: Message-ID: <5436FA69.6030300@infinispan.org> On 10/09/2014 03:40 PM, William Burns wrote: > Actually this was something I was hoping to get to possibly in the near future. > > I already have to do https://issues.jboss.org/browse/ISPN-4358 which > will require rewriting parts of the distributed entry iterator. In > doing so I was planning on breaking this out to a more generic > framework where you could run a given operation by segment > guaranteeing it was only ran once per entry. In doing so I was > thinking I could try to move M/R on top of this to allow it to also be > resilient to rehash events. > > Additional comments inline. > > On Thu, Oct 9, 2014 at 8:18 AM, Emmanuel Bernard wrote: >> Pedro and I have been having discussions with the LEADS guys on their experience of Map / Reduce especially around stability during topology changes. >> >> This ties to the .size() thread you guys have been exchanging on (I only could read it partially). >> >> On the requirements, theirs is pretty straightforward and expected I think from most users. >> They are fine with inconsistencies with entries create/updated/deleted between the M/R start and the end. > > There is no way we can fix this without adding a very strict isolation > level like SERIALIZABLE. Snapshot Isolation should be fine, but I don't wanna enter in discussion about it right now :) > >> They are *not* fine with seeing the same key/value several time for the duration of the M/R execution. This AFAIK can happen when a topology change occurs. > > This can happen if it was processed on one node and then rehash > migrates the entry to another and runs it there. > >> >> Here is a proposal. >> Why not run the M/R job not per node but rather per segment? >> The point is that segments are stable across topology changes. The M/R tasks would then be about iterating over the keys in a given segment. >> >> The M/R request would send the task per segments on each node where the segment is primary. > > This is exactly what the iterator does today but also watches for > rehashes to send the request to a new owner when the segment moves > between nodes. > >> (We can imagine interesting things like sending it to one of the backups for workload optimization purposes or sending it to both primary and backups and to comparisons). >> The M/R requester would be in an interesting situation. It could detect that a segment M/R never returns and trigger a new computation on another node than the one initially sent. >> >> One tricky question around that is when the M/R job store data in an intermediary state. We need some sort of way to expose the user indirectly to segments so that we can evict per segment intermediary caches in case of failure or retry. > > This was one place I was thinking I would need to take special care to > look into when doing a conversion like this. > >> >> But before getting ahead of ourselves, what do you thing of the general idea? Even without retry framework, this approach would be more stable than our current per node approach during topology changes and improve dependability. > > Doing it solely based on segment would remove the possibility of > having duplicates. However without a mechanism to send a new request > on rehash it would be possible to only find a subset of values (if a > segment is removed while iterating on it). true. I think the retry mechanism is the best approach. other alternative, would be to implement a Map getBySegment(int) operations that goes remote if the segment is not local. > >> >> Emmanuel >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From pedro at infinispan.org Thu Oct 9 17:16:06 2014 From: pedro at infinispan.org (Pedro Ruivo) Date: Fri, 10 Oct 2014 00:16:06 +0300 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: References: Message-ID: <5436FB16.3000003@infinispan.org> On 10/09/2014 04:41 PM, Dan Berindei wrote: > > > On Thu, Oct 9, 2014 at 3:40 PM, William Burns > wrote: > > Actually this was something I was hoping to get to possibly in the > near future. > > I already have to do https://issues.jboss.org/browse/ISPN-4358 which > will require rewriting parts of the distributed entry iterator. In > doing so I was planning on breaking this out to a more generic > framework where you could run a given operation by segment > guaranteeing it was only ran once per entry. In doing so I was > thinking I could try to move M/R on top of this to allow it to also be > resilient to rehash events. > > Additional comments inline. > > On Thu, Oct 9, 2014 at 8:18 AM, Emmanuel Bernard > > wrote: > > Pedro and I have been having discussions with the LEADS guys on their experience of Map / Reduce especially around stability during topology changes. > > > > This ties to the .size() thread you guys have been exchanging on (I only could read it partially). > > > > On the requirements, theirs is pretty straightforward and expected I think from most users. > > They are fine with inconsistencies with entries create/updated/deleted between the M/R start and the end. > > There is no way we can fix this without adding a very strict isolation > level like SERIALIZABLE. > > > > They are *not* fine with seeing the same key/value several time for the duration of the M/R execution. This AFAIK can happen when a topology change occurs. > > This can happen if it was processed on one node and then rehash > migrates the entry to another and runs it there. > > > > > Here is a proposal. > > Why not run the M/R job not per node but rather per segment? > > The point is that segments are stable across topology changes. The M/R tasks would then be about iterating over the keys in a given segment. > > > > The M/R request would send the task per segments on each node where the segment is primary. > > This is exactly what the iterator does today but also watches for > rehashes to send the request to a new owner when the segment moves > between nodes. > > > (We can imagine interesting things like sending it to one of the backups for workload optimization purposes or sending it to both primary and backups and to comparisons). > > The M/R requester would be in an interesting situation. It could detect that a segment M/R never returns and trigger a new computation on another node than the one initially sent. > > > > One tricky question around that is when the M/R job store data in an intermediary state. We need some sort of way to expose the user indirectly to segments so that we can evict per segment intermediary caches in case of failure or retry. > > This was one place I was thinking I would need to take special care to > look into when doing a conversion like this. > > > I'd rather not expose this to the user. Instead, we could split the > intermediary values for each key by the source segment, and do the > invalidation of the retried segments in our M/R framework (e.g. when we > detect that the primary owner at the start of the map/combine phase is > not an owner at all at the end). > > I think we have another problem with the publishing of intermediary > values not being idempotent. The default configuration for the > intermediate cache is non-transactional, and retrying the put(delta) > command after a topology change could add the same intermediate values > twice. A transactional intermediary cache should be safe, though, > because the tx won't commit on the old owner until the new owner knows > about the tx. can you elaborate on it? anyway, I think the retry mechanism should solve it. If we detect a topology change (during the iteration of segment _i_) and the segment _i_ is moved, then we can cancel the iteration, remove all the intermediate values generated in segment _i_ and restart (on the primary owner). > > > > > > But before getting ahead of ourselves, what do you thing of the general idea? Even without retry framework, this approach would be more stable than our current per node approach during topology changes and improve dependability. > > Doing it solely based on segment would remove the possibility of > having duplicates. However without a mechanism to send a new request > on rehash it would be possible to only find a subset of values (if a > segment is removed while iterating on it). > > > > > Emmanuel > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From dan.berindei at gmail.com Fri Oct 10 03:03:37 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 10 Oct 2014 10:03:37 +0300 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: <5436FB16.3000003@infinispan.org> References: <5436FB16.3000003@infinispan.org> Message-ID: On Fri, Oct 10, 2014 at 12:16 AM, Pedro Ruivo wrote: > > > On 10/09/2014 04:41 PM, Dan Berindei wrote: > > > > > > On Thu, Oct 9, 2014 at 3:40 PM, William Burns > > wrote: > > > > Actually this was something I was hoping to get to possibly in the > > near future. > > > > I already have to do https://issues.jboss.org/browse/ISPN-4358 which > > will require rewriting parts of the distributed entry iterator. In > > doing so I was planning on breaking this out to a more generic > > framework where you could run a given operation by segment > > guaranteeing it was only ran once per entry. In doing so I was > > thinking I could try to move M/R on top of this to allow it to also > be > > resilient to rehash events. > > > > Additional comments inline. > > > > On Thu, Oct 9, 2014 at 8:18 AM, Emmanuel Bernard > > > wrote: > > > Pedro and I have been having discussions with the LEADS guys on > their experience of Map / Reduce especially around stability during > topology changes. > > > > > > This ties to the .size() thread you guys have been exchanging on > (I only could read it partially). > > > > > > On the requirements, theirs is pretty straightforward and expected > I think from most users. > > > They are fine with inconsistencies with entries > create/updated/deleted between the M/R start and the end. > > > > There is no way we can fix this without adding a very strict > isolation > > level like SERIALIZABLE. > > > > > > > They are *not* fine with seeing the same key/value several time > for the duration of the M/R execution. This AFAIK can happen when a > topology change occurs. > > > > This can happen if it was processed on one node and then rehash > > migrates the entry to another and runs it there. > > > > > > > > Here is a proposal. > > > Why not run the M/R job not per node but rather per segment? > > > The point is that segments are stable across topology changes. The > M/R tasks would then be about iterating over the keys in a given segment. > > > > > > The M/R request would send the task per segments on each node > where the segment is primary. > > > > This is exactly what the iterator does today but also watches for > > rehashes to send the request to a new owner when the segment moves > > between nodes. > > > > > (We can imagine interesting things like sending it to one of the > backups for workload optimization purposes or sending it to both primary > and backups and to comparisons). > > > The M/R requester would be in an interesting situation. It could > detect that a segment M/R never returns and trigger a new computation on > another node than the one initially sent. > > > > > > One tricky question around that is when the M/R job store data in > an intermediary state. We need some sort of way to expose the user > indirectly to segments so that we can evict per segment intermediary caches > in case of failure or retry. > > > > This was one place I was thinking I would need to take special care > to > > look into when doing a conversion like this. > > > > > > I'd rather not expose this to the user. Instead, we could split the > > intermediary values for each key by the source segment, and do the > > invalidation of the retried segments in our M/R framework (e.g. when we > > detect that the primary owner at the start of the map/combine phase is > > not an owner at all at the end). > > > > I think we have another problem with the publishing of intermediary > > values not being idempotent. The default configuration for the > > intermediate cache is non-transactional, and retrying the put(delta) > > command after a topology change could add the same intermediate values > > twice. A transactional intermediary cache should be safe, though, > > because the tx won't commit on the old owner until the new owner knows > > about the tx. > > can you elaborate on it? > say we have a cache with numOwners=2, owners(k) = [A, B] C will become the primary owner of k, but for now owners(k) = [A, B, C] O sends put(delta) to A (the primary) A sends put(delta) to B, C B sees a topology change (owners(k) = [C, B]), doesn't apply the delta and replies with an OutdatedTopologyException C applies the delta A resends put(delta) to C (new primary) C sends put(delta) to B, applies the delta again I think it could be solved with versions, I just wanted to point out that we don't do that now. > > anyway, I think the retry mechanism should solve it. If we detect a > topology change (during the iteration of segment _i_) and the segment > _i_ is moved, then we can cancel the iteration, remove all the > intermediate values generated in segment _i_ and restart (on the primary > owner). > The problem is that the intermediate keys aren't in the same segment: we want the reduce phase to access only keys local to the reducing node, and keys in different input segments can yield values for the same intermediate key. So like you say, we'd have to retry on every topology change in the intermediary cache, not just the ones affecting segment _i_. There's another complication: in the scenario above, O may only get the topology update with owners(k) = [C, B] after the map/combine phase completed. So the originator of the M/R job would have to watch for topology changes seen by any node, and invalidate/retry any input segments that could have been affected. All that without slowing down the no-topology-change case too much... > > > > > > > > > But before getting ahead of ourselves, what do you thing of the > general idea? Even without retry framework, this approach would be more > stable than our current per node approach during topology changes and > improve dependability. > > > > Doing it solely based on segment would remove the possibility of > > having duplicates. However without a mechanism to send a new request > > on rehash it would be possible to only find a subset of values (if a > > segment is removed while iterating on it). > > > > > > > > Emmanuel > > > _______________________________________________ > > > infinispan-dev mailing list > > > infinispan-dev at lists.jboss.org > > > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org infinispan-dev at lists.jboss.org> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141010/b324ae41/attachment-0001.html From rory.odonnell at oracle.com Fri Oct 10 06:01:36 2014 From: rory.odonnell at oracle.com (Rory O'Donnell Oracle, Dublin Ireland) Date: Fri, 10 Oct 2014 11:01:36 +0100 Subject: [infinispan-dev] Analysis of infinispan-6.0.2 dependency on JDK-Internal APIs In-Reply-To: <54227EF9.6090403@oracle.com> References: <54227901.8020509@oracle.com> <54227EF9.6090403@oracle.com> Message-ID: <5437AE80.6050302@oracle.com> Hi Galder, Did you have time to review the report, any feedback ? Rgds,Rory On 24/09/2014 09:21, Rory O'Donnell Oracle, Dublin Ireland wrote: > Below is a text output of the report for infinispan-6.0.2. > > Rgds,Rory > > ------------------------------------------------------------------------ > > > JDK Internal API Usage Report for infinispan-6.0.2.Final-all > > The OpenJDK Quality Outreach campaign has run a compatibility report > to identify usage of JDK-internal APIs. Usage of these JDK-internal > APIs could pose compatibility issues, as the Java team explained in > 1996 > . > We have created this report to help you identify which JDK-internal > APIs your project uses, what to use instead, and where those changes > should go. Making these changes will improve your compatibility, and > in some cases give better performance. > > Migrating away from the JDK-internal APIs now will give your team > adequate time for testing before the release of JDK 9. If you are > unable to migrate away from an internal API, please provide us with an > explanation below to help us understand it better. As a reminder, > supported APIs are determined by the OpenJDK's Java Community Process > and not by Oracle. > > This report was generated by jdeps > > through static analysis of artifacts: it does not identify any usage > of those APIs through reflection or dynamic bytecode. You may also run > jdeps on your own > > if you would prefer. > > Summary of the analysis of the jar files within > infinispan-6.0.2.Final-all: > > * Numer of jar files depending on JDK-internal APIs: 10 > * Internal APIs that have known replacements: 0 > * Internal APIs that have no supported replacements: 73 > > > APIs that have known replacements > : > > ID Replace Usage of With Inside > > > JDK-internal APIs without supported replacements: > > ID Internal APIs (do not use) Used by > 1 com.sun.org.apache.xml.internal.utils.PrefixResolver > > * lib/freemarker-2.3.11.jar > > Explanation... > 2 com.sun.org.apache.xpath.internal.XPath > > * lib/freemarker-2.3.11.jar > > Explanation... > 3 com.sun.org.apache.xpath.internal.XPathContext > > * lib/freemarker-2.3.11.jar > > Explanation... > 4 com.sun.org.apache.xpath.internal.objects.XBoolean > > * lib/freemarker-2.3.11.jar > > Explanation... > 5 com.sun.org.apache.xpath.internal.objects.XNodeSet > > * lib/freemarker-2.3.11.jar > > Explanation... > 6 com.sun.org.apache.xpath.internal.objects.XNull > > * lib/freemarker-2.3.11.jar > > Explanation... > 7 com.sun.org.apache.xpath.internal.objects.XNumber > > * lib/freemarker-2.3.11.jar > > Explanation... > 8 com.sun.org.apache.xpath.internal.objects.XObject > > * lib/freemarker-2.3.11.jar > > Explanation... > 9 com.sun.org.apache.xpath.internal.objects.XString > > * lib/freemarker-2.3.11.jar > > Explanation... > 10 org.w3c.dom.html.HTMLAnchorElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 11 org.w3c.dom.html.HTMLAppletElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 12 org.w3c.dom.html.HTMLAreaElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 13 org.w3c.dom.html.HTMLBRElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 14 org.w3c.dom.html.HTMLBaseElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 15 org.w3c.dom.html.HTMLBaseFontElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 16 org.w3c.dom.html.HTMLBodyElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 17 org.w3c.dom.html.HTMLButtonElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 18 org.w3c.dom.html.HTMLCollection > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 19 org.w3c.dom.html.HTMLDListElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 20 org.w3c.dom.html.HTMLDirectoryElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 21 org.w3c.dom.html.HTMLDivElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 22 org.w3c.dom.html.HTMLDocument > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 23 org.w3c.dom.html.HTMLElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 24 org.w3c.dom.html.HTMLFieldSetElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 25 org.w3c.dom.html.HTMLFontElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 26 org.w3c.dom.html.HTMLFormElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 27 org.w3c.dom.html.HTMLFrameElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 28 org.w3c.dom.html.HTMLFrameSetElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 29 org.w3c.dom.html.HTMLHRElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 30 org.w3c.dom.html.HTMLHeadElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 31 org.w3c.dom.html.HTMLHeadingElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 32 org.w3c.dom.html.HTMLHtmlElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 33 org.w3c.dom.html.HTMLIFrameElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 34 org.w3c.dom.html.HTMLImageElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 35 org.w3c.dom.html.HTMLInputElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 36 org.w3c.dom.html.HTMLIsIndexElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 37 org.w3c.dom.html.HTMLLIElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 38 org.w3c.dom.html.HTMLLabelElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 39 org.w3c.dom.html.HTMLLegendElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 40 org.w3c.dom.html.HTMLLinkElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 41 org.w3c.dom.html.HTMLMapElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 42 org.w3c.dom.html.HTMLMenuElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 43 org.w3c.dom.html.HTMLMetaElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 44 org.w3c.dom.html.HTMLModElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 45 org.w3c.dom.html.HTMLOListElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 46 org.w3c.dom.html.HTMLObjectElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 47 org.w3c.dom.html.HTMLOptGroupElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 48 org.w3c.dom.html.HTMLOptionElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 49 org.w3c.dom.html.HTMLParagraphElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 50 org.w3c.dom.html.HTMLParamElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 51 org.w3c.dom.html.HTMLPreElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 52 org.w3c.dom.html.HTMLQuoteElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 53 org.w3c.dom.html.HTMLScriptElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 54 org.w3c.dom.html.HTMLSelectElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 55 org.w3c.dom.html.HTMLStyleElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 56 org.w3c.dom.html.HTMLTableCaptionElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 57 org.w3c.dom.html.HTMLTableCellElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 58 org.w3c.dom.html.HTMLTableColElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 59 org.w3c.dom.html.HTMLTableElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 60 org.w3c.dom.html.HTMLTableRowElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 61 org.w3c.dom.html.HTMLTableSectionElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 62 org.w3c.dom.html.HTMLTextAreaElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 63 org.w3c.dom.html.HTMLTitleElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 64 org.w3c.dom.html.HTMLUListElement > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 65 org.w3c.dom.ranges.DocumentRange > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 66 org.w3c.dom.ranges.Range > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 67 org.w3c.dom.ranges.RangeException > > * lib/xercesImpl-2.9.1.jar > > Explanation... > 68 sun.misc.Signal > > * lib/aesh-0.33.7.jar > > Explanation... > 69 sun.misc.SignalHandler > > * lib/aesh-0.33.7.jar > > Explanation... > 70 sun.misc.Unsafe > > * lib/avro-1.7.5.jar > * lib/guava-12.0.jar > * lib/infinispan-commons-6.0.2.Final.jar > * lib/mvel2-2.0.12.jar > * lib/scala-library-2.10.2.jar > > Explanation... > 71 sun.nio.ch.FileChannelImpl > > * lib/leveldb-0.5.jar > > Explanation... > 72 sun.reflect.ReflectionFactory > > * lib/jboss-marshalling-1.4.4.Final.jar > > Explanation... > 73 sun.reflect.ReflectionFactory$GetReflectionFactoryAction > > * lib/jboss-marshalling-1.4.4.Final.jar > > Explanation... > > > Identify External Replacements > > You should use a separate third-party library that performs this > functionality. > > ID Internal API (grouped by package) Used By Identify External > Replacement > > > ------------------------------------------------------------------------ > > > On 24/09/2014 08:55, Rory O'Donnell Oracle, Dublin Ireland wrote: >> Hi Galder, >> >> As part of the preparations for JDK 9, Oracle?s engineers have been >> analyzing open source projects like yours to understand usage. One >> area of concern involves identifying compatibility problems, such as >> reliance on JDK-internal APIs. >> >> Our engineers have already prepared guidance on migrating some of the >> more common usage patterns of JDK-internal APIs to supported public >> interfaces. The list is on the OpenJDK wiki [0], along with >> instructions on how to run the jdeps analysis tool yourself . >> >> As part of the ongoing development of JDK 9, I would like to >> encourage migration from JDK-internal APIs towards the supported Java >> APIs. I have prepared a report for your project rele ase >> infinispan-6.0.2 based on the jdeps output. >> >> The report is attached to this e-mail. >> >> For anything where your migration path is unclear, I would appreciate >> comments on the JDK-internal API usage patterns in the attached jdeps >> report - in particular comments elaborating on the rationale for them >> - either to me or on this mailing list. >> >> Finding suitable replacements for unsupported interfaces is not >> always straightforward, which is why I am reaching out to you early >> in the JDK 9 development cycle so you can give feedback about new >> APIs that may be needed to facilitate this exercise. >> >> Thank you in advance for any efforts and feedback helping us make JDK >> 9 better. >> >> Rgds,Rory >> >> [0] >> https://wiki.openjdk.java.net/display/JDK8/Java+Dependency+Analysis+Tool >> >> >> -- >> Rgds,Rory O'Donnell >> Quality Engineering Manager >> Oracle EMEA , Dublin, Ireland >> >> >> >> > > -- > Rgds,Rory O'Donnell > Quality Engineering Manager > Oracle EMEA , Dublin, Ireland > > > > -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141010/68d38cac/attachment-0001.html From dan.berindei at gmail.com Fri Oct 10 07:37:00 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 10 Oct 2014 14:37:00 +0300 Subject: [infinispan-dev] Analysis of infinispan-6.0.2 dependency on JDK-Internal APIs In-Reply-To: <5437AE80.6050302@oracle.com> References: <54227901.8020509@oracle.com> <54227EF9.6090403@oracle.com> <5437AE80.6050302@oracle.com> Message-ID: Hi Rory Galder is on PTO for another week, so I'll try to answer instead. We only use sun.misc.Unsafe directly, in order to implement a variation of Doug Lea's ConcurrentHashMapV8 that accepts a custom Equivalence (implementation of equality/hashCode). I guess we'll have to switch to AtomicFieldUpdaters if we want it to work with JDK 9, and possibly move to the volatile extensions once they are implemented. The rest of the internal class usages seem to be from our dependencies on WildFly, JBoss Marshalling, LevelDB, Smooks, and JBoss MicroContainer. Smooks and JBoss MicroContainer likely won't see any updates for JDK 9, but they're only used in the demos so they're not critical. JBoss Marshalling is used in the core, however, so we'll need a release from them before we can run anything on JDK 9. Cheers Dan On Fri, Oct 10, 2014 at 1:01 PM, Rory O'Donnell Oracle, Dublin Ireland < rory.odonnell at oracle.com> wrote: > Hi Galder, > > Did you have time to review the report, any feedback ? > > Rgds,Rory > > On 24/09/2014 09:21, Rory O'Donnell Oracle, Dublin Ireland wrote: > > Below is a text output of the report for infinispan-6.0.2. > > Rgds,Rory > > ------------------------------ > JDK Internal API Usage Report for infinispan-6.0.2.Final-all > > The OpenJDK Quality Outreach campaign has run a compatibility report to > identify usage of JDK-internal APIs. Usage of these JDK-internal APIs could > pose compatibility issues, as the Java team explained in 1996 > . We > have created this report to help you identify which JDK-internal APIs your > project uses, what to use instead, and where those changes should go. > Making these changes will improve your compatibility, and in some cases > give better performance. > > Migrating away from the JDK-internal APIs now will give your team adequate > time for testing before the release of JDK 9. If you are unable to migrate > away from an internal API, please provide us with an explanation below to > help us understand it better. As a reminder, supported APIs are determined > by the OpenJDK's Java Community Process and not by Oracle. > > This report was generated by jdeps > > through static analysis of artifacts: it does not identify any usage of > those APIs through reflection or dynamic bytecode. You may also run jdeps > on your own > > if you would prefer. > > Summary of the analysis of the jar files within > infinispan-6.0.2.Final-all: > > - Numer of jar files depending on JDK-internal APIs: 10 > - Internal APIs that have known replacements: 0 > - Internal APIs that have no supported replacements: 73 > > APIs that have known replacements > > : ID Replace Usage of With Inside JDK-internal APIs without supported > replacements: ID Internal APIs (do not use) Used by 1 > com.sun.org.apache.xml.internal.utils.PrefixResolver > > - lib/freemarker-2.3.11.jar > > Explanation... 2 com.sun.org.apache.xpath.internal.XPath > > - lib/freemarker-2.3.11.jar > > Explanation... 3 com.sun.org.apache.xpath.internal.XPathContext > > - lib/freemarker-2.3.11.jar > > Explanation... 4 com.sun.org.apache.xpath.internal.objects.XBoolean > > - lib/freemarker-2.3.11.jar > > Explanation... 5 com.sun.org.apache.xpath.internal.objects.XNodeSet > > - lib/freemarker-2.3.11.jar > > Explanation... 6 com.sun.org.apache.xpath.internal.objects.XNull > > - lib/freemarker-2.3.11.jar > > Explanation... 7 com.sun.org.apache.xpath.internal.objects.XNumber > > - lib/freemarker-2.3.11.jar > > Explanation... 8 com.sun.org.apache.xpath.internal.objects.XObject > > - lib/freemarker-2.3.11.jar > > Explanation... 9 com.sun.org.apache.xpath.internal.objects.XString > > - lib/freemarker-2.3.11.jar > > Explanation... 10 org.w3c.dom.html.HTMLAnchorElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 11 org.w3c.dom.html.HTMLAppletElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 12 org.w3c.dom.html.HTMLAreaElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 13 org.w3c.dom.html.HTMLBRElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 14 org.w3c.dom.html.HTMLBaseElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 15 org.w3c.dom.html.HTMLBaseFontElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 16 org.w3c.dom.html.HTMLBodyElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 17 org.w3c.dom.html.HTMLButtonElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 18 org.w3c.dom.html.HTMLCollection > > - lib/xercesImpl-2.9.1.jar > > Explanation... 19 org.w3c.dom.html.HTMLDListElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 20 org.w3c.dom.html.HTMLDirectoryElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 21 org.w3c.dom.html.HTMLDivElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 22 org.w3c.dom.html.HTMLDocument > > - lib/xercesImpl-2.9.1.jar > > Explanation... 23 org.w3c.dom.html.HTMLElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 24 org.w3c.dom.html.HTMLFieldSetElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 25 org.w3c.dom.html.HTMLFontElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 26 org.w3c.dom.html.HTMLFormElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 27 org.w3c.dom.html.HTMLFrameElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 28 org.w3c.dom.html.HTMLFrameSetElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 29 org.w3c.dom.html.HTMLHRElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 30 org.w3c.dom.html.HTMLHeadElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 31 org.w3c.dom.html.HTMLHeadingElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 32 org.w3c.dom.html.HTMLHtmlElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 33 org.w3c.dom.html.HTMLIFrameElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 34 org.w3c.dom.html.HTMLImageElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 35 org.w3c.dom.html.HTMLInputElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 36 org.w3c.dom.html.HTMLIsIndexElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 37 org.w3c.dom.html.HTMLLIElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 38 org.w3c.dom.html.HTMLLabelElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 39 org.w3c.dom.html.HTMLLegendElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 40 org.w3c.dom.html.HTMLLinkElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 41 org.w3c.dom.html.HTMLMapElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 42 org.w3c.dom.html.HTMLMenuElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 43 org.w3c.dom.html.HTMLMetaElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 44 org.w3c.dom.html.HTMLModElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 45 org.w3c.dom.html.HTMLOListElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 46 org.w3c.dom.html.HTMLObjectElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 47 org.w3c.dom.html.HTMLOptGroupElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 48 org.w3c.dom.html.HTMLOptionElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 49 org.w3c.dom.html.HTMLParagraphElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 50 org.w3c.dom.html.HTMLParamElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 51 org.w3c.dom.html.HTMLPreElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 52 org.w3c.dom.html.HTMLQuoteElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 53 org.w3c.dom.html.HTMLScriptElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 54 org.w3c.dom.html.HTMLSelectElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 55 org.w3c.dom.html.HTMLStyleElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 56 org.w3c.dom.html.HTMLTableCaptionElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 57 org.w3c.dom.html.HTMLTableCellElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 58 org.w3c.dom.html.HTMLTableColElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 59 org.w3c.dom.html.HTMLTableElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 60 org.w3c.dom.html.HTMLTableRowElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 61 org.w3c.dom.html.HTMLTableSectionElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 62 org.w3c.dom.html.HTMLTextAreaElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 63 org.w3c.dom.html.HTMLTitleElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 64 org.w3c.dom.html.HTMLUListElement > > - lib/xercesImpl-2.9.1.jar > > Explanation... 65 org.w3c.dom.ranges.DocumentRange > > - lib/xercesImpl-2.9.1.jar > > Explanation... 66 org.w3c.dom.ranges.Range > > - lib/xercesImpl-2.9.1.jar > > Explanation... 67 org.w3c.dom.ranges.RangeException > > - lib/xercesImpl-2.9.1.jar > > Explanation... 68 sun.misc.Signal > > - lib/aesh-0.33.7.jar > > Explanation... 69 sun.misc.SignalHandler > > - lib/aesh-0.33.7.jar > > Explanation... 70 sun.misc.Unsafe > > - lib/avro-1.7.5.jar > - lib/guava-12.0.jar > - lib/infinispan-commons-6.0.2.Final.jar > - lib/mvel2-2.0.12.jar > - lib/scala-library-2.10.2.jar > > Explanation... 71 sun.nio.ch.FileChannelImpl > > - lib/leveldb-0.5.jar > > Explanation... 72 sun.reflect.ReflectionFactory > > - lib/jboss-marshalling-1.4.4.Final.jar > > Explanation... 73 > sun.reflect.ReflectionFactory$GetReflectionFactoryAction > > - lib/jboss-marshalling-1.4.4.Final.jar > > Explanation... Identify External Replacements > > You should use a separate third-party library that performs this > functionality. > ID Internal API (grouped by package) Used By Identify External > Replacement > ------------------------------ > > > On 24/09/2014 08:55, Rory O'Donnell Oracle, Dublin Ireland wrote: > > Hi Galder, > > As part of the preparations for JDK 9, Oracle?s engineers have been > analyzing open source projects like yours to understand usage. One area of > concern involves identifying compatibility problems, such as reliance on > JDK-internal APIs. > > Our engineers have already prepared guidance on migrating some of the more > common usage patterns of JDK-internal APIs to supported public interfaces. > The list is on the OpenJDK wiki [0], along with instructions on how to run > the jdeps analysis tool yourself . > > As part of the ongoing development of JDK 9, I would like to encourage > migration from JDK-internal APIs towards the supported Java APIs. I have > prepared a report for your project rele ase infinispan-6.0.2 based on the > jdeps output. > > The report is attached to this e-mail. > > For anything where your migration path is unclear, I would appreciate > comments on the JDK-internal API usage patterns in the attached jdeps > report - in particular comments elaborating on the rationale for them - > either to me or on this mailing list. > > Finding suitable replacements for unsupported interfaces is not always > straightforward, which is why I am reaching out to you early in the JDK 9 > development cycle so you can give feedback about new APIs that may be > needed to facilitate this exercise. > > Thank you in advance for any efforts and feedback helping us make JDK 9 > better. > > Rgds,Rory > > [0] > https://wiki.openjdk.java.net/display/JDK8/Java+Dependency+Analysis+Tool > > > -- > Rgds,Rory O'Donnell > Quality Engineering Manager > Oracle EMEA , Dublin, Ireland > > > > > > > -- > Rgds,Rory O'Donnell > Quality Engineering Manager > Oracle EMEA , Dublin, Ireland > > > > > > > -- > Rgds,Rory O'Donnell > Quality Engineering Manager > Oracle EMEA , Dublin, Ireland > > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141010/8dbe5058/attachment-0001.html From rory.odonnell at oracle.com Fri Oct 10 08:26:33 2014 From: rory.odonnell at oracle.com (Rory O'Donnell Oracle, Dublin Ireland) Date: Fri, 10 Oct 2014 13:26:33 +0100 Subject: [infinispan-dev] Analysis of infinispan-6.0.2 dependency on JDK-Internal APIs In-Reply-To: References: <54227901.8020509@oracle.com> <54227EF9.6090403@oracle.com> <5437AE80.6050302@oracle.com> Message-ID: <5437D079.5090902@oracle.com> Hi Dan, Thank you for the feedback. Rgds,Rory On 10/10/2014 12:37, Dan Berindei wrote: > Hi Rory > > Galder is on PTO for another week, so I'll try to answer instead. > > We only use sun.misc.Unsafe directly, in order to implement a > variation of Doug Lea's ConcurrentHashMapV8 that accepts a custom > Equivalence (implementation of equality/hashCode). I guess we'll have > to switch to AtomicFieldUpdaters if we want it to work with JDK 9, and > possibly move to the volatile extensions once they are implemented. > > The rest of the internal class usages seem to be from our dependencies > on WildFly, JBoss Marshalling, LevelDB, Smooks, and JBoss > MicroContainer. Smooks and JBoss MicroContainer likely won't see any > updates for JDK 9, but they're only used in the demos so they're not > critical. JBoss Marshalling is used in the core, however, so we'll > need a release from them before we can run anything on JDK 9. > > Cheers > Dan > > > On Fri, Oct 10, 2014 at 1:01 PM, Rory O'Donnell Oracle, Dublin Ireland > > wrote: > > Hi Galder, > > Did you have time to review the report, any feedback ? > > Rgds,Rory > > On 24/09/2014 09:21, Rory O'Donnell Oracle, Dublin Ireland wrote: >> Below is a text output of the report for infinispan-6.0.2. >> >> Rgds,Rory >> >> ------------------------------------------------------------------------ >> >> >> JDK Internal API Usage Report for infinispan-6.0.2.Final-all >> >> The OpenJDK Quality Outreach campaign has run a compatibility >> report to identify usage of JDK-internal APIs. Usage of these >> JDK-internal APIs could pose compatibility issues, as the Java >> team explained in 1996 >> . >> We have created this report to help you identify which >> JDK-internal APIs your project uses, what to use instead, and >> where those changes should go. Making these changes will improve >> your compatibility, and in some cases give better performance. >> >> Migrating away from the JDK-internal APIs now will give your team >> adequate time for testing before the release of JDK 9. If you are >> unable to migrate away from an internal API, please provide us >> with an explanation below to help us understand it better. As a >> reminder, supported APIs are determined by the OpenJDK's Java >> Community Process and not by Oracle. >> >> This report was generated by jdeps >> >> through static analysis of artifacts: it does not identify any >> usage of those APIs through reflection or dynamic bytecode. You >> may also run jdeps on your own >> >> if you would prefer. >> >> Summary of the analysis of the jar files within >> infinispan-6.0.2.Final-all: >> >> * Numer of jar files depending on JDK-internal APIs: 10 >> * Internal APIs that have known replacements: 0 >> * Internal APIs that have no supported replacements: 73 >> >> >> APIs that have known replacements >> : >> >> ID Replace Usage of With Inside >> >> >> JDK-internal APIs without supported replacements: >> >> ID Internal APIs (do not use) Used by >> 1 com.sun.org.apache.xml.internal.utils.PrefixResolver >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 2 com.sun.org.apache.xpath.internal.XPath >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 3 com.sun.org.apache.xpath.internal.XPathContext >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 4 com.sun.org.apache.xpath.internal.objects.XBoolean >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 5 com.sun.org.apache.xpath.internal.objects.XNodeSet >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 6 com.sun.org.apache.xpath.internal.objects.XNull >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 7 com.sun.org.apache.xpath.internal.objects.XNumber >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 8 com.sun.org.apache.xpath.internal.objects.XObject >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 9 com.sun.org.apache.xpath.internal.objects.XString >> >> * lib/freemarker-2.3.11.jar >> >> Explanation... >> 10 org.w3c.dom.html.HTMLAnchorElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 11 org.w3c.dom.html.HTMLAppletElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 12 org.w3c.dom.html.HTMLAreaElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 13 org.w3c.dom.html.HTMLBRElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 14 org.w3c.dom.html.HTMLBaseElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 15 org.w3c.dom.html.HTMLBaseFontElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 16 org.w3c.dom.html.HTMLBodyElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 17 org.w3c.dom.html.HTMLButtonElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 18 org.w3c.dom.html.HTMLCollection >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 19 org.w3c.dom.html.HTMLDListElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 20 org.w3c.dom.html.HTMLDirectoryElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 21 org.w3c.dom.html.HTMLDivElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 22 org.w3c.dom.html.HTMLDocument >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 23 org.w3c.dom.html.HTMLElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 24 org.w3c.dom.html.HTMLFieldSetElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 25 org.w3c.dom.html.HTMLFontElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 26 org.w3c.dom.html.HTMLFormElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 27 org.w3c.dom.html.HTMLFrameElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 28 org.w3c.dom.html.HTMLFrameSetElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 29 org.w3c.dom.html.HTMLHRElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 30 org.w3c.dom.html.HTMLHeadElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 31 org.w3c.dom.html.HTMLHeadingElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 32 org.w3c.dom.html.HTMLHtmlElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 33 org.w3c.dom.html.HTMLIFrameElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 34 org.w3c.dom.html.HTMLImageElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 35 org.w3c.dom.html.HTMLInputElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 36 org.w3c.dom.html.HTMLIsIndexElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 37 org.w3c.dom.html.HTMLLIElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 38 org.w3c.dom.html.HTMLLabelElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 39 org.w3c.dom.html.HTMLLegendElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 40 org.w3c.dom.html.HTMLLinkElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 41 org.w3c.dom.html.HTMLMapElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 42 org.w3c.dom.html.HTMLMenuElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 43 org.w3c.dom.html.HTMLMetaElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 44 org.w3c.dom.html.HTMLModElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 45 org.w3c.dom.html.HTMLOListElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 46 org.w3c.dom.html.HTMLObjectElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 47 org.w3c.dom.html.HTMLOptGroupElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 48 org.w3c.dom.html.HTMLOptionElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 49 org.w3c.dom.html.HTMLParagraphElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 50 org.w3c.dom.html.HTMLParamElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 51 org.w3c.dom.html.HTMLPreElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 52 org.w3c.dom.html.HTMLQuoteElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 53 org.w3c.dom.html.HTMLScriptElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 54 org.w3c.dom.html.HTMLSelectElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 55 org.w3c.dom.html.HTMLStyleElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 56 org.w3c.dom.html.HTMLTableCaptionElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 57 org.w3c.dom.html.HTMLTableCellElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 58 org.w3c.dom.html.HTMLTableColElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 59 org.w3c.dom.html.HTMLTableElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 60 org.w3c.dom.html.HTMLTableRowElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 61 org.w3c.dom.html.HTMLTableSectionElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 62 org.w3c.dom.html.HTMLTextAreaElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 63 org.w3c.dom.html.HTMLTitleElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 64 org.w3c.dom.html.HTMLUListElement >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 65 org.w3c.dom.ranges.DocumentRange >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 66 org.w3c.dom.ranges.Range >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 67 org.w3c.dom.ranges.RangeException >> >> * lib/xercesImpl-2.9.1.jar >> >> Explanation... >> 68 sun.misc.Signal >> >> * lib/aesh-0.33.7.jar >> >> Explanation... >> 69 sun.misc.SignalHandler >> >> * lib/aesh-0.33.7.jar >> >> Explanation... >> 70 sun.misc.Unsafe >> >> * lib/avro-1.7.5.jar >> * lib/guava-12.0.jar >> * lib/infinispan-commons-6.0.2.Final.jar >> * lib/mvel2-2.0.12.jar >> * lib/scala-library-2.10.2.jar >> >> Explanation... >> 71 sun.nio.ch.FileChannelImpl >> >> * lib/leveldb-0.5.jar >> >> Explanation... >> 72 sun.reflect.ReflectionFactory >> >> * lib/jboss-marshalling-1.4.4.Final.jar >> >> Explanation... >> 73 sun.reflect.ReflectionFactory$GetReflectionFactoryAction >> >> * lib/jboss-marshalling-1.4.4.Final.jar >> >> Explanation... >> >> >> Identify External Replacements >> >> You should use a separate third-party library that performs this >> functionality. >> >> ID Internal API (grouped by package) Used By Identify External >> Replacement >> >> >> ------------------------------------------------------------------------ >> >> >> On 24/09/2014 08:55, Rory O'Donnell Oracle, Dublin Ireland wrote: >>> Hi Galder, >>> >>> As part of the preparations for JDK 9, Oracle?s engineers have >>> been analyzing open source projects like yours to understand >>> usage. One area of concern involves identifying compatibility >>> problems, such as reliance on JDK-internal APIs. >>> >>> Our engineers have already prepared guidance on migrating some >>> of the more common usage patterns of JDK-internal APIs to >>> supported public interfaces. The list is on the OpenJDK wiki >>> [0], along with instructions on how to run the jdeps analysis >>> tool yourself . >>> >>> As part of the ongoing development of JDK 9, I would like to >>> encourage migration from JDK-internal APIs towards the supported >>> Java APIs. I have prepared a report for your project rele ase >>> infinispan-6.0.2 based on the jdeps output. >>> >>> The report is attached to this e-mail. >>> >>> For anything where your migration path is unclear, I would >>> appreciate comments on the JDK-internal API usage patterns in >>> the attached jdeps report - in particular comments elaborating >>> on the rationale for them - either to me or on this mailing list. >>> >>> Finding suitable replacements for unsupported interfaces is not >>> always straightforward, which is why I am reaching out to you >>> early in the JDK 9 development cycle so you can give feedback >>> about new APIs that may be needed to facilitate this exercise. >>> >>> Thank you in advance for any efforts and feedback helping us >>> make JDK 9 better. >>> >>> Rgds,Rory >>> >>> [0] >>> https://wiki.openjdk.java.net/display/JDK8/Java+Dependency+Analysis+Tool >>> >>> >>> >>> -- >>> Rgds,Rory O'Donnell >>> Quality Engineering Manager >>> Oracle EMEA , Dublin, Ireland >>> >>> >>> >>> >> >> -- >> Rgds,Rory O'Donnell >> Quality Engineering Manager >> Oracle EMEA , Dublin, Ireland >> >> >> >> > > -- > Rgds,Rory O'Donnell > Quality Engineering Manager > Oracle EMEA , Dublin, Ireland > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141010/af8ed8aa/attachment-0001.html From mudokonman at gmail.com Fri Oct 10 08:38:04 2014 From: mudokonman at gmail.com (William Burns) Date: Fri, 10 Oct 2014 08:38:04 -0400 Subject: [infinispan-dev] About size() In-Reply-To: <5435560E.2030206@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 11:19 AM, Radim Vansa wrote: > Users expect that size() will be constant-time (or linear to cluster > size), and generally fast operation. I'd prefer to keep it that way. > Though, even the MR way (used for HotRod size() now) needs to crawl > through all the entries locally. Many in memory collections require O(n) to do size such as ConcurrentLinkedQueue, so I wouldn't say size should always be expected to be constant time or O(c) where c is # of nodes. Granted a user can expect anything they want. > > 'Heretic, not very well though of and changing too many things' idea: > what about having data container segment-aware? Then you'd just bcast > SizeCommand with given topologyId and sum up sizes of primary-owned > segments... It's not a complete solution, but at least that would enable > to get the number of locally owned entries quite fast. Though, you can't > do that easily with cache stores (without changing SPI). > > Regarding cache stores, IMO we're damned anyway: when calling > cacheStore.size(), it can report more entries as those haven't been > expired yet, it can report less entries as those can be expired due to > [1]. Or, we'll enumerate all the entries, and that's going to be slow > (btw., [1] reminded me that we should enumerate both datacontainer AND > cachestores even if passivation is not enabled). This is precisely what the distributed iterator does. And also support for expired entries was recently integrated as I missed that in the original implementation [a] [a] https://issues.jboss.org/browse/ISPN-4643 > > Radim > > [1] https://issues.jboss.org/browse/ISPN-3202 > > On 10/08/2014 04:42 PM, William Burns wrote: >> So it seems we would want to change this for 7.0 if possible since it >> would be a bigger change for something like 7.1 and 8.0 would be even >> further out. I should be able to put this together for CR2. >> >> It seems that we want to implement keySet, values and entrySet methods >> using the entry iterator approach. >> >> It is however unclear for the size method if we want to use MR entry >> counting and not worry about the rehash and passivation issues since >> it is just an estimation anyways. Or if we want to also use the entry >> iterator which should be closer approximation but will require more >> network overhead and memory usage. >> >> Also we didn't really talk about the fact that these methods would >> ignore ongoing transactions and if that is a concern or not. >> >> - Will >> >> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >>> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >>> >>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >>>> >>>>> Hi, >>>>> >>>>> recently we had a discussion about what size() returns, but I've >>>>> realized there are more things that users would like to know. My >>>>> question is whether you think that they would really appreciate it, or >>>>> whether it's just my QA point of view where I sometimes compute the >>>>> 'checksums' of cache to see if I didn't lost anything. >>>>> >>>>> There are those sizes: >>>>> A) number of owned entries >>>>> B) number of entries stored locally in memory >>>>> C) number of entries stored in each local cache store >>>>> D) number of entries stored in each shared cache store >>>>> E) total number of entries in cache >>>>> >>>>> So far, we can get >>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>> E via distributed iterators / MR >>>>> A via data container iteration + distribution manager query, but only >>>>> without cache store >>>>> C or D through >>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>> >>>>> I think that it would go along with users' expectations if size() >>>>> returned E and for the rest we should have special methods on >>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>> I'd say that finally to something that has firm meaning. >>>>> >>>>> WDYT? >>>> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >>>> - they are approximate (data changes during iteration) >>>> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >>>> >>>> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >>>> >>>> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. >>> Yes, that's what I meant as well. >>> >>> Cheers, >>> -- >>> Mircea Markus >>> Infinispan lead (www.infinispan.org) >>> >>> >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Fri Oct 10 09:53:49 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 10 Oct 2014 15:53:49 +0200 Subject: [infinispan-dev] Clustering standalone Infinispan w/ WF running Infinispan In-Reply-To: <5437E23E.4050508@jboss.com> References: <54257C74.2050600@redhat.com> <1898996727.37561681.1411744742097.JavaMail.zimbra@redhat.com> <54297D4F.9060009@jboss.com> <133530692.31414088.1412009840434.JavaMail.zimbra@redhat.com> <542A5583.4030908@redhat.com> <851849146.38574524.1412064092251.JavaMail.zimbra@redhat.com> <542D5143.3070006@redhat.com> <5437E23E.4050508@jboss.com> Message-ID: <5437E4ED.6020109@redhat.com> Markdown chewed on my markup :) https://raw.githubusercontent.com/tristantarrant/infinispan-playground-hybrid/master/README.md On 10/10/14 15:42, Kurt T Stam wrote: > Hi Tristan, > > I'm trying to follow your instructions but am I bit confused by the > following: > > "You will also need to modify the following file: > > modules/system/layers/base/org/jboss/as/clustering/infinispan/main/module.xml > > > by adding the following line to its dependencies:" > > What do I have to add? > > Thx, > > --Kurt > > > > On 10/2/14, 9:21 AM, Tristan Tarrant wrote: >> I have successfully created a "hybrid" cluster between an application >> using Infinispan in embedded mode and an Infinispan server by doing >> the following on the embedded side: >> >> - use a JGroups Channel wrapped in a MuxHandler >> - use a custom class resolver which simulates (or rather... hacks) >> the behaviour of the ModularClassResolver when not using modules >> >> You can find the code at my personal GitHub repo: >> >> https://github.com/tristantarrant/infinispan-playground/tree/master/src/main/java/net/dataforte/infinispan/hybrid >> >> >> suggestions and improvements are welcome. >> >> Tristan >> >> On 30/09/14 10:01, Stelios Koussouris wrote: >>> Hi, >>> >>> To give a bit of context on this. We are doing a POC where the >>> customer wishes to utilize JDG to speed up their application. We >>> need (due to some customer requirements) to cluster >>> EMBEDDED JDG (infinispan library mode) with REMOTE JDG (Infinispan >>> Server) nodes. The infinispan jars should be the same as they are >>> only libraries and they >>> are on the same version. However, during "clustering" of the caches >>> we started seeing errors which looked like there were due to the >>> fact that the clustering of the caches contained different >>> info between the 2 types of cache instantiation (embedded vs server). >>> >>> The result was to for a suggestion to create our own MuxChannel (I >>> don't know if we have any other alternatives at this stage to >>> cluster embedded with server infinispan caches) but at the moment we >>> are facing https://gist.github.com/skoussou/5edc5689446b67f85ae8 >>> >>> Regards, >>> >>> Stylianos Kousouris >>> Red Hat Middleware Consultant >>> >>> ----- Original Message ----- >>> From: "Tristan Tarrant" >>> To: "infinispan -Dev List" , "Kurt T >>> Stam" >>> Cc: "Stelios Koussouris" , "Richard >>> Achmatowicz" >>> Sent: Tuesday, 30 September, 2014 8:02:27 AM >>> Subject: Re: [infinispan-dev] Clustering standalone Infinispan w/ WF >>> running Infinispan >>> >>> I don't know what Kurt is doing, but Stelios is attempting to >>> cluster an >>> application using embedded Infinispan deployed within WF together with >>> an Infinispan Server instance. >>> The application is managing its own caches, and therefore it is not >>> interacting with the underlying Infinispan and JGroups subsystems in >>> WF. >>> Infinispan Server uses its Infinispan and JGroups subsystems (which are >>> forked from WF's) and therefore are using MuxChannels. >>> >>> I told Stelios to use a MuxChannel-wrapped Channel in his application >>> and it solved part of the issue (he was initially importing the one >>> included in the WF's jgroups subsystem, but now he's using his local >>> copy), but now he has run into further problems and I believe what Paul >>> & Dennis have written might be correct. >>> >>> The code that configures this is in >>> EmbeddedCacheManagerConfigurationService: >>> >>> GlobalConfigurationBuilder builder = new GlobalConfigurationBuilder(); >>> ModuleLoader moduleLoader = this.dependencies.getModuleLoader(); >>> builder.serialization().classResolver(ModularClassResolver.getInstance(moduleLoader)); >>> >>> >>> I don't know how you'd get a ModuleLoader from within a WF deployment, >>> but I'm sure it can be done. >>> >>> Tristan >>> >>> On 29/09/14 18:57, Paul Ferraro wrote: >>>> You should not need to use a MuxChannel. This would only be >>>> necessary if there are other EAP services sharing the channel. >>>> Using a MuxChannel allows your standalone Infinispan instance to >>>> filter these irrelevant messages. However, in JDG, there should be >>>> no other services other than Infinispan using the channel - hence >>>> the MuxChannel stuff is unnecessary. >>>> >>>> I think Dennis earlier response was spot on. EAP/JDG configures >>>> it's cache managers using a ModularClassResolver (which includes a >>>> module name along with the class name when marshalling). Your >>>> standalone Infinispan instances do not use this and therefore >>>> cannot make sense of the message body. >>>> >>>> Paul >>>> >>>> ----- Original Message ----- >>>>> From: "Kurt T Stam" >>>>> To: "Stelios Koussouris" , "Radoslav Husar" >>>>> >>>>> Cc: "Galder Zamarre?o" , "Paul Ferraro" >>>>> , "Richard Achmatowicz" >>>>> , "infinispan -Dev List" >>>>> >>>>> Sent: Monday, September 29, 2014 11:39:59 AM >>>>> Subject: Re: Clustering standalone Infinispan w/ WF running >>>>> Infinispan >>>>> >>>>> Thanks for following up Stelios, I think Galder is traveling the >>>>> next 2 >>>>> weeks. >>>>> >>>>> So - do we need fixes on both ends then so that the boot order >>>>> does not >>>>> matter? In which project(s) would we apply >>>>> there changes? Or can they be applied in the end-user's code? >>>>> >>>>> Thx, >>>>> >>>>> --Kurt >>>>> >>>>> >>>>> >>>>> On 9/26/14, 11:19 AM, Stelios Koussouris wrote: >>>>>> Hi, >>>>>> >>>>>> Rado: It is both ways. ie. if I start first the JDG Server I get >>>>>> the issue >>>>>> on the library mode side when I start that one. If reverse the >>>>>> order of >>>>>> startup I get it in the JDG Server side. >>>>>> >>>>>> Question: >>>>>> ----------------------------------------------------------------------------------------------------------------------- >>>>>> >>>>>> ...IMO the channel needs to be wrapped as >>>>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to >>>>>> infinispan. >>>>>> ... >>>>>> ----------------------------------------------------------------------------------------------------------------------- >>>>>> >>>>>> For now that this is not being done. If I wanted to do it >>>>>> manually on the >>>>>> library side where I can create the protocol programmatically we are >>>>>> talking about something like this? >>>>>> >>>>>> ProtocolStackConfigurator configurator = >>>>>> ConfiguratorFactory.getStackConfigurator("jgroups-udp.xml"); >>>>>> MuxChannel channel = new MuxChannel(configurator); >>>>>> org.infinispan.remoting.transport.Transport transport = new >>>>>> org.infinispan.remoting.transport.jgroups.JGroupsTransport(channel); >>>>>> >>>>>> .... >>>>>> then replace the below >>>>>> new >>>>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport().clusterName("UDM-CLUSTER").addProperty("configurationFile", >>>>>> >>>>>> "jgroups-udp.xml") >>>>>> WITH >>>>>> new >>>>>> GlobalConfigurationBuilder().clusteredDefault().globalJmxStatistics().cacheManagerName("RDSCacheManager").allowDuplicateDomains(true).enable().transport(Transport).clusterName("UDM-CLUSTER") >>>>>> >>>>>> >>>>>> Btw, someone mentioned that if I follow this method I need to to >>>>>> know the >>>>>> assigned mux ids, but that is not quite clear what it means with >>>>>> regards >>>>>> to the JGroupsTransport configuration >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Stylianos Kousouris >>>>>> Red Hat Middleware Consultant >>>>>> >>>>>> ----- Original Message ----- >>>>>> From: "Radoslav Husar" >>>>>> To: "Galder Zamarre?o" , "Paul Ferraro" >>>>>> >>>>>> Cc: "Richard Achmatowicz" , "infinispan -Dev >>>>>> List" >>>>>> , "Stelios Koussouris" >>>>>> , "Kurt T Stam" >>>>>> Sent: Friday, 26 September, 2014 3:47:16 PM >>>>>> Subject: Re: Clustering standalone Infinispan w/ WF running >>>>>> Infinispan >>>>>> >>>>>> From what Stelios is telling me the question is a little bit >>>>>> other way >>>>>> round: he is using library mode infinispan and jgroups in EAP and >>>>>> connecting to JDG. So the question is what JDG is doing with the >>>>>> stack, >>>>>> not AS/WF as its infinispan/jgroups subsystem is not used. >>>>>> >>>>>> Unfortunately I don't have access to the JDG repo so I don't know >>>>>> what >>>>>> changes have been made there but if you are using the same jgroups >>>>>> logic, IMO the channel needs to be wrapped as >>>>>> org.jboss.as.clustering.jgroups.MuxChannel before passing to >>>>>> infinispan. >>>>>> >>>>>> Rado >>>>>> >>>>>> On 26/09/14 15:03, Galder Zamarre?o wrote: >>>>>>> Hey Paul, >>>>>>> >>>>>>> In the last couple of days, a couple of people have encountered the >>>>>>> exception in [1] when trying to cluster a standalone Infinispan >>>>>>> app with >>>>>>> its own JGroups configuration file with a AS/WF running >>>>>>> Infinispan cache. >>>>>>> >>>>>>> From my POV, 3 possible causes: >>>>>>> >>>>>>> 1. Dependency mismatches between AS/WF and the standalone app. >>>>>>> Having done >>>>>>> some quick study of Kurt?s case, apart from micro version >>>>>>> changes, all >>>>>>> looks good. >>>>>>> >>>>>>> 2. Mismatch in the Infinispan and/or JGroups configuration file. >>>>>>> >>>>>>> 3. AS/WF puts something on the clustered wire that standalone >>>>>>> Infinispan >>>>>>> does not expect. Are you still doing multiplexing? Could you be >>>>>>> adding >>>>>>> extra info to the wire? >>>>>>> >>>>>>> With this email, I?m trying to get some clarification from you >>>>>>> if the >>>>>>> issue could be due to 3rd option. If it?s either of the first >>>>>>> two, it?s a >>>>>>> matter of digging and finding the difference, but if it?s 3rd >>>>>>> one, it?s >>>>>>> more problematic. >>>>>>> >>>>>>> Any ideas? >>>>>>> >>>>>>> [1] https://gist.github.com/skoussou/92f062f2d0bd17168e01 >>>>>>> -- >>>>>>> Galder Zamarre?o >>>>>>> galder at redhat.com >>>>>>> twitter.com/galderz >>>>>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> > From rvansa at redhat.com Fri Oct 10 10:03:16 2014 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 10 Oct 2014 16:03:16 +0200 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> Message-ID: <5437E724.4040705@redhat.com> On 10/10/2014 02:38 PM, William Burns wrote: > On Wed, Oct 8, 2014 at 11:19 AM, Radim Vansa wrote: >> Users expect that size() will be constant-time (or linear to cluster >> size), and generally fast operation. I'd prefer to keep it that way. >> Though, even the MR way (used for HotRod size() now) needs to crawl >> through all the entries locally. > Many in memory collections require O(n) to do size such as > ConcurrentLinkedQueue, so I wouldn't say size should always be > expected to be constant time or O(c) where c is # of nodes. Granted a > user can expect anything they want. OK, I stand corrected. Moreover, I was generalizing myself to all users, a common mistake :) Anyway, monitoring tools love nice charts, and I can imagine monitoring software polling every 1 second to update that cool chart with cache size. Do we want a fast but imprecise variant of this operation in some statistics class? Radim > >> 'Heretic, not very well though of and changing too many things' idea: >> what about having data container segment-aware? Then you'd just bcast >> SizeCommand with given topologyId and sum up sizes of primary-owned >> segments... It's not a complete solution, but at least that would enable >> to get the number of locally owned entries quite fast. Though, you can't >> do that easily with cache stores (without changing SPI). >> >> Regarding cache stores, IMO we're damned anyway: when calling >> cacheStore.size(), it can report more entries as those haven't been >> expired yet, it can report less entries as those can be expired due to >> [1]. Or, we'll enumerate all the entries, and that's going to be slow >> (btw., [1] reminded me that we should enumerate both datacontainer AND >> cachestores even if passivation is not enabled). > This is precisely what the distributed iterator does. And also > support for expired entries was recently integrated as I missed that > in the original implementation [a] > > [a] https://issues.jboss.org/browse/ISPN-4643 > >> Radim >> >> [1] https://issues.jboss.org/browse/ISPN-3202 >> >> On 10/08/2014 04:42 PM, William Burns wrote: >>> So it seems we would want to change this for 7.0 if possible since it >>> would be a bigger change for something like 7.1 and 8.0 would be even >>> further out. I should be able to put this together for CR2. >>> >>> It seems that we want to implement keySet, values and entrySet methods >>> using the entry iterator approach. >>> >>> It is however unclear for the size method if we want to use MR entry >>> counting and not worry about the rehash and passivation issues since >>> it is just an estimation anyways. Or if we want to also use the entry >>> iterator which should be closer approximation but will require more >>> network overhead and memory usage. >>> >>> Also we didn't really talk about the fact that these methods would >>> ignore ongoing transactions and if that is a concern or not. >>> >>> - Will >>> >>> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >>>> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >>>> >>>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >>>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> recently we had a discussion about what size() returns, but I've >>>>>> realized there are more things that users would like to know. My >>>>>> question is whether you think that they would really appreciate it, or >>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>> >>>>>> There are those sizes: >>>>>> A) number of owned entries >>>>>> B) number of entries stored locally in memory >>>>>> C) number of entries stored in each local cache store >>>>>> D) number of entries stored in each shared cache store >>>>>> E) total number of entries in cache >>>>>> >>>>>> So far, we can get >>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>> E via distributed iterators / MR >>>>>> A via data container iteration + distribution manager query, but only >>>>>> without cache store >>>>>> C or D through >>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>> >>>>>> I think that it would go along with users' expectations if size() >>>>>> returned E and for the rest we should have special methods on >>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>> I'd say that finally to something that has firm meaning. >>>>>> >>>>>> WDYT? >>>>> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >>>>> - they are approximate (data changes during iteration) >>>>> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >>>>> >>>>> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >>>>> >>>>> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. >>>> Yes, that's what I meant as well. >>>> >>>> Cheers, >>>> -- >>>> Mircea Markus >>>> Infinispan lead (www.infinispan.org) >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa >> JBoss DataGrid QA >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From ttarrant at redhat.com Fri Oct 10 10:06:18 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 10 Oct 2014 16:06:18 +0200 Subject: [infinispan-dev] About size() In-Reply-To: <5437E724.4040705@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> Message-ID: <5437E7DA.6060101@redhat.com> What's wrong with sum(Datacontainer.size())/numOwners ? Tristan On 10/10/14 16:03, Radim Vansa wrote: > On 10/10/2014 02:38 PM, William Burns wrote: >> On Wed, Oct 8, 2014 at 11:19 AM, Radim Vansa wrote: >>> Users expect that size() will be constant-time (or linear to cluster >>> size), and generally fast operation. I'd prefer to keep it that way. >>> Though, even the MR way (used for HotRod size() now) needs to crawl >>> through all the entries locally. >> Many in memory collections require O(n) to do size such as >> ConcurrentLinkedQueue, so I wouldn't say size should always be >> expected to be constant time or O(c) where c is # of nodes. Granted a >> user can expect anything they want. > OK, I stand corrected. Moreover, I was generalizing myself to all users, > a common mistake :) > > Anyway, monitoring tools love nice charts, and I can imagine monitoring > software polling every 1 second to update that cool chart with cache > size. Do we want a fast but imprecise variant of this operation in some > statistics class? > > Radim > >>> 'Heretic, not very well though of and changing too many things' idea: >>> what about having data container segment-aware? Then you'd just bcast >>> SizeCommand with given topologyId and sum up sizes of primary-owned >>> segments... It's not a complete solution, but at least that would enable >>> to get the number of locally owned entries quite fast. Though, you can't >>> do that easily with cache stores (without changing SPI). >>> >>> Regarding cache stores, IMO we're damned anyway: when calling >>> cacheStore.size(), it can report more entries as those haven't been >>> expired yet, it can report less entries as those can be expired due to >>> [1]. Or, we'll enumerate all the entries, and that's going to be slow >>> (btw., [1] reminded me that we should enumerate both datacontainer AND >>> cachestores even if passivation is not enabled). >> This is precisely what the distributed iterator does. And also >> support for expired entries was recently integrated as I missed that >> in the original implementation [a] >> >> [a] https://issues.jboss.org/browse/ISPN-4643 >> >>> Radim >>> >>> [1] https://issues.jboss.org/browse/ISPN-3202 >>> >>> On 10/08/2014 04:42 PM, William Burns wrote: >>>> So it seems we would want to change this for 7.0 if possible since it >>>> would be a bigger change for something like 7.1 and 8.0 would be even >>>> further out. I should be able to put this together for CR2. >>>> >>>> It seems that we want to implement keySet, values and entrySet methods >>>> using the entry iterator approach. >>>> >>>> It is however unclear for the size method if we want to use MR entry >>>> counting and not worry about the rehash and passivation issues since >>>> it is just an estimation anyways. Or if we want to also use the entry >>>> iterator which should be closer approximation but will require more >>>> network overhead and memory usage. >>>> >>>> Also we didn't really talk about the fact that these methods would >>>> ignore ongoing transactions and if that is a concern or not. >>>> >>>> - Will >>>> >>>> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >>>>> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >>>>> >>>>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >>>>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>> realized there are more things that users would like to know. My >>>>>>> question is whether you think that they would really appreciate it, or >>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>> >>>>>>> There are those sizes: >>>>>>> A) number of owned entries >>>>>>> B) number of entries stored locally in memory >>>>>>> C) number of entries stored in each local cache store >>>>>>> D) number of entries stored in each shared cache store >>>>>>> E) total number of entries in cache >>>>>>> >>>>>>> So far, we can get >>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>> E via distributed iterators / MR >>>>>>> A via data container iteration + distribution manager query, but only >>>>>>> without cache store >>>>>>> C or D through >>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>> >>>>>>> I think that it would go along with users' expectations if size() >>>>>>> returned E and for the rest we should have special methods on >>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>> I'd say that finally to something that has firm meaning. >>>>>>> >>>>>>> WDYT? >>>>>> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >>>>>> - they are approximate (data changes during iteration) >>>>>> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >>>>>> >>>>>> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >>>>>> >>>>>> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. >>>>> Yes, that's what I meant as well. >>>>> >>>>> Cheers, >>>>> -- >>>>> Mircea Markus >>>>> Infinispan lead (www.infinispan.org) >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> -- >>> Radim Vansa >>> JBoss DataGrid QA >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > From rvansa at redhat.com Fri Oct 10 10:18:27 2014 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 10 Oct 2014 16:18:27 +0200 Subject: [infinispan-dev] About size() In-Reply-To: <5437E7DA.6060101@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> <5437E7DA.6060101@redhat.com> Message-ID: <5437EAB3.8030202@redhat.com> That we should expose that as one method, not forcing people to implement the sum() themselves. And possibly cachestores, again. Radim On 10/10/2014 04:06 PM, Tristan Tarrant wrote: > What's wrong with sum(Datacontainer.size())/numOwners ? > > Tristan > > On 10/10/14 16:03, Radim Vansa wrote: >> On 10/10/2014 02:38 PM, William Burns wrote: >>> On Wed, Oct 8, 2014 at 11:19 AM, Radim Vansa wrote: >>>> Users expect that size() will be constant-time (or linear to cluster >>>> size), and generally fast operation. I'd prefer to keep it that way. >>>> Though, even the MR way (used for HotRod size() now) needs to crawl >>>> through all the entries locally. >>> Many in memory collections require O(n) to do size such as >>> ConcurrentLinkedQueue, so I wouldn't say size should always be >>> expected to be constant time or O(c) where c is # of nodes. Granted a >>> user can expect anything they want. >> OK, I stand corrected. Moreover, I was generalizing myself to all users, >> a common mistake :) >> >> Anyway, monitoring tools love nice charts, and I can imagine monitoring >> software polling every 1 second to update that cool chart with cache >> size. Do we want a fast but imprecise variant of this operation in some >> statistics class? >> >> Radim >> >>>> 'Heretic, not very well though of and changing too many things' idea: >>>> what about having data container segment-aware? Then you'd just bcast >>>> SizeCommand with given topologyId and sum up sizes of primary-owned >>>> segments... It's not a complete solution, but at least that would enable >>>> to get the number of locally owned entries quite fast. Though, you can't >>>> do that easily with cache stores (without changing SPI). >>>> >>>> Regarding cache stores, IMO we're damned anyway: when calling >>>> cacheStore.size(), it can report more entries as those haven't been >>>> expired yet, it can report less entries as those can be expired due to >>>> [1]. Or, we'll enumerate all the entries, and that's going to be slow >>>> (btw., [1] reminded me that we should enumerate both datacontainer AND >>>> cachestores even if passivation is not enabled). >>> This is precisely what the distributed iterator does. And also >>> support for expired entries was recently integrated as I missed that >>> in the original implementation [a] >>> >>> [a] https://issues.jboss.org/browse/ISPN-4643 >>> >>>> Radim >>>> >>>> [1] https://issues.jboss.org/browse/ISPN-3202 >>>> >>>> On 10/08/2014 04:42 PM, William Burns wrote: >>>>> So it seems we would want to change this for 7.0 if possible since it >>>>> would be a bigger change for something like 7.1 and 8.0 would be even >>>>> further out. I should be able to put this together for CR2. >>>>> >>>>> It seems that we want to implement keySet, values and entrySet methods >>>>> using the entry iterator approach. >>>>> >>>>> It is however unclear for the size method if we want to use MR entry >>>>> counting and not worry about the rehash and passivation issues since >>>>> it is just an estimation anyways. Or if we want to also use the entry >>>>> iterator which should be closer approximation but will require more >>>>> network overhead and memory usage. >>>>> >>>>> Also we didn't really talk about the fact that these methods would >>>>> ignore ongoing transactions and if that is a concern or not. >>>>> >>>>> - Will >>>>> >>>>> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >>>>>> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >>>>>> >>>>>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >>>>>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>>> realized there are more things that users would like to know. My >>>>>>>> question is whether you think that they would really appreciate it, or >>>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>>> >>>>>>>> There are those sizes: >>>>>>>> A) number of owned entries >>>>>>>> B) number of entries stored locally in memory >>>>>>>> C) number of entries stored in each local cache store >>>>>>>> D) number of entries stored in each shared cache store >>>>>>>> E) total number of entries in cache >>>>>>>> >>>>>>>> So far, we can get >>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>>> E via distributed iterators / MR >>>>>>>> A via data container iteration + distribution manager query, but only >>>>>>>> without cache store >>>>>>>> C or D through >>>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>>> >>>>>>>> I think that it would go along with users' expectations if size() >>>>>>>> returned E and for the rest we should have special methods on >>>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>>> I'd say that finally to something that has firm meaning. >>>>>>>> >>>>>>>> WDYT? >>>>>>> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >>>>>>> - they are approximate (data changes during iteration) >>>>>>> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >>>>>>> >>>>>>> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >>>>>>> >>>>>>> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. >>>>>> Yes, that's what I meant as well. >>>>>> >>>>>> Cheers, >>>>>> -- >>>>>> Mircea Markus >>>>>> Infinispan lead (www.infinispan.org) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> -- >>>> Radim Vansa >>>> JBoss DataGrid QA >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From mmarkus at redhat.com Fri Oct 10 10:20:52 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Fri, 10 Oct 2014 15:20:52 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <5437EAB3.8030202@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> <5437E7DA.6060101@redhat.com> <5437EAB3.8030202@redhat.com> Message-ID: <6F085B91-2882-4636-BDC4-1E47815B05D0@redhat.com> On Oct 10, 2014, at 15:18, Radim Vansa wrote: > That we should expose that as one method, not forcing people to > implement the sum() themselves. Hmm, isn't the method you mention cache.size() ? :-) > > And possibly cachestores, again. AdvancedCacheLoader has size. > > Radim > > On 10/10/2014 04:06 PM, Tristan Tarrant wrote: >> What's wrong with sum(Datacontainer.size())/numOwners ? >> >> Tristan >> >> On 10/10/14 16:03, Radim Vansa wrote: >>> On 10/10/2014 02:38 PM, William Burns wrote: >>>> On Wed, Oct 8, 2014 at 11:19 AM, Radim Vansa wrote: >>>>> Users expect that size() will be constant-time (or linear to cluster >>>>> size), and generally fast operation. I'd prefer to keep it that way. >>>>> Though, even the MR way (used for HotRod size() now) needs to crawl >>>>> through all the entries locally. >>>> Many in memory collections require O(n) to do size such as >>>> ConcurrentLinkedQueue, so I wouldn't say size should always be >>>> expected to be constant time or O(c) where c is # of nodes. Granted a >>>> user can expect anything they want. >>> OK, I stand corrected. Moreover, I was generalizing myself to all users, >>> a common mistake :) >>> >>> Anyway, monitoring tools love nice charts, and I can imagine monitoring >>> software polling every 1 second to update that cool chart with cache >>> size. Do we want a fast but imprecise variant of this operation in some >>> statistics class? >>> >>> Radim >>> >>>>> 'Heretic, not very well though of and changing too many things' idea: >>>>> what about having data container segment-aware? Then you'd just bcast >>>>> SizeCommand with given topologyId and sum up sizes of primary-owned >>>>> segments... It's not a complete solution, but at least that would enable >>>>> to get the number of locally owned entries quite fast. Though, you can't >>>>> do that easily with cache stores (without changing SPI). >>>>> >>>>> Regarding cache stores, IMO we're damned anyway: when calling >>>>> cacheStore.size(), it can report more entries as those haven't been >>>>> expired yet, it can report less entries as those can be expired due to >>>>> [1]. Or, we'll enumerate all the entries, and that's going to be slow >>>>> (btw., [1] reminded me that we should enumerate both datacontainer AND >>>>> cachestores even if passivation is not enabled). >>>> This is precisely what the distributed iterator does. And also >>>> support for expired entries was recently integrated as I missed that >>>> in the original implementation [a] >>>> >>>> [a] https://issues.jboss.org/browse/ISPN-4643 >>>> >>>>> Radim >>>>> >>>>> [1] https://issues.jboss.org/browse/ISPN-3202 >>>>> >>>>> On 10/08/2014 04:42 PM, William Burns wrote: >>>>>> So it seems we would want to change this for 7.0 if possible since it >>>>>> would be a bigger change for something like 7.1 and 8.0 would be even >>>>>> further out. I should be able to put this together for CR2. >>>>>> >>>>>> It seems that we want to implement keySet, values and entrySet methods >>>>>> using the entry iterator approach. >>>>>> >>>>>> It is however unclear for the size method if we want to use MR entry >>>>>> counting and not worry about the rehash and passivation issues since >>>>>> it is just an estimation anyways. Or if we want to also use the entry >>>>>> iterator which should be closer approximation but will require more >>>>>> network overhead and memory usage. >>>>>> >>>>>> Also we didn't really talk about the fact that these methods would >>>>>> ignore ongoing transactions and if that is a concern or not. >>>>>> >>>>>> - Will >>>>>> >>>>>> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >>>>>>> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >>>>>>> >>>>>>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >>>>>>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>>>> realized there are more things that users would like to know. My >>>>>>>>> question is whether you think that they would really appreciate it, or >>>>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>>>> >>>>>>>>> There are those sizes: >>>>>>>>> A) number of owned entries >>>>>>>>> B) number of entries stored locally in memory >>>>>>>>> C) number of entries stored in each local cache store >>>>>>>>> D) number of entries stored in each shared cache store >>>>>>>>> E) total number of entries in cache >>>>>>>>> >>>>>>>>> So far, we can get >>>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>>>> E via distributed iterators / MR >>>>>>>>> A via data container iteration + distribution manager query, but only >>>>>>>>> without cache store >>>>>>>>> C or D through >>>>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>>>> >>>>>>>>> I think that it would go along with users' expectations if size() >>>>>>>>> returned E and for the rest we should have special methods on >>>>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>>>> I'd say that finally to something that has firm meaning. >>>>>>>>> >>>>>>>>> WDYT? >>>>>>>> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >>>>>>>> - they are approximate (data changes during iteration) >>>>>>>> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >>>>>>>> >>>>>>>> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >>>>>>>> >>>>>>>> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. >>>>>>> Yes, that's what I meant as well. >>>>>>> >>>>>>> Cheers, >>>>>>> -- >>>>>>> Mircea Markus >>>>>>> Infinispan lead (www.infinispan.org) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> -- >>>>> Radim Vansa >>>>> JBoss DataGrid QA >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From dan.berindei at gmail.com Fri Oct 10 10:23:35 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 10 Oct 2014 17:23:35 +0300 Subject: [infinispan-dev] About size() In-Reply-To: <5437E7DA.6060101@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> <5437E7DA.6060101@redhat.com> Message-ID: Exactly, in a monitoring application you wouldn't need the exact number of key-value mappings in the cache. The number of entries in memory and/or on disk should be much more interesting, and we wouldn't have to worry about duplicated/missing/expired entries to show that. On Fri, Oct 10, 2014 at 5:06 PM, Tristan Tarrant wrote: > What's wrong with sum(Datacontainer.size())/numOwners ? > > Tristan > > On 10/10/14 16:03, Radim Vansa wrote: > > On 10/10/2014 02:38 PM, William Burns wrote: > >> On Wed, Oct 8, 2014 at 11:19 AM, Radim Vansa wrote: > >>> Users expect that size() will be constant-time (or linear to cluster > >>> size), and generally fast operation. I'd prefer to keep it that way. > >>> Though, even the MR way (used for HotRod size() now) needs to crawl > >>> through all the entries locally. > >> Many in memory collections require O(n) to do size such as > >> ConcurrentLinkedQueue, so I wouldn't say size should always be > >> expected to be constant time or O(c) where c is # of nodes. Granted a > >> user can expect anything they want. > > OK, I stand corrected. Moreover, I was generalizing myself to all users, > > a common mistake :) > > > > Anyway, monitoring tools love nice charts, and I can imagine monitoring > > software polling every 1 second to update that cool chart with cache > > size. Do we want a fast but imprecise variant of this operation in some > > statistics class? > > > > Radim > > > >>> 'Heretic, not very well though of and changing too many things' idea: > >>> what about having data container segment-aware? Then you'd just bcast > >>> SizeCommand with given topologyId and sum up sizes of primary-owned > >>> segments... It's not a complete solution, but at least that would > enable > >>> to get the number of locally owned entries quite fast. Though, you > can't > >>> do that easily with cache stores (without changing SPI). > >>> > >>> Regarding cache stores, IMO we're damned anyway: when calling > >>> cacheStore.size(), it can report more entries as those haven't been > >>> expired yet, it can report less entries as those can be expired due to > >>> [1]. Or, we'll enumerate all the entries, and that's going to be slow > >>> (btw., [1] reminded me that we should enumerate both datacontainer AND > >>> cachestores even if passivation is not enabled). > >> This is precisely what the distributed iterator does. And also > >> support for expired entries was recently integrated as I missed that > >> in the original implementation [a] > >> > >> [a] https://issues.jboss.org/browse/ISPN-4643 > >> > >>> Radim > >>> > >>> [1] https://issues.jboss.org/browse/ISPN-3202 > >>> > >>> On 10/08/2014 04:42 PM, William Burns wrote: > >>>> So it seems we would want to change this for 7.0 if possible since it > >>>> would be a bigger change for something like 7.1 and 8.0 would be even > >>>> further out. I should be able to put this together for CR2. > >>>> > >>>> It seems that we want to implement keySet, values and entrySet methods > >>>> using the entry iterator approach. > >>>> > >>>> It is however unclear for the size method if we want to use MR entry > >>>> counting and not worry about the rehash and passivation issues since > >>>> it is just an estimation anyways. Or if we want to also use the entry > >>>> iterator which should be closer approximation but will require more > >>>> network overhead and memory usage. > >>>> > >>>> Also we didn't really talk about the fact that these methods would > >>>> ignore ongoing transactions and if that is a concern or not. > >>>> > >>>> - Will > >>>> > >>>> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus > wrote: > >>>>> On Oct 8, 2014, at 15:11, Dan Berindei > wrote: > >>>>> > >>>>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus > wrote: > >>>>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: > >>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> recently we had a discussion about what size() returns, but I've > >>>>>>> realized there are more things that users would like to know. My > >>>>>>> question is whether you think that they would really appreciate > it, or > >>>>>>> whether it's just my QA point of view where I sometimes compute the > >>>>>>> 'checksums' of cache to see if I didn't lost anything. > >>>>>>> > >>>>>>> There are those sizes: > >>>>>>> A) number of owned entries > >>>>>>> B) number of entries stored locally in memory > >>>>>>> C) number of entries stored in each local cache store > >>>>>>> D) number of entries stored in each shared cache store > >>>>>>> E) total number of entries in cache > >>>>>>> > >>>>>>> So far, we can get > >>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() > >>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() > >>>>>>> E via distributed iterators / MR > >>>>>>> A via data container iteration + distribution manager query, but > only > >>>>>>> without cache store > >>>>>>> C or D through > >>>>>>> > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > >>>>>>> > >>>>>>> I think that it would go along with users' expectations if size() > >>>>>>> returned E and for the rest we should have special methods on > >>>>>>> AdvancedCache. That would of course change the meaning of size(), > but > >>>>>>> I'd say that finally to something that has firm meaning. > >>>>>>> > >>>>>>> WDYT? > >>>>>> There was a lot of arguments in past whether size() and other > methods that operate over all the elements (keySet, values) are useful > because: > >>>>>> - they are approximate (data changes during iteration) > >>>>>> - they are very resource consuming and might be miss-used (this is > the reason we chosen to use size() with its current local semantic) > >>>>>> > >>>>>> These methods (size, keys, values) are useful for people and I > think we were not wise to implement them only on top of the local data: > this is like preferring efficiency over correctness. This also created a > lot of confusion with our users, question like size() doesn't return the > correct value being asked regularly. I totally agree that size() returns E > (i.e. everything that is stored within the grid, including persistence) and > it's performance implications to be documented accordingly. For keySet and > values - we should stop implementing them (throw exception) and point users > to Will's distributed iterator which is a nicer way to achieve the desired > behavior. > >>>>>> > >>>>>> We can also implement keySet() and values() on top of the > distributed entry iterator and document that using the iterator directly is > better. > >>>>> Yes, that's what I meant as well. > >>>>> > >>>>> Cheers, > >>>>> -- > >>>>> Mircea Markus > >>>>> Infinispan lead (www.infinispan.org) > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> infinispan-dev mailing list > >>>>> infinispan-dev at lists.jboss.org > >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> infinispan-dev at lists.jboss.org > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> -- > >>> Radim Vansa > >>> JBoss DataGrid QA > >>> > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev at lists.jboss.org > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141010/09687daf/attachment-0001.html From dan.berindei at gmail.com Fri Oct 10 10:25:55 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 10 Oct 2014 17:25:55 +0300 Subject: [infinispan-dev] About size() In-Reply-To: <6F085B91-2882-4636-BDC4-1E47815B05D0@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> <5437E7DA.6060101@redhat.com> <5437EAB3.8030202@redhat.com> <6F085B91-2882-4636-BDC4-1E47815B05D0@redhat.com> Message-ID: On Fri, Oct 10, 2014 at 5:20 PM, Mircea Markus wrote: > > On Oct 10, 2014, at 15:18, Radim Vansa wrote: > > > That we should expose that as one method, not forcing people to > > implement the sum() themselves. > > Hmm, isn't the method you mention cache.size() ? :-) > Nope, because we decided to make cache.size() precise-but-slow :) > > > > > And possibly cachestores, again. > > AdvancedCacheLoader has size. > > > > > Radim > > > > On 10/10/2014 04:06 PM, Tristan Tarrant wrote: > >> What's wrong with sum(Datacontainer.size())/numOwners ? > >> > >> Tristan > >> > >> On 10/10/14 16:03, Radim Vansa wrote: > >>> On 10/10/2014 02:38 PM, William Burns wrote: > >>>> On Wed, Oct 8, 2014 at 11:19 AM, Radim Vansa > wrote: > >>>>> Users expect that size() will be constant-time (or linear to cluster > >>>>> size), and generally fast operation. I'd prefer to keep it that way. > >>>>> Though, even the MR way (used for HotRod size() now) needs to crawl > >>>>> through all the entries locally. > >>>> Many in memory collections require O(n) to do size such as > >>>> ConcurrentLinkedQueue, so I wouldn't say size should always be > >>>> expected to be constant time or O(c) where c is # of nodes. Granted a > >>>> user can expect anything they want. > >>> OK, I stand corrected. Moreover, I was generalizing myself to all > users, > >>> a common mistake :) > >>> > >>> Anyway, monitoring tools love nice charts, and I can imagine monitoring > >>> software polling every 1 second to update that cool chart with cache > >>> size. Do we want a fast but imprecise variant of this operation in some > >>> statistics class? > >>> > >>> Radim > >>> > >>>>> 'Heretic, not very well though of and changing too many things' idea: > >>>>> what about having data container segment-aware? Then you'd just bcast > >>>>> SizeCommand with given topologyId and sum up sizes of primary-owned > >>>>> segments... It's not a complete solution, but at least that would > enable > >>>>> to get the number of locally owned entries quite fast. Though, you > can't > >>>>> do that easily with cache stores (without changing SPI). > >>>>> > >>>>> Regarding cache stores, IMO we're damned anyway: when calling > >>>>> cacheStore.size(), it can report more entries as those haven't been > >>>>> expired yet, it can report less entries as those can be expired due > to > >>>>> [1]. Or, we'll enumerate all the entries, and that's going to be slow > >>>>> (btw., [1] reminded me that we should enumerate both datacontainer > AND > >>>>> cachestores even if passivation is not enabled). > >>>> This is precisely what the distributed iterator does. And also > >>>> support for expired entries was recently integrated as I missed that > >>>> in the original implementation [a] > >>>> > >>>> [a] https://issues.jboss.org/browse/ISPN-4643 > >>>> > >>>>> Radim > >>>>> > >>>>> [1] https://issues.jboss.org/browse/ISPN-3202 > >>>>> > >>>>> On 10/08/2014 04:42 PM, William Burns wrote: > >>>>>> So it seems we would want to change this for 7.0 if possible since > it > >>>>>> would be a bigger change for something like 7.1 and 8.0 would be > even > >>>>>> further out. I should be able to put this together for CR2. > >>>>>> > >>>>>> It seems that we want to implement keySet, values and entrySet > methods > >>>>>> using the entry iterator approach. > >>>>>> > >>>>>> It is however unclear for the size method if we want to use MR entry > >>>>>> counting and not worry about the rehash and passivation issues since > >>>>>> it is just an estimation anyways. Or if we want to also use the > entry > >>>>>> iterator which should be closer approximation but will require more > >>>>>> network overhead and memory usage. > >>>>>> > >>>>>> Also we didn't really talk about the fact that these methods would > >>>>>> ignore ongoing transactions and if that is a concern or not. > >>>>>> > >>>>>> - Will > >>>>>> > >>>>>> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus > wrote: > >>>>>>> On Oct 8, 2014, at 15:11, Dan Berindei > wrote: > >>>>>>> > >>>>>>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus > wrote: > >>>>>>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: > >>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> recently we had a discussion about what size() returns, but I've > >>>>>>>>> realized there are more things that users would like to know. My > >>>>>>>>> question is whether you think that they would really appreciate > it, or > >>>>>>>>> whether it's just my QA point of view where I sometimes compute > the > >>>>>>>>> 'checksums' of cache to see if I didn't lost anything. > >>>>>>>>> > >>>>>>>>> There are those sizes: > >>>>>>>>> A) number of owned entries > >>>>>>>>> B) number of entries stored locally in memory > >>>>>>>>> C) number of entries stored in each local cache store > >>>>>>>>> D) number of entries stored in each shared cache store > >>>>>>>>> E) total number of entries in cache > >>>>>>>>> > >>>>>>>>> So far, we can get > >>>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() > >>>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() > >>>>>>>>> E via distributed iterators / MR > >>>>>>>>> A via data container iteration + distribution manager query, but > only > >>>>>>>>> without cache store > >>>>>>>>> C or D through > >>>>>>>>> > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() > >>>>>>>>> > >>>>>>>>> I think that it would go along with users' expectations if size() > >>>>>>>>> returned E and for the rest we should have special methods on > >>>>>>>>> AdvancedCache. That would of course change the meaning of > size(), but > >>>>>>>>> I'd say that finally to something that has firm meaning. > >>>>>>>>> > >>>>>>>>> WDYT? > >>>>>>>> There was a lot of arguments in past whether size() and other > methods that operate over all the elements (keySet, values) are useful > because: > >>>>>>>> - they are approximate (data changes during iteration) > >>>>>>>> - they are very resource consuming and might be miss-used (this > is the reason we chosen to use size() with its current local semantic) > >>>>>>>> > >>>>>>>> These methods (size, keys, values) are useful for people and I > think we were not wise to implement them only on top of the local data: > this is like preferring efficiency over correctness. This also created a > lot of confusion with our users, question like size() doesn't return the > correct value being asked regularly. I totally agree that size() returns E > (i.e. everything that is stored within the grid, including persistence) and > it's performance implications to be documented accordingly. For keySet and > values - we should stop implementing them (throw exception) and point users > to Will's distributed iterator which is a nicer way to achieve the desired > behavior. > >>>>>>>> > >>>>>>>> We can also implement keySet() and values() on top of the > distributed entry iterator and document that using the iterator directly is > better. > >>>>>>> Yes, that's what I meant as well. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> -- > >>>>>>> Mircea Markus > >>>>>>> Infinispan lead (www.infinispan.org) > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> infinispan-dev mailing list > >>>>>>> infinispan-dev at lists.jboss.org > >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>>>> _______________________________________________ > >>>>>> infinispan-dev mailing list > >>>>>> infinispan-dev at lists.jboss.org > >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>>> -- > >>>>> Radim Vansa > >>>>> JBoss DataGrid QA > >>>>> > >>>>> _______________________________________________ > >>>>> infinispan-dev mailing list > >>>>> infinispan-dev at lists.jboss.org > >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> infinispan-dev at lists.jboss.org > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > -- > > Radim Vansa > > JBoss DataGrid QA > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141010/cb48fd01/attachment.html From vblagoje at redhat.com Fri Oct 10 11:13:05 2014 From: vblagoje at redhat.com (Vladimir Blagojevic) Date: Fri, 10 Oct 2014 11:13:05 -0400 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: References: <5436FB16.3000003@infinispan.org> Message-ID: <5437F781.8020808@redhat.com> On 2014-10-10, 3:03 AM, Dan Berindei wrote: > > The problem is that the intermediate keys aren't in the same segment: > we want the reduce phase to access only keys local to the reducing > node, and keys in different input segments can yield values for the > same intermediate key. So like you say, we'd have to retry on every > topology change in the intermediary cache, not just the ones affecting > segment _i_. > If we have to retry for all segments on every topology change than I am not sure why would it make sense to work on this optimization and topology handling mechanism at all. We have to handle the cases where one node might have completed map phase and inserted deltas, while the other only started inserting deltas, and the third one is still doing map phase and has not inserted any deltas at all. The same thing with reduce portion. It seems to me that in the end any algorithm we come up with will not be not much better than: detect topology change, retry map/reduce job. Vladimir From ttarrant at redhat.com Fri Oct 10 11:22:40 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 10 Oct 2014 17:22:40 +0200 Subject: [infinispan-dev] About size() In-Reply-To: <5437EAB3.8030202@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> <5437E7DA.6060101@redhat.com> <5437EAB3.8030202@redhat.com> Message-ID: <5437F9C0.8070200@redhat.com> I meant as a quick implementation of that, not that we want to tell users to do it. Tristan On 10/10/14 16:18, Radim Vansa wrote: > That we should expose that as one method, not forcing people to > implement the sum() themselves. > > And possibly cachestores, again. > > Radim > > On 10/10/2014 04:06 PM, Tristan Tarrant wrote: >> What's wrong with sum(Datacontainer.size())/numOwners ? >> >> Tristan >> >> On 10/10/14 16:03, Radim Vansa wrote: >>> On 10/10/2014 02:38 PM, William Burns wrote: >>>> On Wed, Oct 8, 2014 at 11:19 AM, Radim Vansa wrote: >>>>> Users expect that size() will be constant-time (or linear to cluster >>>>> size), and generally fast operation. I'd prefer to keep it that way. >>>>> Though, even the MR way (used for HotRod size() now) needs to crawl >>>>> through all the entries locally. >>>> Many in memory collections require O(n) to do size such as >>>> ConcurrentLinkedQueue, so I wouldn't say size should always be >>>> expected to be constant time or O(c) where c is # of nodes. Granted a >>>> user can expect anything they want. >>> OK, I stand corrected. Moreover, I was generalizing myself to all users, >>> a common mistake :) >>> >>> Anyway, monitoring tools love nice charts, and I can imagine monitoring >>> software polling every 1 second to update that cool chart with cache >>> size. Do we want a fast but imprecise variant of this operation in some >>> statistics class? >>> >>> Radim >>> >>>>> 'Heretic, not very well though of and changing too many things' idea: >>>>> what about having data container segment-aware? Then you'd just bcast >>>>> SizeCommand with given topologyId and sum up sizes of primary-owned >>>>> segments... It's not a complete solution, but at least that would enable >>>>> to get the number of locally owned entries quite fast. Though, you can't >>>>> do that easily with cache stores (without changing SPI). >>>>> >>>>> Regarding cache stores, IMO we're damned anyway: when calling >>>>> cacheStore.size(), it can report more entries as those haven't been >>>>> expired yet, it can report less entries as those can be expired due to >>>>> [1]. Or, we'll enumerate all the entries, and that's going to be slow >>>>> (btw., [1] reminded me that we should enumerate both datacontainer AND >>>>> cachestores even if passivation is not enabled). >>>> This is precisely what the distributed iterator does. And also >>>> support for expired entries was recently integrated as I missed that >>>> in the original implementation [a] >>>> >>>> [a] https://issues.jboss.org/browse/ISPN-4643 >>>> >>>>> Radim >>>>> >>>>> [1] https://issues.jboss.org/browse/ISPN-3202 >>>>> >>>>> On 10/08/2014 04:42 PM, William Burns wrote: >>>>>> So it seems we would want to change this for 7.0 if possible since it >>>>>> would be a bigger change for something like 7.1 and 8.0 would be even >>>>>> further out. I should be able to put this together for CR2. >>>>>> >>>>>> It seems that we want to implement keySet, values and entrySet methods >>>>>> using the entry iterator approach. >>>>>> >>>>>> It is however unclear for the size method if we want to use MR entry >>>>>> counting and not worry about the rehash and passivation issues since >>>>>> it is just an estimation anyways. Or if we want to also use the entry >>>>>> iterator which should be closer approximation but will require more >>>>>> network overhead and memory usage. >>>>>> >>>>>> Also we didn't really talk about the fact that these methods would >>>>>> ignore ongoing transactions and if that is a concern or not. >>>>>> >>>>>> - Will >>>>>> >>>>>> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus wrote: >>>>>>> On Oct 8, 2014, at 15:11, Dan Berindei wrote: >>>>>>> >>>>>>>> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus wrote: >>>>>>>> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>>>> realized there are more things that users would like to know. My >>>>>>>>> question is whether you think that they would really appreciate it, or >>>>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>>>> >>>>>>>>> There are those sizes: >>>>>>>>> A) number of owned entries >>>>>>>>> B) number of entries stored locally in memory >>>>>>>>> C) number of entries stored in each local cache store >>>>>>>>> D) number of entries stored in each shared cache store >>>>>>>>> E) total number of entries in cache >>>>>>>>> >>>>>>>>> So far, we can get >>>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>>>> E via distributed iterators / MR >>>>>>>>> A via data container iteration + distribution manager query, but only >>>>>>>>> without cache store >>>>>>>>> C or D through >>>>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>>>> >>>>>>>>> I think that it would go along with users' expectations if size() >>>>>>>>> returned E and for the rest we should have special methods on >>>>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>>>> I'd say that finally to something that has firm meaning. >>>>>>>>> >>>>>>>>> WDYT? >>>>>>>> There was a lot of arguments in past whether size() and other methods that operate over all the elements (keySet, values) are useful because: >>>>>>>> - they are approximate (data changes during iteration) >>>>>>>> - they are very resource consuming and might be miss-used (this is the reason we chosen to use size() with its current local semantic) >>>>>>>> >>>>>>>> These methods (size, keys, values) are useful for people and I think we were not wise to implement them only on top of the local data: this is like preferring efficiency over correctness. This also created a lot of confusion with our users, question like size() doesn't return the correct value being asked regularly. I totally agree that size() returns E (i.e. everything that is stored within the grid, including persistence) and it's performance implications to be documented accordingly. For keySet and values - we should stop implementing them (throw exception) and point users to Will's distributed iterator which is a nicer way to achieve the desired behavior. >>>>>>>> >>>>>>>> We can also implement keySet() and values() on top of the distributed entry iterator and document that using the iterator directly is better. >>>>>>> Yes, that's what I meant as well. >>>>>>> >>>>>>> Cheers, >>>>>>> -- >>>>>>> Mircea Markus >>>>>>> Infinispan lead (www.infinispan.org) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> -- >>>>> Radim Vansa >>>>> JBoss DataGrid QA >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > From dan.berindei at gmail.com Fri Oct 10 12:06:53 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 10 Oct 2014 19:06:53 +0300 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: <5437F781.8020808@redhat.com> References: <5436FB16.3000003@infinispan.org> <5437F781.8020808@redhat.com> Message-ID: On Fri, Oct 10, 2014 at 6:13 PM, Vladimir Blagojevic wrote: > On 2014-10-10, 3:03 AM, Dan Berindei wrote: > > > > The problem is that the intermediate keys aren't in the same segment: > > we want the reduce phase to access only keys local to the reducing > > node, and keys in different input segments can yield values for the > > same intermediate key. So like you say, we'd have to retry on every > > topology change in the intermediary cache, not just the ones affecting > > segment _i_. > > > If we have to retry for all segments on every topology change than I am > not sure why would it make sense to work on this optimization and > topology handling mechanism at all. We have to handle the cases where > one node might have completed map phase and inserted deltas, while the > other only started inserting deltas, and the third one is still doing > map phase and has not inserted any deltas at all. The same thing with > reduce portion. It seems to me that in the end any algorithm we come up > with will not be not much better than: detect topology change, retry > map/reduce job. > Initially that was my thinking as well. But if the originator invokes the map/combine phase for only one segment at a time, it will have to retry only one segment per cluster node, not all the segments. And each node would write to separate keys in the intermediate cache, making it easy to clean up only one node's work. So it would still be worth it, as usually numSegments >> clusterSize. Plus we don't need this broad retry strategy if the intermediate cache is transactional (I think). The biggest downside I see is that it would be horribly slow if the cache store doesn't support efficient iteration of a single segment. So we might want to implement a full retry strategy as well, if some cache stores can't support that. Cheers Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141010/4191fde3/attachment.html From mudokonman at gmail.com Fri Oct 10 12:30:54 2014 From: mudokonman at gmail.com (William Burns) Date: Fri, 10 Oct 2014 12:30:54 -0400 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: On Wed, Oct 8, 2014 at 12:23 PM, Dan Berindei wrote: > > > On Wed, Oct 8, 2014 at 6:14 PM, William Burns wrote: >> >> On Wed, Oct 8, 2014 at 10:57 AM, Dan Berindei >> wrote: >> > >> > >> > On Wed, Oct 8, 2014 at 5:42 PM, William Burns >> > wrote: >> >> >> >> So it seems we would want to change this for 7.0 if possible since it >> >> would be a bigger change for something like 7.1 and 8.0 would be even >> >> further out. I should be able to put this together for CR2. >> > >> > >> > I'm not 100% convinced that we need it for 7.x. For 8.0 I would >> > recommend >> > removing the size() method altogether, and providing some looser >> > "statistics" instead. >> >> Yeah I guess I don't know enough about the demand for these methods or >> what people wanted to use them for to know what kind of priority they >> should be given. >> >> It sounds like you are talking about decoupling from the >> Map/ConcurrentMap interface completely then, right? So we would also >> eliminate the other bulk methods (keySet, values, entrySet)? > > > Yes, I would base the Cache interface on JSR-107's Cache, which doesn't have > size() or the other methods. > >> >> >> > >> >> >> >> >> >> It seems that we want to implement keySet, values and entrySet methods >> >> using the entry iterator approach. >> >> >> >> It is however unclear for the size method if we want to use MR entry >> >> counting and not worry about the rehash and passivation issues since >> >> it is just an estimation anyways. Or if we want to also use the entry >> >> iterator which should be closer approximation but will require more >> >> network overhead and memory usage. >> > >> > >> > +1 to use the entry iterator from me, ignoring state transfer we can get >> > some pretty wild fluctuations in the size of the cache. >> >> That is personally my feeling as well, but I tend to err more on the >> side of correctness to begin with. >> >> > We could use a distributed task for Cache.isEmpty() instead of size() == >> > 0, >> > though. >> >> Yes that should be a good optimization either way. >> >> > >> >> >> >> >> >> Also we didn't really talk about the fact that these methods would >> >> ignore ongoing transactions and if that is a concern or not. >> >> >> > >> > It might be a concern for the Hibernate 2LC impl, it was their TCK that >> > prompted the last round of discussions about clear(). >> >> Although I wonder how much these methods are even used since they only >> work for Local, Replication or Invalidation caches in their current >> state (and didn't even use loaders until 6.0). > > > There is some more information about the test in the mailing list discussion > [1] > There's also a JIRA for clear() [2] > > I think 2LC almost never uses distribution, so size() being local-only > didn't matter, but making it non-tx could cause problems - at least for that > particular test. I had toyed around with the following idea before, but I never thought of it in the scope of the size method solely, but I have a solution that would work mostly for transactional caches. Essentially the size method would always operate in a READ_COMMITTED like state, using REPEATABLE_READ doesn't seem feasible since we can't keep all the contents in memory. Essentially the iterator would be ran and for each key that is found it checks the context to see if it is there. If the context entry is marked as removed it doesn't count the key, if the key is there it marks the key as found and counts it, and if it is not found it counts it. Then after iteration it finds all the keys in the context that were not found and also adds them to the count. This way it doesn't need to store additional memory (besides iteration costs) as all the context information is in memory. My original thought was to also make the EntryIterator transactional in the same way which also means the keySet, entrySet and values methods could do the same things. The main reason stumbling block I had was the fact that the iterator and various collections returned could be used outside of the ongoing transaction which didn't seem to make much sense to me. But maybe these should be changed to be more like backing maps which HashMap, ConcurrentHashMap etc use for their methods, where instead it would pick up the transaction if there is one in the current thread and if there is no transaction just start an implicit one. This however was a big change from how these collections work currently in that they are in memory copies only. What do you guys think? > > [1] http://lists.jboss.org/pipermail/infinispan-dev/2013-October/013914.html > [2] https://issues.jboss.org/browse/ISPN-3656 > >> >> > >> > We haven't talked about what size(), keySet() and values() should return >> > for >> > an invalidation cache either... I forget, does the distributed entry >> > iterator work with invalidation caches? >> >> It works the same as a local cache so only the local node contents are >> returned. Replicated does the same thing, distributed is the only >> special case. This was the only thing that made sense to me, but if >> you have any ideas that would be great to hear for possibly enhancing >> Invalidation iteration. > > > Sounds good to me. > > cache.get(k) will search on all the nodes via ClusterLoader, so there is a > certain appeal in making the entry iterator do the same. But invalidation > caches are used with an external (non-CacheLoader) source of data anyway, so > we can never return "all the entries". > >> >> > >> > >> >> >> >> - Will >> >> >> >> On Wed, Oct 8, 2014 at 10:13 AM, Mircea Markus >> >> wrote: >> >> > >> >> > On Oct 8, 2014, at 15:11, Dan Berindei >> >> > wrote: >> >> > >> >> >> >> >> >> On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus >> >> >> wrote: >> >> >> On Oct 3, 2014, at 9:30, Radim Vansa wrote: >> >> >> >> >> >> > Hi, >> >> >> > >> >> >> > recently we had a discussion about what size() returns, but I've >> >> >> > realized there are more things that users would like to know. My >> >> >> > question is whether you think that they would really appreciate >> >> >> > it, >> >> >> > or >> >> >> > whether it's just my QA point of view where I sometimes compute >> >> >> > the >> >> >> > 'checksums' of cache to see if I didn't lost anything. >> >> >> > >> >> >> > There are those sizes: >> >> >> > A) number of owned entries >> >> >> > B) number of entries stored locally in memory >> >> >> > C) number of entries stored in each local cache store >> >> >> > D) number of entries stored in each shared cache store >> >> >> > E) total number of entries in cache >> >> >> > >> >> >> > So far, we can get >> >> >> > B via withFlags(SKIP_CACHE_LOAD).size() >> >> >> > (passivation ? B : 0) + firstNonZero(C, D) via size() >> >> >> > E via distributed iterators / MR >> >> >> > A via data container iteration + distribution manager query, but >> >> >> > only >> >> >> > without cache store >> >> >> > C or D through >> >> >> > >> >> >> > >> >> >> > getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >> >> >> > >> >> >> > I think that it would go along with users' expectations if size() >> >> >> > returned E and for the rest we should have special methods on >> >> >> > AdvancedCache. That would of course change the meaning of size(), >> >> >> > but >> >> >> > I'd say that finally to something that has firm meaning. >> >> >> > >> >> >> > WDYT? >> >> >> >> >> >> There was a lot of arguments in past whether size() and other >> >> >> methods >> >> >> that operate over all the elements (keySet, values) are useful >> >> >> because: >> >> >> - they are approximate (data changes during iteration) >> >> >> - they are very resource consuming and might be miss-used (this is >> >> >> the >> >> >> reason we chosen to use size() with its current local semantic) >> >> >> >> >> >> These methods (size, keys, values) are useful for people and I think >> >> >> we >> >> >> were not wise to implement them only on top of the local data: this >> >> >> is like >> >> >> preferring efficiency over correctness. This also created a lot of >> >> >> confusion >> >> >> with our users, question like size() doesn't return the correct >> >> >> value being >> >> >> asked regularly. I totally agree that size() returns E (i.e. >> >> >> everything that >> >> >> is stored within the grid, including persistence) and it's >> >> >> performance >> >> >> implications to be documented accordingly. For keySet and values - >> >> >> we should >> >> >> stop implementing them (throw exception) and point users to Will's >> >> >> distributed iterator which is a nicer way to achieve the desired >> >> >> behavior. >> >> >> >> >> >> We can also implement keySet() and values() on top of the >> >> >> distributed >> >> >> entry iterator and document that using the iterator directly is >> >> >> better. >> >> > >> >> > Yes, that's what I meant as well. >> >> > >> >> > Cheers, >> >> > -- >> >> > Mircea Markus >> >> > Infinispan lead (www.infinispan.org) >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > _______________________________________________ >> >> > infinispan-dev mailing list >> >> > infinispan-dev at lists.jboss.org >> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> infinispan-dev at lists.jboss.org >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > >> > >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From mmarkus at redhat.com Fri Oct 10 13:51:24 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Fri, 10 Oct 2014 18:51:24 +0100 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> <5437E7DA.6060101@redhat.com> <5437EAB3.8030202@redhat.com> <6F085B91-2882-4636-BDC4-1E47815B05D0@redhat.com> Message-ID: <82DC8B96-DAE7-42D0-976A-1CEB1EF7A212@redhat.com> On Oct 10, 2014, at 15:25, Dan Berindei wrote: > On Fri, Oct 10, 2014 at 5:20 PM, Mircea Markus wrote: > > On Oct 10, 2014, at 15:18, Radim Vansa wrote: > > > That we should expose that as one method, not forcing people to > > implement the sum() themselves. > > Hmm, isn't the method you mention cache.size() ? :-) > > Nope, because we decided to make cache.size() precise-but-slow :) It's not possible to make it precise unless we provide snapshot isolation /MVCC support. IMO the formula Tristan provides is a good enough approximation of the size of the data. And definitely way better than what we currently have. (Looking at CHM.size() they offer an "accurate" size of the map by counting it in a loop and making sure that the size is reproducible. I don't think that's accurate in the general case, though, as you might count intermediate sizes in that loop). Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From mmarkus at redhat.com Fri Oct 10 13:52:19 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Fri, 10 Oct 2014 18:52:19 +0100 Subject: [infinispan-dev] About size() In-Reply-To: <5437F9C0.8070200@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> <5437E7DA.6060101@redhat.com> <5437EAB3.8030202@redhat.com> <5437F9C0.8070200@redhat.com> Message-ID: <12870F52-4C81-400F-B2A0-382A690DA94A@redhat.com> On Oct 10, 2014, at 16:22, Tristan Tarrant wrote: > I meant as a quick implementation of that, not that we want to tell > users to do it. +1. That would be accurate enough for practical reasons. Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From mmarkus at redhat.com Fri Oct 10 14:01:52 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Fri, 10 Oct 2014 19:01:52 +0100 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: On Oct 10, 2014, at 17:30, William Burns wrote: >>>>> Also we didn't really talk about the fact that these methods would >>>>> ignore ongoing transactions and if that is a concern or not. >>>>> >>>> >>>> It might be a concern for the Hibernate 2LC impl, it was their TCK that >>>> prompted the last round of discussions about clear(). >>> >>> Although I wonder how much these methods are even used since they only >>> work for Local, Replication or Invalidation caches in their current >>> state (and didn't even use loaders until 6.0). >> >> >> There is some more information about the test in the mailing list discussion >> [1] >> There's also a JIRA for clear() [2] >> >> I think 2LC almost never uses distribution, so size() being local-only >> didn't matter, but making it non-tx could cause problems - at least for that >> particular test. > > I had toyed around with the following idea before, but I never thought > of it in the scope of the size method solely, but I have a solution > that would work mostly for transactional caches. Essentially the size > method would always operate in a READ_COMMITTED like state, using > REPEATABLE_READ doesn't seem feasible since we can't keep all the > contents in memory. Essentially the iterator would be ran and for > each key that is found it checks the context to see if it is there. > If the context entry is marked as removed it doesn't count the key, if > the key is there it marks the key as found and counts it, and if it is > not found it counts it. Then after iteration it finds all the keys in > the context that were not found and also adds them to the count. This > way it doesn't need to store additional memory (besides iteration > costs) as all the context information is in memory. sounds good to me. > > My original thought was to also make the EntryIterator transactional > in the same way which also means the keySet, entrySet and values > methods could do the same things. The main reason stumbling block I > had was the fact that the iterator and various collections returned > could be used outside of the ongoing transaction which didn't seem to > make much sense to me. But maybe these should be changed to be more > like backing maps which HashMap, ConcurrentHashMap etc use for their > methods, where instead it would pick up the transaction if there is > one in the current thread and if there is no transaction just start an > implicit one. or if they are outside of a transaction to deny progress > This however was a big change from how these > collections work currently in that they are in memory copies only. > > What do you guys think? I think that keeping track of the context entries is a better way of iterating so +1. As you mentioned, we should also make it clear that RC semantic applies. Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From emmanuel at hibernate.org Fri Oct 10 11:49:48 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Fri, 10 Oct 2014 18:49:48 +0300 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: References: <5436FB16.3000003@infinispan.org> Message-ID: <20141010154948.GD5052@hibernate.org> When wrestling with the subject, here is what I had in mind. The M/R coordinator node sends the M task per segment on the node where the segment is primary. Each "per-segment" M task is executed and is offered the way to push intermediary results in a temp cache. The intermediary results are stored with a composite key [imtermKey-i, seg-j]. The M/R coordinator waits for all M tasks to return. If one does not (timeout, rehash), the following happens: - delete [intermKey-i, seg-i] (that operation could be handled by the new per-segment M before the map task is effectively started) - ship the M task for that segment-i to the new primary owner of segment-i When all M tasks are received the Reduce phase will read all [intermKey-i, *] keys and reduce them. Note that if the reduction phase is itself distributed, we could apply the same key per segment and shipping split for these. Again the tricky part is to expose the ability to write to intermediary caches per segment without exposing segments per se as well as let someone see a concatenated view if intermKey-i from all segments subkeys during reduction. Thoughts? Dan, I did not quite get what alternative approach you wanted to propose. Care to respin it for a slow brain? :) Emmanuel On Fri 2014-10-10 10:03, Dan Berindei wrote: > > > I'd rather not expose this to the user. Instead, we could split the > > > intermediary values for each key by the source segment, and do the > > > invalidation of the retried segments in our M/R framework (e.g. when we > > > detect that the primary owner at the start of the map/combine phase is > > > not an owner at all at the end). > > > > > > I think we have another problem with the publishing of intermediary > > > values not being idempotent. The default configuration for the > > > intermediate cache is non-transactional, and retrying the put(delta) > > > command after a topology change could add the same intermediate values > > > twice. A transactional intermediary cache should be safe, though, > > > because the tx won't commit on the old owner until the new owner knows > > > about the tx. > > > > can you elaborate on it? > > > > say we have a cache with numOwners=2, owners(k) = [A, B] > C will become the primary owner of k, but for now owners(k) = [A, B, C] > O sends put(delta) to A (the primary) > A sends put(delta) to B, C > B sees a topology change (owners(k) = [C, B]), doesn't apply the delta and > replies with an OutdatedTopologyException > C applies the delta > A resends put(delta) to C (new primary) > C sends put(delta) to B, applies the delta again > > I think it could be solved with versions, I just wanted to point out that > we don't do that now. > > > > > > anyway, I think the retry mechanism should solve it. If we detect a > > topology change (during the iteration of segment _i_) and the segment > > _i_ is moved, then we can cancel the iteration, remove all the > > intermediate values generated in segment _i_ and restart (on the primary > > owner). > > > > The problem is that the intermediate keys aren't in the same segment: we > want the reduce phase to access only keys local to the reducing node, and > keys in different input segments can yield values for the same intermediate > key. So like you say, we'd have to retry on every topology change in the > intermediary cache, not just the ones affecting segment _i_. > > There's another complication: in the scenario above, O may only get the > topology update with owners(k) = [C, B] after the map/combine phase > completed. So the originator of the M/R job would have to watch for > topology changes seen by any node, and invalidate/retry any input segments > that could have been affected. All that without slowing down the > no-topology-change case too much... > > > > > > > > > > > > > > But before getting ahead of ourselves, what do you thing of the > > general idea? Even without retry framework, this approach would be more > > stable than our current per node approach during topology changes and > > improve dependability. > > > > > > Doing it solely based on segment would remove the possibility of > > > having duplicates. However without a mechanism to send a new request > > > on rehash it would be possible to only find a subset of values (if a > > > segment is removed while iterating on it). From sanne at infinispan.org Fri Oct 10 16:36:35 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 10 Oct 2014 21:36:35 +0100 Subject: [infinispan-dev] Naming of project modules Message-ID: All, I occasionally have to hard-reset my whole workspace, delete the Eclipse projects, and re-import them, especially when I switch between branches. I have lots of projects, and they are all nicely "grouped" as Eclipse shows projects in alphabetical order, and all projects use a consistent prefix like "hibernate-ogm-" or "wildfly-", etc.. But Infinispan often manages to fool me, as most modules have an "infinispan-" prefix, but not all of them follow this rule so some get to hide out of sight (I literally have hundreds of projects in my primary workspace). Could we please make sure they all have a name starting with "infinispan-" ? If you agree I'm happy to send a PR to fix the couple of exotic ones. Sanne From sanne at infinispan.org Sat Oct 11 12:54:57 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Sat, 11 Oct 2014 17:54:57 +0100 Subject: [infinispan-dev] On ParserRegistry and classloaders In-Reply-To: <53D945B4.7070508@redhat.com> References: <9A22CC58-5B20-4D6B-BA6B-B4A23493979F@redhat.com> <53831239.30601@redhat.com> <114950B5-AA9F-485C-8E3C-AC36299974AB@redhat.com> <53D945B4.7070508@redhat.com> Message-ID: Hi all, sorry it took me quite some time to return on this, but it's still troublesome and quite critical to allow using Infinispan as a module. I'm again trying to use Infinispan's JBoss Modules as a (library) for deployed application, and it fails because of: Caused by: java.lang.ClassNotFoundException: org.infinispan.remoting.transport.jgroups.JGroupsTransport from [Module \"deployment.ModuleMemberRegistrationIT.war:main\" from Service Module Loader]"}} Debugging the application server, the stack is pointing to this method : public TransportConfigurationBuilder defaultTransport() { Transport transport = Util.getInstance(DEFAULT_TRANSPORT, this.getGlobalConfig().getClassLoader()); <------- Failure here transport(transport); return this; } The getClassLoader() function returns a reference to the application deployment (as you can see in the error above). This code makes sense I think, but then inspecting in the "Util.getInstance" method we see that it's not strictly using the classloaders we pass, but it's going to look in a specific sequence as defined by this function: public static ClassLoader[] getClassLoaders(ClassLoader appClassLoader) { return new ClassLoader[] { appClassLoader, // User defined classes OsgiClassLoader.getInstance(), // OSGi bundle context needs to be on top of TCCL, system CL, etc. Util.class.getClassLoader(), // Infinispan classes (not always on TCCL [modular env]) ClassLoader.getSystemClassLoader(), // Used when load time instrumentation is in effect Thread.currentThread().getContextClassLoader() //Used by jboss-as stuff }; } Now to clarify one thing: the deployment does NOT have direct visibility to the Infinispan modules. The application is depending on Hibernate Search, and I can't allow Infinispan to leak visibility to the application. So going through the list, none of the classloaders actually have the class from Infinispan Core because: appClassLoader -> doesn't have access to any Infinispan jar OsgiClassLoader.getInstance() -> OSGi only (right? I'm not too clear on this one.. BTW the OsgiClassLoader.getInstance() is clearly not threadsafe) Util.class.getClassLoader(), // Infinispan classes (not always on TCCL [modular env]) <- (That's the comment in the code) Well NO, this comment was probably correct when Util.class was included into infinispan-core.. but now this function just returns the modular classloader of Infinispan Commons ! It doesn't expose the JGroupsTransport. I hoped I could configure the ClassLoader, but I've already done that: 1 ClassLoader ispnClassLoadr = ParserRegistry.class.getClassLoader(); 2 configurationParser = new ParserRegistry( ispnClassLoadr ); // Make sure the Parser is using the module having access to Infinispan Core 3 ConfigurationBuilderHolder builderHolder = configurationParser.parse( is ); // I have to pass a stream so that I can load the resource from the user module 4 patchInfinispanClassLoader( builderHolder ); // AFTER I have the builderHolder, I can use an utility method to correct the classloaders But the above exception is triggered at line *3* so I have no opportunity to amend any classloader. In summary: the defaultTransport() is triggered during XML parsing and it's going to ignore any programmatically set classloader. A patch proposal would be to change the following line from the GlobalConfigurationBuilder() constructor: ClassLoader defaultCL = Util.isOSGiContext() ? GlobalConfigurationBuilder.class.getClassLoader() : Thread.currentThread().getContextClassLoader(); to a simple: ClassLoader defaultCL = GlobalConfigurationBuilder.class.getClassLoader(); But the function generating the chain of classloaders should probably also need to be fixed.. I suspect it would be much easier if we could agree to: A) accept classloaders on the methods which need it B) Stick to the given classloaders without applying unrequested and surprising overrides My workaround in Search is going to be to set the ContextClassloader; that resolved my problem but it's piling up on other workarounds related to Infinispan initialization and classloaders. I'm not opening JIRA's here as I'm not clear on which ones are bugs, and how many of these are intentional.. please take that judgement task. Thanks, Sanne On 30 July 2014 20:21, Ion Savin wrote: > Hi Sanne, > > I don't see any changes in the ParserRegistry which would have removed the > behavior you describe (at least looking at the OSGi changes). Can you point > me please to some code which used to work in the past? > > I've found two classes which have some reference to Hibernate in comments > and the factory was removed part of the OSGi changes. Are these perhaps the > changes which you are missing? > > https://github.com/infinispan/infinispan/blob/6.0.x/core/src/main/java/org/infinispan/util/FileLookup.java > > https://github.com/infinispan/infinispan/blob/6.0.x/core/src/main/java/org/infinispan/util/FileLookupFactory.java > > -- > Ion Savin > > > On 07/30/2014 09:17 PM, Mircea Markus wrote: >> >> Ion, Martin - what are your thoughts? >> >> On Jul 29, 2014, at 16:34, Sanne Grinovero wrote: >> >>> All, >>> in Search we wrap the Parser in a decorator which workarounds the >>> classloader limitation. >>> I still think you should fix this, it doesn't matter how/why it was >>> changed. >>> >>> Sanne >>> >>> On 26 May 2014 11:06, Ion Savin wrote: >>>> >>>> Hi Sanne, Galder, >>>> >>>> On 05/23/2014 07:08 PM, Sanne Grinovero wrote: >>>>> >>>>> On 23 May 2014 08:03, Galder Zamarre?o wrote: >>>>>>> >>>>>>> Hey Sanne, >>>>>>> >>>>>>> I?ve looked at ParserRegistry and not sure I see the changes you are >>>>>>> referring to? >>>>>>> >>>>>>>> From what I?ve seen, ParserRegistry has taken class loader in the >>>>>>>> constructor since the start. >>>>> >>>>> Yes, and that was good as we've been using it: it might need >>>>> directions to be pointed at the right modules to load extension >>>>> points. >>>>> >>>>> My problem is not that the constructor takes a ClassLoader, but that >>>>> other options have been removed; essentially in my scenario the module >>>>> containing the extension points does not contain the configuration >>>>> file I want it to load, and the actual classLoader I want the >>>>> CacheManager to use is yet a different one. As explained below, >>>>> assembling a single "catch all" ClassLoader to delegate to all doesn't >>>>> work as some of these actually need to be strictly isolated to prevent >>>>> ambiguities. >>>>> >>>>>>> I suspect you might be referring to classloader related changes as a >>>>>>> result of OSGI integration? >>>>> >>>>> I didn't check but that sounds like a reasonable estimate. >>>> >>>> >>>> I had a look at the OSGi-related changes done for this class and they >>>> don't alter the class interface in any way. The implementation changes >>>> related to FileLookup seem to maintain the same behavior for non-OSGi >>>> contexts also. >>>> >>>> Regards, >>>> Ion Savin >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> Cheers, >> > From dan.berindei at gmail.com Mon Oct 13 03:13:10 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 13 Oct 2014 10:13:10 +0300 Subject: [infinispan-dev] Naming of project modules In-Reply-To: References: Message-ID: +1 On Fri, Oct 10, 2014 at 11:36 PM, Sanne Grinovero wrote: > All, > > I occasionally have to hard-reset my whole workspace, delete the > Eclipse projects, and re-import them, especially when I switch between > branches. > > I have lots of projects, and they are all nicely "grouped" as Eclipse > shows projects in alphabetical order, and all projects use a > consistent prefix like "hibernate-ogm-" or "wildfly-", etc.. > > But Infinispan often manages to fool me, as most modules have an > "infinispan-" prefix, but not all of them follow this rule so some get > to hide out of sight (I literally have hundreds of projects in my > primary workspace). > > Could we please make sure they all have a name starting with "infinispan-" > ? > > If you agree I'm happy to send a PR to fix the couple of exotic ones. > > Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141013/3dbd8f16/attachment.html From dan.berindei at gmail.com Mon Oct 13 04:45:42 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 13 Oct 2014 11:45:42 +0300 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: <20141010154948.GD5052@hibernate.org> References: <5436FB16.3000003@infinispan.org> <20141010154948.GD5052@hibernate.org> Message-ID: On Fri, Oct 10, 2014 at 6:49 PM, Emmanuel Bernard wrote: > When wrestling with the subject, here is what I had in mind. > > The M/R coordinator node sends the M task per segment on the node where > the segment is primary. > What's M? Is it just a shorthand for "map", or is it a new parameter that controls the number of map/combine tasks sent at once? > Each "per-segment" M task is executed and is offered the way to push > intermediary results in a temp cache. > Just to be clear, the user-provided mapper and combiner don't know anything about the intermediary cache (which doesn't have to be temporary, if it's shared by all M/R tasks). They only interact with the Collector interface. The map/combine task on the other hand is our code, and it deals with the intermediary cache directly. > The intermediary results are stored with a composite key [imtermKey-i, > seg-j]. > The M/R coordinator waits for all M tasks to return. If one does not > (timeout, rehash), the following happens: > We can't allow time out map tasks, or they will keep writing to the intermediate cache in parallel with the retried tasks. So the originator has to wait for a response from each node to which it sent a map task. > - delete [intermKey-i, seg-i] (that operation could be handled by the > new per-segment M before the map task is effectively started) > - ship the M task for that segment-i to the new primary owner of > segment-i > > When all M tasks are received the Reduce phase will read all [intermKey-i, > *] > keys and reduce them. > Note that if the reduction phase is itself distributed, we could apply > the same key per segment and shipping split for these. > Sure, we have to retry reduce tasks when the primary owner changes, and it makes sense to retry as little as possible. > > Again the tricky part is to expose the ability to write to intermediary > caches per segment without exposing segments per se as well as let > someone see a concatenated view if intermKey-i from all segments subkeys > during reduction. > Writing to and reading from the intermediate cache is already abstracted from user code (in the Mapper and Reducer interfaces). So we don't need to worry about exposing extra details to the user. > > Thoughts? > > Dan, I did not quite get what alternative approach you wanted to > propose. Care to respin it for a slow brain? :) > I think where we differ is that I don't think user code needs to know about how we store the intermediate values and what we retry, as long as their mappers/combiners/reducers don't have side effects. Otherwise I was thinking on the same lines: send 1 map/combine task for each segment (maybe with a cap on the number of segments being processed at the same time on each node), split the intermediate values per input segment, cancel+retry each map task if the topology changes and the executing node is no longer an owner. If the reduce phase is distributed, run 1 reduce task per segment as well, and cancel+retry the reduce task if the executing node is no longer an owner. I had some ideas about assigning each map/combine phase a UUID and making the intermediate keys [intermKey, seg, mctask] to allow the originator to retry a map/combine task without waiting for the previous one to finish, but I don't think I mentioned that before :) There are also some details that I'm worried about: 1) If the reduce phase is distributed, and the intermediate cache is non-transactional, any topology change in the intermediate cache will require us to retry all the map/combine tasks that were running at the time on any node (even if some nodes did not detect the topology change yet). So it would make sense to limit the number of map/combine tasks that are processed at one time, in order to limit the amount of tasks we retry (OR require the intermediate cache to be transactional). 2) Running a separate map/combine task for each segment is not really an option until we implement the the segment-aware data container and cache stores. Without that change, it will make everything much slower, because of all the extra iterations for each segment. 3) And finally, all this will be overkill when the input cache is small, and the time needed to process the data is comparable to the time needed to send all those extra RPCs. So I'm thinking it might be better to adopt Vladimir's suggestion to retry everything if we detect a topology change in the input and/or intermediate cache at the end of the M/R task, at least in the first phase. Cheers Dan > > Emmanuel > > On Fri 2014-10-10 10:03, Dan Berindei wrote: > > > > I'd rather not expose this to the user. Instead, we could split the > > > > intermediary values for each key by the source segment, and do the > > > > invalidation of the retried segments in our M/R framework (e.g. when > we > > > > detect that the primary owner at the start of the map/combine phase > is > > > > not an owner at all at the end). > > > > > > > > I think we have another problem with the publishing of intermediary > > > > values not being idempotent. The default configuration for the > > > > intermediate cache is non-transactional, and retrying the put(delta) > > > > command after a topology change could add the same intermediate > values > > > > twice. A transactional intermediary cache should be safe, though, > > > > because the tx won't commit on the old owner until the new owner > knows > > > > about the tx. > > > > > > can you elaborate on it? > > > > > > > say we have a cache with numOwners=2, owners(k) = [A, B] > > C will become the primary owner of k, but for now owners(k) = [A, B, C] > > O sends put(delta) to A (the primary) > > A sends put(delta) to B, C > > B sees a topology change (owners(k) = [C, B]), doesn't apply the delta > and > > replies with an OutdatedTopologyException > > C applies the delta > > A resends put(delta) to C (new primary) > > C sends put(delta) to B, applies the delta again > > > > I think it could be solved with versions, I just wanted to point out that > > we don't do that now. > > > > > > > > > > anyway, I think the retry mechanism should solve it. If we detect a > > > topology change (during the iteration of segment _i_) and the segment > > > _i_ is moved, then we can cancel the iteration, remove all the > > > intermediate values generated in segment _i_ and restart (on the > primary > > > owner). > > > > > > > The problem is that the intermediate keys aren't in the same segment: we > > want the reduce phase to access only keys local to the reducing node, and > > keys in different input segments can yield values for the same > intermediate > > key. So like you say, we'd have to retry on every topology change in the > > intermediary cache, not just the ones affecting segment _i_. > > > > There's another complication: in the scenario above, O may only get the > > topology update with owners(k) = [C, B] after the map/combine phase > > completed. So the originator of the M/R job would have to watch for > > topology changes seen by any node, and invalidate/retry any input > segments > > that could have been affected. All that without slowing down the > > no-topology-change case too much... > > > > > > > > > > > > > > > > > > > But before getting ahead of ourselves, what do you thing of the > > > general idea? Even without retry framework, this approach would be more > > > stable than our current per node approach during topology changes and > > > improve dependability. > > > > > > > > Doing it solely based on segment would remove the possibility of > > > > having duplicates. However without a mechanism to send a new > request > > > > on rehash it would be possible to only find a subset of values > (if a > > > > segment is removed while iterating on it). > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141013/c244ab87/attachment-0001.html From dan.berindei at gmail.com Mon Oct 13 06:14:17 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 13 Oct 2014 13:14:17 +0300 Subject: [infinispan-dev] About size() In-Reply-To: <82DC8B96-DAE7-42D0-976A-1CEB1EF7A212@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <5435560E.2030206@redhat.com> <5437E724.4040705@redhat.com> <5437E7DA.6060101@redhat.com> <5437EAB3.8030202@redhat.com> <6F085B91-2882-4636-BDC4-1E47815B05D0@redhat.com> <82DC8B96-DAE7-42D0-976A-1CEB1EF7A212@redhat.com> Message-ID: On Fri, Oct 10, 2014 at 8:51 PM, Mircea Markus wrote: > On Oct 10, 2014, at 15:25, Dan Berindei wrote: > > > On Fri, Oct 10, 2014 at 5:20 PM, Mircea Markus > wrote: > > > > On Oct 10, 2014, at 15:18, Radim Vansa wrote: > > > > > That we should expose that as one method, not forcing people to > > > implement the sum() themselves. > > > > Hmm, isn't the method you mention cache.size() ? :-) > > > > Nope, because we decided to make cache.size() precise-but-slow :) > > It's not possible to make it precise unless we provide snapshot isolation > /MVCC support. > IMO the formula Tristan provides is a good enough approximation of the size > of the data. And definitely way better than what we currently have. > (Looking at CHM.size() they offer an "accurate" size of the map by > counting it in a loop and making sure that the size is reproducible. I > don't think that's accurate in the general case, though, as you might count > intermediate sizes in that loop). > CHM.size() actually tracks the modCount of each segment and locks all the segments in the final retry, so the result should be accurate. CHMV8 doesn't do that, instead it keeps a striped counter and doesn't try very hard to get a reproducible sum from the counter cells. I thought we concluded size() should be stable (and accurate) when there is no write activity, and the way to implement that is with the entry iterator. The result of Tristan's formula can change without any write activity on the cache, just because there is a state transfer in progress. For monitoring tools I'd rather have separate methods entriesInMemory() -> sum(dataContainer.size()) and entriesInStores() -> sum(cacheStore.size()) that are allowed to rise and fall as nodes join and leave the cache. Cheers Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141013/217c8164/attachment.html From sanne at infinispan.org Mon Oct 13 07:23:05 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 13 Oct 2014 12:23:05 +0100 Subject: [infinispan-dev] On ParserRegistry and classloaders In-Reply-To: References: <9A22CC58-5B20-4D6B-BA6B-B4A23493979F@redhat.com> <53831239.30601@redhat.com> <114950B5-AA9F-485C-8E3C-AC36299974AB@redhat.com> <53D945B4.7070508@redhat.com> Message-ID: So it seems I can workaround most of the below issues by temporarily setting the ContectClassLoader, but then there is an additional problem: since I can't pass the JGroups configuration as an InputStream but this is hardcoded by resource name in the Infinispan configuration, Infinispan is unable to load the correct resource. I'd consider this quite bad for modular usage, and the following issue makes it more confusing as the error message reported to the user reports instead an NPE regarding the Transaction Table: https://issues.jboss.org/browse/ISPN-2145 I hope that improving the error message at least could be prioritized for version 7.0 Final. Sanne On 11 October 2014 17:54, Sanne Grinovero wrote: > Hi all, > sorry it took me quite some time to return on this, but it's still > troublesome and quite critical to allow using Infinispan as a module. > > I'm again trying to use Infinispan's JBoss Modules as a (library) for > deployed application, and it fails because of: > > Caused by: java.lang.ClassNotFoundException: > org.infinispan.remoting.transport.jgroups.JGroupsTransport from > [Module \"deployment.ModuleMemberRegistrationIT.war:main\" from > Service Module Loader]"}} > > Debugging the application server, the stack is pointing to this method : > > public TransportConfigurationBuilder defaultTransport() { > Transport transport = Util.getInstance(DEFAULT_TRANSPORT, > this.getGlobalConfig().getClassLoader()); <------- Failure here > transport(transport); > return this; > } > > The getClassLoader() function returns a reference to the application > deployment (as you can see in the error above). > This code makes sense I think, but then inspecting in the > "Util.getInstance" method we see that it's not strictly using the > classloaders we pass, but it's going to look in a specific sequence as > defined by this function: > > public static ClassLoader[] getClassLoaders(ClassLoader appClassLoader) { > return new ClassLoader[] { > appClassLoader, // User defined classes > OsgiClassLoader.getInstance(), // OSGi bundle context > needs to be on top of TCCL, system CL, etc. > Util.class.getClassLoader(), // Infinispan classes (not > always on TCCL [modular env]) > ClassLoader.getSystemClassLoader(), // Used when load time > instrumentation is in effect > Thread.currentThread().getContextClassLoader() //Used by > jboss-as stuff > }; > } > > > Now to clarify one thing: the deployment does NOT have direct > visibility to the Infinispan modules. The application is depending on > Hibernate Search, and I can't allow Infinispan to leak visibility to > the application. > So going through the list, none of the classloaders actually have the > class from Infinispan Core because: > > appClassLoader -> doesn't have access to any Infinispan jar > > OsgiClassLoader.getInstance() -> OSGi only (right? I'm not too clear > on this one.. BTW the OsgiClassLoader.getInstance() is clearly not > threadsafe) > > Util.class.getClassLoader(), // Infinispan classes (not always on > TCCL [modular env]) <- (That's the comment in the code) > > Well NO, this comment was probably correct when Util.class was > included into infinispan-core.. but now this function just returns the > modular classloader of Infinispan Commons ! It doesn't expose the > JGroupsTransport. > > I hoped I could configure the ClassLoader, but I've already done that: > > 1 ClassLoader ispnClassLoadr = ParserRegistry.class.getClassLoader(); > 2 configurationParser = new ParserRegistry( ispnClassLoadr ); // Make > sure the Parser is using the module having access to Infinispan Core > 3 ConfigurationBuilderHolder builderHolder = > configurationParser.parse( is ); // I have to pass a stream so that I > can load the resource from the user module > 4 patchInfinispanClassLoader( builderHolder ); // AFTER I have the > builderHolder, I can use an utility method to correct the classloaders > > But the above exception is triggered at line *3* so I have no > opportunity to amend any classloader. > > > In summary: the defaultTransport() is triggered during XML parsing and > it's going to ignore any programmatically set classloader. > > A patch proposal would be to change the following line from the > GlobalConfigurationBuilder() constructor: > > ClassLoader defaultCL = Util.isOSGiContext() ? > GlobalConfigurationBuilder.class.getClassLoader() : > Thread.currentThread().getContextClassLoader(); > > to a simple: > > ClassLoader defaultCL = GlobalConfigurationBuilder.class.getClassLoader(); > > But the function generating the chain of classloaders should probably > also need to be fixed.. > I suspect it would be much easier if we could agree to: > A) accept classloaders on the methods which need it > B) Stick to the given classloaders without applying unrequested and > surprising overrides > > My workaround in Search is going to be to set the ContextClassloader; > that resolved my problem but it's piling up on other workarounds > related to Infinispan initialization and classloaders. I'm not opening > JIRA's here as I'm not clear on which ones are bugs, and how many of > these are intentional.. please take that judgement task. > > Thanks, > Sanne > > > > > On 30 July 2014 20:21, Ion Savin wrote: >> Hi Sanne, >> >> I don't see any changes in the ParserRegistry which would have removed the >> behavior you describe (at least looking at the OSGi changes). Can you point >> me please to some code which used to work in the past? >> >> I've found two classes which have some reference to Hibernate in comments >> and the factory was removed part of the OSGi changes. Are these perhaps the >> changes which you are missing? >> >> https://github.com/infinispan/infinispan/blob/6.0.x/core/src/main/java/org/infinispan/util/FileLookup.java >> >> https://github.com/infinispan/infinispan/blob/6.0.x/core/src/main/java/org/infinispan/util/FileLookupFactory.java >> >> -- >> Ion Savin >> >> >> On 07/30/2014 09:17 PM, Mircea Markus wrote: >>> >>> Ion, Martin - what are your thoughts? >>> >>> On Jul 29, 2014, at 16:34, Sanne Grinovero wrote: >>> >>>> All, >>>> in Search we wrap the Parser in a decorator which workarounds the >>>> classloader limitation. >>>> I still think you should fix this, it doesn't matter how/why it was >>>> changed. >>>> >>>> Sanne >>>> >>>> On 26 May 2014 11:06, Ion Savin wrote: >>>>> >>>>> Hi Sanne, Galder, >>>>> >>>>> On 05/23/2014 07:08 PM, Sanne Grinovero wrote: >>>>>> >>>>>> On 23 May 2014 08:03, Galder Zamarre?o wrote: >>>>>>>> >>>>>>>> Hey Sanne, >>>>>>>> >>>>>>>> I?ve looked at ParserRegistry and not sure I see the changes you are >>>>>>>> referring to? >>>>>>>> >>>>>>>>> From what I?ve seen, ParserRegistry has taken class loader in the >>>>>>>>> constructor since the start. >>>>>> >>>>>> Yes, and that was good as we've been using it: it might need >>>>>> directions to be pointed at the right modules to load extension >>>>>> points. >>>>>> >>>>>> My problem is not that the constructor takes a ClassLoader, but that >>>>>> other options have been removed; essentially in my scenario the module >>>>>> containing the extension points does not contain the configuration >>>>>> file I want it to load, and the actual classLoader I want the >>>>>> CacheManager to use is yet a different one. As explained below, >>>>>> assembling a single "catch all" ClassLoader to delegate to all doesn't >>>>>> work as some of these actually need to be strictly isolated to prevent >>>>>> ambiguities. >>>>>> >>>>>>>> I suspect you might be referring to classloader related changes as a >>>>>>>> result of OSGI integration? >>>>>> >>>>>> I didn't check but that sounds like a reasonable estimate. >>>>> >>>>> >>>>> I had a look at the OSGi-related changes done for this class and they >>>>> don't alter the class interface in any way. The implementation changes >>>>> related to FileLookup seem to maintain the same behavior for non-OSGi >>>>> contexts also. >>>>> >>>>> Regards, >>>>> Ion Savin >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> Cheers, >>> >> From isavin at redhat.com Mon Oct 13 08:40:43 2014 From: isavin at redhat.com (Ion Savin) Date: Mon, 13 Oct 2014 15:40:43 +0300 Subject: [infinispan-dev] On ParserRegistry and classloaders In-Reply-To: References: <9A22CC58-5B20-4D6B-BA6B-B4A23493979F@redhat.com> <53831239.30601@redhat.com> <114950B5-AA9F-485C-8E3C-AC36299974AB@redhat.com> <53D945B4.7070508@redhat.com> Message-ID: <543BC84B.7090807@redhat.com> Hi Sanne, > OsgiClassLoader.getInstance() -> OSGi only (right? I'm not too clear Yes, this CL is OSGi-only. Won't load anything outside an OSGi container. > on this one.. BTW the OsgiClassLoader.getInstance() is clearly not > threadsafe) Thanks for catching this! Opened: https://issues.jboss.org/browse/ISPN-4829 -- Ion Savin From dan.berindei at gmail.com Mon Oct 13 09:06:30 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 13 Oct 2014 16:06:30 +0300 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: On Fri, Oct 10, 2014 at 9:01 PM, Mircea Markus wrote: > > On Oct 10, 2014, at 17:30, William Burns wrote: > > >>>>> Also we didn't really talk about the fact that these methods would > >>>>> ignore ongoing transactions and if that is a concern or not. > >>>>> > >>>> > >>>> It might be a concern for the Hibernate 2LC impl, it was their TCK > that > >>>> prompted the last round of discussions about clear(). > >>> > >>> Although I wonder how much these methods are even used since they only > >>> work for Local, Replication or Invalidation caches in their current > >>> state (and didn't even use loaders until 6.0). > >> > >> > >> There is some more information about the test in the mailing list > discussion > >> [1] > >> There's also a JIRA for clear() [2] > >> > >> I think 2LC almost never uses distribution, so size() being local-only > >> didn't matter, but making it non-tx could cause problems - at least for > that > >> particular test. > > > > I had toyed around with the following idea before, but I never thought > > of it in the scope of the size method solely, but I have a solution > > that would work mostly for transactional caches. Essentially the size > > method would always operate in a READ_COMMITTED like state, using > > REPEATABLE_READ doesn't seem feasible since we can't keep all the > > contents in memory. Essentially the iterator would be ran and for > > each key that is found it checks the context to see if it is there. > > If the context entry is marked as removed it doesn't count the key, if > > the key is there it marks the key as found and counts it, and if it is > > not found it counts it. Then after iteration it finds all the keys in > > the context that were not found and also adds them to the count. This > > way it doesn't need to store additional memory (besides iteration > > costs) as all the context information is in memory. > > sounds good to me. > Mircea, you have to decide whether you want the precise estimation using the entry iterator or the loose estimation using dataContainer.size() :) I guess we can't make size() read everything into the invocation context, so READ_COMMITTED is all we can provide if we want to keep size() transactional. Maybe we don't really need it though... Will, could you investigate the failing test that started the clear() thread [1] to see if it really needs size() to be transactional? > > > > My original thought was to also make the EntryIterator transactional > > in the same way which also means the keySet, entrySet and values > > methods could do the same things. The main reason stumbling block I > > had was the fact that the iterator and various collections returned > > could be used outside of the ongoing transaction which didn't seem to > > make much sense to me. But maybe these should be changed to be more > > like backing maps which HashMap, ConcurrentHashMap etc use for their > > methods, where instead it would pick up the transaction if there is > > one in the current thread and if there is no transaction just start an > > implicit one. > > or if they are outside of a transaction to deny progress > I don't think it's fair to require an explicit transaction for every entrySet(). It should be possible to start an iteration without a transaction, and only to invalidate an iteration started from an explicit transaction the moment the transaction is committed/rolled back (although it would complicate rules a bit). And what happens if the user writes to the cache while it's iterating through the cache-backed collection? Should the user see the new entry in the iteration, or not? I don't think you can figure out at the end of the iteration which keys were included without keeping all the keys on the originator. > > > This however was a big change from how these > > collections work currently in that they are in memory copies only. > > > > What do you guys think? > > I think that keeping track of the context entries is a better way of > iterating so +1. As you mentioned, we should also make it clear that RC > semantic applies. > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141013/4116000a/attachment-0001.html From mmarkus at redhat.com Mon Oct 13 11:55:02 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Mon, 13 Oct 2014 16:55:02 +0100 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> Message-ID: <8BAD2C28-ADC3-4D1E-8B5D-F0D17B35C83C@redhat.com> On Oct 13, 2014, at 14:06, Dan Berindei wrote: > > > On Fri, Oct 10, 2014 at 9:01 PM, Mircea Markus wrote: > > On Oct 10, 2014, at 17:30, William Burns wrote: > > >>>>> Also we didn't really talk about the fact that these methods would > >>>>> ignore ongoing transactions and if that is a concern or not. > >>>>> > >>>> > >>>> It might be a concern for the Hibernate 2LC impl, it was their TCK that > >>>> prompted the last round of discussions about clear(). > >>> > >>> Although I wonder how much these methods are even used since they only > >>> work for Local, Replication or Invalidation caches in their current > >>> state (and didn't even use loaders until 6.0). > >> > >> > >> There is some more information about the test in the mailing list discussion > >> [1] > >> There's also a JIRA for clear() [2] > >> > >> I think 2LC almost never uses distribution, so size() being local-only > >> didn't matter, but making it non-tx could cause problems - at least for that > >> particular test. > > > > I had toyed around with the following idea before, but I never thought > > of it in the scope of the size method solely, but I have a solution > > that would work mostly for transactional caches. Essentially the size > > method would always operate in a READ_COMMITTED like state, using > > REPEATABLE_READ doesn't seem feasible since we can't keep all the > > contents in memory. Essentially the iterator would be ran and for > > each key that is found it checks the context to see if it is there. > > If the context entry is marked as removed it doesn't count the key, if > > the key is there it marks the key as found and counts it, and if it is > > not found it counts it. Then after iteration it finds all the keys in > > the context that were not found and also adds them to the count. This > > way it doesn't need to store additional memory (besides iteration > > costs) as all the context information is in memory. > > sounds good to me. > > Mircea, you have to decide whether you want the precise estimation using the entry iterator or the loose estimation using dataContainer.size() :) > > I guess we can't make size() read everything into the invocation context, so READ_COMMITTED is all we can provide if we want to keep size() transactional. Maybe we don't really need it though... Will, could you investigate the failing test that started the clear() thread [1] to see if it really needs size() to be transactional? I'm okay with both approaches TBH, both are much better than what we currently have. The accurate one is more costly but seems to be the solution of choice so let's go for it. > > > > > > My original thought was to also make the EntryIterator transactional > > in the same way which also means the keySet, entrySet and values > > methods could do the same things. The main reason stumbling block I > > had was the fact that the iterator and various collections returned > > could be used outside of the ongoing transaction which didn't seem to > > make much sense to me. But maybe these should be changed to be more > > like backing maps which HashMap, ConcurrentHashMap etc use for their > > methods, where instead it would pick up the transaction if there is > > one in the current thread and if there is no transaction just start an > > implicit one. > > or if they are outside of a transaction to deny progress > > I don't think it's fair to require an explicit transaction for every entrySet(). It should be possible to start an iteration without a transaction, and only to invalidate an iteration started from an explicit transaction the moment the transaction is committed/rolled back (although it would complicate rules a bit). > > And what happens if the user writes to the cache while it's iterating through the cache-backed collection? Should the user see the new entry in the iteration, or not? I don't think you can figure out at the end of the iteration which keys were included without keeping all the keys on the originator. If the modification is done outside the iterator one might expect an ConcurrentModificationException, as it is the case with some JDK iterators. > > > > This however was a big change from how these > > collections work currently in that they are in memory copies only. > > > > What do you guys think? > > I think that keeping track of the context entries is a better way of iterating so +1. As you mentioned, we should also make it clear that RC semantic applies. > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From rvansa at redhat.com Tue Oct 14 02:55:49 2014 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 14 Oct 2014 08:55:49 +0200 Subject: [infinispan-dev] About size() In-Reply-To: <8BAD2C28-ADC3-4D1E-8B5D-F0D17B35C83C@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <8BAD2C28-ADC3-4D1E-8B5D-F0D17B35C83C@redhat.com> Message-ID: <543CC8F5.9000403@redhat.com> On 10/13/2014 05:55 PM, Mircea Markus wrote: > On Oct 13, 2014, at 14:06, Dan Berindei wrote: > >> >> On Fri, Oct 10, 2014 at 9:01 PM, Mircea Markus wrote: >> >> On Oct 10, 2014, at 17:30, William Burns wrote: >> >>>>>>> Also we didn't really talk about the fact that these methods would >>>>>>> ignore ongoing transactions and if that is a concern or not. >>>>>>> >>>>>> It might be a concern for the Hibernate 2LC impl, it was their TCK that >>>>>> prompted the last round of discussions about clear(). >>>>> Although I wonder how much these methods are even used since they only >>>>> work for Local, Replication or Invalidation caches in their current >>>>> state (and didn't even use loaders until 6.0). >>>> >>>> There is some more information about the test in the mailing list discussion >>>> [1] >>>> There's also a JIRA for clear() [2] >>>> >>>> I think 2LC almost never uses distribution, so size() being local-only >>>> didn't matter, but making it non-tx could cause problems - at least for that >>>> particular test. >>> I had toyed around with the following idea before, but I never thought >>> of it in the scope of the size method solely, but I have a solution >>> that would work mostly for transactional caches. Essentially the size >>> method would always operate in a READ_COMMITTED like state, using >>> REPEATABLE_READ doesn't seem feasible since we can't keep all the >>> contents in memory. Essentially the iterator would be ran and for >>> each key that is found it checks the context to see if it is there. >>> If the context entry is marked as removed it doesn't count the key, if >>> the key is there it marks the key as found and counts it, and if it is >>> not found it counts it. Then after iteration it finds all the keys in >>> the context that were not found and also adds them to the count. This >>> way it doesn't need to store additional memory (besides iteration >>> costs) as all the context information is in memory. >> sounds good to me. >> >> Mircea, you have to decide whether you want the precise estimation using the entry iterator or the loose estimation using dataContainer.size() :) >> >> I guess we can't make size() read everything into the invocation context, so READ_COMMITTED is all we can provide if we want to keep size() transactional. Maybe we don't really need it though... Will, could you investigate the failing test that started the clear() thread [1] to see if it really needs size() to be transactional? > I'm okay with both approaches TBH, both are much better than what we currently have. The accurate one is more costly but seems to be the solution of choice so let's go for it. > >> >>> My original thought was to also make the EntryIterator transactional >>> in the same way which also means the keySet, entrySet and values >>> methods could do the same things. The main reason stumbling block I >>> had was the fact that the iterator and various collections returned >>> could be used outside of the ongoing transaction which didn't seem to >>> make much sense to me. But maybe these should be changed to be more >>> like backing maps which HashMap, ConcurrentHashMap etc use for their >>> methods, where instead it would pick up the transaction if there is >>> one in the current thread and if there is no transaction just start an >>> implicit one. >> or if they are outside of a transaction to deny progress >> >> I don't think it's fair to require an explicit transaction for every entrySet(). It should be possible to start an iteration without a transaction, and only to invalidate an iteration started from an explicit transaction the moment the transaction is committed/rolled back (although it would complicate rules a bit). >> >> And what happens if the user writes to the cache while it's iterating through the cache-backed collection? Should the user see the new entry in the iteration, or not? I don't think you can figure out at the end of the iteration which keys were included without keeping all the keys on the originator. > If the modification is done outside the iterator one might expect an ConcurrentModificationException, as it is the case with some JDK iterators. -1 We're aiming at high performance cache with a lot of changes while the operation is executed. This way, the iteration would never complete, unless you explicitly switch the cache to read only mode (either through Infinispan operation or in application). I think that adding isCacheModified() or isTopologyChanged() to the iterator would make sense, if that's not too complicated to implement. Though, if we want non-disturbed iteration, snapshot isolation is the only answer. Radim > >> >> >>> This however was a big change from how these >>> collections work currently in that they are in memory copies only. >>> >>> What do you guys think? >> I think that keeping track of the context entries is a better way of iterating so +1. As you mentioned, we should also make it clear that RC semantic applies. >> >> Cheers, >> -- >> Mircea Markus >> Infinispan lead (www.infinispan.org) >> >> >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > Cheers, -- Radim Vansa JBoss DataGrid QA From dan.berindei at gmail.com Tue Oct 14 03:33:14 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 14 Oct 2014 10:33:14 +0300 Subject: [infinispan-dev] About size() In-Reply-To: <543CC8F5.9000403@redhat.com> References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <8BAD2C28-ADC3-4D1E-8B5D-F0D17B35C83C@redhat.com> <543CC8F5.9000403@redhat.com> Message-ID: On Tue, Oct 14, 2014 at 9:55 AM, Radim Vansa wrote: > On 10/13/2014 05:55 PM, Mircea Markus wrote: > > On Oct 13, 2014, at 14:06, Dan Berindei wrote: > > > >> > >> On Fri, Oct 10, 2014 at 9:01 PM, Mircea Markus > wrote: > >> > >> On Oct 10, 2014, at 17:30, William Burns wrote: > >> > >>>>>>> Also we didn't really talk about the fact that these methods would > >>>>>>> ignore ongoing transactions and if that is a concern or not. > >>>>>>> > >>>>>> It might be a concern for the Hibernate 2LC impl, it was their TCK > that > >>>>>> prompted the last round of discussions about clear(). > >>>>> Although I wonder how much these methods are even used since they > only > >>>>> work for Local, Replication or Invalidation caches in their current > >>>>> state (and didn't even use loaders until 6.0). > >>>> > >>>> There is some more information about the test in the mailing list > discussion > >>>> [1] > >>>> There's also a JIRA for clear() [2] > >>>> > >>>> I think 2LC almost never uses distribution, so size() being local-only > >>>> didn't matter, but making it non-tx could cause problems - at least > for that > >>>> particular test. > >>> I had toyed around with the following idea before, but I never thought > >>> of it in the scope of the size method solely, but I have a solution > >>> that would work mostly for transactional caches. Essentially the size > >>> method would always operate in a READ_COMMITTED like state, using > >>> REPEATABLE_READ doesn't seem feasible since we can't keep all the > >>> contents in memory. Essentially the iterator would be ran and for > >>> each key that is found it checks the context to see if it is there. > >>> If the context entry is marked as removed it doesn't count the key, if > >>> the key is there it marks the key as found and counts it, and if it is > >>> not found it counts it. Then after iteration it finds all the keys in > >>> the context that were not found and also adds them to the count. This > >>> way it doesn't need to store additional memory (besides iteration > >>> costs) as all the context information is in memory. > >> sounds good to me. > >> > >> Mircea, you have to decide whether you want the precise estimation > using the entry iterator or the loose estimation using dataContainer.size() > :) > >> > >> I guess we can't make size() read everything into the invocation > context, so READ_COMMITTED is all we can provide if we want to keep size() > transactional. Maybe we don't really need it though... Will, could you > investigate the failing test that started the clear() thread [1] to see if > it really needs size() to be transactional? > > I'm okay with both approaches TBH, both are much better than what we > currently have. The accurate one is more costly but seems to be the > solution of choice so let's go for it. > > > >> > >>> My original thought was to also make the EntryIterator transactional > >>> in the same way which also means the keySet, entrySet and values > >>> methods could do the same things. The main reason stumbling block I > >>> had was the fact that the iterator and various collections returned > >>> could be used outside of the ongoing transaction which didn't seem to > >>> make much sense to me. But maybe these should be changed to be more > >>> like backing maps which HashMap, ConcurrentHashMap etc use for their > >>> methods, where instead it would pick up the transaction if there is > >>> one in the current thread and if there is no transaction just start an > >>> implicit one. > >> or if they are outside of a transaction to deny progress > >> > >> I don't think it's fair to require an explicit transaction for every > entrySet(). It should be possible to start an iteration without a > transaction, and only to invalidate an iteration started from an explicit > transaction the moment the transaction is committed/rolled back (although > it would complicate rules a bit). > >> > >> And what happens if the user writes to the cache while it's iterating > through the cache-backed collection? Should the user see the new entry in > the iteration, or not? I don't think you can figure out at the end of the > iteration which keys were included without keeping all the keys on the > originator. > > If the modification is done outside the iterator one might expect an > ConcurrentModificationException, as it is the case with some JDK iterators. > > -1 We're aiming at high performance cache with a lot of changes while > the operation is executed. This way, the iteration would never complete, > unless you explicitly switch the cache to read only mode (either through > Infinispan operation or in application). > I was referring only to changes made in the same transaction, not changes made by other transactions. But you make a good point, we can't throw a ConcurrentModificationException if the user for writes in the same transaction and ignore other transactions. > > I think that adding isCacheModified() or isTopologyChanged() to the > iterator would make sense, if that's not too complicated to implement. > Though, if we want non-disturbed iteration, snapshot isolation is the > only answer. > isCacheModified() is probably too costly to implement. isTopologyChanged() could be done, but I'm not sure what's the use case, as the entry iterator abstracts topology changes from the user. I don't think we want undisturbed iteration, at least not at this point. Personally, I just want to have a good story on why the iteration behaves in a certain way. By my standards, explaining that changes made by other transactions may completely/partially/not at all be visible in the iteration is fine, explaining that changes made by the same transaction may or may not be visible is not. Cheers Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141014/f69e85b5/attachment-0001.html From mudokonman at gmail.com Tue Oct 14 08:11:42 2014 From: mudokonman at gmail.com (William Burns) Date: Tue, 14 Oct 2014 08:11:42 -0400 Subject: [infinispan-dev] About size() In-Reply-To: References: <542E5E92.7060504@redhat.com> <3EA0122E-8293-49EB-8CB7-F67FA2E58532@redhat.com> <593A31AE-2C9C-4B90-9048-0EDBADCA1ADF@redhat.com> <8BAD2C28-ADC3-4D1E-8B5D-F0D17B35C83C@redhat.com> <543CC8F5.9000403@redhat.com> Message-ID: On Tue, Oct 14, 2014 at 3:33 AM, Dan Berindei wrote: > > > On Tue, Oct 14, 2014 at 9:55 AM, Radim Vansa wrote: >> >> On 10/13/2014 05:55 PM, Mircea Markus wrote: >> > On Oct 13, 2014, at 14:06, Dan Berindei wrote: >> > >> >> >> >> On Fri, Oct 10, 2014 at 9:01 PM, Mircea Markus >> >> wrote: >> >> >> >> On Oct 10, 2014, at 17:30, William Burns wrote: >> >> >> >>>>>>> Also we didn't really talk about the fact that these methods would >> >>>>>>> ignore ongoing transactions and if that is a concern or not. >> >>>>>>> >> >>>>>> It might be a concern for the Hibernate 2LC impl, it was their TCK >> >>>>>> that >> >>>>>> prompted the last round of discussions about clear(). >> >>>>> Although I wonder how much these methods are even used since they >> >>>>> only >> >>>>> work for Local, Replication or Invalidation caches in their current >> >>>>> state (and didn't even use loaders until 6.0). >> >>>> >> >>>> There is some more information about the test in the mailing list >> >>>> discussion >> >>>> [1] >> >>>> There's also a JIRA for clear() [2] >> >>>> >> >>>> I think 2LC almost never uses distribution, so size() being >> >>>> local-only >> >>>> didn't matter, but making it non-tx could cause problems - at least >> >>>> for that >> >>>> particular test. >> >>> I had toyed around with the following idea before, but I never thought >> >>> of it in the scope of the size method solely, but I have a solution >> >>> that would work mostly for transactional caches. Essentially the size >> >>> method would always operate in a READ_COMMITTED like state, using >> >>> REPEATABLE_READ doesn't seem feasible since we can't keep all the >> >>> contents in memory. Essentially the iterator would be ran and for >> >>> each key that is found it checks the context to see if it is there. >> >>> If the context entry is marked as removed it doesn't count the key, if >> >>> the key is there it marks the key as found and counts it, and if it is >> >>> not found it counts it. Then after iteration it finds all the keys in >> >>> the context that were not found and also adds them to the count. This >> >>> way it doesn't need to store additional memory (besides iteration >> >>> costs) as all the context information is in memory. >> >> sounds good to me. >> >> >> >> Mircea, you have to decide whether you want the precise estimation >> >> using the entry iterator or the loose estimation using dataContainer.size() >> >> :) >> >> >> >> I guess we can't make size() read everything into the invocation >> >> context, so READ_COMMITTED is all we can provide if we want to keep size() >> >> transactional. Maybe we don't really need it though... Will, could you >> >> investigate the failing test that started the clear() thread [1] to see if >> >> it really needs size() to be transactional? >> > I'm okay with both approaches TBH, both are much better than what we >> > currently have. The accurate one is more costly but seems to be the solution >> > of choice so let's go for it. >> > >> >> >> >>> My original thought was to also make the EntryIterator transactional >> >>> in the same way which also means the keySet, entrySet and values >> >>> methods could do the same things. The main reason stumbling block I >> >>> had was the fact that the iterator and various collections returned >> >>> could be used outside of the ongoing transaction which didn't seem to >> >>> make much sense to me. But maybe these should be changed to be more >> >>> like backing maps which HashMap, ConcurrentHashMap etc use for their >> >>> methods, where instead it would pick up the transaction if there is >> >>> one in the current thread and if there is no transaction just start an >> >>> implicit one. >> >> or if they are outside of a transaction to deny progress >> >> >> >> I don't think it's fair to require an explicit transaction for every >> >> entrySet(). It should be possible to start an iteration without a >> >> transaction, and only to invalidate an iteration started from an explicit >> >> transaction the moment the transaction is committed/rolled back (although it >> >> would complicate rules a bit). >> >> >> >> And what happens if the user writes to the cache while it's iterating >> >> through the cache-backed collection? Should the user see the new entry in >> >> the iteration, or not? I don't think you can figure out at the end of the >> >> iteration which keys were included without keeping all the keys on the >> >> originator. >> > If the modification is done outside the iterator one might expect an >> > ConcurrentModificationException, as it is the case with some JDK iterators. >> >> -1 We're aiming at high performance cache with a lot of changes while >> the operation is executed. This way, the iteration would never complete, >> unless you explicitly switch the cache to read only mode (either through >> Infinispan operation or in application). > > > I was referring only to changes made in the same transaction, not changes > made by other transactions. But you make a good point, we can't throw a > ConcurrentModificationException if the user for writes in the same > transaction and ignore other transactions. > >> >> >> I think that adding isCacheModified() or isTopologyChanged() to the >> iterator would make sense, if that's not too complicated to implement. >> Though, if we want non-disturbed iteration, snapshot isolation is the >> only answer. > > > isCacheModified() is probably too costly to implement. > isTopologyChanged() could be done, but I'm not sure what's the use case, as > the entry iterator abstracts topology changes from the user. > > I don't think we want undisturbed iteration, at least not at this point. > Personally, I just want to have a good story on why the iteration behaves in > a certain way. By my standards, explaining that changes made by other > transactions may completely/partially/not at all be visible in the iteration > is fine, explaining that changes made by the same transaction may or may not > be visible is not. Sorry I didn't respond earlier. But these commands would check the transaction context before returning the value to the user. This requires a user interaction for this to occur, so we can guarantee they will always see their updated value if they have one in the transaction (even if one is ran in between the iteration). The big thing is whether or not another transaction's update is seen when we don't have an update for that key (that will be occur if the segment is completed before the update or not). There should be no need to tell if the cache modified or the topology changed (the former would require a very high impact for performance to tell with a DIST cache). To be honest the wrapper classes would just be delegating to the Cache for the vast majority of operations (get, remove, contains etc.). It would only be when someone specifically uses the iterator on the various collections that the distributed iterator would even be used. This way the various collections would be backing maps, like HashMap and ConcurrentHashMap have, just they have to check the transaction as well. The values collection would be extremely limited in its supported methods though, pretty much only to iteration and size. > > > Cheers > Dan > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From slaskawi at redhat.com Tue Oct 14 09:23:40 2014 From: slaskawi at redhat.com (=?UTF-8?B?U2ViYXN0aWFuIMWBYXNrYXdpZWM=?=) Date: Tue, 14 Oct 2014 15:23:40 +0200 Subject: [infinispan-dev] Multiple Spring modules Message-ID: <543D23DC.3050206@redhat.com> Hey! Currently I'm working on Spring 3 and 4 support and because these versions are not compatible (in terms of Cache API), we probably would need to have 2 modules for Spring. Now the question is - how to maintain them? Here are the options which comes into my mind: 1. Create copy of Spring 3 module and put everything into newly created Spring 4, then update versions and implement new methods in Cache interface. Pros: - 1 OSGi bundle - transparent upgrade - just replace spring bundle - Easy to maintain Spring 4 only fixes Cons: - Code duplication 2. Extract common part and create 2 modules which depend on it - very hard because Cache interface is logically at the bottom of the structure. Everything depends on it. Pros: - No code duplication Cons: - Increased code complexity - 2 bundles needed - common + spring 3/4 3. Make Spring 4 module depend on Spring 3 and replace Cache implementations, run Maven Shade plugin to put everything together Pros: - No code duplication Cons: - Hacking into code, no intuitive design - Will probably work in this specific case, further maintenance might be hard. 4. Implement 2 missing methods in Spring module without @override annotation. This way it should work against Spring 3 and 4 Pros: - Really small change and single jar will support both spring 3 and 4 Cons: - Spring version ranges in pom (not sure if it fits into Infinispan design and BOMs) - Not intuitive I like option #1 - much easier maintenance + we might start using Spring 4 features without breaking Spring 3 module. Option #4 is also not that bad... Which option would you prefer? Best regards Sebastian From gustavonalle at gmail.com Tue Oct 14 09:41:37 2014 From: gustavonalle at gmail.com (Gustavo Fernandes) Date: Tue, 14 Oct 2014 14:41:37 +0100 Subject: [infinispan-dev] Multiple Spring modules In-Reply-To: <543D23DC.3050206@redhat.com> References: <543D23DC.3050206@redhat.com> Message-ID: I prefer #1, since it decouples Spring 3 from Spring 4. For example, Spring 4.1 is bringing many improvements on Cache [1], which I'm not sure if it will available on 3.2.x maintenance branch. [1] http://spring.io/blog/2014/06/16/further-cache-improvements-in-spring-4-1 Gustavo On Tue, Oct 14, 2014 at 2:23 PM, Sebastian ?askawiec wrote: > Hey! > > Currently I'm working on Spring 3 and 4 support and because these > versions are not compatible (in terms of Cache API), we probably would > need to have 2 modules for Spring. > > Now the question is - how to maintain them? Here are the options which > comes into my mind: > > 1. Create copy of Spring 3 module and put everything into newly created > Spring 4, then update versions and implement new methods in Cache > interface. > Pros: > - 1 OSGi bundle - transparent upgrade - just replace spring bundle > - Easy to maintain Spring 4 only fixes > Cons: > - Code duplication > 2. Extract common part and create 2 modules which depend on it - very > hard because Cache interface is logically at the bottom of the > structure. Everything depends on it. > Pros: > - No code duplication > Cons: > - Increased code complexity > - 2 bundles needed - common + spring 3/4 > 3. Make Spring 4 module depend on Spring 3 and replace Cache > implementations, run Maven Shade plugin to put everything together > Pros: > - No code duplication > Cons: > - Hacking into code, no intuitive design > - Will probably work in this specific case, further maintenance > might be hard. > 4. Implement 2 missing methods in Spring module without @override > annotation. This way it should work against Spring 3 and 4 > Pros: > - Really small change and single jar will support both spring 3 > and 4 > Cons: > - Spring version ranges in pom (not sure if it fits into > Infinispan design and BOMs) > - Not intuitive > > I like option #1 - much easier maintenance + we might start using Spring > 4 features without breaking Spring 3 module. Option #4 is also not that > bad... > > Which option would you prefer? > > Best regards > Sebastian > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From emmanuel at hibernate.org Wed Oct 15 07:54:32 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Wed, 15 Oct 2014 13:54:32 +0200 Subject: [infinispan-dev] Feedback and requests on clustered and remote listeners In-Reply-To: References: <1C064841-B234-4EC8-AEE3-467B1E52ED0B@hibernate.org> <2FD61F5D-67A6-4D6C-A0A0-D924914BAD14@hibernate.org> <21CE8853-8A48-4061-BD6C-55EE93BECE33@hibernate.org> <06B6FC9A-9886-4C02-B821-EC5C0864A948@redhat.com> <6FDDA588-82B0-4DB3-95C9-527CDD8B5219@hibernate.org> <821068D0-D857-4006-AC70-A0798C4756C1@redhat.com> <4FA0A74E-4DA4-42F1-BE0C-C07E401633A0@hibernate.org> Message-ID: Sorry for the long delay. I looked at your PR early last week and it looked good to me. I think it might be slighly more efficient to offer a single contract that do both the filtering and the conversion (esp when one consider reading the value in raw protobuf and do the parsing once). But your approach solves the old / new value feature. Emmanuel On 30 Sep 2014, at 18:13, William Burns wrote: > I have put it on a branch on github and you can try it out and let me > know what you think. > > I still have a few things I may want to change though: > > 1. I don't like how pre events are yet as they don't give you the > previous value and new value as post events do > 2. The enum to tell the type has become a bit more complicated and I > think I am going to change it to a class > 3. I also have some internal changes that should require less memory > allocations I wanted to clean up. > > https://github.com/wburns/infinispan/tree/ISPN-4753 > > Thanks, > > - Will > > On Fri, Sep 26, 2014 at 4:06 AM, Emmanuel Bernard > wrote: >> You lost me at actually ;) but if you have some code or even a gist showing how a user would use and interact with these changes, I can give you some feedback on the use cases I had in mind and if they fit. >> >> >>> On 25 sept. 2014, at 15:20, William Burns wrote: >>> >>> Actually while working and thinking on this it seems it may be easiest >>> to exclude the usage of KeyValueFilter in the listener pieces >>> completely and instead leave the annotation as it is now. Instead the >>> provided CacheEventFilter would be wrapped by a KeyValueFilter >>> implement that just called the new method as if it was a create event >>> for each value while iterating on them. I am thinking this is the >>> cleanest. Do you guys have any opinions? It would also keep intact a >>> lot of existing code and APIs. >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From slaskawi at redhat.com Wed Oct 15 08:43:07 2014 From: slaskawi at redhat.com (=?UTF-8?B?U2ViYXN0aWFuIMWBYXNrYXdpZWM=?=) Date: Wed, 15 Oct 2014 14:43:07 +0200 Subject: [infinispan-dev] Multiple Spring modules In-Reply-To: References: <543D23DC.3050206@redhat.com> Message-ID: <543E6BDB.7050803@redhat.com> Hey! After several discussions we decided to create 2 separate modules for Spring 3 and Spring 4 support. As Gustavo mentioned, there are a lot of new things in Spring 4.1 which are connected to caching. Once we start supporting Spring 3 and 4 integration using the same jar - new features might be hard (if not impossible) to introduce. Having 2 separate jars gives us flexibility which might be useful in the future. The code might be found here: *https://github.com/infinispan/infinispan/pull/2957* Best regards Sebastian On 10/14/2014 03:41 PM, Gustavo Fernandes wrote: > I prefer #1, since it decouples Spring 3 from Spring 4. For example, > Spring 4.1 is bringing many improvements on Cache [1], which I'm not > sure if it will available on 3.2.x maintenance branch. > > > [1] http://spring.io/blog/2014/06/16/further-cache-improvements-in-spring-4-1 > > wrote: >> >> 1. Create copy of Spring 3 module and put everything into newly created >> Spring 4, then update versions and implement new methods in Cache >> interface. >> Pros: >> - 1 OSGi bundle - transparent upgrade - just replace spring bundle >> - Easy to maintain Spring 4 only fixes >> Cons: >> - Code duplication >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141015/d55b108f/attachment.html From mudokonman at gmail.com Wed Oct 15 09:40:10 2014 From: mudokonman at gmail.com (William Burns) Date: Wed, 15 Oct 2014 09:40:10 -0400 Subject: [infinispan-dev] Feedback and requests on clustered and remote listeners In-Reply-To: References: <1C064841-B234-4EC8-AEE3-467B1E52ED0B@hibernate.org> <2FD61F5D-67A6-4D6C-A0A0-D924914BAD14@hibernate.org> <21CE8853-8A48-4061-BD6C-55EE93BECE33@hibernate.org> <06B6FC9A-9886-4C02-B821-EC5C0864A948@redhat.com> <6FDDA588-82B0-4DB3-95C9-527CDD8B5219@hibernate.org> <821068D0-D857-4006-AC70-A0798C4756C1@redhat.com> <4FA0A74E-4DA4-42F1-BE0C-C07E401633A0@hibernate.org> Message-ID: On Wed, Oct 15, 2014 at 7:54 AM, Emmanuel Bernard wrote: > Sorry for the long delay. > I looked at your PR early last week and it looked good to me. > I think it might be slighly more efficient to offer a single contract that do both the filtering and the conversion (esp when one consider reading the value in raw protobuf and do the parsing once). But your approach solves the old / new value feature. I agree, that is something that will still need to be added to the listeners. Most likely it will be implemented in a similar fashion as KeyValueFilterConverter was. I have created [1] to track it. [1] https://issues.jboss.org/browse/ISPN-4850 > > Emmanuel > > On 30 Sep 2014, at 18:13, William Burns wrote: > >> I have put it on a branch on github and you can try it out and let me >> know what you think. >> >> I still have a few things I may want to change though: >> >> 1. I don't like how pre events are yet as they don't give you the >> previous value and new value as post events do >> 2. The enum to tell the type has become a bit more complicated and I >> think I am going to change it to a class >> 3. I also have some internal changes that should require less memory >> allocations I wanted to clean up. >> >> https://github.com/wburns/infinispan/tree/ISPN-4753 >> >> Thanks, >> >> - Will >> >> On Fri, Sep 26, 2014 at 4:06 AM, Emmanuel Bernard >> wrote: >>> You lost me at actually ;) but if you have some code or even a gist showing how a user would use and interact with these changes, I can give you some feedback on the use cases I had in mind and if they fit. >>> >>> >>>> On 25 sept. 2014, at 15:20, William Burns wrote: >>>> >>>> Actually while working and thinking on this it seems it may be easiest >>>> to exclude the usage of KeyValueFilter in the listener pieces >>>> completely and instead leave the annotation as it is now. Instead the >>>> provided CacheEventFilter would be wrapped by a KeyValueFilter >>>> implement that just called the new method as if it was a create event >>>> for each value while iterating on them. I am thinking this is the >>>> cleanest. Do you guys have any opinions? It would also keep intact a >>>> lot of existing code and APIs. >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From emmanuel at hibernate.org Wed Oct 15 12:21:55 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Wed, 15 Oct 2014 18:21:55 +0200 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: References: <5436FB16.3000003@infinispan.org> <5437F781.8020808@redhat.com> Message-ID: On 10 Oct 2014, at 18:06, Dan Berindei wrote: > > The biggest downside I see is that it would be horribly slow if the cache store doesn't support efficient iteration of a single segment. So we might want to implement a full retry strategy as well, if some cache stores can't support that. > My understanding from a discussion with Pedro (in a hard, cold and sinister place but that?s another story) is that *today* M/R is kinda horrible for global cache stores anyways that have to do the key per node filtering dance anyways. So it?s not significantly worse. Plus I said we should do work per segment but in reality if you send 5 Map segment work to the same node, you can optimize and do a single loop only making it feel like they are separated work. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141015/ea9e06eb/attachment.html From emmanuel at hibernate.org Wed Oct 15 12:41:21 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Wed, 15 Oct 2014 18:41:21 +0200 Subject: [infinispan-dev] TopologySafe Map / Reduce In-Reply-To: References: <5436FB16.3000003@infinispan.org> <20141010154948.GD5052@hibernate.org> Message-ID: <7A961E49-612C-47FF-ACC4-64F0B4821022@hibernate.org> On 13 Oct 2014, at 10:45, Dan Berindei wrote: > > On Fri, Oct 10, 2014 at 6:49 PM, Emmanuel Bernard wrote: > When wrestling with the subject, here is what I had in mind. > > The M/R coordinator node sends the M task per segment on the node where > the segment is primary. > > What's M? Is it just a shorthand for "map", or is it a new parameter that controls the number of map/combine tasks sent at once? M is short for Map. Sorry. > > Each "per-segment" M task is executed and is offered the way to push > intermediary results in a temp cache. > > Just to be clear, the user-provided mapper and combiner don't know anything about the intermediary cache (which doesn't have to be temporary, if it's shared by all M/R tasks). They only interact with the Collector interface. > The map/combine task on the other hand is our code, and it deals with the intermediary cache directly. Interesting, Evangelos, do you actually use the collector interface or actual explicit intermediary caches in your approach. If that?s the collector interface, I guess that?s easier to hide that sharding business. > > The intermediary results are stored with a composite key [imtermKey-i, seg-j]. > The M/R coordinator waits for all M tasks to return. If one does not > (timeout, rehash), the following happens: > > We can't allow time out map tasks, or they will keep writing to the intermediate cache in parallel with the retried tasks. So the originator has to wait for a response from each node to which it sent a map task. OK. I guess the originator can see that a node is out of the cluster though and act accordingly. > > - delete [intermKey-i, seg-i] (that operation could be handled by the > new per-segment M before the map task is effectively started) > - ship the M task for that segment-i to the new primary owner of > segment-i > > When all M tasks are received the Reduce phase will read all [intermKey-i, *] > keys and reduce them. > Note that if the reduction phase is itself distributed, we could apply > the same key per segment and shipping split for these. > > Sure, we have to retry reduce tasks when the primary owner changes, and it makes sense to retry as little as possible. > > > Again the tricky part is to expose the ability to write to intermediary > caches per segment without exposing segments per se as well as let > someone see a concatenated view if intermKey-i from all segments subkeys > during reduction. > > Writing to and reading from the intermediate cache is already abstracted from user code (in the Mapper and Reducer interfaces). So we don't need to worry about exposing extra details to the user. > > > Thoughts? > > Dan, I did not quite get what alternative approach you wanted to > propose. Care to respin it for a slow brain? :) > > I think where we differ is that I don't think user code needs to know about how we store the intermediate values and what we retry, as long as their mappers/combiners/reducers don't have side effects. Right but my understanding from the LEADS guys was that they had side effects on their M/Rs. Waiting for Evangelos to speak up. > > Otherwise I was thinking on the same lines: send 1 map/combine task for each segment (maybe with a cap on the number of segments being processed at the same time on each node), split the intermediate values per input segment, cancel+retry each map task if the topology changes and the executing node is no longer an owner. If the reduce phase is distributed, run 1 reduce task per segment as well, and cancel+retry the reduce task if the executing node is no longer an owner. > > I had some ideas about assigning each map/combine phase a UUID and making the intermediate keys [intermKey, seg, mctask] to allow the originator to retry a map/combine task without waiting for the previous one to finish, but I don't think I mentioned that before :) Nice touch, that fixes the rogue node / timeout problem. > There are also some details that I'm worried about: > > 1) If the reduce phase is distributed, and the intermediate cache is non-transactional, any topology change in the intermediate cache will require us to retry all the map/combine tasks that were running at the time on any node (even if some nodes did not detect the topology change yet). So it would make sense to limit the number of map/combine tasks that are processed at one time, in order to limit the amount of tasks we retry (OR require the intermediate cache to be transactional). I am not fully following that. What matters in the end it seems is for the originator to detect a topology change and discard things accordingly, no? If the other nodes are slaves of that originator for the purpose of that M/R, we are good. > > 2) Running a separate map/combine task for each segment is not really an option until we implement the the segment-aware data container and cache stores. Without that change, it will make everything much slower, because of all the extra iterations for each segment. > See my other email about physically merging down the per segment work into a per node work when you ship that work. > 3) And finally, all this will be overkill when the input cache is small, and the time needed to process the data is comparable to the time needed to send all those extra RPCs. > > So I'm thinking it might be better to adopt Vladimir's suggestion to retry everything if we detect a topology change in the input and/or intermediate cache at the end of the M/R task, at least in the first phase. You half lost but I think that with my proposal to physically merge the RPC calls per node instead of per segment, that problem would be alleviated. Emmanuel -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141015/476af049/attachment.html From emmanuel at hibernate.org Thu Oct 16 05:23:54 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Thu, 16 Oct 2014 11:23:54 +0200 Subject: [infinispan-dev] JAR distribution once the grid has been deployed Message-ID: <77B40530-3309-420B-B720-DAC56410A102@hibernate.org> Hi all, I know this has been discussed in the past (by Tristan I think), but I don?t know how concrete the plans have come since then. One major issue with all the distributed execution code interfaces we have is that it requires to have in the classpath of each node both the implementation of these interfaces and the class files corresponding to the key and value being processed. My understanding is that this is true of the distexec, Map / Reduce and (clustered) listener. Evangelos from the LEADS project sort of worked around this problem by creating specialized versions of his distexec that loads the necessary JARs from the grid itself (in a set of keys) and creates a classloader that references these JARs. In a sequence, it conceptually looks like that: - have the generic classloader distexec version in the each of grid nodes classpath at start time - when a new remote execution is required, load each necessary JAR in a specific key in a specific cache - the generic distexec basically receives the necessary keys, load each jar and create a classloader out of them - the generic distexec load and launch the specific code that needs to be executed (based on the fqcn of the code to execute) from the created classloader There are a few problems with that including: - it requires a lot of manual work from the user - big JARs make the key / value per JAR logic explode a bit. The algorithms LEADS use have 300 MB sized JARs - god know what security leak this can lead to So I wondered if we have a better alternative and plans and if there was a wiki page discussing the needs and potential approaches. As an intermediary step we could make this approach a tutorial or side classes that people can borrow from for each of the use cases. Emmanuel From isavin at redhat.com Fri Oct 17 12:11:31 2014 From: isavin at redhat.com (Ion Savin) Date: Fri, 17 Oct 2014 19:11:31 +0300 Subject: [infinispan-dev] On ParserRegistry and classloaders In-Reply-To: References: <9A22CC58-5B20-4D6B-BA6B-B4A23493979F@redhat.com> <53831239.30601@redhat.com> <114950B5-AA9F-485C-8E3C-AC36299974AB@redhat.com> <53D945B4.7070508@redhat.com> Message-ID: <54413FB3.3020409@redhat.com> Hi Sanne, > Caused by: java.lang.ClassNotFoundException: > org.infinispan.remoting.transport.jgroups.JGroupsTransport from > [Module \"deployment.ModuleMemberRegistrationIT.war:main\" from > Service Module Loader]"}} Can you please share also the full stack for this exception? Thanks! -- Ion Savin From sanne at infinispan.org Fri Oct 17 14:15:58 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 17 Oct 2014 19:15:58 +0100 Subject: [infinispan-dev] Improving the performance of index writers Message-ID: Hi all, we have been breaking down the problem of latency during Index Writing into smaller manageable tasks, you can find the general overview JIRA here : - https://issues.jboss.org/browse/ISPN-4847 As you can see some minor improvements have been fixed already, and while each of them provides only minor 10% to 30% improvements, some provide more and combined the composite ratio is getting interesting. While these minor issues (even combined) won't give us the many orders of magnitude performance improvements that we'd like to see, they are important as they are paving the road to the more significant efficiency improvements. I documented the main idea here, as it belongs into the Hibernate Search engine: https://hibernate.atlassian.net/browse/HSEARCH-1699 I don't expect that to be implemented overnight, but Gustavo already sent a PR for the ASYNC case, which is based on the same principle of avoiding the commits but is simpler to implement: https://hibernate.atlassian.net/browse/HSEARCH-1693 We expect this one to be a proof of concept for the performance that we'll get from HSEARCH-1699, and also I think it's very useful on its own: previously users of ASYNC indexing were forced into a "very async" architecture which might have been a bit too hard to manage, while now being able to set a maximum delay for the async operation I also expect that to be an acceptable compromise for a much wider range of use cases. Essentially this will decouple the achievable throughput of indexed caches from the RPC latency, although obviously this latency will still be the limiting factor for some dimensions, especially the response time for a single synchronous indexed write will still be affected primarily by the ability of Infinispan to improve the number of blocking RPCs needed for a single write. Feedback very welcome! Sanne From galder at redhat.com Mon Oct 20 03:16:43 2014 From: galder at redhat.com (=?windows-1252?Q?Galder_Zamarre=F1o?=) Date: Mon, 20 Oct 2014 09:16:43 +0200 Subject: [infinispan-dev] Feedback and requests on clustered and remote listeners In-Reply-To: References: <1C064841-B234-4EC8-AEE3-467B1E52ED0B@hibernate.org> <2FD61F5D-67A6-4D6C-A0A0-D924914BAD14@hibernate.org> <21CE8853-8A48-4061-BD6C-55EE93BECE33@hibernate.org> <06B6FC9A-9886-4C02-B821-EC5C0864A948@redhat.com> <6FDDA588-82B0-4DB3-95C9-527CDD8B5219@hibernate.org> <821068D0-D857-4006-AC70-A0798C4756C1@redhat.com> <4FA0A74E-4DA4-42F1-BE0C-C07E401633A0@hibernate.org> Message-ID: Hi all, Thanks Will for implementing this. I?ve created [1] to investigate whether any changes would be required for Hot Rod remote listeners to take advantage of this. Cheers, [1] https://issues.jboss.org/browse/ISPN-4857 On 30 Sep 2014, at 18:13, William Burns wrote: > I have put it on a branch on github and you can try it out and let me > know what you think. > > I still have a few things I may want to change though: > > 1. I don't like how pre events are yet as they don't give you the > previous value and new value as post events do > 2. The enum to tell the type has become a bit more complicated and I > think I am going to change it to a class > 3. I also have some internal changes that should require less memory > allocations I wanted to clean up. > > https://github.com/wburns/infinispan/tree/ISPN-4753 > > Thanks, > > - Will > > On Fri, Sep 26, 2014 at 4:06 AM, Emmanuel Bernard > wrote: >> You lost me at actually ;) but if you have some code or even a gist showing how a user would use and interact with these changes, I can give you some feedback on the use cases I had in mind and if they fit. >> >> >>> On 25 sept. 2014, at 15:20, William Burns wrote: >>> >>> Actually while working and thinking on this it seems it may be easiest >>> to exclude the usage of KeyValueFilter in the listener pieces >>> completely and instead leave the annotation as it is now. Instead the >>> provided CacheEventFilter would be wrapped by a KeyValueFilter >>> implement that just called the new method as if it was a create event >>> for each value while iterating on them. I am thinking this is the >>> cleanest. Do you guys have any opinions? It would also keep intact a >>> lot of existing code and APIs. >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarre?o galder at redhat.com twitter.com/galderz From isavin at redhat.com Mon Oct 20 05:13:06 2014 From: isavin at redhat.com (Ion Savin) Date: Mon, 20 Oct 2014 12:13:06 +0300 Subject: [infinispan-dev] my status Message-ID: <5444D222.6040807@redhat.com> Hi all, I'll be missing the meeting today so here is my status: Last week: * resolved ISPN-4784 and ISPN-4251 * spent a good amount of time studying JBoss Modules and how we package infinispan as AS modules This week still: * ISPN-3836 TxCleanupService can cause TCCL leak * HRCPP-173 The HotRod client should support a separate CH for each cache -- Ion Savin From emmanuel at hibernate.org Mon Oct 20 07:59:37 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Mon, 20 Oct 2014 13:59:37 +0200 Subject: [infinispan-dev] Improving the performance of index writers In-Reply-To: References: Message-ID: <403215A0-6143-4EB4-B77E-553DB04FD587@hibernate.org> HSEARCH-1699 looks good. A few comments. Maybe from a user point of you we want to expose the number of ms the user is ok to delay a commit due to indexing. Which would mean that you can wait up to that number before calling it a day and emptying the queue. The big question I have which you elude too is whether this mechanism should have some kind of back pressure mechanism by also caping the queue size. BTW, in the following paragraph, either you lost me or you are talking non sense: > Systems with an high degree of parallelism will benefit from this, and the performance should converge to the performance you would have without every doing a commit; however if the frequency of commits is apparoching to zero, it also means that the average latency of each operation will get significantly higher. Still, in such situations assuming we are for example stacking up a million changesets between each commit, that implies this solution would be approximately a million times faster than the existing design (A million would not be realistic of course as it implies a million of parallel requests). I think you can only converge to an average of 1/2 * (commit + configured delay time) latency wise. I am assuming latency is what people are interested in, not the average CPU / memory load of indexing. Emmanuel On 17 Oct 2014, at 20:15, Sanne Grinovero wrote: > Hi all, > we have been breaking down the problem of latency during Index > Writing into smaller manageable tasks, you can find the general > overview JIRA here : > > - https://issues.jboss.org/browse/ISPN-4847 > > As you can see some minor improvements have been fixed already, and > while each of them provides only minor 10% to 30% improvements, some > provide more and combined the composite ratio is getting interesting. > > While these minor issues (even combined) won't give us the many orders > of magnitude performance improvements that we'd like to see, they are > important as they are paving the road to the more significant > efficiency improvements. > > I documented the main idea here, as it belongs into the Hibernate Search engine: > > https://hibernate.atlassian.net/browse/HSEARCH-1699 > > I don't expect that to be implemented overnight, but Gustavo already > sent a PR for the ASYNC case, which is based on the same principle of > avoiding the commits but is simpler to implement: > > https://hibernate.atlassian.net/browse/HSEARCH-1693 > > We expect this one to be a proof of concept for the performance that > we'll get from HSEARCH-1699, and also I think it's very useful on its > own: previously users of ASYNC indexing were forced into a "very > async" architecture which might have been a bit too hard to manage, > while now being able to set a maximum delay for the async operation I > also expect that to be an acceptable compromise for a much wider range > of use cases. > > Essentially this will decouple the achievable throughput of indexed > caches from the RPC latency, although obviously this latency will > still be the limiting factor for some dimensions, especially the > response time for a single synchronous indexed write will still be > affected primarily by the ability of Infinispan to improve the number > of blocking RPCs needed for a single write. > > Feedback very welcome! > > Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From sanne at infinispan.org Mon Oct 20 08:10:47 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 20 Oct 2014 13:10:47 +0100 Subject: [infinispan-dev] Improving the performance of index writers In-Reply-To: <403215A0-6143-4EB4-B77E-553DB04FD587@hibernate.org> References: <403215A0-6143-4EB4-B77E-553DB04FD587@hibernate.org> Message-ID: On 20 October 2014 12:59, Emmanuel Bernard wrote: > HSEARCH-1699 looks good. A few comments. > > Maybe from a user point of you we want to expose the number of ms the user is ok to delay a commit due to indexing. Which would mean that you can wait up to that number before calling it a day and emptying the queue. The big question I have which you elude too is whether this mechanism should have some kind of back pressure mechanism by also caping the queue size. Gustavo is implementing that for the ASYNC backend, but the SYNC backend will always block the user thread until the commit is done (and some commit is going to be done ASAP). About to write a mail to hibernate-dev to discuss the ASYNC backend property name and exact semantics. > > BTW, in the following paragraph, either you lost me or you are talking non sense: > >> Systems with an high degree of parallelism will benefit from this, and the performance should converge to the performance you would have without every doing a commit; however if the frequency of commits is apparoching to zero, it also means that the average latency of each operation will get significantly higher. Still, in such situations assuming we are for example stacking up a million changesets between each commit, that implies this solution would be approximately a million times faster than the existing design (A million would not be realistic of course as it implies a million of parallel requests). > > I think you can only converge to an average of 1/2 * (commit + configured delay time) latency wise. I am assuming latency is what people are interested in, not the average CPU / memory load of indexing. I'm sorry I'm confused. There is no configured delay time for the SYNC backend discussed on HSEARCH-1699, are you talking about the Async one? But my paragraph above is strictly referring tot the strategy meant to be applied for the Sync one. Thanks for the feedback! BTW I didn't cross-post to hibernate-dev as this was meant as a heads up for the Infinispan team otherwise not having visibility on what we're planning, but I should really start a discussion thread for the details on hibernate-dev. Infinispan developers: if you're interested in following this subject, please comment on the JIRAs or join the hibernate-dev mailing list. Sanne > > Emmanuel > > On 17 Oct 2014, at 20:15, Sanne Grinovero wrote: > >> Hi all, >> we have been breaking down the problem of latency during Index >> Writing into smaller manageable tasks, you can find the general >> overview JIRA here : >> >> - https://issues.jboss.org/browse/ISPN-4847 >> >> As you can see some minor improvements have been fixed already, and >> while each of them provides only minor 10% to 30% improvements, some >> provide more and combined the composite ratio is getting interesting. >> >> While these minor issues (even combined) won't give us the many orders >> of magnitude performance improvements that we'd like to see, they are >> important as they are paving the road to the more significant >> efficiency improvements. >> >> I documented the main idea here, as it belongs into the Hibernate Search engine: >> >> https://hibernate.atlassian.net/browse/HSEARCH-1699 >> >> I don't expect that to be implemented overnight, but Gustavo already >> sent a PR for the ASYNC case, which is based on the same principle of >> avoiding the commits but is simpler to implement: >> >> https://hibernate.atlassian.net/browse/HSEARCH-1693 >> >> We expect this one to be a proof of concept for the performance that >> we'll get from HSEARCH-1699, and also I think it's very useful on its >> own: previously users of ASYNC indexing were forced into a "very >> async" architecture which might have been a bit too hard to manage, >> while now being able to set a maximum delay for the async operation I >> also expect that to be an acceptable compromise for a much wider range >> of use cases. >> >> Essentially this will decouple the achievable throughput of indexed >> caches from the RPC latency, although obviously this latency will >> still be the limiting factor for some dimensions, especially the >> response time for a single synchronous indexed write will still be >> affected primarily by the ability of Infinispan to improve the number >> of blocking RPCs needed for a single write. >> >> Feedback very welcome! >> >> Sanne >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From gustavonalle at gmail.com Mon Oct 20 08:40:16 2014 From: gustavonalle at gmail.com (Gustavo Fernandes) Date: Mon, 20 Oct 2014 13:40:16 +0100 Subject: [infinispan-dev] Improving the performance of index writers In-Reply-To: References: <403215A0-6143-4EB4-B77E-553DB04FD587@hibernate.org> Message-ID: On Mon, Oct 20, 2014 at 1:10 PM, Sanne Grinovero wrote: > On 20 October 2014 12:59, Emmanuel Bernard wrote: >> HSEARCH-1699 looks good. A few comments. >> >> Maybe from a user point of you we want to expose the number of ms the user is ok to delay a commit due to indexing. Which would mean that you can wait up to that number before calling it a day and emptying the queue. The big question I have which you elude too is whether this mechanism should have some kind of back pressure mechanism by also caping the queue size. > > Gustavo is implementing that for the ASYNC backend, but the SYNC > backend will always block the user thread until the commit is done > (and some commit is going to be done ASAP). > About to write a mail to hibernate-dev to discuss the ASYNC backend > property name and exact semantics. > Current ASYNC proposal [1] involves a refresh interval to explicitly flush the indexes. There's still a queue involved though; the queue is filled with indexing work to be applied and will block if flooded with multiple producers. The difference resides in the consumption side: instead of the flow {apply, commit, apply, commit} it will do {apply(1..*) - commit - apply(1..*) - commit} [1] https://github.com/hibernate/hibernate-search/pull/681 Gustavo From emmanuel at hibernate.org Mon Oct 20 09:55:58 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Mon, 20 Oct 2014 15:55:58 +0200 Subject: [infinispan-dev] Improving the performance of index writers In-Reply-To: References: <403215A0-6143-4EB4-B77E-553DB04FD587@hibernate.org> Message-ID: On 20 Oct 2014, at 14:10, Sanne Grinovero wrote: > On 20 October 2014 12:59, Emmanuel Bernard wrote: >> HSEARCH-1699 looks good. A few comments. >> >> Maybe from a user point of you we want to expose the number of ms the user is ok to delay a commit due to indexing. Which would mean that you can wait up to that number before calling it a day and emptying the queue. The big question I have which you elude too is whether this mechanism should have some kind of back pressure mechanism by also caping the queue size. > > Gustavo is implementing that for the ASYNC backend, but the SYNC > backend will always block the user thread until the commit is done > (and some commit is going to be done ASAP). > About to write a mail to hibernate-dev to discuss the ASYNC backend > property name and exact semantics. I understand that the sync mode will block until the commit is done. what I am saying is that for HSEARCH-1699 (SYNC) (and probably also for the ASYNC mode), you can ask the user ?how much more? is he willing to wait for the index to be committed compared to ?as fast as possible?. That becomes your window of aggregation. Does that make sense? > >> >> BTW, in the following paragraph, either you lost me or you are talking non sense: >> >>> Systems with an high degree of parallelism will benefit from this, and the performance should converge to the performance you would have without every doing a commit; however if the frequency of commits is apparoching to zero, it also means that the average latency of each operation will get significantly higher. Still, in such situations assuming we are for example stacking up a million changesets between each commit, that implies this solution would be approximately a million times faster than the existing design (A million would not be realistic of course as it implies a million of parallel requests). >> >> I think you can only converge to an average of 1/2 * (commit + configured delay time) latency wise. I am assuming latency is what people are interested in, not the average CPU / memory load of indexing. > > I'm sorry I'm confused. There is no configured delay time for the SYNC > backend discussed on HSEARCH-1699, are you talking about the Async > one? But my paragraph above is strictly referring tot the strategy > meant to be applied for the Sync one. There is a delay. it is what you call the "target frequency of commits?. And my alternative that i proposed is not su much a frequency rather than how much more you delay a flush in the hope of getting more work in. In your model of a fixed frequency, they the average delay is 1/2 * 1/frequency + commit time. Or do you have something different in mind? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141020/7d32ff75/attachment-0001.html From ttarrant at redhat.com Mon Oct 20 10:47:31 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 20 Oct 2014 16:47:31 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan Message-ID: <54452083.4020502@redhat.com> Hi guys, with the imminent release of 7.0.0.CR2 we are reaching the end of this release cycle. There have been a ton of improvements (maybe too many) and a lot of time has passed since the previous version (maybe to much). Following up on my previous e-mail about future plans, here's a recap of a plan which I believe will allow us to move at a much quicker pace: For the next minor releases I would like to suggest the following strategy: - use a 3 month timebox where we strive to maintain master in an "always releasable" state - complex feature work will need to happen onto dedicated feature branches, using the usual GitHub pull-request workflow - only when a feature is complete (code, tests, docs, reviewed, CI-checked) it will be merged back into master - if a feature is running late it will be postponed to the following minor release so as not to hinder other development I am also going to suggest dropping the cherry-picking approach and going with git merge. In order to achieve this we need CI to be always in top form with 0 failures in master. This will allow merging a PR directly from GitHub's interface. We obviously need to trust our tools and our existing code base. This is the plan for 7.1.0: 13 November 7.1.0.Alpha1 18 December 7.1.0.Beta1 15 January 7.1.0.CR1 30 January 7.1.0.Final Tristan From ttarrant at redhat.com Mon Oct 20 10:48:52 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 20 Oct 2014 16:48:52 +0200 Subject: [infinispan-dev] Weekly Infinispan IRC meeting 2014-10-20 Message-ID: <544520D4.9050100@redhat.com> Get the minutes from here: http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2014/infinispan.2014-10-20-14.02.log.html Tristan From sanne at infinispan.org Mon Oct 20 10:57:04 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 20 Oct 2014 15:57:04 +0100 Subject: [infinispan-dev] Improving the performance of index writers In-Reply-To: References: <403215A0-6143-4EB4-B77E-553DB04FD587@hibernate.org> Message-ID: On 20 October 2014 14:55, Emmanuel Bernard wrote: > > On 20 Oct 2014, at 14:10, Sanne Grinovero wrote: > > On 20 October 2014 12:59, Emmanuel Bernard wrote: > > HSEARCH-1699 looks good. A few comments. > > Maybe from a user point of you we want to expose the number of ms the user > is ok to delay a commit due to indexing. Which would mean that you can wait > up to that number before calling it a day and emptying the queue. The big > question I have which you elude too is whether this mechanism should have > some kind of back pressure mechanism by also caping the queue size. > > > Gustavo is implementing that for the ASYNC backend, but the SYNC > backend will always block the user thread until the commit is done > (and some commit is going to be done ASAP). > About to write a mail to hibernate-dev to discuss the ASYNC backend > property name and exact semantics. > > > I understand that the sync mode will block until the commit is done. what I > am saying is that for HSEARCH-1699 (SYNC) (and probably also for the ASYNC > mode), you can ask the user ?how much more? is he willing to wait for the > index to be committed compared to ?as fast as possible?. That becomes your > window of aggregation. Does that make sense? > > > > > BTW, in the following paragraph, either you lost me or you are talking non > sense: > > Systems with an high degree of parallelism will benefit from this, and the > performance should converge to the performance you would have without every > doing a commit; however if the frequency of commits is apparoching to zero, > it also means that the average latency of each operation will get > significantly higher. Still, in such situations assuming we are for example > stacking up a million changesets between each commit, that implies this > solution would be approximately a million times faster than the existing > design (A million would not be realistic of course as it implies a million > of parallel requests). > > > I think you can only converge to an average of 1/2 * (commit + configured > delay time) latency wise. I am assuming latency is what people are > interested in, not the average CPU / memory load of indexing. > > > I'm sorry I'm confused. There is no configured delay time for the SYNC > backend discussed on HSEARCH-1699, are you talking about the Async > one? But my paragraph above is strictly referring tot the strategy > meant to be applied for the Sync one. > > > There is a delay. it is what you call the "target frequency of commits?. And > my alternative that i proposed is not su much a frequency rather than how > much more you delay a flush in the hope of getting more work in. No there is no delay, in case there is a constant flow of incoming write operations, the write loop will degenerate in something like (pseudo code and overly simplified): while (true) { apply(getNextChangeset()) commit(); } So it's a busy loop with no waits: the "target frequency of commits" will naturally match the maximum frequency of commits which the storage can handle, as we've said that applying the changes is not a cost, it's essentially the same as while (true) { commit(); } That code will loop faster if the commits are quick. The point being that the number of changes which we can apply in period T, does not depend on the time it taks to do commit operations on the underlying storage. The real code will need to be a bit more complex, for example to handle this case: while (true) { changeset = getNextChangeset(); if (changeset.isEmpty) { waitWithoutBurningCPU(); } else { apply(all pending changes) commit(); } > > In your model of a fixed frequency, they the average delay is 1/2 * > 1/frequency + commit time. > Or do you have something different in mind? I hope the above example clarifies. It's not a fixed frequency, it's "as fast as it can", but with latency not better than what can be performed by a single commit. What I'm attempting to explain when comparing "frequency" is that this is the optimal speed for each situation, especially compared to current solution, and regardless of queueing up. There is an inherent form of back pressure: it's limited by the cost of the single commit, which will delay further changesets in the queue.. but the queue depth doesn't get larger than 1 and we don't risk running out of space as it blocks producers, blocking the application. Sanne > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From emmanuel at hibernate.org Mon Oct 20 12:21:58 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Mon, 20 Oct 2014 18:21:58 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54452083.4020502@redhat.com> References: <54452083.4020502@redhat.com> Message-ID: There is a difference between cherry picking and rebasing when it comes to reapply a work on top of a branch. Do you dislike both equally compared to a merge (aka railroad nexus git history approach)? On 20 Oct 2014, at 16:47, Tristan Tarrant wrote: > Hi guys, > > with the imminent release of 7.0.0.CR2 we are reaching the end of this > release cycle. There have been a ton of improvements (maybe too many) > and a lot of time has passed since the previous version (maybe to much). > Following up on my previous e-mail about future plans, here's a recap of > a plan which I believe will allow us to move at a much quicker pace: > > For the next minor releases I would like to suggest the following strategy: > - use a 3 month timebox where we strive to maintain master in an "always releasable" state > - complex feature work will need to happen onto dedicated feature branches, using the usual GitHub pull-request workflow > - only when a feature is complete (code, tests, docs, reviewed, CI-checked) it will be merged back into master > - if a feature is running late it will be postponed to the following minor release so as not to hinder other development > > I am also going to suggest dropping the cherry-picking approach and going with git merge. In order to achieve this we need CI to be always in top form with 0 failures in master. This will allow merging a PR directly from GitHub's interface. We obviously need to trust our tools and our existing code base. > > This is the plan for 7.1.0: > > 13 November 7.1.0.Alpha1 > 18 December 7.1.0.Beta1 > 15 January 7.1.0.CR1 > 30 January 7.1.0.Final > > > Tristan > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From mmarkus at redhat.com Mon Oct 20 12:28:14 2014 From: mmarkus at redhat.com (Mircea Markus) Date: Mon, 20 Oct 2014 17:28:14 +0100 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> Message-ID: <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> On Oct 20, 2014, at 17:21, Emmanuel Bernard wrote: > There is a difference between cherry picking and rebasing when it comes to reapply a work on top of a branch. What is the difference? :-) > Do you dislike both equally compared to a merge (aka railroad nexus git history approach)? Using github's "merge" button is pretty convenient imo, even though the history is not as nice as with a rebase (or cherry-pick, I miss the difference for now ) > > > On 20 Oct 2014, at 16:47, Tristan Tarrant wrote: > >> Hi guys, >> >> with the imminent release of 7.0.0.CR2 we are reaching the end of this >> release cycle. There have been a ton of improvements (maybe too many) >> and a lot of time has passed since the previous version (maybe to much). >> Following up on my previous e-mail about future plans, here's a recap of >> a plan which I believe will allow us to move at a much quicker pace: >> >> For the next minor releases I would like to suggest the following strategy: >> - use a 3 month timebox where we strive to maintain master in an "always releasable" state >> - complex feature work will need to happen onto dedicated feature branches, using the usual GitHub pull-request workflow >> - only when a feature is complete (code, tests, docs, reviewed, CI-checked) it will be merged back into master >> - if a feature is running late it will be postponed to the following minor release so as not to hinder other development >> >> I am also going to suggest dropping the cherry-picking approach and going with git merge. In order to achieve this we need CI to be always in top form with 0 failures in master. This will allow merging a PR directly from GitHub's interface. We obviously need to trust our tools and our existing code base. >> >> This is the plan for 7.1.0: >> >> 13 November 7.1.0.Alpha1 >> 18 December 7.1.0.Beta1 >> 15 January 7.1.0.CR1 >> 30 January 7.1.0.Final >> >> >> Tristan >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) From emmanuel at hibernate.org Mon Oct 20 12:37:18 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Mon, 20 Oct 2014 18:37:18 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> Message-ID: rebase is a oneliner op per branch you want to reapply whereas cherry picking requires to manually select the commits you want. Underneath in git guts it probably does the same. I have to admit I barely had the occasion to want to click the GitHub UI button as except for simple documentation, reviewing code almost always require to fetch the branch and look at it in an IDE of sort for proper review. The documentation bit is actually even requiring local run since Markdown / Asciidoc and all tend to silently fail a syntax mistake. On 20 Oct 2014, at 18:28, Mircea Markus wrote: > > On Oct 20, 2014, at 17:21, Emmanuel Bernard wrote: > >> There is a difference between cherry picking and rebasing when it comes to reapply a work on top of a branch. > > What is the difference? :-) > >> Do you dislike both equally compared to a merge (aka railroad nexus git history approach)? > > Using github's "merge" button is pretty convenient imo, even though the history is not as nice as with a rebase (or cherry-pick, I miss the difference for now ) > >> >> >> On 20 Oct 2014, at 16:47, Tristan Tarrant wrote: >> >>> Hi guys, >>> >>> with the imminent release of 7.0.0.CR2 we are reaching the end of this >>> release cycle. There have been a ton of improvements (maybe too many) >>> and a lot of time has passed since the previous version (maybe to much). >>> Following up on my previous e-mail about future plans, here's a recap of >>> a plan which I believe will allow us to move at a much quicker pace: >>> >>> For the next minor releases I would like to suggest the following strategy: >>> - use a 3 month timebox where we strive to maintain master in an "always releasable" state >>> - complex feature work will need to happen onto dedicated feature branches, using the usual GitHub pull-request workflow >>> - only when a feature is complete (code, tests, docs, reviewed, CI-checked) it will be merged back into master >>> - if a feature is running late it will be postponed to the following minor release so as not to hinder other development >>> >>> I am also going to suggest dropping the cherry-picking approach and going with git merge. In order to achieve this we need CI to be always in top form with 0 failures in master. This will allow merging a PR directly from GitHub's interface. We obviously need to trust our tools and our existing code base. >>> >>> This is the plan for 7.1.0: >>> >>> 13 November 7.1.0.Alpha1 >>> 18 December 7.1.0.Beta1 >>> 15 January 7.1.0.CR1 >>> 30 January 7.1.0.Final >>> >>> >>> Tristan >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141020/b61fadeb/attachment-0001.html From ttarrant at redhat.com Mon Oct 20 12:40:54 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 20 Oct 2014 18:40:54 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> Message-ID: <54453B16.6050202@redhat.com> Sure, you still want to review it in your IDE, and maybe run local tests, but ultimately merging via the GitHub UI. Tristan On 20/10/14 18:37, Emmanuel Bernard wrote: > rebase is a oneliner op per branch you want to reapply whereas cherry > picking requires to manually select the commits you want. Underneath > in git guts it probably does the same. > > I have to admit I barely had the occasion to want to click the GitHub > UI button as except for simple documentation, reviewing code almost > always require to fetch the branch and look at it in an IDE of sort > for proper review. The documentation bit is actually even requiring > local run since Markdown / Asciidoc and all tend to silently fail a > syntax mistake. > > On 20 Oct 2014, at 18:28, Mircea Markus > wrote: > >> >> On Oct 20, 2014, at 17:21, Emmanuel Bernard > > wrote: >> >>> There is a difference between cherry picking and rebasing when it >>> comes to reapply a work on top of a branch. >> >> What is the difference? :-) >> >>> Do you dislike both equally compared to a merge (aka railroad nexus >>> git history approach)? >> >> Using github's "merge" button is pretty convenient imo, even though >> the history is not as nice as with a rebase (or cherry-pick, I miss >> the difference for now ) >> >>> >>> >>> On 20 Oct 2014, at 16:47, Tristan Tarrant >> > wrote: >>> >>>> Hi guys, >>>> >>>> with the imminent release of 7.0.0.CR2 we are reaching the end of this >>>> release cycle. There have been a ton of improvements (maybe too many) >>>> and a lot of time has passed since the previous version (maybe to >>>> much). >>>> Following up on my previous e-mail about future plans, here's a >>>> recap of >>>> a plan which I believe will allow us to move at a much quicker pace: >>>> >>>> For the next minor releases I would like to suggest the following >>>> strategy: >>>> - use a 3 month timebox where we strive to maintain master in an >>>> "always releasable" state >>>> - complex feature work will need to happen onto dedicated feature >>>> branches, using the usual GitHub pull-request workflow >>>> - only when a feature is complete (code, tests, docs, reviewed, >>>> CI-checked) it will be merged back into master >>>> - if a feature is running late it will be postponed to the >>>> following minor release so as not to hinder other development >>>> >>>> I am also going to suggest dropping the cherry-picking approach and >>>> going with git merge. In order to achieve this we need CI to be >>>> always in top form with 0 failures in master. This will allow >>>> merging a PR directly from GitHub's interface. We obviously need to >>>> trust our tools and our existing code base. >>>> >>>> This is the plan for 7.1.0: >>>> >>>> 13 November 7.1.0.Alpha1 >>>> 18 December 7.1.0.Beta1 >>>> 15 January 7.1.0.CR1 >>>> 30 January 7.1.0.Final >>>> >>>> >>>> Tristan >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> Cheers, >> -- >> Mircea Markus >> Infinispan lead (www.infinispan.org ) >> >> >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From sanne at infinispan.org Mon Oct 20 12:45:54 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 20 Oct 2014 17:45:54 +0100 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54453B16.6050202@redhat.com> References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> <54453B16.6050202@redhat.com> Message-ID: On 20 October 2014 17:40, Tristan Tarrant wrote: > Sure, you still want to review it in your IDE, and maybe run local > tests, but ultimately merging via the GitHub UI. If you do one thing locally, and then "ultimately" press a button there you didn't test the same thing. Sanne > > Tristan > > On 20/10/14 18:37, Emmanuel Bernard wrote: >> rebase is a oneliner op per branch you want to reapply whereas cherry >> picking requires to manually select the commits you want. Underneath >> in git guts it probably does the same. >> >> I have to admit I barely had the occasion to want to click the GitHub >> UI button as except for simple documentation, reviewing code almost >> always require to fetch the branch and look at it in an IDE of sort >> for proper review. The documentation bit is actually even requiring >> local run since Markdown / Asciidoc and all tend to silently fail a >> syntax mistake. >> >> On 20 Oct 2014, at 18:28, Mircea Markus > > wrote: >> >>> >>> On Oct 20, 2014, at 17:21, Emmanuel Bernard >> > wrote: >>> >>>> There is a difference between cherry picking and rebasing when it >>>> comes to reapply a work on top of a branch. >>> >>> What is the difference? :-) >>> >>>> Do you dislike both equally compared to a merge (aka railroad nexus >>>> git history approach)? >>> >>> Using github's "merge" button is pretty convenient imo, even though >>> the history is not as nice as with a rebase (or cherry-pick, I miss >>> the difference for now ) >>> >>>> >>>> >>>> On 20 Oct 2014, at 16:47, Tristan Tarrant >>> > wrote: >>>> >>>>> Hi guys, >>>>> >>>>> with the imminent release of 7.0.0.CR2 we are reaching the end of this >>>>> release cycle. There have been a ton of improvements (maybe too many) >>>>> and a lot of time has passed since the previous version (maybe to >>>>> much). >>>>> Following up on my previous e-mail about future plans, here's a >>>>> recap of >>>>> a plan which I believe will allow us to move at a much quicker pace: >>>>> >>>>> For the next minor releases I would like to suggest the following >>>>> strategy: >>>>> - use a 3 month timebox where we strive to maintain master in an >>>>> "always releasable" state >>>>> - complex feature work will need to happen onto dedicated feature >>>>> branches, using the usual GitHub pull-request workflow >>>>> - only when a feature is complete (code, tests, docs, reviewed, >>>>> CI-checked) it will be merged back into master >>>>> - if a feature is running late it will be postponed to the >>>>> following minor release so as not to hinder other development >>>>> >>>>> I am also going to suggest dropping the cherry-picking approach and >>>>> going with git merge. In order to achieve this we need CI to be >>>>> always in top form with 0 failures in master. This will allow >>>>> merging a PR directly from GitHub's interface. We obviously need to >>>>> trust our tools and our existing code base. >>>>> >>>>> This is the plan for 7.1.0: >>>>> >>>>> 13 November 7.1.0.Alpha1 >>>>> 18 December 7.1.0.Beta1 >>>>> 15 January 7.1.0.CR1 >>>>> 30 January 7.1.0.Final >>>>> >>>>> >>>>> Tristan >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> Cheers, >>> -- >>> Mircea Markus >>> Infinispan lead (www.infinispan.org ) >>> >>> >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From emmanuel at hibernate.org Mon Oct 20 12:49:24 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Mon, 20 Oct 2014 18:49:24 +0200 Subject: [infinispan-dev] Improving the performance of index writers In-Reply-To: References: <403215A0-6143-4EB4-B77E-553DB04FD587@hibernate.org> Message-ID: <3176738B-7837-46D1-AD33-C36EF9E755F3@hibernate.org> So assuming an idle index loop, the first commit would lead to the execution of the indexing work and flush. If two or more commits come during that flush time, then the queue would be > 1 and ?batching? would occur. Correct? On 20 Oct 2014, at 16:57, Sanne Grinovero wrote: > On 20 October 2014 14:55, Emmanuel Bernard wrote: >> >> On 20 Oct 2014, at 14:10, Sanne Grinovero wrote: >> >> On 20 October 2014 12:59, Emmanuel Bernard wrote: >> >> HSEARCH-1699 looks good. A few comments. >> >> Maybe from a user point of you we want to expose the number of ms the user >> is ok to delay a commit due to indexing. Which would mean that you can wait >> up to that number before calling it a day and emptying the queue. The big >> question I have which you elude too is whether this mechanism should have >> some kind of back pressure mechanism by also caping the queue size. >> >> >> Gustavo is implementing that for the ASYNC backend, but the SYNC >> backend will always block the user thread until the commit is done >> (and some commit is going to be done ASAP). >> About to write a mail to hibernate-dev to discuss the ASYNC backend >> property name and exact semantics. >> >> >> I understand that the sync mode will block until the commit is done. what I >> am saying is that for HSEARCH-1699 (SYNC) (and probably also for the ASYNC >> mode), you can ask the user ?how much more? is he willing to wait for the >> index to be committed compared to ?as fast as possible?. That becomes your >> window of aggregation. Does that make sense? >> >> >> >> >> BTW, in the following paragraph, either you lost me or you are talking non >> sense: >> >> Systems with an high degree of parallelism will benefit from this, and the >> performance should converge to the performance you would have without every >> doing a commit; however if the frequency of commits is apparoching to zero, >> it also means that the average latency of each operation will get >> significantly higher. Still, in such situations assuming we are for example >> stacking up a million changesets between each commit, that implies this >> solution would be approximately a million times faster than the existing >> design (A million would not be realistic of course as it implies a million >> of parallel requests). >> >> >> I think you can only converge to an average of 1/2 * (commit + configured >> delay time) latency wise. I am assuming latency is what people are >> interested in, not the average CPU / memory load of indexing. >> >> >> I'm sorry I'm confused. There is no configured delay time for the SYNC >> backend discussed on HSEARCH-1699, are you talking about the Async >> one? But my paragraph above is strictly referring tot the strategy >> meant to be applied for the Sync one. >> >> >> There is a delay. it is what you call the "target frequency of commits?. And >> my alternative that i proposed is not su much a frequency rather than how >> much more you delay a flush in the hope of getting more work in. > > No there is no delay, in case there is a constant flow of incoming > write operations, the write loop will degenerate in something like > (pseudo code and overly simplified): > > while (true) { > apply(getNextChangeset()) > commit(); > } > > So it's a busy loop with no waits: the "target frequency of commits" > will naturally match the maximum frequency of commits which the > storage can handle, as we've said that applying the changes is not a > cost, it's essentially the same as > > while (true) { > commit(); > } > > That code will loop faster if the commits are quick. The point being > that the number of changes which we can apply in period T, does not > depend on the time it taks to do commit operations on the underlying > storage. > > The real code will need to be a bit more complex, for example to > handle this case: > > while (true) { > changeset = getNextChangeset(); > if (changeset.isEmpty) { > waitWithoutBurningCPU(); > } > else { > apply(all pending changes) > commit(); > } > > >> >> In your model of a fixed frequency, they the average delay is 1/2 * >> 1/frequency + commit time. >> Or do you have something different in mind? > > I hope the above example clarifies. It's not a fixed frequency, it's > "as fast as it can", but with latency not better than what can be > performed by a single commit. What I'm attempting to explain when > comparing "frequency" is that this is the optimal speed for each > situation, especially compared to current solution, and regardless of > queueing up. > There is an inherent form of back pressure: it's limited by the cost > of the single commit, which will delay further changesets in the > queue.. but the queue depth doesn't get larger than 1 and we don't > risk running out of space as it blocks producers, blocking the > application. > > Sanne > > >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Oct 20 12:51:58 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 20 Oct 2014 18:51:58 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> <54453B16.6050202@redhat.com> Message-ID: <54453DAE.8000707@redhat.com> My assumption is that the test is run by CI. Tristan On 20/10/14 18:45, Sanne Grinovero wrote: > On 20 October 2014 17:40, Tristan Tarrant wrote: >> Sure, you still want to review it in your IDE, and maybe run local >> tests, but ultimately merging via the GitHub UI. > If you do one thing locally, and then "ultimately" press a button > there you didn't test the same thing. > > Sanne > >> Tristan >> >> On 20/10/14 18:37, Emmanuel Bernard wrote: >>> rebase is a oneliner op per branch you want to reapply whereas cherry >>> picking requires to manually select the commits you want. Underneath >>> in git guts it probably does the same. >>> >>> I have to admit I barely had the occasion to want to click the GitHub >>> UI button as except for simple documentation, reviewing code almost >>> always require to fetch the branch and look at it in an IDE of sort >>> for proper review. The documentation bit is actually even requiring >>> local run since Markdown / Asciidoc and all tend to silently fail a >>> syntax mistake. >>> >>> On 20 Oct 2014, at 18:28, Mircea Markus >> > wrote: >>> >>>> On Oct 20, 2014, at 17:21, Emmanuel Bernard >>> > wrote: >>>> >>>>> There is a difference between cherry picking and rebasing when it >>>>> comes to reapply a work on top of a branch. >>>> What is the difference? :-) >>>> >>>>> Do you dislike both equally compared to a merge (aka railroad nexus >>>>> git history approach)? >>>> Using github's "merge" button is pretty convenient imo, even though >>>> the history is not as nice as with a rebase (or cherry-pick, I miss >>>> the difference for now ) >>>> >>>>> >>>>> On 20 Oct 2014, at 16:47, Tristan Tarrant >>>> > wrote: >>>>> >>>>>> Hi guys, >>>>>> >>>>>> with the imminent release of 7.0.0.CR2 we are reaching the end of this >>>>>> release cycle. There have been a ton of improvements (maybe too many) >>>>>> and a lot of time has passed since the previous version (maybe to >>>>>> much). >>>>>> Following up on my previous e-mail about future plans, here's a >>>>>> recap of >>>>>> a plan which I believe will allow us to move at a much quicker pace: >>>>>> >>>>>> For the next minor releases I would like to suggest the following >>>>>> strategy: >>>>>> - use a 3 month timebox where we strive to maintain master in an >>>>>> "always releasable" state >>>>>> - complex feature work will need to happen onto dedicated feature >>>>>> branches, using the usual GitHub pull-request workflow >>>>>> - only when a feature is complete (code, tests, docs, reviewed, >>>>>> CI-checked) it will be merged back into master >>>>>> - if a feature is running late it will be postponed to the >>>>>> following minor release so as not to hinder other development >>>>>> >>>>>> I am also going to suggest dropping the cherry-picking approach and >>>>>> going with git merge. In order to achieve this we need CI to be >>>>>> always in top form with 0 failures in master. This will allow >>>>>> merging a PR directly from GitHub's interface. We obviously need to >>>>>> trust our tools and our existing code base. >>>>>> >>>>>> This is the plan for 7.1.0: >>>>>> >>>>>> 13 November 7.1.0.Alpha1 >>>>>> 18 December 7.1.0.Beta1 >>>>>> 15 January 7.1.0.CR1 >>>>>> 30 January 7.1.0.Final >>>>>> >>>>>> >>>>>> Tristan >>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> Cheers, >>>> -- >>>> Mircea Markus >>>> Infinispan lead (www.infinispan.org ) >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From emmanuel at hibernate.org Mon Oct 20 12:55:47 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Mon, 20 Oct 2014 18:55:47 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54453B16.6050202@redhat.com> References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> <54453B16.6050202@redhat.com> Message-ID: So you review locally and potentially run locally and then you switch from your terminal console or IDE to wherever the button is in your 350 opened tabs because it?s faster than git push upstream master. I am having a hard time to see the convenience unless you do browser only reviews. On 20 Oct 2014, at 18:40, Tristan Tarrant wrote: > Sure, you still want to review it in your IDE, and maybe run local > tests, but ultimately merging via the GitHub UI. > > Tristan > > On 20/10/14 18:37, Emmanuel Bernard wrote: >> rebase is a oneliner op per branch you want to reapply whereas cherry >> picking requires to manually select the commits you want. Underneath >> in git guts it probably does the same. >> >> I have to admit I barely had the occasion to want to click the GitHub >> UI button as except for simple documentation, reviewing code almost >> always require to fetch the branch and look at it in an IDE of sort >> for proper review. The documentation bit is actually even requiring >> local run since Markdown / Asciidoc and all tend to silently fail a >> syntax mistake. >> >> On 20 Oct 2014, at 18:28, Mircea Markus > > wrote: >> >>> >>> On Oct 20, 2014, at 17:21, Emmanuel Bernard >> > wrote: >>> >>>> There is a difference between cherry picking and rebasing when it >>>> comes to reapply a work on top of a branch. >>> >>> What is the difference? :-) >>> >>>> Do you dislike both equally compared to a merge (aka railroad nexus >>>> git history approach)? >>> >>> Using github's "merge" button is pretty convenient imo, even though >>> the history is not as nice as with a rebase (or cherry-pick, I miss >>> the difference for now ) >>> >>>> >>>> >>>> On 20 Oct 2014, at 16:47, Tristan Tarrant >>> > wrote: >>>> >>>>> Hi guys, >>>>> >>>>> with the imminent release of 7.0.0.CR2 we are reaching the end of this >>>>> release cycle. There have been a ton of improvements (maybe too many) >>>>> and a lot of time has passed since the previous version (maybe to >>>>> much). >>>>> Following up on my previous e-mail about future plans, here's a >>>>> recap of >>>>> a plan which I believe will allow us to move at a much quicker pace: >>>>> >>>>> For the next minor releases I would like to suggest the following >>>>> strategy: >>>>> - use a 3 month timebox where we strive to maintain master in an >>>>> "always releasable" state >>>>> - complex feature work will need to happen onto dedicated feature >>>>> branches, using the usual GitHub pull-request workflow >>>>> - only when a feature is complete (code, tests, docs, reviewed, >>>>> CI-checked) it will be merged back into master >>>>> - if a feature is running late it will be postponed to the >>>>> following minor release so as not to hinder other development >>>>> >>>>> I am also going to suggest dropping the cherry-picking approach and >>>>> going with git merge. In order to achieve this we need CI to be >>>>> always in top form with 0 failures in master. This will allow >>>>> merging a PR directly from GitHub's interface. We obviously need to >>>>> trust our tools and our existing code base. >>>>> >>>>> This is the plan for 7.1.0: >>>>> >>>>> 13 November 7.1.0.Alpha1 >>>>> 18 December 7.1.0.Beta1 >>>>> 15 January 7.1.0.CR1 >>>>> 30 January 7.1.0.Final >>>>> >>>>> >>>>> Tristan >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> Cheers, >>> -- >>> Mircea Markus >>> Infinispan lead (www.infinispan.org ) >>> >>> >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From sanne at infinispan.org Mon Oct 20 13:09:57 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 20 Oct 2014 18:09:57 +0100 Subject: [infinispan-dev] Improving the performance of index writers In-Reply-To: <3176738B-7837-46D1-AD33-C36EF9E755F3@hibernate.org> References: <403215A0-6143-4EB4-B77E-553DB04FD587@hibernate.org> <3176738B-7837-46D1-AD33-C36EF9E755F3@hibernate.org> Message-ID: On 20 October 2014 17:49, Emmanuel Bernard wrote: > So assuming an idle index loop, the first commit would lead to the execution of the indexing work and flush. If two or more commits come during that flush time, then the queue would be > 1 and ?batching? would occur. Correct? Yes that's the idea. As soon as the IndexWriter thread is done with that first commit, it gets back to see if more work was queued up and takes *all* of them for processing. The interesting part should be why taking *all* should not be a scary concept here: Memory-wise it's not a problem as all those changesets already are on the stack, and they still prevent further work to be created (as producers are blocked) so there is no additional memory cost (compared to previous approach). Time-wise, applying one or "all" makes no (significant) difference, as we've seen that each changeset to be applied is very efficient, while the commit is what introduces a more significant delay (orders of magnitude difference). Only if we have millions of changesets in the batch there would be some noticeable change of latency, as the first threads having enqueued something would have to wait slightly longer than normal; but even then, the average latency of all producers would be lower than the current approach as all other producers are not waiting in a long line. Sanne > > > > On 20 Oct 2014, at 16:57, Sanne Grinovero wrote: > >> On 20 October 2014 14:55, Emmanuel Bernard wrote: >>> >>> On 20 Oct 2014, at 14:10, Sanne Grinovero wrote: >>> >>> On 20 October 2014 12:59, Emmanuel Bernard wrote: >>> >>> HSEARCH-1699 looks good. A few comments. >>> >>> Maybe from a user point of you we want to expose the number of ms the user >>> is ok to delay a commit due to indexing. Which would mean that you can wait >>> up to that number before calling it a day and emptying the queue. The big >>> question I have which you elude too is whether this mechanism should have >>> some kind of back pressure mechanism by also caping the queue size. >>> >>> >>> Gustavo is implementing that for the ASYNC backend, but the SYNC >>> backend will always block the user thread until the commit is done >>> (and some commit is going to be done ASAP). >>> About to write a mail to hibernate-dev to discuss the ASYNC backend >>> property name and exact semantics. >>> >>> >>> I understand that the sync mode will block until the commit is done. what I >>> am saying is that for HSEARCH-1699 (SYNC) (and probably also for the ASYNC >>> mode), you can ask the user ?how much more? is he willing to wait for the >>> index to be committed compared to ?as fast as possible?. That becomes your >>> window of aggregation. Does that make sense? >>> >>> >>> >>> >>> BTW, in the following paragraph, either you lost me or you are talking non >>> sense: >>> >>> Systems with an high degree of parallelism will benefit from this, and the >>> performance should converge to the performance you would have without every >>> doing a commit; however if the frequency of commits is apparoching to zero, >>> it also means that the average latency of each operation will get >>> significantly higher. Still, in such situations assuming we are for example >>> stacking up a million changesets between each commit, that implies this >>> solution would be approximately a million times faster than the existing >>> design (A million would not be realistic of course as it implies a million >>> of parallel requests). >>> >>> >>> I think you can only converge to an average of 1/2 * (commit + configured >>> delay time) latency wise. I am assuming latency is what people are >>> interested in, not the average CPU / memory load of indexing. >>> >>> >>> I'm sorry I'm confused. There is no configured delay time for the SYNC >>> backend discussed on HSEARCH-1699, are you talking about the Async >>> one? But my paragraph above is strictly referring tot the strategy >>> meant to be applied for the Sync one. >>> >>> >>> There is a delay. it is what you call the "target frequency of commits?. And >>> my alternative that i proposed is not su much a frequency rather than how >>> much more you delay a flush in the hope of getting more work in. >> >> No there is no delay, in case there is a constant flow of incoming >> write operations, the write loop will degenerate in something like >> (pseudo code and overly simplified): >> >> while (true) { >> apply(getNextChangeset()) >> commit(); >> } >> >> So it's a busy loop with no waits: the "target frequency of commits" >> will naturally match the maximum frequency of commits which the >> storage can handle, as we've said that applying the changes is not a >> cost, it's essentially the same as >> >> while (true) { >> commit(); >> } >> >> That code will loop faster if the commits are quick. The point being >> that the number of changes which we can apply in period T, does not >> depend on the time it taks to do commit operations on the underlying >> storage. >> >> The real code will need to be a bit more complex, for example to >> handle this case: >> >> while (true) { >> changeset = getNextChangeset(); >> if (changeset.isEmpty) { >> waitWithoutBurningCPU(); >> } >> else { >> apply(all pending changes) >> commit(); >> } >> >> >>> >>> In your model of a fixed frequency, they the average delay is 1/2 * >>> 1/frequency + commit time. >>> Or do you have something different in mind? >> >> I hope the above example clarifies. It's not a fixed frequency, it's >> "as fast as it can", but with latency not better than what can be >> performed by a single commit. What I'm attempting to explain when >> comparing "frequency" is that this is the optimal speed for each >> situation, especially compared to current solution, and regardless of >> queueing up. >> There is an inherent form of back pressure: it's limited by the cost >> of the single commit, which will delay further changesets in the >> queue.. but the queue depth doesn't get larger than 1 and we don't >> risk running out of space as it blocks producers, blocking the >> application. >> >> Sanne >> >> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From an1310 at hotmail.com Mon Oct 20 14:35:35 2014 From: an1310 at hotmail.com (Erik Salter) Date: Mon, 20 Oct 2014 14:35:35 -0400 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54452083.4020502@redhat.com> References: <54452083.4020502@redhat.com> Message-ID: With this more agile release cycle, can users expect minor releases to be compatible with the previous release? Or will we still need to use the RollingUpgrade path? Regards, Erik On 10/20/14, 10:47 AM, "Tristan Tarrant" wrote: >Hi guys, > >with the imminent release of 7.0.0.CR2 we are reaching the end of this >release cycle. There have been a ton of improvements (maybe too many) >and a lot of time has passed since the previous version (maybe to much). >Following up on my previous e-mail about future plans, here's a recap of >a plan which I believe will allow us to move at a much quicker pace: > >For the next minor releases I would like to suggest the following >strategy: >- use a 3 month timebox where we strive to maintain master in an "always >releasable" state >- complex feature work will need to happen onto dedicated feature >branches, using the usual GitHub pull-request workflow >- only when a feature is complete (code, tests, docs, reviewed, >CI-checked) it will be merged back into master >- if a feature is running late it will be postponed to the following >minor release so as not to hinder other development > >I am also going to suggest dropping the cherry-picking approach and going >with git merge. In order to achieve this we need CI to be always in top >form with 0 failures in master. This will allow merging a PR directly >from GitHub's interface. We obviously need to trust our tools and our >existing code base. > >This is the plan for 7.1.0: > >13 November 7.1.0.Alpha1 >18 December 7.1.0.Beta1 >15 January 7.1.0.CR1 >30 January 7.1.0.Final > > >Tristan > >_______________________________________________ >infinispan-dev mailing list >infinispan-dev at lists.jboss.org >https://lists.jboss.org/mailman/listinfo/infinispan-dev From slaskawi at redhat.com Tue Oct 21 03:41:05 2014 From: slaskawi at redhat.com (=?UTF-8?B?U2ViYXN0aWFuIMWBYXNrYXdpZWM=?=) Date: Tue, 21 Oct 2014 09:41:05 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54452083.4020502@redhat.com> References: <54452083.4020502@redhat.com> Message-ID: <54460E11.2000705@redhat.com> +1000 I think that's a big step in good direction. As Tristan said - having 0 test failures is essential here. I would say even more - Pull Requests without green tick from CI shouldn't be considered as "ready for review". Having 0 test failures rule has one additional side effect - if for some reason test failure gets into the repo (e.g. merging 2 similar PRs which change test data - especially in text files) - all other Pull Requests will start to fail from that point. This will oblige us to fix the failure before merging further Pull Requests. This makes tracking down failure commits really easy (or at least much easier then it is now). +1 for merging using Github UI - It simply saves us time. Searching through railroad history is a bit harder - but on the other hand "git log --merges --graph" shows you very nice history of merged features (not individual commits). It might be really useful in some cases. Thanks! Sebastian On 10/20/2014 04:47 PM, Tristan Tarrant wrote: > I am also going to suggest dropping the cherry-picking approach and going with git merge. In order to achieve this we need CI to be always in top form with 0 failures in master. This will allow merging a PR directly from GitHub's interface. We obviously need to trust our tools and our existing code base. From sanne at infinispan.org Tue Oct 21 04:27:11 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 21 Oct 2014 09:27:11 +0100 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54460E11.2000705@redhat.com> References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> Message-ID: On 21 October 2014 08:41, Sebastian ?askawiec wrote: > +1000 I think that's a big step in good direction. > > As Tristan said - having 0 test failures is essential here. I would say > even more - Pull Requests without green tick from CI shouldn't be > considered as "ready for review". > > Having 0 test failures rule has one additional side effect - if for some > reason test failure gets into the repo (e.g. merging 2 similar PRs which > change test data - especially in text files) - all other Pull Requests > will start to fail from that point. This will oblige us to fix the > failure before merging further Pull Requests. This makes tracking down > failure commits really easy (or at least much easier then it is now). I totally agree here, but it never worked: people regularly ignore failing tests, for various reasons. We've had similar good intentions expressed many times, but I simply have no reason to believe that this time it's going to work out. > +1 for merging using Github UI - It simply saves us time. Searching > through railroad history is a bit harder - but on the other hand "git > log --merges --graph" shows you very nice history of merged features > (not individual commits). It might be really useful in some cases. I'm skeptical but we can try. I should admit that part of my scepticism comes from having had bad experiences with merge, but that's probably caused by the fact that I generally don't use them at all in my workflows. That said, I did face major horror stories caused by merge - especially from occasional contributors who might get confused by it - and I wouldn't try it on projects I care for. Good luck. Sanne > > Thanks! > Sebastian > > > On 10/20/2014 04:47 PM, Tristan Tarrant wrote: >> I am also going to suggest dropping the cherry-picking approach and going with git merge. In order to achieve this we need CI to be always in top form with 0 failures in master. This will allow merging a PR directly from GitHub's interface. We obviously need to trust our tools and our existing code base. > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From slaskawi at redhat.com Tue Oct 21 04:59:19 2014 From: slaskawi at redhat.com (=?UTF-8?B?U2ViYXN0aWFuIMWBYXNrYXdpZWM=?=) Date: Tue, 21 Oct 2014 10:59:19 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> Message-ID: <54462067.3050501@redhat.com> I think we can work on this one... First of all - contributors need to know about this rule, perhaps updating [1] might be a good idea. Official announcement on mailing list might be also helpful (this email thread is already pretty long, so it might be missed by many folks). Secondly - we need to stop integrating Pull Requests with new failures. It's a bit harder when we have some existing failures, because there is always an excuse (this failure is not related, it's just an unstable test etc). But once we have clean build - it's a "binary" decision. I think we might also add some descriptive comment to Pull Request when the build is unstable - something like "This Pull Request won't be integrated, because it's unstable. Fix it first.". [1] http://infinispan.org/docs/7.0.x/contributing/contributing.html On 10/21/2014 10:27 AM, Sanne Grinovero wrote: > I totally agree here, but it never worked: people regularly ignore > failing tests, for various reasons. > We've had similar good intentions expressed many times, but I simply > have no reason to believe that this time it's going to work out. From rory.odonnell at oracle.com Tue Oct 21 06:29:03 2014 From: rory.odonnell at oracle.com (Rory O'Donnell) Date: Tue, 21 Oct 2014 11:29:03 +0100 Subject: [infinispan-dev] Early Access builds for JDK 9 b35 and JDK 8u40 b10 are available on java.net Message-ID: <5446356F.4080709@oracle.com> Hi Galder, Early Access build for JDK 9 b35 is available on java.net, summary of changes are listed here Early Access build for JDK 8u40 b10 is available on java.net, summary of changes are listed here. Rgds,Rory -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141021/a335a393/attachment.html From dan.berindei at gmail.com Tue Oct 21 06:47:06 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 21 Oct 2014 13:47:06 +0300 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54462067.3050501@redhat.com> References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> <54462067.3050501@redhat.com> Message-ID: On Tue, Oct 21, 2014 at 11:59 AM, Sebastian ?askawiec wrote: > I think we can work on this one... > > First of all - contributors need to know about this rule, perhaps > updating [1] might be a good idea. Official announcement on mailing list > might be also helpful (this email thread is already pretty long, so it > might be missed by many folks). > Like Sanne said, we did decide to go for a green test suite a great many times on this list, so I don't think lack of awareness is an issue. In fact, I was volunteered to monitor the TeamCity test results and create a blocker issue for each failing test some time ago, but finding the proper owner for bugs proved to be quite time consuming so I haven't been sticking to it. This thread did motivate me to create a few new blocker issues, however :) > > Secondly - we need to stop integrating Pull Requests with new failures. > It's a bit harder when we have some existing failures, because there is > always an excuse (this failure is not related, it's just an unstable > test etc). But once we have clean build - it's a "binary" decision. > I think we might also add some descriptive comment to Pull Request when > the build is unstable - something like "This Pull Request won't be > integrated, because it's unstable. Fix it first.". > Of course, the question is how we are going to achieve that magical clean build status... At one point we moved the random failing tests to the unstable group/category to remove them from the main build, but I've noticed that we've never really brought back an unstable test, at least in the core, so I've stopped doing it. I figured having the test failures in every email from TeamCity would motivate people to look into those failures, but I'm not sure anyone reads the build failure emails from TeamCity :) And it's not enough to have one master build with 0 failures. We had that before, and it didn't really help. We have to have at least a week with 0 failures on master (with JDK7, with JDK8, and with TRACE enabled, on the slow EC2 agent, on the fast OpenStack agents etc.) before we can say the build is really clean. I'm not saying we can't do it, but I share Sanne's pessimism: this time is not that different from the last time we committed to a clean build. > > [1] http://infinispan.org/docs/7.0.x/contributing/contributing.html > > On 10/21/2014 10:27 AM, Sanne Grinovero wrote: > > I totally agree here, but it never worked: people regularly ignore > > failing tests, for various reasons. > > We've had similar good intentions expressed many times, but I simply > > have no reason to believe that this time it's going to work out. > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141021/0d3c859a/attachment-0001.html From rvansa at redhat.com Tue Oct 21 06:53:27 2014 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 21 Oct 2014 12:53:27 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54462067.3050501@redhat.com> References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> <54462067.3050501@redhat.com> Message-ID: <54463B27.2080505@redhat.com> We might also want to define reproducer test commits, and how should these be integrated. Is the recommended workflow to have one commit with issue reproducer test (so that we can see that it was broken prior to the fix my checking out just this commit), and following commit fixing the issue? Should those two commits be squashed when pulling the PR (so that any master commit is always green), or cherry-picked one-by-one (having only the actual HEAD on master green any time)? I miss the guideline. Thanks Radim On 10/21/2014 10:59 AM, Sebastian ?askawiec wrote: > I think we can work on this one... > > First of all - contributors need to know about this rule, perhaps > updating [1] might be a good idea. Official announcement on mailing list > might be also helpful (this email thread is already pretty long, so it > might be missed by many folks). > > Secondly - we need to stop integrating Pull Requests with new failures. > It's a bit harder when we have some existing failures, because there is > always an excuse (this failure is not related, it's just an unstable > test etc). But once we have clean build - it's a "binary" decision. > I think we might also add some descriptive comment to Pull Request when > the build is unstable - something like "This Pull Request won't be > integrated, because it's unstable. Fix it first.". > > [1] http://infinispan.org/docs/7.0.x/contributing/contributing.html > > On 10/21/2014 10:27 AM, Sanne Grinovero wrote: >> I totally agree here, but it never worked: people regularly ignore >> failing tests, for various reasons. >> We've had similar good intentions expressed many times, but I simply >> have no reason to believe that this time it's going to work out. > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From slaskawi at redhat.com Tue Oct 21 07:35:31 2014 From: slaskawi at redhat.com (=?UTF-8?B?U2ViYXN0aWFuIMWBYXNrYXdpZWM=?=) Date: Tue, 21 Oct 2014 13:35:31 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> <54462067.3050501@redhat.com> Message-ID: <54464503.7020903@redhat.com> On 10/21/2014 12:47 PM, Dan Berindei wrote: > In fact, I was volunteered to monitor the TeamCity test results and > create a blocker issue for each failing test some time ago, but > finding the proper owner for bugs proved to be quite time consuming so > I haven't been sticking to it. This thread did motivate me to create a > few new blocker issues, however :) I believe we need to change our strategy in this point. We don't want to create new issues - we want to motivate everybody to fix it (and fix it fast). As I said - when the failure gets into our repo - all successive Pull Requests will start to fail. Nobody will be able to integrate his changes and everybody (not everybody - some guys which are in hurry) will probably want to unblock themselves... The easiest way to do that is to fix the build... This is the main idea... To make failing test a serious problem and not just another "easy to ignore" issue... > Of course, the question is how we are going to achieve that magical > clean build status... I've got some idea - it's pretty controversial, but maybe you will like it :) * Remove every failing test from our code base - just delete it (no ignoring, no adding to separate testsuite - just delete). * Create separate branch and place all those tests there - simply revert commit which removed them from master. * Organize failed-test-bounty with our Community - ask them to fix as many as possible during fixed amount of time (a month or two? maybe shorter?). * Every contributor in failed-test-bounty will be listed in "Thanks" section of the release notes * After the bounty is over, we'll just delete tests which were not fixed... -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141021/8c816e8c/attachment.html From sanne at infinispan.org Tue Oct 21 07:46:42 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 21 Oct 2014 12:46:42 +0100 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54464503.7020903@redhat.com> References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> <54462067.3050501@redhat.com> <54464503.7020903@redhat.com> Message-ID: Hi Sebastian, I'm not against the idea at all, I would really love it and I agree that this is the biggest pain and waste of time in trying to make progress on Infinispan. I'm just trying to be realistic and warn you that these great intentions didn't work in the past. Now my hope is that maybe we now have more people agreeing on how important this is, so maybe we're in a better position, and I'm always willing to try this again. I'm sceptical though on embarking into a PR process which assumes that we're going to be able to keep the testsuite green just because we decide so; let's first work on a green testsuite, and then proof that we can keep it that way for a resonsable time.. then we can talk about building something on reliable foundations. [BTW I can't answer inline as you're sending HTML formatted email and gmail isn't properly separating your quote from Dan's previous words] Why would you delete the tests and not simply ignore them? You can revert the change as well. Both strategies would have the same effect, but ignoring them you don't need to take notes of what disappeared; best part, is that code using APIs get refactored together with other changes, and you minimize conflicts as you minimize the number of lines being changed. Sanne On 21 October 2014 12:35, Sebastian ?askawiec wrote: > On 10/21/2014 12:47 PM, Dan Berindei wrote: > > In fact, I was volunteered to monitor the TeamCity test results and create a > blocker issue for each failing test some time ago, but finding the proper > owner for bugs proved to be quite time consuming so I haven't been sticking > to it. This thread did motivate me to create a few new blocker issues, > however :) > > I believe we need to change our strategy in this point. We don't want to > create new issues - we want to motivate everybody to fix it (and fix it > fast). As I said - when the failure gets into our repo - all successive Pull > Requests will start to fail. Nobody will be able to integrate his changes > and everybody (not everybody - some guys which are in hurry) will probably > want to unblock themselves... The easiest way to do that is to fix the > build... > > This is the main idea... To make failing test a serious problem and not just > another "easy to ignore" issue... > > Of course, the question is how we are going to achieve that magical clean > build status... > > I've got some idea - it's pretty controversial, but maybe you will like it > :) > > Remove every failing test from our code base - just delete it (no ignoring, > no adding to separate testsuite - just delete). > Create separate branch and place all those tests there - simply revert > commit which removed them from master. > Organize failed-test-bounty with our Community - ask them to fix as many as > possible during fixed amount of time (a month or two? maybe shorter?). > Every contributor in failed-test-bounty will be listed in "Thanks" section > of the release notes > After the bounty is over, we'll just delete tests which were not fixed... > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From gustavonalle at gmail.com Tue Oct 21 08:05:52 2014 From: gustavonalle at gmail.com (Gustavo Fernandes) Date: Tue, 21 Oct 2014 13:05:52 +0100 Subject: [infinispan-dev] Infinispan 7.0.0.CR2 released! Message-ID: Dear all, We are proud to announce the second release candidate for Infinispan 7.0 Release details on http://blog.infinispan.org/2014/10/infinispan-7.html Thanks everyone for their contributions! Cheers, Gustavo From emmanuel at hibernate.org Tue Oct 21 09:13:01 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Tue, 21 Oct 2014 15:13:01 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54464503.7020903@redhat.com> References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> <54462067.3050501@redhat.com> <54464503.7020903@redhat.com> Message-ID: <20141021131301.GK3066@hibernate.org> No shower and no beer until the next fully green result (including tests left over as unstable). That should be motivating enough ;) On Tue 2014-10-21 13:35, Sebastian ?askawiec wrote: > On 10/21/2014 12:47 PM, Dan Berindei wrote: > >In fact, I was volunteered to monitor the TeamCity test results and create > >a blocker issue for each failing test some time ago, but finding the > >proper owner for bugs proved to be quite time consuming so I haven't been > >sticking to it. This thread did motivate me to create a few new blocker > >issues, however :) > I believe we need to change our strategy in this point. We don't want to > create new issues - we want to motivate everybody to fix it (and fix it > fast). As I said - when the failure gets into our repo - all successive Pull > Requests will start to fail. Nobody will be able to integrate his changes > and everybody (not everybody - some guys which are in hurry) will probably > want to unblock themselves... The easiest way to do that is to fix the > build... > > This is the main idea... To make failing test a serious problem and not just > another "easy to ignore" issue... > >Of course, the question is how we are going to achieve that magical clean > >build status... > I've got some idea - it's pretty controversial, but maybe you will like it > :) > > * Remove every failing test from our code base - just delete it (no > ignoring, no adding to separate testsuite - just delete). > * Create separate branch and place all those tests there - simply > revert commit which removed them from master. > * Organize failed-test-bounty with our Community - ask them to fix as > many as possible during fixed amount of time (a month or two? maybe > shorter?). > * Every contributor in failed-test-bounty will be listed in "Thanks" > section of the release notes > * After the bounty is over, we'll just delete tests which were not > fixed... > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Tue Oct 21 09:27:57 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 21 Oct 2014 16:27:57 +0300 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> <54462067.3050501@redhat.com> <54464503.7020903@redhat.com> Message-ID: On Tue, Oct 21, 2014 at 2:46 PM, Sanne Grinovero wrote: > Hi Sebastian, > I'm not against the idea at all, I would really love it and I agree > that this is the biggest pain and waste of time in trying to make > progress on Infinispan. I'm just trying to be realistic and warn you > that these great intentions didn't work in the past. > Now my hope is that maybe we now have more people agreeing on how > important this is, so maybe we're in a better position, and I'm always > willing to try this again. > > I'm sceptical though on embarking into a PR process which assumes that > we're going to be able to keep the testsuite green just because we > decide so; let's first work on a green testsuite, and then proof that > we can keep it that way for a resonsable time.. then we can talk about > building something on reliable foundations. > +1 > > [BTW I can't answer inline as you're sending HTML formatted email and > gmail isn't properly separating your quote from Dan's previous words] > > Why would you delete the tests and not simply ignore them? You can > revert the change as well. > Both strategies would have the same effect, but ignoring them you > don't need to take notes of what disappeared; best part, is that code > using APIs get refactored together with other changes, and you > minimize conflicts as you minimize the number of lines being changed. > Hey, if fixing compilation errors would have been enough, we wouldn't have so many tests in the unstable group that *always* fail. So Sebastian has a point, if we don't have a clear deadline for bringing the failing tests back into the main build, we might as well delete them. Except I don't want to delete tests, I want to make them work. And I didn't see anything in his proposal about tests that fail only once a week, or only on a certain agent, or only if TRACE logging is enabled... Cheers Dan > > Sanne > > On 21 October 2014 12:35, Sebastian ?askawiec wrote: > > On 10/21/2014 12:47 PM, Dan Berindei wrote: > > > > In fact, I was volunteered to monitor the TeamCity test results and > create a > > blocker issue for each failing test some time ago, but finding the proper > > owner for bugs proved to be quite time consuming so I haven't been > sticking > > to it. This thread did motivate me to create a few new blocker issues, > > however :) > > > > I believe we need to change our strategy in this point. We don't want to > > create new issues - we want to motivate everybody to fix it (and fix it > > fast). As I said - when the failure gets into our repo - all successive > Pull > > Requests will start to fail. Nobody will be able to integrate his changes > > and everybody (not everybody - some guys which are in hurry) will > probably > > want to unblock themselves... The easiest way to do that is to fix the > > build... > > > > This is the main idea... To make failing test a serious problem and not > just > > another "easy to ignore" issue... > > > > Of course, the question is how we are going to achieve that magical clean > > build status... > > > > I've got some idea - it's pretty controversial, but maybe you will like > it > > :) > > > > Remove every failing test from our code base - just delete it (no > ignoring, > > no adding to separate testsuite - just delete). > > Create separate branch and place all those tests there - simply revert > > commit which removed them from master. > > Organize failed-test-bounty with our Community - ask them to fix as many > as > > possible during fixed amount of time (a month or two? maybe shorter?). > > Every contributor in failed-test-bounty will be listed in "Thanks" > section > > of the release notes > > After the bounty is over, we'll just delete tests which were not fixed... > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141021/59459b95/attachment-0001.html From gustavonalle at gmail.com Tue Oct 21 10:08:40 2014 From: gustavonalle at gmail.com (Gustavo Fernandes) Date: Tue, 21 Oct 2014 15:08:40 +0100 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <54464503.7020903@redhat.com> References: <54452083.4020502@redhat.com> <54460E11.2000705@redhat.com> <54462067.3050501@redhat.com> <54464503.7020903@redhat.com> Message-ID: > > Remove every failing test from our code base - just delete it (no ignoring, > no adding to separate testsuite - just delete). > Create separate branch and place all those tests there - simply revert > commit which removed them from master. > Organize failed-test-bounty with our Community - ask them to fix as many as > possible during fixed amount of time (a month or two? maybe shorter?). > Every contributor in failed-test-bounty will be listed in "Thanks" section > of the release notes > After the bounty is over, we'll just delete tests which were not fixed... -1 to delete tests, unless it's clear they are rubbish. A failing test, specially those who fail some times on some specific machines when it rains, is potentially exposing a race condition in the code it is testing: I can't see other way to handle it rather than understanding why it fails and fix either the test or the feature. Gustavo > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From sanne at infinispan.org Tue Oct 21 17:46:48 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 21 Oct 2014 22:46:48 +0100 Subject: [infinispan-dev] On ParserRegistry and classloaders In-Reply-To: <54413FB3.3020409@redhat.com> References: <9A22CC58-5B20-4D6B-BA6B-B4A23493979F@redhat.com> <53831239.30601@redhat.com> <114950B5-AA9F-485C-8E3C-AC36299974AB@redhat.com> <53D945B4.7070508@redhat.com> <54413FB3.3020409@redhat.com> Message-ID: On 17 October 2014 17:11, Ion Savin wrote: > Hi Sanne, > >> Caused by: java.lang.ClassNotFoundException: >> org.infinispan.remoting.transport.jgroups.JGroupsTransport from >> [Module \"deployment.ModuleMemberRegistrationIT.war:main\" from >> Service Module Loader]"}} > > > Can you please share also the full stack for this exception? > Thanks! Hi Ion, I'm currently unable to find time to reassemble the failing experiment as I've moved on to alternatives. I will try to give you directions so that we can have a proper test failing for this in the Infinispan project. Sanne From vjuranek at redhat.com Wed Oct 22 03:43:15 2014 From: vjuranek at redhat.com (Vojtech Juranek) Date: Wed, 22 Oct 2014 09:43:15 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <54464503.7020903@redhat.com> Message-ID: <4999398.dViRhNXb2H@localhost> On Tuesday 21 October 2014 12:46:42 Sanne Grinovero wrote: > I'm sceptical though on embarking into a PR process which assumes that > we're going to be able to keep the testsuite green just because we > decide so; let's first work on a green testsuite, and then proof that > we can keep it that way for a resonsable time.. +1, green testsuite is a mandatory condition for any other action. Let's start with smaller steps - I'd propose to make green Master JDK 7 first. Currently [1] there fails these tests: * ExampleConfigsIT.testXsiteConfig - should be fixed by PR #2973 * HotRodRemoteCacheIT.testPutAsync - I investigated it few days ago, it's IMHO a not a random failure but a regular bug - ISPN-4813 * HotRodRemoteCacheIT.testCustom*, testEven* - I'll volunteer on investigating this issue * StateTransferSuppressIT.testRebalanceWith* - any volunteer for investigating this one? Once the root causes are investigated, we can fix the issues and try to keep this build green. > then we can talk about > building something on reliable foundations. actually, if there's a failing test and it's clear which commit has caused it, we can stop merging PRs from the developer who has introduced the regression until he fix it. This should be clear message, that the first thing he should work on is to fix the test. IMHO we can start with it now and don't need green testsuite for it [1] http://ci.infinispan.org/viewLog.html?buildId=13323&tab=buildResultsDiv&buildTypeId=bt8&logTab= -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 473 bytes Desc: This is a digitally signed message part. Url : http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141022/2caee779/attachment.bin From dan.berindei at gmail.com Wed Oct 22 04:39:49 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 22 Oct 2014 11:39:49 +0300 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <4999398.dViRhNXb2H@localhost> References: <54452083.4020502@redhat.com> <54464503.7020903@redhat.com> <4999398.dViRhNXb2H@localhost> Message-ID: On Wed, Oct 22, 2014 at 10:43 AM, Vojtech Juranek wrote: > On Tuesday 21 October 2014 12:46:42 Sanne Grinovero wrote: > > I'm sceptical though on embarking into a PR process which assumes that > > we're going to be able to keep the testsuite green just because we > > decide so; let's first work on a green testsuite, and then proof that > > we can keep it that way for a resonsable time.. > > +1, green testsuite is a mandatory condition for any other action. > Let's start with smaller steps - I'd propose to make green Master JDK 7 > first. > +1 > > Currently [1] there fails these tests: > * ExampleConfigsIT.testXsiteConfig - should be fixed by PR #2973 > Integrated > * HotRodRemoteCacheIT.testPutAsync - I investigated it few days ago, it's > IMHO a not a random failure but a regular bug - ISPN-4813 > Does anyone volunteer for ISPN-4813? > * HotRodRemoteCacheIT.testCustom*, testEven* - I'll volunteer on > investigating this issue > * StateTransferSuppressIT.testRebalanceWith* - any volunteer for > investigating this one? > I'll get this one, it might be related to the partition handling work. There are a couple more failures in the last 5 days that we should probably look at: http://ci.infinispan.org/project.html?projectId=Infinispan&buildTypeId=bt8&page=1&tab=tests > Once the root causes are investigated, we can fix the issues and try to > keep this build green. > > > then we can talk about > > building something on reliable foundations. > > actually, if there's a failing test and it's clear which commit has caused > it, we can stop merging PRs from the developer > who has introduced the regression until he fix it. This should be clear > message, that the first thing he should > work on is to fix the test. IMHO we can start with it now and don't need > green testsuite for it > In theory that sounds good, but who do we stop merging for on account the current failures? :) > > [1] > http://ci.infinispan.org/viewLog.html?buildId=13323&tab=buildResultsDiv&buildTypeId=bt8&logTab= > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141022/86883021/attachment.html From galder at redhat.com Wed Oct 22 04:58:53 2014 From: galder at redhat.com (=?windows-1252?Q?Galder_Zamarre=F1o?=) Date: Wed, 22 Oct 2014 10:58:53 +0200 Subject: [infinispan-dev] Analysis of infinispan-6.0.2 dependency on JDK-Internal APIs In-Reply-To: References: <54227901.8020509@oracle.com> <54227EF9.6090403@oracle.com> <5437AE80.6050302@oracle.com> Message-ID: <722448FF-6E48-414A-AE09-C3737A799315@redhat.com> On 10 Oct 2014, at 13:37, Dan Berindei wrote: > Hi Rory > > Galder is on PTO for another week, so I'll try to answer instead. > > We only use sun.misc.Unsafe directly, in order to implement a variation of Doug Lea's ConcurrentHashMapV8 that accepts a custom Equivalence (implementation of equality/hashCode). I guess we'll have to switch to AtomicFieldUpdaters if we want it to work with JDK 9, and possibly move to the volatile extensions once they are implemented. By that point, Doug might even have an updated ConcurrentHashMapV8 version that can be run with JDK9. We could ask it in the concurrency-interest list, but I?d give them some time yet. > The rest of the internal class usages seem to be from our dependencies on WildFly, JBoss Marshalling, LevelDB, Smooks, and JBoss MicroContainer. Smooks and JBoss MicroContainer likely won't see any updates for JDK 9, but they're only used in the demos so they're not critical. JBoss Marshalling is used in the core, however, so we'll need a release from them before we can run anything on JDK 9. > > Cheers > Dan > > > On Fri, Oct 10, 2014 at 1:01 PM, Rory O'Donnell Oracle, Dublin Ireland wrote: > Hi Galder, > > Did you have time to review the report, any feedback ? > > Rgds,Rory > > On 24/09/2014 09:21, Rory O'Donnell Oracle, Dublin Ireland wrote: >> Below is a text output of the report for infinispan-6.0.2. >> >> Rgds,Rory >> >> JDK Internal API Usage Report for infinispan-6.0.2.Final-all >> >> The OpenJDK Quality Outreach campaign has run a compatibility report to identify usage of JDK-internal APIs. Usage of these JDK-internal APIs could pose compatibility issues, as the Java team explained in 1996. We have created this report to help you identify which JDK-internal APIs your project uses, what to use instead, and where those changes should go. Making these changes will improve your compatibility, and in some cases give better performance. >> >> Migrating away from the JDK-internal APIs now will give your team adequate time for testing before the release of JDK 9. If you are unable to migrate away from an internal API, please provide us with an explanation below to help us understand it better. As a reminder, supported APIs are determined by the OpenJDK's Java Community Process and not by Oracle. >> >> This report was generated by jdeps through static analysis of artifacts: it does not identify any usage of those APIs through reflection or dynamic bytecode. You may also run jdeps on your own if you would prefer. >> >> Summary of the analysis of the jar files within infinispan-6.0.2.Final-all: >> >> ? Numer of jar files depending on JDK-internal APIs: 10 >> ? Internal APIs that have known replacements: 0 >> ? Internal APIs that have no supported replacements: 73 >> APIs that have known replacements: >> >> ID Replace Usage of With Inside >> JDK-internal APIs without supported replacements: >> >> ID Internal APIs (do not use) Used by >> 1 com.sun.org.apache.xml.internal.utils.PrefixResolver >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 2 com.sun.org.apache.xpath.internal.XPath >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 3 com.sun.org.apache.xpath.internal.XPathContext >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 4 com.sun.org.apache.xpath.internal.objects.XBoolean >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 5 com.sun.org.apache.xpath.internal.objects.XNodeSet >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 6 com.sun.org.apache.xpath.internal.objects.XNull >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 7 com.sun.org.apache.xpath.internal.objects.XNumber >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 8 com.sun.org.apache.xpath.internal.objects.XObject >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 9 com.sun.org.apache.xpath.internal.objects.XString >> ? lib/freemarker-2.3.11.jar >> Explanation... >> 10 org.w3c.dom.html.HTMLAnchorElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 11 org.w3c.dom.html.HTMLAppletElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 12 org.w3c.dom.html.HTMLAreaElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 13 org.w3c.dom.html.HTMLBRElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 14 org.w3c.dom.html.HTMLBaseElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 15 org.w3c.dom.html.HTMLBaseFontElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 16 org.w3c.dom.html.HTMLBodyElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 17 org.w3c.dom.html.HTMLButtonElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 18 org.w3c.dom.html.HTMLCollection >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 19 org.w3c.dom.html.HTMLDListElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 20 org.w3c.dom.html.HTMLDirectoryElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 21 org.w3c.dom.html.HTMLDivElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 22 org.w3c.dom.html.HTMLDocument >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 23 org.w3c.dom.html.HTMLElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 24 org.w3c.dom.html.HTMLFieldSetElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 25 org.w3c.dom.html.HTMLFontElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 26 org.w3c.dom.html.HTMLFormElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 27 org.w3c.dom.html.HTMLFrameElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 28 org.w3c.dom.html.HTMLFrameSetElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 29 org.w3c.dom.html.HTMLHRElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 30 org.w3c.dom.html.HTMLHeadElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 31 org.w3c.dom.html.HTMLHeadingElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 32 org.w3c.dom.html.HTMLHtmlElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 33 org.w3c.dom.html.HTMLIFrameElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 34 org.w3c.dom.html.HTMLImageElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 35 org.w3c.dom.html.HTMLInputElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 36 org.w3c.dom.html.HTMLIsIndexElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 37 org.w3c.dom.html.HTMLLIElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 38 org.w3c.dom.html.HTMLLabelElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 39 org.w3c.dom.html.HTMLLegendElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 40 org.w3c.dom.html.HTMLLinkElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 41 org.w3c.dom.html.HTMLMapElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 42 org.w3c.dom.html.HTMLMenuElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 43 org.w3c.dom.html.HTMLMetaElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 44 org.w3c.dom.html.HTMLModElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 45 org.w3c.dom.html.HTMLOListElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 46 org.w3c.dom.html.HTMLObjectElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 47 org.w3c.dom.html.HTMLOptGroupElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 48 org.w3c.dom.html.HTMLOptionElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 49 org.w3c.dom.html.HTMLParagraphElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 50 org.w3c.dom.html.HTMLParamElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 51 org.w3c.dom.html.HTMLPreElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 52 org.w3c.dom.html.HTMLQuoteElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 53 org.w3c.dom.html.HTMLScriptElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 54 org.w3c.dom.html.HTMLSelectElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 55 org.w3c.dom.html.HTMLStyleElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 56 org.w3c.dom.html.HTMLTableCaptionElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 57 org.w3c.dom.html.HTMLTableCellElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 58 org.w3c.dom.html.HTMLTableColElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 59 org.w3c.dom.html.HTMLTableElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 60 org.w3c.dom.html.HTMLTableRowElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 61 org.w3c.dom.html.HTMLTableSectionElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 62 org.w3c.dom.html.HTMLTextAreaElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 63 org.w3c.dom.html.HTMLTitleElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 64 org.w3c.dom.html.HTMLUListElement >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 65 org.w3c.dom.ranges.DocumentRange >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 66 org.w3c.dom.ranges.Range >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 67 org.w3c.dom.ranges.RangeException >> ? lib/xercesImpl-2.9.1.jar >> Explanation... >> 68 sun.misc.Signal >> ? lib/aesh-0.33.7.jar >> Explanation... >> 69 sun.misc.SignalHandler >> ? lib/aesh-0.33.7.jar >> Explanation... >> 70 sun.misc.Unsafe >> ? lib/avro-1.7.5.jar >> ? lib/guava-12.0.jar >> ? lib/infinispan-commons-6.0.2.Final.jar >> ? lib/mvel2-2.0.12.jar >> ? lib/scala-library-2.10.2.jar >> Explanation... >> 71 sun.nio.ch.FileChannelImpl >> ? lib/leveldb-0.5.jar >> Explanation... >> 72 sun.reflect.ReflectionFactory >> ? lib/jboss-marshalling-1.4.4.Final.jar >> Explanation... >> 73 sun.reflect.ReflectionFactory$GetReflectionFactoryAction >> ? lib/jboss-marshalling-1.4.4.Final.jar >> Explanation... >> Identify External Replacements >> >> You should use a separate third-party library that performs this functionality. >> >> ID Internal API (grouped by package) Used By Identify External Replacement >> >> >> On 24/09/2014 08:55, Rory O'Donnell Oracle, Dublin Ireland wrote: >>> Hi Galder, >>> >>> As part of the preparations for JDK 9, Oracle?s engineers have been analyzing open source projects like yours to understand usage. One area of concern involves identifying compatibility problems, such as reliance on JDK-internal APIs. >>> >>> Our engineers have already prepared guidance on migrating some of the more common usage patterns of JDK-internal APIs to supported public interfaces. The list is on the OpenJDK wiki [0], along with instructions on how to run the jdeps analysis tool yourself . >>> >>> As part of the ongoing development of JDK 9, I would like to encourage migration from JDK-internal APIs towards the supported Java APIs. I have prepared a report for your project rele ase infinispan-6.0.2 based on the jdeps output. >>> >>> The report is attached to this e-mail. >>> >>> For anything where your migration path is unclear, I would appreciate comments on the JDK-internal API usage patterns in the attached jdeps report - in particular comments elaborating on the rationale for them - either to me or on this mailing list. >>> >>> Finding suitable replacements for unsupported interfaces is not always straightforward, which is why I am reaching out to you early in the JDK 9 development cycle so you can give feedback about new APIs that may be needed to facilitate this exercise. >>> >>> Thank you in advance for any efforts and feedback helping us make JDK 9 better. >>> >>> Rgds,Rory >>> >>> [0] https://wiki.openjdk.java.net/display/JDK8/Java+Dependency+Analysis+Tool >>> >>> >>> -- >>> Rgds,Rory O'Donnell >>> Quality Engineering Manager >>> Oracle EMEA , Dublin, Ireland >>> >>> >>> >>> >>> >> >> -- >> Rgds,Rory O'Donnell >> Quality Engineering Manager >> Oracle EMEA , Dublin, Ireland >> >> >> >> >> > > -- > Rgds,Rory O'Donnell > Quality Engineering Manager > Oracle EMEA , Dublin, Ireland > > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarre?o galder at redhat.com twitter.com/galderz From vjuranek at redhat.com Wed Oct 22 04:59:57 2014 From: vjuranek at redhat.com (Vojtech Juranek) Date: Wed, 22 Oct 2014 10:59:57 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <4999398.dViRhNXb2H@localhost> Message-ID: <56561795.2zjSlQ0m7Z@localhost> > Does anyone volunteer for ISPN-4813? me not:-) as this one is IMHO little bit tricky - it's (at least partially) related to the question "which size is the correct size?". Was there any clear conclusion in recent " About size()" thread? > > actually, if there's a failing test and it's clear which commit has caused > > it, we can stop merging PRs from the developer > > who has introduced the regression until he fix it. This should be clear > > message, that the first thing he should > > work on is to fix the test. IMHO we can start with it now and don't need > > green testsuite for it > > In theory that sounds good, but who do we stop merging for on account the > current failures? :) probably nobody, we need to find volunteers in this case. I meant it only for issues where it's clear who introduced the regression (e.g. me for recently failing NodeAuth*IT tests). As above, not a complete solution, but rather a small step to move to the state where we would like to be -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 473 bytes Desc: This is a digitally signed message part. Url : http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141022/d7d2c8bf/attachment.bin From galder at redhat.com Wed Oct 22 06:32:04 2014 From: galder at redhat.com (=?windows-1252?Q?Galder_Zamarre=F1o?=) Date: Wed, 22 Oct 2014 12:32:04 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> <54453B16.6050202@redhat.com> Message-ID: <5F4D7312-0B6C-4C30-B5E9-F75F1973FEE5@redhat.com> Guys, Jason from Wildfly provided some interesting information a while back on the benefits of ?merge? approach vs cherry-pick. To paraphrase: > I used to be anti-merge because I thought it made things harder for users to grok. That was back when git wasn?t mainstream though. > > Now that everyone uses git I think its a good thing. There are some really nice benefits to it: > > 1. The original history from the author is preserved > 2. The author does not have to toss their branch to avoid a conflict introduced by a pull after their PR is merged > 3. Changes introduced by conflict resolution are kept separate from the authors. So you know if the problem was caused by the merge or the change > itself > 4. The person who merged the change is recorded in the git history, so you have an audit record of who allowed the change in if you have multiple mergers > 5. PRs sometimes include multiple commits, and a merge commit allows you to see which commits encompass the overall change > 6. Due to 5, bisecting is quicker > 7. It?s easier to revert a merge commit > 8. Github PRs automatically close when you perform a merge > 9. You can use the big green button with automated CI > > There are however some drawbacks: > > 1. If you revert a merge, you need to create a new merge to bring it back. This can be a little confusing if you do it wrong > 2. You have to know how to interact with merge commits in the tools (e.g. revert requires -m 1) > 3. git log, for whatever reason defaults to date ordering instead of topographical ordering, which can look confusing since it doesn?t represent when > changes were actually merged. Thats solvable using git log ?topo-order > Merging merge commits is of course nasty, but you don?t have to allow it. You can just require that authors rebase their history when they need it to > be more current vs a git pull. Merging then follows a nice clear one level nesting.? The key thing IMO is: Avoiding merge commits but making sure that everyone rebases their changes :) So, +1 to Tristan?s suggestion but making sure we avoid merge commits! On 20 Oct 2014, at 18:55, Emmanuel Bernard wrote: > So you review locally and potentially run locally and then you switch from your terminal console or IDE to wherever the button is in your 350 opened tabs because it?s faster than git push upstream master. I am having a hard time to see the convenience unless you do browser only reviews. > > > On 20 Oct 2014, at 18:40, Tristan Tarrant wrote: > >> Sure, you still want to review it in your IDE, and maybe run local >> tests, but ultimately merging via the GitHub UI. >> >> Tristan >> >> On 20/10/14 18:37, Emmanuel Bernard wrote: >>> rebase is a oneliner op per branch you want to reapply whereas cherry >>> picking requires to manually select the commits you want. Underneath >>> in git guts it probably does the same. >>> >>> I have to admit I barely had the occasion to want to click the GitHub >>> UI button as except for simple documentation, reviewing code almost >>> always require to fetch the branch and look at it in an IDE of sort >>> for proper review. The documentation bit is actually even requiring >>> local run since Markdown / Asciidoc and all tend to silently fail a >>> syntax mistake. >>> >>> On 20 Oct 2014, at 18:28, Mircea Markus >> > wrote: >>> >>>> >>>> On Oct 20, 2014, at 17:21, Emmanuel Bernard >>> > wrote: >>>> >>>>> There is a difference between cherry picking and rebasing when it >>>>> comes to reapply a work on top of a branch. >>>> >>>> What is the difference? :-) >>>> >>>>> Do you dislike both equally compared to a merge (aka railroad nexus >>>>> git history approach)? >>>> >>>> Using github's "merge" button is pretty convenient imo, even though >>>> the history is not as nice as with a rebase (or cherry-pick, I miss >>>> the difference for now ) >>>> >>>>> >>>>> >>>>> On 20 Oct 2014, at 16:47, Tristan Tarrant >>>> > wrote: >>>>> >>>>>> Hi guys, >>>>>> >>>>>> with the imminent release of 7.0.0.CR2 we are reaching the end of this >>>>>> release cycle. There have been a ton of improvements (maybe too many) >>>>>> and a lot of time has passed since the previous version (maybe to >>>>>> much). >>>>>> Following up on my previous e-mail about future plans, here's a >>>>>> recap of >>>>>> a plan which I believe will allow us to move at a much quicker pace: >>>>>> >>>>>> For the next minor releases I would like to suggest the following >>>>>> strategy: >>>>>> - use a 3 month timebox where we strive to maintain master in an >>>>>> "always releasable" state >>>>>> - complex feature work will need to happen onto dedicated feature >>>>>> branches, using the usual GitHub pull-request workflow >>>>>> - only when a feature is complete (code, tests, docs, reviewed, >>>>>> CI-checked) it will be merged back into master >>>>>> - if a feature is running late it will be postponed to the >>>>>> following minor release so as not to hinder other development >>>>>> >>>>>> I am also going to suggest dropping the cherry-picking approach and >>>>>> going with git merge. In order to achieve this we need CI to be >>>>>> always in top form with 0 failures in master. This will allow >>>>>> merging a PR directly from GitHub's interface. We obviously need to >>>>>> trust our tools and our existing code base. >>>>>> >>>>>> This is the plan for 7.1.0: >>>>>> >>>>>> 13 November 7.1.0.Alpha1 >>>>>> 18 December 7.1.0.Beta1 >>>>>> 15 January 7.1.0.CR1 >>>>>> 30 January 7.1.0.Final >>>>>> >>>>>> >>>>>> Tristan >>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> Cheers, >>>> -- >>>> Mircea Markus >>>> Infinispan lead (www.infinispan.org ) >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarre?o galder at redhat.com twitter.com/galderz From dan.berindei at gmail.com Wed Oct 22 07:04:52 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 22 Oct 2014 14:04:52 +0300 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <56561795.2zjSlQ0m7Z@localhost> References: <54452083.4020502@redhat.com> <4999398.dViRhNXb2H@localhost> <56561795.2zjSlQ0m7Z@localhost> Message-ID: On Wed, Oct 22, 2014 at 11:59 AM, Vojtech Juranek wrote: > > > Does anyone volunteer for ISPN-4813? > > me not:-) as this one is IMHO little bit tricky - it's (at least partially) > related to the question "which size is the correct size?". Was there any > clear > conclusion in recent " About size()" thread? > > > > > actually, if there's a failing test and it's clear which commit has > caused > > > it, we can stop merging PRs from the developer > > > who has introduced the regression until he fix it. This should be clear > > > message, that the first thing he should > > > work on is to fix the test. IMHO we can start with it now and don't > need > > > green testsuite for it > > > > In theory that sounds good, but who do we stop merging for on account the > > current failures? :) > > probably nobody, we need to find volunteers in this case. I meant it only > for > issues where it's clear who introduced the regression (e.g. me for recently > failing NodeAuth*IT tests). As above, not a complete solution, but rather a > small step to move to the state where we would like to be > Yeah, that's what I meant, most of the time it's hard to track down who caused a specific failure. BTW, I forgot to mention something in the previous email: when you investigate a test failure please mark the test as investigated in TeamCity (preferably with a link to an issue in JIRA). And if it's an intermittent failure, make sure it's not marked as fixed automatically (TC is a little greedy here, it thinks the test is fixed the moment it doesn't fail in a build). Cheers Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141022/7cd0a27f/attachment.html From dan.berindei at gmail.com Wed Oct 22 07:29:59 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 22 Oct 2014 14:29:59 +0300 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: <5F4D7312-0B6C-4C30-B5E9-F75F1973FEE5@redhat.com> References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> <54453B16.6050202@redhat.com> <5F4D7312-0B6C-4C30-B5E9-F75F1973FEE5@redhat.com> Message-ID: On Wed, Oct 22, 2014 at 1:32 PM, Galder Zamarre?o wrote: > Guys, Jason from Wildfly provided some interesting information a while > back on the benefits of ?merge? approach vs cherry-pick. To paraphrase: > > > I used to be anti-merge because I thought it made things harder for > users to grok. That was back when git wasn?t mainstream though. > > > > Now that everyone uses git I think its a good thing. There are some > really nice benefits to it: > > > > 1. The original history from the author is preserved > TBH most of the time I don't care about my history, I always have stupid commit messages until I squash my commits and "prettify" the commit messages. > > 2. The author does not have to toss their branch to avoid a conflict > introduced by a pull after their PR is merged > I think this can only happen if the branch had conflicts and the committer resolved them, but we require authors to rebase so I don't think this has been a problem for us. > > 3. Changes introduced by conflict resolution are kept separate from the > authors. So you know if the problem was caused by the merge or the change > > itself > 4. The person who merged the change is recorded in the git history, so > you have an audit record of who allowed the change in if you have multiple > mergers > Git records the committer as well. You just have to do a "git rebase -f master" locally to make sure the committer field is updated before you push into upstream. > > 5. PRs sometimes include multiple commits, and a merge commit allows you > to see which commits encompass the overall change > > 6. Due to 5, bisecting is quicker > +1, bisecting is cool, even though I've never been able to use it on Infinispan :) > > 7. It?s easier to revert a merge commit > > 8. Github PRs automatically close when you perform a merge > > 9. You can use the big green button with automated CI > > > > There are however some drawbacks: > > > > 1. If you revert a merge, you need to create a new merge to bring it > back. This can be a little confusing if you do it wrong > > 2. You have to know how to interact with merge commits in the tools > (e.g. revert requires -m 1) > 3. git log, for whatever reason defaults to date ordering instead of > topographical ordering, which can look confusing since it doesn?t represent > when > changes were actually merged. Thats solvable using git log > ?topo-order > > > Merging merge commits is of course nasty, but you don?t have to allow > it. You can just require that authors rebase their history when they need > it to > be more current vs a git pull. Merging then follows a nice clear > one level nesting.? > > The key thing IMO is: Avoiding merge commits but making sure that everyone > rebases their changes :) > > So, +1 to Tristan?s suggestion but making sure we avoid merge commits! > Sorry, you've lost me here :) Doesn't every merge have a merge commit? I have seen recommendations to use the GitHub Merge button but avoid force the PR authors to rebase anyway elsewhere, but I'm not sure how we could enforce that. I don't think there is any cue in the GitHub UI that the PR could/should be rebased. Maybe we could add a git hook to do that check, and force the integrator to rebase locally if the PR is not yet rebased? > > On 20 Oct 2014, at 18:55, Emmanuel Bernard wrote: > > > So you review locally and potentially run locally and then you switch > from your terminal console or IDE to wherever the button is in your 350 > opened tabs because it?s faster than git push upstream master. I am having > a hard time to see the convenience unless you do browser only reviews. > > > > > > On 20 Oct 2014, at 18:40, Tristan Tarrant wrote: > > > >> Sure, you still want to review it in your IDE, and maybe run local > >> tests, but ultimately merging via the GitHub UI. > >> > >> Tristan > >> > >> On 20/10/14 18:37, Emmanuel Bernard wrote: > >>> rebase is a oneliner op per branch you want to reapply whereas cherry > >>> picking requires to manually select the commits you want. Underneath > >>> in git guts it probably does the same. > >>> > >>> I have to admit I barely had the occasion to want to click the GitHub > >>> UI button as except for simple documentation, reviewing code almost > >>> always require to fetch the branch and look at it in an IDE of sort > >>> for proper review. The documentation bit is actually even requiring > >>> local run since Markdown / Asciidoc and all tend to silently fail a > >>> syntax mistake. > >>> > >>> On 20 Oct 2014, at 18:28, Mircea Markus >>> > wrote: > >>> > >>>> > >>>> On Oct 20, 2014, at 17:21, Emmanuel Bernard >>>> > wrote: > >>>> > >>>>> There is a difference between cherry picking and rebasing when it > >>>>> comes to reapply a work on top of a branch. > >>>> > >>>> What is the difference? :-) > >>>> > >>>>> Do you dislike both equally compared to a merge (aka railroad nexus > >>>>> git history approach)? > >>>> > >>>> Using github's "merge" button is pretty convenient imo, even though > >>>> the history is not as nice as with a rebase (or cherry-pick, I miss > >>>> the difference for now ) > >>>> > >>>>> > >>>>> > >>>>> On 20 Oct 2014, at 16:47, Tristan Tarrant >>>>> > wrote: > >>>>> > >>>>>> Hi guys, > >>>>>> > >>>>>> with the imminent release of 7.0.0.CR2 we are reaching the end of > this > >>>>>> release cycle. There have been a ton of improvements (maybe too > many) > >>>>>> and a lot of time has passed since the previous version (maybe to > >>>>>> much). > >>>>>> Following up on my previous e-mail about future plans, here's a > >>>>>> recap of > >>>>>> a plan which I believe will allow us to move at a much quicker pace: > >>>>>> > >>>>>> For the next minor releases I would like to suggest the following > >>>>>> strategy: > >>>>>> - use a 3 month timebox where we strive to maintain master in an > >>>>>> "always releasable" state > >>>>>> - complex feature work will need to happen onto dedicated feature > >>>>>> branches, using the usual GitHub pull-request workflow > >>>>>> - only when a feature is complete (code, tests, docs, reviewed, > >>>>>> CI-checked) it will be merged back into master > >>>>>> - if a feature is running late it will be postponed to the > >>>>>> following minor release so as not to hinder other development > >>>>>> > >>>>>> I am also going to suggest dropping the cherry-picking approach and > >>>>>> going with git merge. In order to achieve this we need CI to be > >>>>>> always in top form with 0 failures in master. This will allow > >>>>>> merging a PR directly from GitHub's interface. We obviously need to > >>>>>> trust our tools and our existing code base. > >>>>>> > >>>>>> This is the plan for 7.1.0: > >>>>>> > >>>>>> 13 November 7.1.0.Alpha1 > >>>>>> 18 December 7.1.0.Beta1 > >>>>>> 15 January 7.1.0.CR1 > >>>>>> 30 January 7.1.0.Final > >>>>>> > >>>>>> > >>>>>> Tristan > >>>>>> > >>>>>> _______________________________________________ > >>>>>> infinispan-dev mailing list > >>>>>> infinispan-dev at lists.jboss.org infinispan-dev at lists.jboss.org> > >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> infinispan-dev mailing list > >>>>> infinispan-dev at lists.jboss.org infinispan-dev at lists.jboss.org> > >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>> > >>>> Cheers, > >>>> -- > >>>> Mircea Markus > >>>> Infinispan lead (www.infinispan.org ) > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> infinispan-dev at lists.jboss.org > > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> > >>> > >>> > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev at lists.jboss.org > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Galder Zamarre?o > galder at redhat.com > twitter.com/galderz > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141022/0a633c5d/attachment-0001.html From galder at redhat.com Wed Oct 22 06:32:04 2014 From: galder at redhat.com (=?windows-1252?Q?Galder_Zamarre=F1o?=) Date: Wed, 22 Oct 2014 12:32:04 +0200 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> <54453B16.6050202@redhat.com> Message-ID: <5F4D7312-0B6C-4C30-B5E9-F75F1973FEE5@redhat.com> Guys, Jason from Wildfly provided some interesting information a while back on the benefits of ?merge? approach vs cherry-pick. To paraphrase: > I used to be anti-merge because I thought it made things harder for users to grok. That was back when git wasn?t mainstream though. > > Now that everyone uses git I think its a good thing. There are some really nice benefits to it: > > 1. The original history from the author is preserved > 2. The author does not have to toss their branch to avoid a conflict introduced by a pull after their PR is merged > 3. Changes introduced by conflict resolution are kept separate from the authors. So you know if the problem was caused by the merge or the change > itself > 4. The person who merged the change is recorded in the git history, so you have an audit record of who allowed the change in if you have multiple mergers > 5. PRs sometimes include multiple commits, and a merge commit allows you to see which commits encompass the overall change > 6. Due to 5, bisecting is quicker > 7. It?s easier to revert a merge commit > 8. Github PRs automatically close when you perform a merge > 9. You can use the big green button with automated CI > > There are however some drawbacks: > > 1. If you revert a merge, you need to create a new merge to bring it back. This can be a little confusing if you do it wrong > 2. You have to know how to interact with merge commits in the tools (e.g. revert requires -m 1) > 3. git log, for whatever reason defaults to date ordering instead of topographical ordering, which can look confusing since it doesn?t represent when > changes were actually merged. Thats solvable using git log ?topo-order > Merging merge commits is of course nasty, but you don?t have to allow it. You can just require that authors rebase their history when they need it to > be more current vs a git pull. Merging then follows a nice clear one level nesting.? The key thing IMO is: Avoiding merge commits but making sure that everyone rebases their changes :) So, +1 to Tristan?s suggestion but making sure we avoid merge commits! On 20 Oct 2014, at 18:55, Emmanuel Bernard wrote: > So you review locally and potentially run locally and then you switch from your terminal console or IDE to wherever the button is in your 350 opened tabs because it?s faster than git push upstream master. I am having a hard time to see the convenience unless you do browser only reviews. > > > On 20 Oct 2014, at 18:40, Tristan Tarrant wrote: > >> Sure, you still want to review it in your IDE, and maybe run local >> tests, but ultimately merging via the GitHub UI. >> >> Tristan >> >> On 20/10/14 18:37, Emmanuel Bernard wrote: >>> rebase is a oneliner op per branch you want to reapply whereas cherry >>> picking requires to manually select the commits you want. Underneath >>> in git guts it probably does the same. >>> >>> I have to admit I barely had the occasion to want to click the GitHub >>> UI button as except for simple documentation, reviewing code almost >>> always require to fetch the branch and look at it in an IDE of sort >>> for proper review. The documentation bit is actually even requiring >>> local run since Markdown / Asciidoc and all tend to silently fail a >>> syntax mistake. >>> >>> On 20 Oct 2014, at 18:28, Mircea Markus >> > wrote: >>> >>>> >>>> On Oct 20, 2014, at 17:21, Emmanuel Bernard >>> > wrote: >>>> >>>>> There is a difference between cherry picking and rebasing when it >>>>> comes to reapply a work on top of a branch. >>>> >>>> What is the difference? :-) >>>> >>>>> Do you dislike both equally compared to a merge (aka railroad nexus >>>>> git history approach)? >>>> >>>> Using github's "merge" button is pretty convenient imo, even though >>>> the history is not as nice as with a rebase (or cherry-pick, I miss >>>> the difference for now ) >>>> >>>>> >>>>> >>>>> On 20 Oct 2014, at 16:47, Tristan Tarrant >>>> > wrote: >>>>> >>>>>> Hi guys, >>>>>> >>>>>> with the imminent release of 7.0.0.CR2 we are reaching the end of this >>>>>> release cycle. There have been a ton of improvements (maybe too many) >>>>>> and a lot of time has passed since the previous version (maybe to >>>>>> much). >>>>>> Following up on my previous e-mail about future plans, here's a >>>>>> recap of >>>>>> a plan which I believe will allow us to move at a much quicker pace: >>>>>> >>>>>> For the next minor releases I would like to suggest the following >>>>>> strategy: >>>>>> - use a 3 month timebox where we strive to maintain master in an >>>>>> "always releasable" state >>>>>> - complex feature work will need to happen onto dedicated feature >>>>>> branches, using the usual GitHub pull-request workflow >>>>>> - only when a feature is complete (code, tests, docs, reviewed, >>>>>> CI-checked) it will be merged back into master >>>>>> - if a feature is running late it will be postponed to the >>>>>> following minor release so as not to hinder other development >>>>>> >>>>>> I am also going to suggest dropping the cherry-picking approach and >>>>>> going with git merge. In order to achieve this we need CI to be >>>>>> always in top form with 0 failures in master. This will allow >>>>>> merging a PR directly from GitHub's interface. We obviously need to >>>>>> trust our tools and our existing code base. >>>>>> >>>>>> This is the plan for 7.1.0: >>>>>> >>>>>> 13 November 7.1.0.Alpha1 >>>>>> 18 December 7.1.0.Beta1 >>>>>> 15 January 7.1.0.CR1 >>>>>> 30 January 7.1.0.Final >>>>>> >>>>>> >>>>>> Tristan >>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> Cheers, >>>> -- >>>> Mircea Markus >>>> Infinispan lead (www.infinispan.org ) >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarre?o galder at redhat.com twitter.com/galderz From ttarrant at redhat.com Thu Oct 23 05:51:41 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 23 Oct 2014 10:51:41 +0100 Subject: [infinispan-dev] Infinispan 7.1 plan In-Reply-To: References: <54452083.4020502@redhat.com> <1D5EA9BC-D392-4B08-B134-1EC579CB25DE@redhat.com> <54453B16.6050202@redhat.com> <5F4D7312-0B6C-4C30-B5E9-F75F1973FEE5@redhat.com> Message-ID: <5448CFAD.7050907@redhat.com> On 22/10/14 12:29, Dan Berindei wrote: > On Wed, Oct 22, 2014 at 1:32 PM, Galder Zamarre?o > wrote: > > Guys, Jason from Wildfly provided some interesting information a > while back on the benefits of ?merge? approach vs cherry-pick. To > paraphrase: > > > 1. The original history from the author is preserved > > > TBH most of the time I don't care about my history, I always have > stupid commit messages until I squash my commits and "prettify" the > commit messages. Indeed: I commit and squash all the time. I'm only interested in seeing multiple commits in the following case: - they are actually subtasks of the main PR - if the pull request is for a feature which was developed collaboratively by multiple developers > > 2. The author does not have to toss their branch to avoid a > conflict introduced by a pull after their PR is merged > > > I think this can only happen if the branch had conflicts and the > committer resolved them, but we require authors to rebase so I don't > think this has been a problem for us. So the best approach seems to actually be rebase/pull, which gets us what we want. Tristan From sanne at infinispan.org Thu Oct 23 07:39:40 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Thu, 23 Oct 2014 12:39:40 +0100 Subject: [infinispan-dev] Retro-weaving Java8 bytecode to Java7 compatibility (and Java6!) Message-ID: I just found this project: https://github.com/orfjackal/retrolambda I have no idea how reliable the output code could be, maybe someone would be interested to explore the option? Sanne From galder at redhat.com Thu Oct 23 11:58:13 2014 From: galder at redhat.com (=?windows-1252?Q?Galder_Zamarre=F1o?=) Date: Thu, 23 Oct 2014 17:58:13 +0200 Subject: [infinispan-dev] Should JMX size stat consider expired entries? Message-ID: <28B724E9-29AB-4DEE-9C2E-CC466F855C47@redhat.com> Hi all, The reason [1] was failing was due a set of circumstances that essentially meant that JMX size statistic was counting expired entries. We had a brief discussion on IRC and there?s some divergence on how precise JMX size stat should be [2]. I?m OK with expired entries being counted since that means that size is fast to retrieve that way. If you need to start calculating if each entry is expired?etc, it would slow it down. However, if we go down that route, we need to change all tests that rely on JMX size stat for their assertions. There?s quite a few of those in the Server integration testsuite, e.g. AbstractRemoteCacheIT. Thoughts? [1] https://issues.jboss.org/browse/ISPN-4813 [2] https://gist.github.com/galderz/62ae5120f5ac50ceabef -- Galder Zamarre?o galder at redhat.com twitter.com/galderz From andreas.kruthoff at nexustelecom.com Fri Oct 24 03:55:14 2014 From: andreas.kruthoff at nexustelecom.com (Andreas Kruthoff) Date: Fri, 24 Oct 2014 09:55:14 +0200 Subject: [infinispan-dev] ClassNotFoundException infinispan-embedded-7.0.0.CR2.jar Message-ID: <544A05E2.8070101@nexustelecom.com> Hi I was running a cache in distributed mode and suddenly got the following ClassNotFoundException (see below). My classapth: infinispan-embedded-7.0.0.CR2.jar jboss-transaction-api_1.1_spec-1.0.1.Final.jar Am I missing something? Exception in thread "main" org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.start() throws java.lang.Exception on object of type StateTransferManagerImpl at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:170) at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869) at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638) at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627) at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530) at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:216) at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:764) at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:584) at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:539) at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:416) at ch.nexustelecom.lbd.engine.ImsiCache.init(ImsiCache.java:49) at ch.nexustelecom.dexclient.engine.DefaultDexClientEngine.init(DefaultDexClientEngine.java:120) at ch.nexustelecom.dexclient.DexClient.initClient(DexClient.java:169) at ch.nexustelecom.dexclient.tool.DexClientManager.startup(DexClientManager.java:196) at ch.nexustelecom.dexclient.tool.DexClientManager.main(DexClientManager.java:83) Caused by: org.infinispan.commons.CacheException: java.lang.ClassNotFoundException: org.infinispan.partionhandling.impl.AvailabilityMode at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536) at org.infinispan.topology.LocalTopologyManagerImpl.executeOnCoordinator(LocalTopologyManagerImpl.java:388) at org.infinispan.topology.LocalTopologyManagerImpl.join(LocalTopologyManagerImpl.java:102) at org.infinispan.statetransfer.StateTransferManagerImpl.start(StateTransferManagerImpl.java:108) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168) ... 14 more Caused by: java.lang.ClassNotFoundException: org.infinispan.partionhandling.impl.AvailabilityMode at java.net.URLClassLoader$1.run(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.jboss.marshalling.AbstractClassResolver.loadClass(AbstractClassResolver.java:131) at org.jboss.marshalling.AbstractClassResolver.resolveClass(AbstractClassResolver.java:112) at org.jboss.marshalling.river.RiverUnmarshaller.doReadClassDescriptor(RiverUnmarshaller.java:1002) at org.jboss.marshalling.river.RiverUnmarshaller.doReadNewObject(RiverUnmarshaller.java:1239) at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:272) at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) at org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) at org.infinispan.topology.CacheStatusResponse$Externalizer.readObject(CacheStatusResponse.java:76) at org.infinispan.topology.CacheStatusResponse$Externalizer.readObject(CacheStatusResponse.java:62) at org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.readObject(ExternalizerTable.java:424) at org.infinispan.marshall.core.ExternalizerTable.readObject(ExternalizerTable.java:221) at org.infinispan.marshall.core.JBossMarshaller$ExternalizerTableProxy.readObject(JBossMarshaller.java:148) at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:351) at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) at org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) at org.infinispan.remoting.responses.SuccessfulResponse$Externalizer.readObject(SuccessfulResponse.java:79) at org.infinispan.remoting.responses.SuccessfulResponse$Externalizer.readObject(SuccessfulResponse.java:64) at org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.readObject(ExternalizerTable.java:424) at org.infinispan.marshall.core.ExternalizerTable.readObject(ExternalizerTable.java:221) at org.infinispan.marshall.core.JBossMarshaller$ExternalizerTableProxy.readObject(JBossMarshaller.java:148) at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:351) at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) at org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) at org.infinispan.commons.marshall.jboss.AbstractJBossMarshaller.objectFromObjectStream(AbstractJBossMarshaller.java:135) at org.infinispan.marshall.core.VersionAwareMarshaller.objectFromByteBuffer(VersionAwareMarshaller.java:101) at org.infinispan.commons.marshall.AbstractDelegatingMarshaller.objectFromByteBuffer(AbstractDelegatingMarshaller.java:80) at org.infinispan.remoting.transport.jgroups.MarshallerAdapter.objectFromBuffer(MarshallerAdapter.java:28) at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390) at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:250) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:674) at org.jgroups.JChannel.up(JChannel.java:733) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) at org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:146) at org.jgroups.protocols.RSVP.up(RSVP.java:190) at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) at org.jgroups.protocols.FlowControl.up(FlowControl.java:379) at org.jgroups.protocols.pbcast.GMS.up(GMS.java:1042) at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234) at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1034) at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:752) at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:399) at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:610) at org.jgroups.protocols.BARRIER.up(BARRIER.java:152) at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:155) at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:200) at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:297) at org.jgroups.protocols.MERGE3.up(MERGE3.java:288) at org.jgroups.protocols.Discovery.up(Discovery.java:277) at org.jgroups.protocols.TP.passMessageUp(TP.java:1568) at org.jgroups.protocols.TP$MyHandler.run(TP.java:1787) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) This email and any attachment may contain confidential information which is intended for use only by the addressee(s) named above. If you received this email by mistake, please notify the sender immediately, and delete the email from your system. You are prohibited from copying, disseminating or otherwise using the email or any attachment. From dan.berindei at gmail.com Fri Oct 24 04:27:42 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 24 Oct 2014 11:27:42 +0300 Subject: [infinispan-dev] ClassNotFoundException infinispan-embedded-7.0.0.CR2.jar In-Reply-To: <544A05E2.8070101@nexustelecom.com> References: <544A05E2.8070101@nexustelecom.com> Message-ID: Hi Andreas The AvailabilityMode enum moved between 7.0.0.CR1 and 7.0.0.CR2, so my guess is that you have one node running CR2, and another node running CR1 (or a previous version). We don't support running two different versions of Infinispan in the same cluster, so this is expected. I admit we need work a bit on the error message, though, so I've created [1] Cheers Dan [1] https://issues.jboss.org/browse/ISPN-4879 On Fri, Oct 24, 2014 at 10:55 AM, Andreas Kruthoff < andreas.kruthoff at nexustelecom.com> wrote: > Hi > > I was running a cache in distributed mode and suddenly got the following > ClassNotFoundException (see below). > > My classapth: > infinispan-embedded-7.0.0.CR2.jar > jboss-transaction-api_1.1_spec-1.0.1.Final.jar > > Am I missing something? > > > Exception in thread "main" org.infinispan.commons.CacheException: Unable > to invoke method public void > org.infinispan.statetransfer.StateTransferManagerImpl.start() throws > java.lang.Exception on object of type StateTransferManagerImpl > at > > org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:170) > at > > org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869) > at > > org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638) > at > > org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627) > at > > org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530) > at > > org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:216) > at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:764) > at > > org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:584) > at > > org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:539) > at > > org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:416) > at ch.nexustelecom.lbd.engine.ImsiCache.init(ImsiCache.java:49) > at > > ch.nexustelecom.dexclient.engine.DefaultDexClientEngine.init(DefaultDexClientEngine.java:120) > at > ch.nexustelecom.dexclient.DexClient.initClient(DexClient.java:169) > at > > ch.nexustelecom.dexclient.tool.DexClientManager.startup(DexClientManager.java:196) > at > > ch.nexustelecom.dexclient.tool.DexClientManager.main(DexClientManager.java:83) > Caused by: org.infinispan.commons.CacheException: > java.lang.ClassNotFoundException: > org.infinispan.partionhandling.impl.AvailabilityMode > at > org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655) > at > > org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176) > at > > org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536) > at > > org.infinispan.topology.LocalTopologyManagerImpl.executeOnCoordinator(LocalTopologyManagerImpl.java:388) > at > > org.infinispan.topology.LocalTopologyManagerImpl.join(LocalTopologyManagerImpl.java:102) > at > > org.infinispan.statetransfer.StateTransferManagerImpl.start(StateTransferManagerImpl.java:108) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at > > org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168) > ... 14 more > Caused by: java.lang.ClassNotFoundException: > org.infinispan.partionhandling.impl.AvailabilityMode > at java.net.URLClassLoader$1.run(Unknown Source) > at java.net.URLClassLoader$1.run(Unknown Source) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Unknown Source) > at > > org.jboss.marshalling.AbstractClassResolver.loadClass(AbstractClassResolver.java:131) > at > > org.jboss.marshalling.AbstractClassResolver.resolveClass(AbstractClassResolver.java:112) > at > > org.jboss.marshalling.river.RiverUnmarshaller.doReadClassDescriptor(RiverUnmarshaller.java:1002) > at > > org.jboss.marshalling.river.RiverUnmarshaller.doReadNewObject(RiverUnmarshaller.java:1239) > at > > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:272) > at > > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) > at > > org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) > at > > org.infinispan.topology.CacheStatusResponse$Externalizer.readObject(CacheStatusResponse.java:76) > at > > org.infinispan.topology.CacheStatusResponse$Externalizer.readObject(CacheStatusResponse.java:62) > at > > org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.readObject(ExternalizerTable.java:424) > at > > org.infinispan.marshall.core.ExternalizerTable.readObject(ExternalizerTable.java:221) > at > > org.infinispan.marshall.core.JBossMarshaller$ExternalizerTableProxy.readObject(JBossMarshaller.java:148) > at > > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:351) > at > > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) > at > > org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) > at > > org.infinispan.remoting.responses.SuccessfulResponse$Externalizer.readObject(SuccessfulResponse.java:79) > at > > org.infinispan.remoting.responses.SuccessfulResponse$Externalizer.readObject(SuccessfulResponse.java:64) > at > > org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.readObject(ExternalizerTable.java:424) > at > > org.infinispan.marshall.core.ExternalizerTable.readObject(ExternalizerTable.java:221) > at > > org.infinispan.marshall.core.JBossMarshaller$ExternalizerTableProxy.readObject(JBossMarshaller.java:148) > at > > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:351) > at > > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) > at > > org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) > at > > org.infinispan.commons.marshall.jboss.AbstractJBossMarshaller.objectFromObjectStream(AbstractJBossMarshaller.java:135) > at > > org.infinispan.marshall.core.VersionAwareMarshaller.objectFromByteBuffer(VersionAwareMarshaller.java:101) > at > > org.infinispan.commons.marshall.AbstractDelegatingMarshaller.objectFromByteBuffer(AbstractDelegatingMarshaller.java:80) > at > > org.infinispan.remoting.transport.jgroups.MarshallerAdapter.objectFromBuffer(MarshallerAdapter.java:28) > at > > org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390) > at > org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:250) > at > > org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:674) > at org.jgroups.JChannel.up(JChannel.java:733) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at > org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:146) > at org.jgroups.protocols.RSVP.up(RSVP.java:190) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:379) > at org.jgroups.protocols.pbcast.GMS.up(GMS.java:1042) > at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234) > at > org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1034) > at > org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:752) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:399) > at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:610) > at org.jgroups.protocols.BARRIER.up(BARRIER.java:152) > at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:155) > at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:200) > at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:297) > at org.jgroups.protocols.MERGE3.up(MERGE3.java:288) > at org.jgroups.protocols.Discovery.up(Discovery.java:277) > at org.jgroups.protocols.TP.passMessageUp(TP.java:1568) > at org.jgroups.protocols.TP$MyHandler.run(TP.java:1787) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > at java.lang.Thread.run(Unknown Source) > > This email and any attachment may contain confidential information which > is intended for use only by the addressee(s) named above. If you received > this email by mistake, please notify the sender immediately, and delete the > email from your system. You are prohibited from copying, disseminating or > otherwise using the email or any attachment. > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141024/6bde597e/attachment-0001.html From ttarrant at redhat.com Fri Oct 24 04:32:39 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 24 Oct 2014 09:32:39 +0100 Subject: [infinispan-dev] ClassNotFoundException infinispan-embedded-7.0.0.CR2.jar In-Reply-To: <544A05E2.8070101@nexustelecom.com> References: <544A05E2.8070101@nexustelecom.com> Message-ID: <544A0EA7.5090701@redhat.com> Hi Andreas, that class was moved from org.infinispan.partitionhandling.impl to org.infinispan.partitionhandling between CR1 and CR2. Tristan On 24/10/14 08:55, Andreas Kruthoff wrote: > Hi > > I was running a cache in distributed mode and suddenly got the following > ClassNotFoundException (see below). > > My classapth: > infinispan-embedded-7.0.0.CR2.jar > jboss-transaction-api_1.1_spec-1.0.1.Final.jar > > Am I missing something? > > > Exception in thread "main" org.infinispan.commons.CacheException: Unable > to invoke method public void > org.infinispan.statetransfer.StateTransferManagerImpl.start() throws > java.lang.Exception on object of type StateTransferManagerImpl > at > org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:170) > at > org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869) > at > org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638) > at > org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627) > at > org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530) > at > org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:216) > at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:764) > at > org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:584) > at > org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:539) > at > org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:416) > at ch.nexustelecom.lbd.engine.ImsiCache.init(ImsiCache.java:49) > at > ch.nexustelecom.dexclient.engine.DefaultDexClientEngine.init(DefaultDexClientEngine.java:120) > at ch.nexustelecom.dexclient.DexClient.initClient(DexClient.java:169) > at > ch.nexustelecom.dexclient.tool.DexClientManager.startup(DexClientManager.java:196) > at > ch.nexustelecom.dexclient.tool.DexClientManager.main(DexClientManager.java:83) > Caused by: org.infinispan.commons.CacheException: > java.lang.ClassNotFoundException: > org.infinispan.partionhandling.impl.AvailabilityMode > at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655) > at > org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176) > at > org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536) > at > org.infinispan.topology.LocalTopologyManagerImpl.executeOnCoordinator(LocalTopologyManagerImpl.java:388) > at > org.infinispan.topology.LocalTopologyManagerImpl.join(LocalTopologyManagerImpl.java:102) > at > org.infinispan.statetransfer.StateTransferManagerImpl.start(StateTransferManagerImpl.java:108) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at > org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168) > ... 14 more > Caused by: java.lang.ClassNotFoundException: > org.infinispan.partionhandling.impl.AvailabilityMode > at java.net.URLClassLoader$1.run(Unknown Source) > at java.net.URLClassLoader$1.run(Unknown Source) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Unknown Source) > at > org.jboss.marshalling.AbstractClassResolver.loadClass(AbstractClassResolver.java:131) > at > org.jboss.marshalling.AbstractClassResolver.resolveClass(AbstractClassResolver.java:112) > at > org.jboss.marshalling.river.RiverUnmarshaller.doReadClassDescriptor(RiverUnmarshaller.java:1002) > at > org.jboss.marshalling.river.RiverUnmarshaller.doReadNewObject(RiverUnmarshaller.java:1239) > at > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:272) > at > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) > at > org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) > at > org.infinispan.topology.CacheStatusResponse$Externalizer.readObject(CacheStatusResponse.java:76) > at > org.infinispan.topology.CacheStatusResponse$Externalizer.readObject(CacheStatusResponse.java:62) > at > org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.readObject(ExternalizerTable.java:424) > at > org.infinispan.marshall.core.ExternalizerTable.readObject(ExternalizerTable.java:221) > at > org.infinispan.marshall.core.JBossMarshaller$ExternalizerTableProxy.readObject(JBossMarshaller.java:148) > at > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:351) > at > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) > at > org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) > at > org.infinispan.remoting.responses.SuccessfulResponse$Externalizer.readObject(SuccessfulResponse.java:79) > at > org.infinispan.remoting.responses.SuccessfulResponse$Externalizer.readObject(SuccessfulResponse.java:64) > at > org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.readObject(ExternalizerTable.java:424) > at > org.infinispan.marshall.core.ExternalizerTable.readObject(ExternalizerTable.java:221) > at > org.infinispan.marshall.core.JBossMarshaller$ExternalizerTableProxy.readObject(JBossMarshaller.java:148) > at > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:351) > at > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) > at > org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:41) > at > org.infinispan.commons.marshall.jboss.AbstractJBossMarshaller.objectFromObjectStream(AbstractJBossMarshaller.java:135) > at > org.infinispan.marshall.core.VersionAwareMarshaller.objectFromByteBuffer(VersionAwareMarshaller.java:101) > at > org.infinispan.commons.marshall.AbstractDelegatingMarshaller.objectFromByteBuffer(AbstractDelegatingMarshaller.java:80) > at > org.infinispan.remoting.transport.jgroups.MarshallerAdapter.objectFromBuffer(MarshallerAdapter.java:28) > at > org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390) > at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:250) > at > org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:674) > at org.jgroups.JChannel.up(JChannel.java:733) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:146) > at org.jgroups.protocols.RSVP.up(RSVP.java:190) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:379) > at org.jgroups.protocols.pbcast.GMS.up(GMS.java:1042) > at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1034) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:752) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:399) > at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:610) > at org.jgroups.protocols.BARRIER.up(BARRIER.java:152) > at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:155) > at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:200) > at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:297) > at org.jgroups.protocols.MERGE3.up(MERGE3.java:288) > at org.jgroups.protocols.Discovery.up(Discovery.java:277) > at org.jgroups.protocols.TP.passMessageUp(TP.java:1568) > at org.jgroups.protocols.TP$MyHandler.run(TP.java:1787) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > > This email and any attachment may contain confidential information which is intended for use only by the addressee(s) named above. If you received this email by mistake, please notify the sender immediately, and delete the email from your system. You are prohibited from copying, disseminating or otherwise using the email or any attachment. > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From sanne at infinispan.org Sun Oct 26 05:27:10 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Sun, 26 Oct 2014 10:27:10 +0100 Subject: [infinispan-dev] Docker images now available for Infinispan Server Message-ID: https://twitter.com/marekgoldmann/status/526060068945817601 Thanks Marek! From emmanuel at hibernate.org Sun Oct 26 06:13:26 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Sun, 26 Oct 2014 11:13:26 +0100 Subject: [infinispan-dev] Docker images now available for Infinispan Server In-Reply-To: References: Message-ID: We should add a link to the download page of the website. > On 26 oct. 2014, at 10:27, Sanne Grinovero wrote: > > https://twitter.com/marekgoldmann/status/526060068945817601 > > Thanks Marek! > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From gustavonalle at gmail.com Sun Oct 26 07:08:08 2014 From: gustavonalle at gmail.com (Gustavo Fernandes) Date: Sun, 26 Oct 2014 11:08:08 +0000 Subject: [infinispan-dev] Docker images now available for Infinispan Server In-Reply-To: References: Message-ID: <0A0997F0-315D-4D22-8386-415BE3E1C505@gmail.com> I don't think a manual download is needed, since docker will pull the image automatically from the registry which is in sync with https://github.com/jboss-dockerfiles/infinispan It'd be nice to put a chapter in the doc, specially on how to create a cluster, since AFAICT the containers in the image are being launched with bin/standalone.sh by default. Gustavo On 26 Oct 2014, at 10:13, Emmanuel Bernard wrote: > We should add a link to the download page of the website. > > >> On 26 oct. 2014, at 10:27, Sanne Grinovero wrote: >> >> https://twitter.com/marekgoldmann/status/526060068945817601 >> >> Thanks Marek! >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141026/e137b1df/attachment.html From vjuranek at redhat.com Sun Oct 26 16:17:11 2014 From: vjuranek at redhat.com (Vojtech Juranek) Date: Sun, 26 Oct 2014 21:17:11 +0100 Subject: [infinispan-dev] Docker images now available for Infinispan Server In-Reply-To: <0A0997F0-315D-4D22-8386-415BE3E1C505@gmail.com> References: <0A0997F0-315D-4D22-8386-415BE3E1C505@gmail.com> Message-ID: <15211174.nyA0uKPGbk@localhost> On Sunday 26 October 2014 11:08:08 Gustavo Fernandes wrote: > It'd be nice to put a chapter in the doc, specially on how to create a > cluster I'm going to prepare one more Dockerfile for library mode (wildfly + ispn modules). Once done, I'll prepare some small doc chapter how to use it Vojta -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 473 bytes Desc: This is a digitally signed message part. Url : http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141026/ccebd629/attachment.bin From emmanuel at hibernate.org Sun Oct 26 17:29:21 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Sun, 26 Oct 2014 22:29:21 +0100 Subject: [infinispan-dev] Docker images now available for Infinispan Server In-Reply-To: <0A0997F0-315D-4D22-8386-415BE3E1C505@gmail.com> References: <0A0997F0-315D-4D22-8386-415BE3E1C505@gmail.com> Message-ID: <13D8ADEB-B64A-4406-813D-9BAC5ACCECC5@hibernate.org> Not necessarily to ?download? the docker image. Rather point in the dl options that you can use Docker and point to the hub page. On 26 Oct 2014, at 12:08, Gustavo Fernandes wrote: > I don't think a manual download is needed, since docker will pull the image automatically from the registry which is in sync with https://github.com/jboss-dockerfiles/infinispan > It'd be nice to put a chapter in the doc, specially on how to create a cluster, since AFAICT the containers in the image are being launched with bin/standalone.sh by default. > > Gustavo > > On 26 Oct 2014, at 10:13, Emmanuel Bernard wrote: > >> We should add a link to the download page of the website. >> >> >>> On 26 oct. 2014, at 10:27, Sanne Grinovero wrote: >>> >>> https://twitter.com/marekgoldmann/status/526060068945817601 >>> >>> Thanks Marek! >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141026/b5b39cd0/attachment-0001.html From gustavonalle at gmail.com Sun Oct 26 18:46:02 2014 From: gustavonalle at gmail.com (Gustavo Fernandes) Date: Sun, 26 Oct 2014 22:46:02 +0000 Subject: [infinispan-dev] Docker images now available for Infinispan Server In-Reply-To: <13D8ADEB-B64A-4406-813D-9BAC5ACCECC5@hibernate.org> References: <0A0997F0-315D-4D22-8386-415BE3E1C505@gmail.com> <13D8ADEB-B64A-4406-813D-9BAC5ACCECC5@hibernate.org> Message-ID: <193673DB-2C19-4BD2-971B-182CDE108EE4@gmail.com> On 26 Oct 2014, at 21:29, Emmanuel Bernard wrote: > Not necessarily to ?download? the docker image. Rather point in the dl options that you can use Docker and point to the hub page. > Makes much more sense, apologies for my misunderstanding :) Gustavo From isavin at redhat.com Mon Oct 27 06:39:48 2014 From: isavin at redhat.com (Ion Savin) Date: Mon, 27 Oct 2014 12:39:48 +0200 Subject: [infinispan-dev] Infinispan HotRod C++ Client 7.0.0.CR2 released! Message-ID: <544E20F4.70105@redhat.com> Hi all, Infinispan HotRod C++ Client 7.0.0.CR2 is now available! Thanks to everyone involved for the changes and bug reports contributed! Release details on http://blog.infinispan.org/2014/10/infinispan-hotrod-c-client-700cr2.html -- Ion Savin From mgencur at redhat.com Mon Oct 27 11:47:45 2014 From: mgencur at redhat.com (Martin Gencur) Date: Mon, 27 Oct 2014 16:47:45 +0100 Subject: [infinispan-dev] Docker images now available for Infinispan Server In-Reply-To: References: Message-ID: <544E6921.1000507@redhat.com> Thanks Vojtech Juranek for actually creating the Docker file:) And thanks Marek for integrating it and tweeting about it. Martin On 26.10.2014 10:27, Sanne Grinovero wrote: > https://twitter.com/marekgoldmann/status/526060068945817601 > > Thanks Marek! > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From sanne at infinispan.org Mon Oct 27 17:24:40 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 27 Oct 2014 22:24:40 +0100 Subject: [infinispan-dev] Docker images now available for Infinispan Server In-Reply-To: <544E6921.1000507@redhat.com> References: <544E6921.1000507@redhat.com> Message-ID: On 27 October 2014 16:47, Martin Gencur wrote: > Thanks Vojtech Juranek for actually creating the Docker file:) Thanks Vojtech! I had no idea you where working on that :) Sanne > > And thanks Marek for integrating it and tweeting about it. > > Martin > > On 26.10.2014 10:27, Sanne Grinovero wrote: >> https://twitter.com/marekgoldmann/status/526060068945817601 >> >> Thanks Marek! >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From an1310 at hotmail.com Mon Oct 27 18:00:46 2014 From: an1310 at hotmail.com (Erik Salter) Date: Mon, 27 Oct 2014 18:00:46 -0400 Subject: [infinispan-dev] Rebalancing flag as part of the CacheStatusResponse Message-ID: Hi all, This topic came up in a separate discussion with Mircea, and he suggested I post something on the mailing list for a wider audience. I have a business case where I need the value of the rebalancing flag read by the joining nodes. Let's say we have a TACH where we want our keys striped across machines, racks, etc. Due to how NBST works, if we start a bunch of nodes on one side of the topology marker, we'rewill end up with the case where all keys will dog-pile on the first node that joins before being disseminated to the other nodes. In other words, the first joining node on the other side of the topology acts as a "pivot." That's bad, especially if the key is marked as DELTA_WRITE, where the receiving node must pull the key from the readCH before applying the changelog. So not only do we have a single choke-point, but it's made worse by the initial burst of every write requiring numOwner threads for remote reads. If we disable rebalancing and start up the nodes on the other side of the topology, we can process this in a single view change. But there's a catch -- and this is the reason I added the state of the flag. We've run into a case where the current coordinator changed (crash or a MERGE) as the other nodes are starting up. And the new coordinator was elected from the new side of the topology. So we had two separate but balanced CHs on both sides of the topology. And data integrity went out the window. Hence the flag. Note also that this deployment requires the awaitInitialTransfer flag to be false. In a real production environment, this has saved me more times than I can count. Node failover/failback is now reasonably deterministic with a simple operational procedure for our customer(s) to follow. The question is whether this feature would be useful for the community. Even with the new partition handling, I think this implementation is still viable and may warrant inclusion into 7.0 (or 7.1). What does the team think? I welcome any and all feedback. Regards, Erik Salter Cisco Systems, SPVTG (404) 317-0693 From vblagoje at redhat.com Mon Oct 27 19:08:04 2014 From: vblagoje at redhat.com (Vladimir Blagojevic) Date: Mon, 27 Oct 2014 19:08:04 -0400 Subject: [infinispan-dev] JAR distribution once the grid has been deployed In-Reply-To: <77B40530-3309-420B-B720-DAC56410A102@hibernate.org> References: <77B40530-3309-420B-B720-DAC56410A102@hibernate.org> Message-ID: <544ED054.9070004@redhat.com> Emmanuel, I don't think we have any plans in place. I agree with you - we should at least provide hooks for these classloaders and possibly implement a certain simple approach as a proof-of-concept/tutorial for others to hook in their own mechanism of class loading. We can reference Evangelos' approach as one example of how this could be done. Vladimir On 2014-10-16, 5:23 AM, Emmanuel Bernard wrote: > Hi all, > > I know this has been discussed in the past (by Tristan I think), but I don?t know how concrete the plans have come since then. > > One major issue with all the distributed execution code interfaces we have is that it requires to have in the classpath of each node both the implementation of these interfaces and the class files corresponding to the key and value being processed. My understanding is that this is true of the distexec, Map / Reduce and (clustered) listener. > > Evangelos from the LEADS project sort of worked around this problem by creating specialized versions of his distexec that loads the necessary JARs from the grid itself (in a set of keys) and creates a classloader that references these JARs. In a sequence, it conceptually looks like that: > > - have the generic classloader distexec version in the each of grid nodes classpath at start time > - when a new remote execution is required, load each necessary JAR in a specific key in a specific cache > - the generic distexec basically receives the necessary keys, load each jar and create a classloader out of them > - the generic distexec load and launch the specific code that needs to be executed (based on the fqcn of the code to execute) from the created classloader > > There are a few problems with that including: > - it requires a lot of manual work from the user > - big JARs make the key / value per JAR logic explode a bit. The algorithms LEADS use have 300 MB sized JARs > - god know what security leak this can lead to > > So I wondered if we have a better alternative and plans and if there was a wiki page discussing the needs and potential approaches. > As an intermediary step we could make this approach a tutorial or side classes that people can borrow from for each of the use cases. > > Emmanuel > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Tue Oct 28 04:39:06 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 28 Oct 2014 09:39:06 +0100 Subject: [infinispan-dev] JAR distribution once the grid has been deployed In-Reply-To: <77B40530-3309-420B-B720-DAC56410A102@hibernate.org> References: <77B40530-3309-420B-B720-DAC56410A102@hibernate.org> Message-ID: <544F562A.40707@redhat.com> Hi Emmanuel, the plan is to leverage both the domain management and deployment features that are part of server. Galder has already introduced deployment of custom filters/converters and this code can be extended to support all of the other extension points we support (distexec, mapreduce, entities, custom cache loaders/stores, etc). The other alternative I am going to explore is server-side scripting based on JSR-223 which follows the general philosophy of Infinispan Server to be language and protocol independent. I have some initial POC code for this @ [1]. Tristan [1] https://github.com/tristantarrant/infinispan/commit/3e5ec12a071ff489447a611c9da0657d9641d306 On 16/10/14 11:23, Emmanuel Bernard wrote: > Hi all, > > I know this has been discussed in the past (by Tristan I think), but I don?t know how concrete the plans have come since then. > > One major issue with all the distributed execution code interfaces we have is that it requires to have in the classpath of each node both the implementation of these interfaces and the class files corresponding to the key and value being processed. My understanding is that this is true of the distexec, Map / Reduce and (clustered) listener. > > Evangelos from the LEADS project sort of worked around this problem by creating specialized versions of his distexec that loads the necessary JARs from the grid itself (in a set of keys) and creates a classloader that references these JARs. In a sequence, it conceptually looks like that: > > - have the generic classloader distexec version in the each of grid nodes classpath at start time > - when a new remote execution is required, load each necessary JAR in a specific key in a specific cache > - the generic distexec basically receives the necessary keys, load each jar and create a classloader out of them > - the generic distexec load and launch the specific code that needs to be executed (based on the fqcn of the code to execute) from the created classloader > > There are a few problems with that including: > - it requires a lot of manual work from the user > - big JARs make the key / value per JAR logic explode a bit. The algorithms LEADS use have 300 MB sized JARs > - god know what security leak this can lead to > > So I wondered if we have a better alternative and plans and if there was a wiki page discussing the needs and potential approaches. > As an intermediary step we could make this approach a tutorial or side classes that people can borrow from for each of the use cases. > > Emmanuel > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From isavin at redhat.com Tue Oct 28 07:41:46 2014 From: isavin at redhat.com (Ion Savin) Date: Tue, 28 Oct 2014 13:41:46 +0200 Subject: [infinispan-dev] Infinispan HotRod .NET Client 7.0.0.CR2 released! Message-ID: <544F80FA.8080703@redhat.com> Hi all, Infinispan HotRod .NET Client 7.0.0.CR2 is now available! Thanks to everyone involved for the changes and bug reports contributed! Release details on http://blog.infinispan.org/2014/10/infinispan-hotrod-c-client-700cr2.html -- Ion Savin From belaran at gmail.com Wed Oct 29 07:38:23 2014 From: belaran at gmail.com (Romain Pelisse) Date: Wed, 29 Oct 2014 12:38:23 +0100 Subject: [infinispan-dev] Taking over Mongo DB Cache Store ? Message-ID: Hi all, I've been looking into a piece of ISPN I could contribute to, and I think I could help supporting the MognoDB Cache Store[1]. I've checkout and none of the fork seems to be ahead of this version, and it seems to need to be ported to ISPN 7.x - which I'm inclined to work on. However, before doing so, I wanted to check with the community if nobody has such plan... [1] https://github.com/infinispan/infinispan-cachestore-mongodb -- Romain PELISSE, *"The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it" -- Terry Pratchett* Belaran ins Prussia (blog) (... finally up and running !) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141029/37b4fed2/attachment.html From ttarrant at redhat.com Wed Oct 29 09:47:32 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Wed, 29 Oct 2014 14:47:32 +0100 Subject: [infinispan-dev] Infinispan tutorial Message-ID: <5450EFF4.6050103@redhat.com> Hi guys, I've been working on how to spruce up our website, docs and code samples. While quickstarts are ok, they come as monolithic blobs which tell you nothing about how you got there. For this reason I believe a step-by-step tutorial approach is better and I've been looking at the AngularJS tutorials [0] as good examples on how to achieve this. I have created a repo [1] on my GitHub user where each commit is a step in the tutorial. I have tagged the commits using 'step-n' so that you can checkout any of the steps and run them: git checkout step-1 mvn clean package exec:java The GitHub web interface can be used to show the diff between steps, so that it can be linked from the docs [2]. Currently I'm not aiming to build a real application (although suggestions are welcome in this sense), but just going through the basics, adding features one by one, etc. Comments are welcome. Tristan --- [0] https://docs.angularjs.org/tutorial/step_00 [1] https://github.com/tristantarrant/infinispan-embedded-tutorial [2] https://github.com/tristantarrant/infinispan-embedded-tutorial/compare/step-0...step-1?diff=unified From jholusa at redhat.com Thu Oct 30 06:26:56 2014 From: jholusa at redhat.com (Jiri Holusa) Date: Thu, 30 Oct 2014 06:26:56 -0400 (EDT) Subject: [infinispan-dev] Infinispan 7 documentation In-Reply-To: <105852793.2425953.1414663764767.JavaMail.zimbra@redhat.com> Message-ID: <1151447816.2432147.1414664816470.JavaMail.zimbra@redhat.com> Hi guys, I wanted to share one user experience feedback with you. At university, I had a lecture about NoSQL datastores and Infinispan was also mentioned. The lecturer also showed some code examples. To my surprise, he used Infinispan 6. So after the lecture I asked him why version 6, not 7, and his answer was quite surprising. He told me that he got angry on Infinispan 7 documentation, because many code snippet examples were from old 6 version and that he was basically unable to configure it in a reasonable time. So he threw it away and switched back to Infinispan 6. I justed wanted to make a little discussion about this, because I think this is quite a big issue. I noticed that part of this issue was fixed just recently (18 hours ago, nice coincidence :)) by [1] (+10000 Gustavo), but there are still some out-of-date examples. But the message I want to say, we should pay attention to this (I know, boring) stuff, because we're basically discouraging users/community from using the newest version. Every customer/user will start playing with the community version and if he's not able to set it up in a few moments, he will move on to another product. And we don't want that, right? :) I also have clap the effort of Tristan with step-by-step tutorial, that's exactly what user wants and I would be happy to help you in anyway (verifying, keeping up-to-date, whatever) with it. Conclusion: let's pay more attention to documentation, it's the entering point for every newcomer and we want to make as best first impression as possible :) Thanks, Jirka P.S.: I don't see the changes from [1] in Infinispan User Guide [2], am I missing something or will it appear there later? [1] https://github.com/infinispan/infinispan/pull/3011/ [2] http://infinispan.org/docs/7.0.x/user_guide/user_guide.html From bertrama at umich.edu Thu Oct 30 06:42:54 2014 From: bertrama at umich.edu (Albert Bertram) Date: Thu, 30 Oct 2014 06:42:54 -0400 Subject: [infinispan-dev] PHP hot rod client Message-ID: Hi, A couple of years ago there were a few messages on this list about a potential PHP Hot Rod client. I haven't seen any further discussion of it, but I find myself in the same situation described before: I want to have a Drupal installation write cache data to Infinispan, and I'd prefer if it could do it via the Hot Rod protocol rather than the memcached protocol. I haven't seen any further evidence of the existence of a Hot Rod client native to PHP out on the open web, so I wrote a small wrapper around the Hot Rod C++ client which works for my purposes so far. The code is at https://github.com/bertrama/php-hotrod I wanted to send a note to the list to ask a couple questions: Would anyone else be interested in this php extension? Are there client-oriented benchmarks I should run? I looked around for some, but didn't find any. Specifically, I want to compare performance of this PHP Hot Rod client to the PHP memcached client when talking to the same Infinispan server. Thanks! Albert Bertram -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141030/6718abc9/attachment.html From gustavonalle at gmail.com Thu Oct 30 07:03:50 2014 From: gustavonalle at gmail.com (Gustavo Fernandes) Date: Thu, 30 Oct 2014 11:03:50 +0000 Subject: [infinispan-dev] Infinispan 7 documentation In-Reply-To: <1151447816.2432147.1414664816470.JavaMail.zimbra@redhat.com> References: <105852793.2425953.1414663764767.JavaMail.zimbra@redhat.com> <1151447816.2432147.1414664816470.JavaMail.zimbra@redhat.com> Message-ID: > I noticed that part of this issue was fixed just recently (18 hours ago, nice coincidence :)) by [1] (+10000 Gustavo), but there are still some out-of-date examples. Could you open a jira for it? > > P.S.: I don't see the changes from [1] in Infinispan User Guide [2], am I missing something or will it appear there later? > It will appear very soon with 7.0.0.Final release > > > [1] https://github.com/infinispan/infinispan/pull/3011/ > [2] http://infinispan.org/docs/7.0.x/user_guide/user_guide.html > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From belaran at gmail.com Thu Oct 30 08:08:08 2014 From: belaran at gmail.com (Romain Pelisse) Date: Thu, 30 Oct 2014 13:08:08 +0100 Subject: [infinispan-dev] Taking over Mongo DB Cache Store ? In-Reply-To: References: Message-ID: Really ? Nobody? Ok, I therefore claim the mongodb extension for my own ! AH AH AH (diabolical laught, something like a James Bond vilain would do) More seriously, I'll go on with porting the code to the 7.x base and see if I can increase a bit the test coverage. If anybody has feature request, or other wish, please let me know (or assign the JIRA to me if there is one). On 29 October 2014 12:38, Romain Pelisse wrote: > Hi all, > > I've been looking into a piece of ISPN I could contribute to, and I think > I could help supporting the MognoDB Cache Store[1]. I've checkout and none > of the fork seems to be ahead of this version, and it seems to need to be > ported to ISPN 7.x - which I'm inclined to work on. > > However, before doing so, I wanted to check with the community if nobody > has such plan... > > [1] https://github.com/infinispan/infinispan-cachestore-mongodb > > -- > Romain PELISSE, > *"The trouble with having an open mind, of course, is that people will > insist on coming along and trying to put things in it" -- Terry Pratchett* > Belaran ins Prussia (blog) (... > finally up and running !) > -- Romain PELISSE, *"The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it" -- Terry Pratchett* Belaran ins Prussia (blog) (... finally up and running !) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141030/e4e9795d/attachment.html From ttarrant at redhat.com Thu Oct 30 08:18:08 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 30 Oct 2014 13:18:08 +0100 Subject: [infinispan-dev] Taking over Mongo DB Cache Store ? In-Reply-To: References: Message-ID: <54522C80.1040608@redhat.com> Hi Romain, you just reminded me that I wanted to reply, but I got sidetracked. Ideally I would also like to bring the Cassandra cachestore back from the dead, since there has been interest in that direction. Tristan On 30/10/14 13:08, Romain Pelisse wrote: > Really ? Nobody? > > Ok, I therefore claim the mongodb extension for my own ! AH AH AH > (diabolical laught, something like a James Bond vilain would do) > > More seriously, I'll go on with porting the code to the 7.x base and > see if I can increase a bit the test coverage. If anybody has feature > request, or other wish, please let me know (or assign the JIRA to me > if there is one). > > > > On 29 October 2014 12:38, Romain Pelisse > wrote: > > Hi all, > > I've been looking into a piece of ISPN I could contribute to, and > I think I could help supporting the MognoDB Cache Store[1]. I've > checkout and none of the fork seems to be ahead of this version, > and it seems to need to be ported to ISPN 7.x - which I'm inclined > to work on. > > However, before doing so, I wanted to check with the community if > nobody has such plan... > > [1] https://github.com/infinispan/infinispan-cachestore-mongodb > > -- > Romain PELISSE, > /"The trouble with having an open mind, of course, is that people > will insist on coming along and trying to put things in it" -- > Terry Pratchett/ > Belaran ins Prussia (blog) > (... finally up and running !) > > > > > -- > Romain PELISSE, > /"The trouble with having an open mind, of course, is that people will > insist on coming along and trying to put things in it" -- Terry Pratchett/ > Belaran ins Prussia (blog) > (... finally up and running !) > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From belaran at gmail.com Thu Oct 30 08:29:08 2014 From: belaran at gmail.com (Romain Pelisse) Date: Thu, 30 Oct 2014 13:29:08 +0100 Subject: [infinispan-dev] Taking over Mongo DB Cache Store ? In-Reply-To: <54522C80.1040608@redhat.com> References: <54522C80.1040608@redhat.com> Message-ID: Hi Tristan, Well my first idea was to look into Cassandra, but the thing is, I have zero knowledge on Cassandra (while I have like 0.3 % of knowledge of MongoDB). But, maybe once I've maintained a bit the MongoDB CacheStore i'll take a stab at the Cassandra one. On 30 October 2014 13:18, Tristan Tarrant wrote: > Hi Romain, > > you just reminded me that I wanted to reply, but I got sidetracked. > Ideally I would also like to bring the Cassandra cachestore back from > the dead, since there has been interest in that direction. > > Tristan > > On 30/10/14 13:08, Romain Pelisse wrote: > > Really ? Nobody? > > > > Ok, I therefore claim the mongodb extension for my own ! AH AH AH > > (diabolical laught, something like a James Bond vilain would do) > > > > More seriously, I'll go on with porting the code to the 7.x base and > > see if I can increase a bit the test coverage. If anybody has feature > > request, or other wish, please let me know (or assign the JIRA to me > > if there is one). > > > > > > > > On 29 October 2014 12:38, Romain Pelisse > > wrote: > > > > Hi all, > > > > I've been looking into a piece of ISPN I could contribute to, and > > I think I could help supporting the MognoDB Cache Store[1]. I've > > checkout and none of the fork seems to be ahead of this version, > > and it seems to need to be ported to ISPN 7.x - which I'm inclined > > to work on. > > > > However, before doing so, I wanted to check with the community if > > nobody has such plan... > > > > [1] https://github.com/infinispan/infinispan-cachestore-mongodb > > > > -- > > Romain PELISSE, > > /"The trouble with having an open mind, of course, is that people > > will insist on coming along and trying to put things in it" -- > > Terry Pratchett/ > > Belaran ins Prussia (blog) > > (... finally up and running !) > > > > > > > > > > -- > > Romain PELISSE, > > /"The trouble with having an open mind, of course, is that people will > > insist on coming along and trying to put things in it" -- Terry > Pratchett/ > > Belaran ins Prussia (blog) > > (... finally up and running !) > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Romain PELISSE, *"The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it" -- Terry Pratchett* Belaran ins Prussia (blog) (... finally up and running !) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141030/235a910b/attachment-0001.html From sebastian.laskawiec at gmail.com Thu Oct 30 13:49:02 2014 From: sebastian.laskawiec at gmail.com (=?UTF-8?Q?Sebastian_=C5=81askawiec?=) Date: Thu, 30 Oct 2014 18:49:02 +0100 Subject: [infinispan-dev] Infinispan tutorial In-Reply-To: <5450EFF4.6050103@redhat.com> References: <5450EFF4.6050103@redhat.com> Message-ID: Hi Tristan! I really like this idea! Recently I've been studying Developer Materials for EAP and I noticed that the page layout is pretty the same as Angular's (CDI example: [1] - navigation on the left and topic on the top bar). I'm just thinking - maybe we could place this tutorial in JBoss Data Grid Quickstarts section [2]? It seems to be a perfect place for these kind of materials. Best regards Sebastian [1] http://www.jboss.org/quickstarts/eap/payment-cdi-event/index.html [2] http://www.jboss.org/quickstarts/datagrid/ 2014-10-29 14:47 GMT+01:00 Tristan Tarrant : > Hi guys, > > I've been working on how to spruce up our website, docs and code samples. > While quickstarts are ok, they come as monolithic blobs which tell you > nothing about how you got there. For this reason I believe a > step-by-step tutorial approach is better and I've been looking at the > AngularJS tutorials [0] as good examples on how to achieve this. > I have created a repo [1] on my GitHub user where each commit is a step > in the tutorial. I have tagged the commits using 'step-n' so that you > can checkout any of the steps and run them: > > git checkout step-1 > mvn clean package exec:java > > The GitHub web interface can be used to show the diff between steps, so > that it can be linked from the docs [2]. > > Currently I'm not aiming to build a real application (although > suggestions are welcome in this sense), but just going through the > basics, adding features one by one, etc. > > Comments are welcome. > > Tristan > > --- > [0] https://docs.angularjs.org/tutorial/step_00 > [1] https://github.com/tristantarrant/infinispan-embedded-tutorial > [2] > > https://github.com/tristantarrant/infinispan-embedded-tutorial/compare/step-0...step-1?diff=unified > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Sebastian ?askawiec -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141030/e724b7bc/attachment.html