From ttarrant at redhat.com Mon Dec 1 11:21:24 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 01 Dec 2014 17:21:24 +0100 Subject: [infinispan-dev] Weekly IRC Meeting logs 2014-12-01 Message-ID: <547C9584.8030906@redhat.com> Hi all, we had our weekly IRC meeting today. The logs are at http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2014/infinispan.2014-12-01-15.02.log.html Thanks Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From rvansa at redhat.com Tue Dec 2 09:33:21 2014 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 02 Dec 2014 15:33:21 +0100 Subject: [infinispan-dev] DeltaAware: different local/remote behaviour Message-ID: <547DCDB1.3040200@redhat.com> Hi, I was trying to implement an effective atomic counters [1] in Infinispan using the DeltaAware interface, but trying to use DeltaAware I've spotted an unexpected behaviour; I wanted to have a Delta for getAndIncrement() method, that would simply increment the value without knowing the previous value ahead, and return this previous value. Therefore, I was inserting a fake DeltaAware object into the cache that generates this relative Delta. This works as long as the originator != primary owner, as the delta is generated during marshalling. However, if I store that object locally, the fake object is not used to generate the delta and reapply it on current instance in data container, but it is stored directly. Is such difference in local/remote behaviour bug or feature? (this is the main question in this mail) It seems to me that there are two reasons to use deltas: reducing size of RPCs and reduce their total number. So the design should optimize both. I have another doubts about DeltaAware interface usefulness, tracked in ISPN-5035 [2] - while it reduces bandwith from originator to primary owner, the response from primary owner to originator carries the full value. I also find quite inconvenient that only PutKeyValueCommand somehow works with deltas, but ReplaceCommand does not. I've also noticed that the backup carries the full value [3], not quite a good idea when we're trying to reduce bandwith. Generally, I think that EntryProcessor-like interface would be more useful than DeltaAware. Radim [1] https://github.com/rvansa/infinispan/tree/t_objects [2] https://issues.jboss.org/browse/ISPN-5035 [3] https://issues.jboss.org/browse/ISPN-5037 -- Radim Vansa JBoss DataGrid QA From an1310 at hotmail.com Tue Dec 2 10:04:29 2014 From: an1310 at hotmail.com (Erik Salter) Date: Tue, 2 Dec 2014 10:04:29 -0500 Subject: [infinispan-dev] DeltaAware: different local/remote behaviour In-Reply-To: <547DCDB1.3040200@redhat.com> References: <547DCDB1.3040200@redhat.com> Message-ID: Hi Radim, We may be doing something similar. I was implementing something along the lines of a queue of operations that resolve into a single value. This implementation uses Total Order and CRDTs. I also want a changelog to send to the backups. I already use DeltaAware quite liberally in my production environment. I've always looked at it as an implementation detail if the originator == primary owner. While this does make for some inefficiencies, like increased memory utilization (I have a lot of keys for very large objects), it's worth it to me from a simplicity standpoint. I always use DeltaAware with SKIP_REMOTE_LOOKUP. The real fun with DeltaAware are the cases where a backup receives a DeltaAware instance and the key isn't in its data container. It will issue a remote get to pull the complete context before applying the delta. During state transfer, this will lead to increased thread utilization on the joining nodes. I have a use case where I must restart half my cluster while there's 100K DeltaAware keys being written at a high data rate. With numOwners == 2, there are 3 nodes in the union CH. A new backup will issue 2 remote GetKeyValueCommands. I have a hack to stagger the gets to reduce bandwidth, but if we're rethinking the implementation this should be an additional consideration. Regards, Erik On 12/2/14, 9:33 AM, "Radim Vansa" wrote: >Hi, > >I was trying to implement an effective atomic counters [1] in Infinispan >using the DeltaAware interface, but trying to use DeltaAware I've >spotted an unexpected behaviour; I wanted to have a Delta for >getAndIncrement() method, that would simply increment the value without >knowing the previous value ahead, and return this previous value. >Therefore, I was inserting a fake DeltaAware object into the cache that >generates this relative Delta. > >This works as long as the originator != primary owner, as the delta is >generated during marshalling. However, if I store that object locally, >the fake object is not used to generate the delta and reapply it on >current instance in data container, but it is stored directly. > >Is such difference in local/remote behaviour bug or feature? (this is >the main question in this mail) > >It seems to me that there are two reasons to use deltas: reducing size >of RPCs and reduce their total number. So the design should optimize both. > >I have another doubts about DeltaAware interface usefulness, tracked in >ISPN-5035 [2] - while it reduces bandwith from originator to primary >owner, the response from primary owner to originator carries the full >value. I also find quite inconvenient that only PutKeyValueCommand >somehow works with deltas, but ReplaceCommand does not. > >I've also noticed that the backup carries the full value [3], not quite >a good idea when we're trying to reduce bandwith. > >Generally, I think that EntryProcessor-like interface would be more >useful than DeltaAware. > >Radim > >[1] https://github.com/rvansa/infinispan/tree/t_objects >[2] https://issues.jboss.org/browse/ISPN-5035 >[3] https://issues.jboss.org/browse/ISPN-5037 > >-- >Radim Vansa >JBoss DataGrid QA > >_______________________________________________ >infinispan-dev mailing list >infinispan-dev at lists.jboss.org >https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Tue Dec 2 11:17:38 2014 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 02 Dec 2014 17:17:38 +0100 Subject: [infinispan-dev] DeltaAware: different local/remote behaviour In-Reply-To: References: <547DCDB1.3040200@redhat.com> Message-ID: <547DE622.6000905@redhat.com> Hi Erik, it's great to get community (users') feedback on API :) Comments inline On 12/02/2014 04:04 PM, Erik Salter wrote: > Hi Radim, > > We may be doing something similar. I was implementing something along the > lines of a queue of operations that resolve into a single value. This > implementation uses Total Order and CRDTs. I also want a changelog to > send to the backups. > > I already use DeltaAware quite liberally in my production environment. > I've always looked at it as an implementation detail if the originator == > primary owner. While this does make for some inefficiencies, like > increased memory utilization (I have a lot of keys for very large > objects), it's worth it to me from a simplicity standpoint. Yes, and that's what I'd like to do :) When designing the object, I was expecting that all updates will be in the delta-way, and therefore I report that it's not this way and I have to adapt the code in a hackish way. > > I always use DeltaAware with SKIP_REMOTE_LOOKUP. I see. But in some use cases you'd want some condensed report what was the result of applying delta. > > The real fun with DeltaAware are the cases where a backup receives a > DeltaAware instance and the key isn't in its data container. It will > issue a remote get to pull the complete context before applying the delta. > During state transfer, this will lead to increased thread utilization on > the joining nodes. I have a use case where I must restart half my cluster > while there's 100K DeltaAware keys being written at a high data rate. > With numOwners == 2, there are 3 nodes in the union CH. A new backup will > issue 2 remote GetKeyValueCommands. Hmm, does not sound really convenient but I don't see what other could be done when the delta-updated entry is not in place yet. > I have a hack to stagger the gets to > reduce bandwidth, but if we're rethinking the implementation this should > be an additional consideration. Nobody said we're rethinking this - I was just providing the feedback from my POV after first starting to play with DeltaAware. Radim > > Regards, > > Erik > > > On 12/2/14, 9:33 AM, "Radim Vansa" wrote: > >> Hi, >> >> I was trying to implement an effective atomic counters [1] in Infinispan >> using the DeltaAware interface, but trying to use DeltaAware I've >> spotted an unexpected behaviour; I wanted to have a Delta for >> getAndIncrement() method, that would simply increment the value without >> knowing the previous value ahead, and return this previous value. >> Therefore, I was inserting a fake DeltaAware object into the cache that >> generates this relative Delta. >> >> This works as long as the originator != primary owner, as the delta is >> generated during marshalling. However, if I store that object locally, >> the fake object is not used to generate the delta and reapply it on >> current instance in data container, but it is stored directly. >> >> Is such difference in local/remote behaviour bug or feature? (this is >> the main question in this mail) >> >> It seems to me that there are two reasons to use deltas: reducing size >> of RPCs and reduce their total number. So the design should optimize both. >> >> I have another doubts about DeltaAware interface usefulness, tracked in >> ISPN-5035 [2] - while it reduces bandwith from originator to primary >> owner, the response from primary owner to originator carries the full >> value. I also find quite inconvenient that only PutKeyValueCommand >> somehow works with deltas, but ReplaceCommand does not. >> >> I've also noticed that the backup carries the full value [3], not quite >> a good idea when we're trying to reduce bandwith. >> >> Generally, I think that EntryProcessor-like interface would be more >> useful than DeltaAware. >> >> Radim >> >> [1] https://github.com/rvansa/infinispan/tree/t_objects >> [2] https://issues.jboss.org/browse/ISPN-5035 >> [3] https://issues.jboss.org/browse/ISPN-5037 >> >> -- >> Radim Vansa >> JBoss DataGrid QA >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From periyasamy.palanisamy at ericsson.com Wed Dec 3 05:49:07 2014 From: periyasamy.palanisamy at ericsson.com (Periyasamy Palanisamy) Date: Wed, 3 Dec 2014 10:49:07 +0000 Subject: [infinispan-dev] ISPN000196: Failed to recover cluster state after the current node became the coordinator Message-ID: <7280D3BDF6E559489E711F2AF13500441A4CFE9E@ESESSMB105.ericsson.se> Hi, We are using Infinispan 5.3.0 version in Opendaylight based controller product in 2-node clustered environment. But when we tried to reboot a node in the cluster, the error "ISPN000196: Failed to recover cluster state after the current node became the coordinator" is thrown continuously which leads to cluster becomes unusable. I see there is a bug already raised in infinispan 5.3.0 (https://issues.jboss.org/browse/ISPN-3395), but there is no information on which version this issue got fixed. Please confirm me on which infinispan version this issue is fixed. Thanks, Periyasamy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141203/08eeb559/attachment.html From dan.berindei at gmail.com Thu Dec 4 03:41:30 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Thu, 4 Dec 2014 10:41:30 +0200 Subject: [infinispan-dev] DeltaAware: different local/remote behaviour In-Reply-To: <547DE622.6000905@redhat.com> References: <547DCDB1.3040200@redhat.com> <547DE622.6000905@redhat.com> Message-ID: Hi Radim I'm afraid the DeltaAware javadoc is quite clear: > * Implementations of DeltaAware automatically gain the ability to perform fine-grained replication in Infinispan, > * since Infinispan's data container is able to detect these types and only serialize and transport Deltas around > * the network rather than the entire, serialized object. So deltas is only used for minimizing the replication cost, not for local operations. > * Using DeltaAware makes sense if your custom object is large in size and often only sees small portions of the > * object being updated in a transaction. Implementations would need to be able to track these changes during the > * course of a transaction though, to be able to produce a {@link Delta} instance, so this too is a consideration > * for implementations. Meaning put(K, DeltaAware) expects the originator to have the previous value, not just a Delta. And that's why AtomicHashMap requires transactions. Further comments inline. On Tue, Dec 2, 2014 at 6:17 PM, Radim Vansa wrote: > Hi Erik, > > it's great to get community (users') feedback on API :) > > Comments inline > > On 12/02/2014 04:04 PM, Erik Salter wrote: >> Hi Radim, >> >> We may be doing something similar. I was implementing something along the >> lines of a queue of operations that resolve into a single value. This >> implementation uses Total Order and CRDTs. I also want a changelog to >> send to the backups. >> >> I already use DeltaAware quite liberally in my production environment. >> I've always looked at it as an implementation detail if the originator == >> primary owner. While this does make for some inefficiencies, like >> increased memory utilization (I have a lot of keys for very large >> objects), it's worth it to me from a simplicity standpoint. > > Yes, and that's what I'd like to do :) When designing the object, I was > expecting that all updates will be in the delta-way, and therefore I > report that it's not this way and I have to adapt the code in a hackish way. > I think you already have a hack in your code when you create a fake DeltaAware when you only have a Delta :) Your scenario is useful, indeed our Map/Reduce implementation uses it, but it's not what DeltaAware was intended for. I'd rather add a new API (like JCache's EntryProcessor) instead of "fixing" DeltaAware to do things it wasn't intended for. >> >> I always use DeltaAware with SKIP_REMOTE_LOOKUP. > > I see. But in some use cases you'd want some condensed report what was > the result of applying delta. > Again, this would be a hack: the put operation should return the previous value, not an arbitrary value. >> >> The real fun with DeltaAware are the cases where a backup receives a >> DeltaAware instance and the key isn't in its data container. It will >> issue a remote get to pull the complete context before applying the delta. >> During state transfer, this will lead to increased thread utilization on >> the joining nodes. I have a use case where I must restart half my cluster >> while there's 100K DeltaAware keys being written at a high data rate. >> With numOwners == 2, there are 3 nodes in the union CH. A new backup will >> issue 2 remote GetKeyValueCommands. > > Hmm, does not sound really convenient but I don't see what other could > be done when the delta-updated entry is not in place yet. > >> I have a hack to stagger the gets to >> reduce bandwidth, but if we're rethinking the implementation this should >> be an additional consideration. > > Nobody said we're rethinking this - I was just providing the feedback > from my POV after first starting to play with DeltaAware. > I've created ISPN-5042 [1] to keep track of this, but we might implement ISPN-825 [2] sooner. [1] https://issues.jboss.org/browse/ISPN-5042 [2] https://issues.jboss.org/browse/ISPN-825 > Radim > >> >> Regards, >> >> Erik >> >> >> On 12/2/14, 9:33 AM, "Radim Vansa" wrote: >> >>> Hi, >>> >>> I was trying to implement an effective atomic counters [1] in Infinispan >>> using the DeltaAware interface, but trying to use DeltaAware I've >>> spotted an unexpected behaviour; I wanted to have a Delta for >>> getAndIncrement() method, that would simply increment the value without >>> knowing the previous value ahead, and return this previous value. >>> Therefore, I was inserting a fake DeltaAware object into the cache that >>> generates this relative Delta. >>> >>> This works as long as the originator != primary owner, as the delta is >>> generated during marshalling. However, if I store that object locally, >>> the fake object is not used to generate the delta and reapply it on >>> current instance in data container, but it is stored directly. >>> >>> Is such difference in local/remote behaviour bug or feature? (this is >>> the main question in this mail) >>> >>> It seems to me that there are two reasons to use deltas: reducing size >>> of RPCs and reduce their total number. So the design should optimize both. >>> >>> I have another doubts about DeltaAware interface usefulness, tracked in >>> ISPN-5035 [2] - while it reduces bandwith from originator to primary >>> owner, the response from primary owner to originator carries the full >>> value. I also find quite inconvenient that only PutKeyValueCommand >>> somehow works with deltas, but ReplaceCommand does not. >>> >>> I've also noticed that the backup carries the full value [3], not quite >>> a good idea when we're trying to reduce bandwith. >>> >>> Generally, I think that EntryProcessor-like interface would be more >>> useful than DeltaAware. >>> >>> Radim >>> >>> [1] https://github.com/rvansa/infinispan/tree/t_objects >>> [2] https://issues.jboss.org/browse/ISPN-5035 >>> [3] https://issues.jboss.org/browse/ISPN-5037 >>> >>> -- >>> Radim Vansa >>> JBoss DataGrid QA >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From an1310 at hotmail.com Thu Dec 4 10:22:47 2014 From: an1310 at hotmail.com (Erik Salter) Date: Thu, 4 Dec 2014 10:22:47 -0500 Subject: [infinispan-dev] JIRAs for a 7.0.3 release Message-ID: Hi all, I was asked to vote on a list of JIRAs for 7.0.3 and send it to the mailing list. The next iteration of my application is migrating from 5.2.x to 7.0.x, so I'm really focused on hardening and stability, especially WRT state transfer. Here are the ones I was looking at, mostly related to state transfer: - ISPN-5000 - ISPN-4949 (and related ISPN-5030) - ISPN-4975 - ISPN-5027 Notes on a few others I was looking at: - ISPN-4444, from the description, looks serious enough to include. I haven't looked at the commit in-depth; appears to be limited to keys in L1? - ISPN-4979 appears to be a substantial change. I would defer to the team about how risky of a change it is. And if ISPN-3561 makes it into a 7.0.3, I'd consider it a personal favor. Regards, Erik From periyasamy.palanisamy at ericsson.com Thu Dec 4 23:19:47 2014 From: periyasamy.palanisamy at ericsson.com (Periyasamy Palanisamy) Date: Fri, 5 Dec 2014 04:19:47 +0000 Subject: [infinispan-dev] ISPN000196: Failed to recover cluster state after the current node became the coordinator In-Reply-To: <7280D3BDF6E559489E711F2AF13500441A4CFE9E@ESESSMB105.ericsson.se> References: <7280D3BDF6E559489E711F2AF13500441A4CFE9E@ESESSMB105.ericsson.se> Message-ID: <7280D3BDF6E559489E711F2AF13500441A4D4951@ESESSMB105.ericsson.se> Hi Everyone, I am planning to upgrade infinispan from 5.3.0 to 6.0.2 to solve this issue. Please let me know whether infinispan 6.0.2 has fix for it or not. Thanks, Periyasamy From: infinispan-dev-bounces at lists.jboss.org [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of Periyasamy Palanisamy Sent: Wednesday, December 03, 2014 4:19 PM To: infinispan-dev at lists.jboss.org Subject: [infinispan-dev] ISPN000196: Failed to recover cluster state after the current node became the coordinator Hi, We are using Infinispan 5.3.0 version in Opendaylight based controller product in 2-node clustered environment. But when we tried to reboot a node in the cluster, the error "ISPN000196: Failed to recover cluster state after the current node became the coordinator" is thrown continuously which leads to cluster becomes unusable. I see there is a bug already raised in infinispan 5.3.0 (https://issues.jboss.org/browse/ISPN-3395), but there is no information on which version this issue got fixed. Please confirm me on which infinispan version this issue is fixed. Thanks, Periyasamy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141205/3c9787b1/attachment.html From ttarrant at redhat.com Fri Dec 5 04:28:00 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 05 Dec 2014 10:28:00 +0100 Subject: [infinispan-dev] ISPN000196: Failed to recover cluster state after the current node became the coordinator In-Reply-To: <7280D3BDF6E559489E711F2AF13500441A4D4951@ESESSMB105.ericsson.se> References: <7280D3BDF6E559489E711F2AF13500441A4CFE9E@ESESSMB105.ericsson.se> <7280D3BDF6E559489E711F2AF13500441A4D4951@ESESSMB105.ericsson.se> Message-ID: <54817AA0.1020003@redhat.com> You should really upgrade to 7.0.x as 6.0.x is not developed any more. Tristan On 05/12/2014 05:19, Periyasamy Palanisamy wrote: > > Hi Everyone, > > I am planning to upgrade infinispan from 5.3.0 to 6.0.2 to solve this > issue. Please let me know whether infinispan 6.0.2 has fix for it or not. > > Thanks, > > Periyasamy > > *From:*infinispan-dev-bounces at lists.jboss.org > [mailto:infinispan-dev-bounces at lists.jboss.org] *On Behalf Of > *Periyasamy Palanisamy > *Sent:* Wednesday, December 03, 2014 4:19 PM > *To:* infinispan-dev at lists.jboss.org > *Subject:* [infinispan-dev] ISPN000196: Failed to recover cluster > state after the current node became the coordinator > > Hi, > > We are using Infinispan 5.3.0 version in Opendaylight based controller > product in 2-node clustered environment. But when we tried to reboot a > node in the cluster, the error ?ISPN000196: Failed to recover cluster > state after the current node became the coordinator? is thrown > continuously which leads to cluster becomes unusable. I see there is a > bug already raised in infinispan 5.3.0 > (https://issues.jboss.org/browse/ISPN-3395), but there is no > information on which version this issue got fixed. > > Please confirm me on which infinispan version this issue is fixed. > > Thanks, > > Periyasamy > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Fri Dec 5 09:41:19 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 05 Dec 2014 15:41:19 +0100 Subject: [infinispan-dev] Infinispan Management Console project Message-ID: <5481C40F.2080806@redhat.com> Hi all, I have created a dedicated repository [1] for Infinispan Management console on GitHub. Since the project is built on pure HTML/Javascript/CSS using tooling which is familiar in that world (NodeJS, Gulp, Bower, Angular, etc) it makes sense for it to live by itself with its own release cycle. The pom.xml file in there is not a mistake: releases will happen as jar files pushed to JBoss's Nexus so that the main Infinispan project can pull it in. Contributions are welcome Tristan [1] https://github.com/infinispan/infinispan-management-console -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From rory.odonnell at oracle.com Tue Dec 9 03:41:01 2014 From: rory.odonnell at oracle.com (Rory O'Donnell) Date: Tue, 09 Dec 2014 08:41:01 +0000 Subject: [infinispan-dev] JDK 9 images are now modular with JDK 9 Early Access build 41 Message-ID: <5486B59D.4070304@oracle.com> Hi Galder, The initial changesets for JEP 220: Modular Run-Time Images [1] are available with JDK 9 early-access build 41 [2]. To summarize (please see the JEP for details): - The "jre" subdirectory is no longer present in JDK images. - The user-editable configuration files in the "lib" subdirectory have been moved to the new "conf" directory. - The endorsed-standards override mechanism has been removed. - The extension mechanism has been removed. - rt.jar, tools.jar, and dt.jar have been removed. - A new URI scheme for naming stored modules, classes, and resources has been defined. - For tools that previously accessed rt.jar directly, a built-in NIO file-system provider has been defined to provide access to the class and resource files within a run-time image. More details are available at Mark Reinhold's latest blog entry [3] Rgds, Rory [1] http://openjdk.java.net/jeps/220 [2] https://jdk9.java.net/download/ [3] http://mreinhold.org/blog/jigsaw-modular-images -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland From mohammedisaa.khan at subex.com Thu Dec 11 07:17:54 2014 From: mohammedisaa.khan at subex.com (mohammedisaa.khan at subex.com) Date: Thu, 11 Dec 2014 05:17:54 -0700 (MST) Subject: [infinispan-dev] Jgroups - One or more nodes have left exception while querying(get, replaceWithVersion) on the cache Message-ID: <1418300274885-4030028.post@n3.nabble.com> Hi, We are using the Infinispan 6.0.2 Final with hotrod client in our application. We have 3 nodes and are running test with about 30 million entries in the cache and about 300 million requests being processed. During the Execution after a few hours, we get the following error - 1)Failed to recover cluster state after the current node became the coordinator 2)org.infinispan.remoting.transport.jgroups.SuspectException: One or more nodes have left the cluster while replicating command PrepareCommand 3) Message Send failed due to time out 4)Suspect Messages - although the nodes were active. There were no crashes and all the nodes are active! But it seems like some node appeared to leave the cluster(Deduced from error #2) and post that the cluster misbehaves. Most requests return null for cache query although the data is present in the nodes and the nodes are up and active. We have written a debug script which individually queries the cache and the caches respond, but when we run the hotrod client with all node Ip/ports. Only one node seems to respond and other 2 nodes do not respond. Could you tell me why errors 2,3 occur? Are these identified ? Have they been fixed in 7.x? This appears to break the system quite often. Kindly reach out with solutions. Regards, Isaa -- View this message in context: http://infinispan-developer-list.980875.n3.nabble.com/Jgroups-One-or-more-nodes-have-left-exception-while-querying-get-replaceWithVersion-on-the-cache-tp4030028.html Sent from the Infinispan Developer List mailing list archive at Nabble.com. From dan.berindei at gmail.com Thu Dec 11 09:20:02 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Thu, 11 Dec 2014 16:20:02 +0200 Subject: [infinispan-dev] Jgroups - One or more nodes have left exception while querying(get, replaceWithVersion) on the cache In-Reply-To: <1418300274885-4030028.post@n3.nabble.com> References: <1418300274885-4030028.post@n3.nabble.com> Message-ID: Hi Isaa We definitely recommend that you try upgrading to 7.0.2.Final, since we don't support older versions. That being said, the suspect exceptions and communication timeouts are a sign of a flaky network, or more likely of excessive garbage collections. Have you tried enabling GC logging to see how big the pauses are? 7.0.x has some fixes in this area, e.g. it suspect exceptions are no longer propagated to the application and instead the client retries the operation. But it won't help much if the application is really running out of memory. Cheers Dan On Thu, Dec 11, 2014 at 2:17 PM, mohammedisaa.khan at subex.com wrote: > Hi, > > We are using the Infinispan 6.0.2 Final with hotrod client in our > application. We have 3 nodes and are running test with about 30 million > entries in the cache and about 300 million requests being processed. > > During the Execution after a few hours, we get the following error - > > 1)Failed to recover cluster state after the current node became the > coordinator > 2)org.infinispan.remoting.transport.jgroups.SuspectException: One or more > nodes have left the cluster while replicating command PrepareCommand > 3) Message Send failed due to time out > 4)Suspect Messages - although the nodes were active. > > There were no crashes and all the nodes are active! But it seems like some > node appeared to leave the cluster(Deduced from error #2) and post that the > cluster misbehaves. Most requests return null for cache query although the > data is present in the nodes and the nodes are up and active. We have > written a debug script which individually queries the cache and the caches > respond, but when we run the hotrod client with all node Ip/ports. Only one > node seems to respond and other 2 nodes do not respond. > > Could you tell me why errors 2,3 occur? Are these identified ? Have they > been fixed in 7.x? > > This appears to break the system quite often. Kindly reach out with > solutions. > > Regards, > Isaa > > > > > -- > View this message in context: http://infinispan-developer-list.980875.n3.nabble.com/Jgroups-One-or-more-nodes-have-left-exception-while-querying-get-replaceWithVersion-on-the-cache-tp4030028.html > Sent from the Infinispan Developer List mailing list archive at Nabble.com. > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From jholusa at redhat.com Mon Dec 15 06:35:53 2014 From: jholusa at redhat.com (Jiri Holusa) Date: Mon, 15 Dec 2014 06:35:53 -0500 (EST) Subject: [infinispan-dev] Clustered queries and custom indexes In-Reply-To: <1935121773.13323972.1418638356511.JavaMail.zimbra@redhat.com> References: <1935121773.13323972.1418638356511.JavaMail.zimbra@redhat.com> Message-ID: <721545948.13345812.1418643353665.JavaMail.zimbra@redhat.com> Hi, there is an interesting research around similarity search at my university driven by David Nov?k (CC-ed). If anyone interested, see [1][2][3]. Shortly: they basically achieved similarity search on any data (images, songs, etc...) by creating some sort of custom index, that stores a "similarity vector" for each object in the database. This index can solve queries like "give me the most similar images to this example". So why am I posting this here? The architecture is designed on top of Infinispan and they want to use it to speed it up. Basically, they would like to distribute the entries across the cluster, each node would have the similarity index of its entries. Then, when a query comes, it would be distributed to all the nodes, custom search would be performed on the node's indexes and the result returned. This is approximately what Index.LOCAL and ClusteredQuery could do. The difference is that the indexing and searching mechanism must be custom. So I wanted to ask what do you think about implementing such a feature to Infinispan. I was thinking about somehow extracting general API for indexing/searching, then e.g. our Lucene search would become its implementation. I would be happy to take this as a contribution, since I find this extremely interesting topic and also create a diploma thesis out of this. So here are some questions: 1) Is it doable? 2) Do we want this feature? 3) How to design it/where to start? Any input is more then welcome :) Cheers, Jiri [1] https://drive.google.com/file/d/0B4sztQSfpi3rRlJBQjJHMkR2LXc/view [2] https://drive.google.com/file/d/0B4sztQSfpi3rU2p2MV9jRE9iTUk/view [3] https://drive.google.com/file/d/0B4sztQSfpi3rZUpld24ydzJNclk/view From sanne at infinispan.org Mon Dec 15 07:41:31 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 15 Dec 2014 12:41:31 +0000 Subject: [infinispan-dev] Clustered queries and custom indexes In-Reply-To: <721545948.13345812.1418643353665.JavaMail.zimbra@redhat.com> References: <1935121773.13323972.1418638356511.JavaMail.zimbra@redhat.com> <721545948.13345812.1418643353665.JavaMail.zimbra@redhat.com> Message-ID: Hi Jiri, David, I only briefly skimmed through the attachments to get an idea if the content, but it looks great at first sight! I'll read it in depth during the upcoming holidays. But I think I can answer some of your questions already: 1) Yes it's certainly doable, if you have enough time for it :) But you got our attention, we're certainly interested to help. 2) It's probably good value for Infinispan to work on an abstraction from the specific indexing engine, although a poorly implemented abstraction would cost us in terms of performance so we should get that right. User's configuration complexity is also a frequent concern, so let's try to keep that in mind too. Once we have a proper separation from the current indexing/query engine we can certainly add this as an alternative implementation; this can live as an experimental module for a while and be integrated depending on how far we get and how people like the additional features. 3) In terms of design, I should probably read those papers in depth first, but these are my early doubts: # to Lucene / not to Lucene I see in the presentation that Lucene is referred to as a good solution for full-text, but while it's true it is actually just an encoder/decoder/query engine for a vector space model. People have built more than just text based Similarity on top of it. Would this implementation be possible to run on top of Lucene indexes, or is it required to use a completely different index management solution? # to Hibernate Search / not to Hibernate Search Most of the current indexing/query code in Infinispan is based on Hibernate Search, which handles the complexity of Lucene's resource management, Query execution, and makes it easier for developers to map their Domain model. We're working on Hibernate Search to improve its flexibility on dynamic models (more suited to Infinispan users), and also to not necessarily work on Lucene in embedded mode but to delegate to "Lucene like" services. That means it will probably always assume to have some form of Similarity capable vector space model based engine to delegate the hard work to, but not necessarily the Lucene project; we're looking at alternatives like Apache Solr and ElasticSearch for now - so essentially still Lucene based but typically running on a separate dedicated cluster node(s). You could think of integrating the index handling code into Hibernate Search, whose functionality is automatically inherited by Infinispan, or bypass Hibernate Search and integrate with Infinispan directly. Depending on the "Lucene" question, be aware that Hibernate Search is already able to provide functionality like Spatial queries and indexing of PDF/Office files; although this last one is text based, the Spatial integration works on numeric distance; the benefit is that we can combine distance criteria with text criteria. I don't think it would be hard to extend this model to support other implementations of Similarity like the mentioned images and songs, in fact that would probably be a relatively easy task if you already know which Similarity implementation you want to use. The benefit of integrating with Hibernate Search is that you would address the needs of a much larger users base: the same functionality is usable by Hibernate users (Java developers using relational databases: we provide indexing an Similarity based queries on your database stored data). I'm just listing some options but don't intend to recommend any without further details. While I'm leading the Hibernate Search project, I see good value in a proper abstraction from Infinispan to a pluggable (alternative) query strategy, although considering how many details it takes to get right I doubt we'll ever be able to make an effective competitor for the current one; so to answer the two points we'd need a better understanding of what exactly you would need to store in the "index" and how you think this can be maintained in synch with the data. Generally speaking I think all newcomers will be tempted to avoid both Lucene and Hibernate Search to not need to learn too much, but let's keep in mind that not having unlimited manpower we need to be smart and these two engines do a lot of heavy work and are constantly evolving in terms of performance. So unless the requirements don't fit at all, I'd rather help to see what could be reused from these. I haven't done much advanced research using Lucene myself, but I've heard that several researchers use it as a "toolbox" to experiment with new kinds of vector space based analytics, so I expect it should be useful to keep around even in an alternative implementation. Thanks, Sanne On 15 December 2014 at 11:35, Jiri Holusa wrote: > Hi, > > there is an interesting research around similarity search at my university driven by David Nov?k (CC-ed). If anyone interested, see [1][2][3]. > > Shortly: they basically achieved similarity search on any data (images, songs, etc...) by creating some sort of custom index, that stores a "similarity vector" for each object in the database. This index can solve queries like "give me the most similar images to this example". So why am I posting this here? > > The architecture is designed on top of Infinispan and they want to use it to speed it up. Basically, they would like to distribute the entries across the cluster, each node would have the similarity index of its entries. Then, when a query comes, it would be distributed to all the nodes, custom search would be performed on the node's indexes and the result returned. This is approximately what Index.LOCAL and ClusteredQuery could do. > > The difference is that the indexing and searching mechanism must be custom. So I wanted to ask what do you think about implementing such a feature to Infinispan. I was thinking about somehow extracting general API for indexing/searching, then e.g. our Lucene search would become its implementation. > > I would be happy to take this as a contribution, since I find this extremely interesting topic and also create a diploma thesis out of this. > So here are some questions: > 1) Is it doable? > 2) Do we want this feature? > 3) How to design it/where to start? > > Any input is more then welcome :) > > Cheers, > Jiri > > [1] https://drive.google.com/file/d/0B4sztQSfpi3rRlJBQjJHMkR2LXc/view > [2] https://drive.google.com/file/d/0B4sztQSfpi3rU2p2MV9jRE9iTUk/view > [3] https://drive.google.com/file/d/0B4sztQSfpi3rZUpld24ydzJNclk/view > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From linjunru at huawei.com Mon Dec 15 07:54:37 2014 From: linjunru at huawei.com (linjunru) Date: Mon, 15 Dec 2014 12:54:37 +0000 Subject: [infinispan-dev] Performance gap between different value sizes and between key loactions Message-ID: <92F95A318014D342B5C277097274F2E5076938@szxeml555-mbs.china.huawei.com> Hi, all: I have tested infinispan in distributed mode in terms of latency of put(k,v) operation. The own_num is 1 and the key we put/write locates in the same node as the put operation occurs(In the table,?1+0? represents this scenario), the results indicates that the latency increases as the size of the value increases. However the increments seem to be a little ?unreasonable? to me, because the bandwidth of the memory system is quite huge, and the number of keys (10000) remains the same during the experiment. So, here is the questions: which operations inside infinspan have strong relatives with the size of value, and why they costs so much as the size increases? We have also tested infinispan in the scenario which the key and the put/write(key,value) operation reside in different nodes(we noted it as ?0+1?). Compared with ?1+0?, ?0+1? triggers network communications, however, the network latency is much smaller compared to the performance gas between the two scenarios. Why this situation happens? For example, with a 25K bytes ping packet, the RTT is about 0.713ms while performance gas between the two scenarios is about 8.4ms?what operations inside infinispan used the other 7.6ms? UDP is utilized as the transport protocol, the infinispan version we used is 7.0 and there are 4 nodes in the cluster, each has 10000 keys, all of them have memory bigger than 32G, and all of them have xeon cpu e5-2407 x2. Value size 250B( us) 2.5K( us) 25k(us) 250k(us) 2.5M(us) 25M(us) 1+0 463 726 3 236 26 560 354 454 3 979 830 0+1 1 807 2 829 11 635 87 540 1 035 133 11 653 389 Thanks! Best Regards, JR -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141215/bb00b565/attachment-0001.html From rvansa at redhat.com Mon Dec 15 08:34:08 2014 From: rvansa at redhat.com (Radim Vansa) Date: Mon, 15 Dec 2014 14:34:08 +0100 Subject: [infinispan-dev] Performance gap between different value sizes and between key loactions In-Reply-To: <92F95A318014D342B5C277097274F2E5076938@szxeml555-mbs.china.huawei.com> References: <92F95A318014D342B5C277097274F2E5076938@szxeml555-mbs.china.huawei.com> Message-ID: <548EE350.8070807@redhat.com> Hi JR, thanks for those findings! I was benchmarking the dependency of achieved throughput based on entry size in the past, and I found the sweet spot on 8k values (likely because our machines had 9k MTU). Regrettably, we were focusing on throughput rather than on latency. I think that the increased latency could be on the account of: a) marshalling - this is the top suspect b) when receiving the data from network (in JGroups), those are copied from the socket to buffer c) general GC activity - with larger data flow you're about to trigger GC sooner Though, I am quite surprised by such linear scaling, usually RPC latency or waiting for locks is the villain. Unless you set in cache configuration to storeAsBinary, Infinispan treats values as references and there should be no overhead involved. Could you set up sampling mode profiler and check what it reports? All the above are just slightly educated guesses. Radim On 12/15/2014 01:54 PM, linjunru wrote: > > Hi, all: > > I have tested infinispan in distributed mode in terms of latency of > put(k,v) operation. The own_num is 1 and the key we put/write locates > in the same node as the put operation occurs(In the table,?1+0? > represents this scenario), the results indicates that the latency > increases as the size of the value increases. However the increments > seem to be a little ?unreasonable? to me, because the bandwidth of the > memory system is quite huge, and the number of keys (10000) remains > the same during the experiment. So, here is the questions: which > operations inside infinspan have strong relatives with the size of > value, and why they costs so much as the size increases? > > We have also tested infinispan in the scenario which the key and the > put/write(key,value) operation reside in different nodes(we noted it > as ?0+1?). Compared with ?1+0?, ?0+1? triggers network communications, > however, the network latency is much smaller compared to the > performance gas between the two scenarios. Why this situation happens? > For example, with a 25K bytes ping packet, the RTT is about 0.713ms > while performance gas between the two scenarios is about 8.4ms?what > operations inside infinispan used the other 7.6ms? > > UDP is utilized as the transport protocol, the infinispan version we > used is 7.0 and there are 4 nodes in the cluster, each has 10000 keys, > all of them have memory bigger than 32G, and all of them have xeon cpu > e5-2407 x2. > > Value size > > > > 250B( us) > > > > 2.5K( us) > > > > 25k(us) > > > > 250k(us) > > > > 2.5M(us) > > > > 25M(us) > > 1+0 > > > > 463 > > > > 726 > > > > 3 236 > > > > 26 560 > > > > 354 454 > > > > 3 979 830 > > 0+1 > > > > 1 807 > > > > 2 829 > > > > 11 635 > > > > 87 540 > > > > 1 035 133 > > > > 11 653 389 > > Thanks! > > Best Regards, > > JR > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From mudokonman at gmail.com Mon Dec 15 09:55:46 2014 From: mudokonman at gmail.com (William Burns) Date: Mon, 15 Dec 2014 09:55:46 -0500 Subject: [infinispan-dev] My weekly status update 12/15 Message-ID: Last week I submitted PRs for the following: ISPN-5078 ISPN-5072 ISPN-4491 I also had to port 3 issues to product (conflict nightmare as usual) I also have been working on ISPN-4445 - It seems this one is a problem that a cache cannot be operated upon when it is set to LAZY initialization as everything seems to work fine if it is EAGER. ISPN-4973 - I haven't been able to figure this one out yet, but will be looking at it closer after 4445 I should be online in the afternoon EST. - Will From dan.berindei at gmail.com Mon Dec 15 10:43:41 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 15 Dec 2014 17:43:41 +0200 Subject: [infinispan-dev] Performance gap between different value sizes and between key loactions In-Reply-To: <548EE350.8070807@redhat.com> References: <92F95A318014D342B5C277097274F2E5076938@szxeml555-mbs.china.huawei.com> <548EE350.8070807@redhat.com> Message-ID: JR, could you share your test, or at least the configuration you used and what key/value types you used? Like Radim said, in your 1+0 scenario with storeAsBinary disabled and no cache store attached, I would expect the latency to be exactly the same for all value sizes. Cheers Dan On Mon, Dec 15, 2014 at 3:34 PM, Radim Vansa wrote: > Hi JR, > > thanks for those findings! I was benchmarking the dependency of achieved > throughput based on entry size in the past, and I found the sweet spot > on 8k values (likely because our machines had 9k MTU). Regrettably, we > were focusing on throughput rather than on latency. > > I think that the increased latency could be on the account of: > a) marshalling - this is the top suspect > b) when receiving the data from network (in JGroups), those are copied > from the socket to buffer > c) general GC activity - with larger data flow you're about to trigger > GC sooner > > Though, I am quite surprised by such linear scaling, usually RPC latency > or waiting for locks is the villain. Unless you set in cache > configuration to storeAsBinary, Infinispan treats values as references > and there should be no overhead involved. > > Could you set up sampling mode profiler and check what it reports? All > the above are just slightly educated guesses. > > Radim > > On 12/15/2014 01:54 PM, linjunru wrote: >> >> Hi, all: >> >> I have tested infinispan in distributed mode in terms of latency of >> put(k,v) operation. The own_num is 1 and the key we put/write locates >> in the same node as the put operation occurs(In the table,?1+0? >> represents this scenario), the results indicates that the latency >> increases as the size of the value increases. However the increments >> seem to be a little ?unreasonable? to me, because the bandwidth of the >> memory system is quite huge, and the number of keys (10000) remains >> the same during the experiment. So, here is the questions: which >> operations inside infinspan have strong relatives with the size of >> value, and why they costs so much as the size increases? >> >> We have also tested infinispan in the scenario which the key and the >> put/write(key,value) operation reside in different nodes(we noted it >> as ?0+1?). Compared with ?1+0?, ?0+1? triggers network communications, >> however, the network latency is much smaller compared to the >> performance gas between the two scenarios. Why this situation happens? >> For example, with a 25K bytes ping packet, the RTT is about 0.713ms >> while performance gas between the two scenarios is about 8.4ms?what >> operations inside infinispan used the other 7.6ms? >> >> UDP is utilized as the transport protocol, the infinispan version we >> used is 7.0 and there are 4 nodes in the cluster, each has 10000 keys, >> all of them have memory bigger than 32G, and all of them have xeon cpu >> e5-2407 x2. >> >> Value size >> >> >> >> 250B( us) >> >> >> >> 2.5K( us) >> >> >> >> 25k(us) >> >> >> >> 250k(us) >> >> >> >> 2.5M(us) >> >> >> >> 25M(us) >> >> 1+0 >> >> >> >> 463 >> >> >> >> 726 >> >> >> >> 3 236 >> >> >> >> 26 560 >> >> >> >> 354 454 >> >> >> >> 3 979 830 >> >> 0+1 >> >> >> >> 1 807 >> >> >> >> 2 829 >> >> >> >> 11 635 >> >> >> >> 87 540 >> >> >> >> 1 035 133 >> >> >> >> 11 653 389 >> >> Thanks! >> >> Best Regards, >> >> JR >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Dec 15 10:46:55 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 15 Dec 2014 16:46:55 +0100 Subject: [infinispan-dev] Weekly Infinispan IRC meeting 2014-12-15 Message-ID: <548F026F.8090407@redhat.com> Hi all, get the logs at http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2014/infinispan.2014-12-15-15.02.log.html -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From an1310 at hotmail.com Mon Dec 15 15:50:40 2014 From: an1310 at hotmail.com (Erik Salter) Date: Mon, 15 Dec 2014 15:50:40 -0500 Subject: [infinispan-dev] JIRAs for a 7.0.3 release Message-ID: Hi all, I have a few more that might be candidates for the 7.0.x branch. The critical ones (for me) have to do with state transfer and locks. Some of these are still pending, though. ISPN-5076 ISPN-5030 ISPN-5000 ISPN-4546 There are a few optimizations. ISPN-5042, in particular, seems like a quick win. For ISPN-5037 and ISPN-5032, I'll probably implement on my own. Also under the heading of optimizations -- one thing that's a real problem with my cache configuration is the verbose nature of OutdatedTopologyExceptions. They're thrown as RemoteExceptions, which cause stack traces all over the place when a NonTx cache retries its operation. (These actually crashed an indexer on my analytics engine during a performance state transfer test.) Also, if the key ownership hasn't changed, why throw them at all? ISPN-4695 ISPN-4586 (If I'm on Santa's naughty list, I may have a crack at implementing them.) There are a few minor cosmetic things that are easily ported. ISPN-4989 ISPN-5040 ISPN-5052 Thanks all, Erik On 12/4/14, 10:22 AM, "Erik Salter" wrote: >Hi all, > >I was asked to vote on a list of JIRAs for 7.0.3 and send it to the >mailing list. The next iteration of my application is migrating from >5.2.x to 7.0.x, so I'm really focused on hardening and stability, >especially WRT state transfer. Here are the ones I was looking at, mostly >related to state transfer: > >- ISPN-5000 > >- ISPN-4949 (and related ISPN-5030) >- ISPN-4975 >- ISPN-5027 > >Notes on a few others I was looking at: >- ISPN-4444, from the description, looks serious enough to include. I >haven't looked at the commit in-depth; appears to be limited to keys in >L1? >- ISPN-4979 appears to be a substantial change. I would defer to the team >about how risky of a change it is. > >And if ISPN-3561 makes it into a 7.0.3, I'd consider it a personal favor. > >Regards, > >Erik > > >_______________________________________________ >infinispan-dev mailing list >infinispan-dev at lists.jboss.org >https://lists.jboss.org/mailman/listinfo/infinispan-dev From rory.odonnell at oracle.com Tue Dec 16 07:27:51 2014 From: rory.odonnell at oracle.com (Rory O'Donnell) Date: Tue, 16 Dec 2014 12:27:51 +0000 Subject: [infinispan-dev] Early Access builds for JDK 9 b42, JDK 8 b18 & JDK 7 b03 are available on java.net Message-ID: <54902547.2070802@oracle.com> Hi Galder, Now that JDK 9 Early Access build images are modular [1], there is a fresh Early Access build for JDK 9 b42 available on java.net. The summary of changes are listed here In addition, there are new Early Access builds for the ongoing update releases. The Early Access build for JDK 8u40 b18 is available on java.net, with the summary of changes listed here. Finally, the Early Access build for JDK 7u80 b03 is available on java.net, with the summary of changes listed here. As we enter the later phases of development for JDK 7u80 & JDK 8u40, please log any show stoppers as soon as possible. Rgds,Rory [1] http://mreinhold.org/blog/jigsaw-modular-images -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141216/08d94fd4/attachment.html From linjunru at huawei.com Tue Dec 16 07:37:29 2014 From: linjunru at huawei.com (linjunru) Date: Tue, 16 Dec 2014 12:37:29 +0000 Subject: [infinispan-dev] Performance gap between different value sizes and between key loactions In-Reply-To: References: <92F95A318014D342B5C277097274F2E5076938@szxeml555-mbs.china.huawei.com> <548EE350.8070807@redhat.com> Message-ID: <92F95A318014D342B5C277097274F2E5076A1F@szxeml555-mbs.china.huawei.com> Dan & Radim, Thanks! I have attempted to disable storeAsBinary with the followed infinispan configurations, but the results don't show much differences. -

The infinispan configurations utilized by the previous experiments is: -

Best Regards, JR > -----Original Message----- > From: infinispan-dev-bounces at lists.jboss.org > [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of Dan Berindei > Sent: Monday, December 15, 2014 11:44 PM > To: infinispan -Dev List > Subject: Re: [infinispan-dev] Performance gap between different value sizes and > between key loactions > > JR, could you share your test, or at least the configuration you used and what > key/value types you used? > > Like Radim said, in your 1+0 scenario with storeAsBinary disabled and no cache > store attached, I would expect the latency to be exactly the same for all value > sizes. > > Cheers > Dan > > > On Mon, Dec 15, 2014 at 3:34 PM, Radim Vansa wrote: > > Hi JR, > > > > thanks for those findings! I was benchmarking the dependency of > > achieved throughput based on entry size in the past, and I found the > > sweet spot on 8k values (likely because our machines had 9k MTU). > > Regrettably, we were focusing on throughput rather than on latency. > > > > I think that the increased latency could be on the account of: > > a) marshalling - this is the top suspect > > b) when receiving the data from network (in JGroups), those are copied > > from the socket to buffer > > c) general GC activity - with larger data flow you're about to trigger > > GC sooner > > > > Though, I am quite surprised by such linear scaling, usually RPC > > latency or waiting for locks is the villain. Unless you set in cache > > configuration to storeAsBinary, Infinispan treats values as references > > and there should be no overhead involved. > > > > Could you set up sampling mode profiler and check what it reports? All > > the above are just slightly educated guesses. > > > > Radim > > > > On 12/15/2014 01:54 PM, linjunru wrote: > >> > >> Hi, all: > >> > >> I have tested infinispan in distributed mode in terms of latency of > >> put(k,v) operation. The own_num is 1 and the key we put/write locates > >> in the same node as the put operation occurs(In the table,?1+0? > >> represents this scenario), the results indicates that the latency > >> increases as the size of the value increases. However the increments > >> seem to be a little ?unreasonable? to me, because the bandwidth of > >> the memory system is quite huge, and the number of keys (10000) > >> remains the same during the experiment. So, here is the questions: > >> which operations inside infinspan have strong relatives with the size > >> of value, and why they costs so much as the size increases? > >> > >> We have also tested infinispan in the scenario which the key and the > >> put/write(key,value) operation reside in different nodes(we noted it > >> as ?0+1?). Compared with ?1+0?, ?0+1? triggers network > >> communications, however, the network latency is much smaller compared > >> to the performance gas between the two scenarios. Why this situation > happens? > >> For example, with a 25K bytes ping packet, the RTT is about 0.713ms > >> while performance gas between the two scenarios is about 8.4ms?what > >> operations inside infinispan used the other 7.6ms? > >> > >> UDP is utilized as the transport protocol, the infinispan version we > >> used is 7.0 and there are 4 nodes in the cluster, each has 10000 > >> keys, all of them have memory bigger than 32G, and all of them have > >> xeon cpu > >> e5-2407 x2. > >> > >> Value size > >> > >> > >> > >> 250B( us) > >> > >> > >> > >> 2.5K( us) > >> > >> > >> > >> 25k(us) > >> > >> > >> > >> 250k(us) > >> > >> > >> > >> 2.5M(us) > >> > >> > >> > >> 25M(us) > >> > >> 1+0 > >> > >> > >> > >> 463 > >> > >> > >> > >> 726 > >> > >> > >> > >> 3 236 > >> > >> > >> > >> 26 560 > >> > >> > >> > >> 354 454 > >> > >> > >> > >> 3 979 830 > >> > >> 0+1 > >> > >> > >> > >> 1 807 > >> > >> > >> > >> 2 829 > >> > >> > >> > >> 11 635 > >> > >> > >> > >> 87 540 > >> > >> > >> > >> 1 035 133 > >> > >> > >> > >> 11 653 389 > >> > >> Thanks! > >> > >> Best Regards, > >> > >> JR > >> > >> > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > -- > > Radim Vansa > > JBoss DataGrid QA > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: jgroups.xml Type: application/xml Size: 4393 bytes Desc: jgroups.xml Url : http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141216/7d87de4e/attachment-0001.rdf From emmanuel at hibernate.org Tue Dec 16 05:03:43 2014 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Tue, 16 Dec 2014 11:03:43 +0100 Subject: [infinispan-dev] JIRAs for a 7.0.3 release In-Reply-To: References: Message-ID: While we are on a wish list for 7.0.3. We have found a regression from Infinispan 6 on (FineGrained)AtomicMap that is seriously impacting Hibernate OGM. ISPN-5088 We should get this one quickly to avoid people tripping over the concurrency concern the workaround implies. Emmanuel > On 15 Dec 2014, at 21:50, Erik Salter wrote: > > Hi all, > > I have a few more that might be candidates for the 7.0.x branch. The > critical ones (for me) have to do with state transfer and locks. Some of > these are still pending, though. > > ISPN-5076 > ISPN-5030 > ISPN-5000 > ISPN-4546 > > There are a few optimizations. ISPN-5042, in particular, seems like a > quick win. For ISPN-5037 and ISPN-5032, I'll probably implement on my own. > > Also under the heading of optimizations -- one thing that's a real problem > with my cache configuration is the verbose nature of > OutdatedTopologyExceptions. They're thrown as RemoteExceptions, which > cause stack traces all over the place when a NonTx cache retries its > operation. (These actually crashed an indexer on my analytics engine > during a performance state transfer test.) Also, if the key ownership > hasn't changed, why throw them at all? > > ISPN-4695 > ISPN-4586 > > (If I'm on Santa's naughty list, I may have a crack at implementing them.) > > There are a few minor cosmetic things that are easily ported. > > ISPN-4989 > ISPN-5040 > ISPN-5052 > > > Thanks all, > > Erik > > On 12/4/14, 10:22 AM, "Erik Salter" wrote: > >> Hi all, >> >> I was asked to vote on a list of JIRAs for 7.0.3 and send it to the >> mailing list. The next iteration of my application is migrating from >> 5.2.x to 7.0.x, so I'm really focused on hardening and stability, >> especially WRT state transfer. Here are the ones I was looking at, mostly >> related to state transfer: >> >> - ISPN-5000 >> >> - ISPN-4949 (and related ISPN-5030) >> - ISPN-4975 >> - ISPN-5027 >> >> Notes on a few others I was looking at: >> - ISPN-4444, from the description, looks serious enough to include. I >> haven't looked at the commit in-depth; appears to be limited to keys in >> L1? >> - ISPN-4979 appears to be a substantial change. I would defer to the team >> about how risky of a change it is. >> >> And if ISPN-3561 makes it into a 7.0.3, I'd consider it a personal favor. >> >> Regards, >> >> Erik >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From linjunru at huawei.com Wed Dec 17 09:17:57 2014 From: linjunru at huawei.com (linjunru) Date: Wed, 17 Dec 2014 14:17:57 +0000 Subject: [infinispan-dev] Performance gap between different value sizes and between key loactions In-Reply-To: <92F95A318014D342B5C277097274F2E5076A1F@szxeml555-mbs.china.huawei.com> References: <92F95A318014D342B5C277097274F2E5076938@szxeml555-mbs.china.huawei.com> <548EE350.8070807@redhat.com> <92F95A318014D342B5C277097274F2E5076A1F@szxeml555-mbs.china.huawei.com> Message-ID: <92F95A318014D342B5C277097274F2E5076B15@szxeml555-mbs.china.huawei.com> Hi, all: I tested infinispan again with 10Gbe L2 Network. Write latency of ?1+0? scenario almost remain the same, and latency of '0+1' scenario has little improvement especially when size of ?value" increase. Does these results indicate that the network has little impact on infinispan's write latency in these scenario. As Radim mentioned, marshalling and general GC activity may be the other two reason for cause the huge latency. I have tried to disable storeAsBinary, but there is not much differences. I'm not sure whether I configure the infinspan in the right way, so I list my configuration at the end of the email, again(^-^?, if there is anything wrong of the configuration, please point out. Refereed to the GC activity, is there any configurations can optimize it? At last, is there anybody/company use infiinspan to store media such as images, videos or big files?

Ps: only distributed cache ?dist" is utilized and tested. Thanks! Best Regards, JR > -----Original Message----- > From: infinispan-dev-bounces at lists.jboss.org > [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of linjunru > Sent: Tuesday, December 16, 2014 8:37 PM > To: infinispan -Dev List > Subject: Re: [infinispan-dev] Performance gap between different value sizes and > between key loactions > > Dan & Radim, Thanks! > > I have attempted to disable storeAsBinary with the followed infinispan > configurations, but the results don't show much differences. > > xsi:schemaLocation="urn:infinispan:config:7.0 > http://www.infinispan.org/schemas/infinispan-config-7.0.xsd" > xmlns="urn:infinispan:config:7.0"> > - > > > - > > > > > > > > > > The infinispan configurations utilized by the previous experiments is: > xsi:schemaLocation="urn:infinispan:config:7.0 > http://www.infinispan.org/schemas/infinispan-config-7.0.xsd" > xmlns="urn:infinispan:config:7.0"> > - > > > - > > > > > > > > Best Regards, > JR > > > > -----Original Message----- > > From: infinispan-dev-bounces at lists.jboss.org > > [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of Dan > > Berindei > > Sent: Monday, December 15, 2014 11:44 PM > > To: infinispan -Dev List > > Subject: Re: [infinispan-dev] Performance gap between different value > > sizes and between key loactions > > > > JR, could you share your test, or at least the configuration you used > > and what key/value types you used? > > > > Like Radim said, in your 1+0 scenario with storeAsBinary disabled and > > no cache store attached, I would expect the latency to be exactly the > > same for all value sizes. > > > > Cheers > > Dan > > > > > > On Mon, Dec 15, 2014 at 3:34 PM, Radim Vansa > wrote: > > > Hi JR, > > > > > > thanks for those findings! I was benchmarking the dependency of > > > achieved throughput based on entry size in the past, and I found the > > > sweet spot on 8k values (likely because our machines had 9k MTU). > > > Regrettably, we were focusing on throughput rather than on latency. > > > > > > I think that the increased latency could be on the account of: > > > a) marshalling - this is the top suspect > > > b) when receiving the data from network (in JGroups), those are > > > copied from the socket to buffer > > > c) general GC activity - with larger data flow you're about to > > > trigger GC sooner > > > > > > Though, I am quite surprised by such linear scaling, usually RPC > > > latency or waiting for locks is the villain. Unless you set in cache > > > configuration to storeAsBinary, Infinispan treats values as > > > references and there should be no overhead involved. > > > > > > Could you set up sampling mode profiler and check what it reports? > > > All the above are just slightly educated guesses. > > > > > > Radim > > > > > > On 12/15/2014 01:54 PM, linjunru wrote: > > >> > > >> Hi, all: > > >> > > >> I have tested infinispan in distributed mode in terms of latency of > > >> put(k,v) operation. The own_num is 1 and the key we put/write > > >> locates in the same node as the put operation occurs(In the table,?1+0? > > >> represents this scenario), the results indicates that the latency > > >> increases as the size of the value increases. However the > > >> increments seem to be a little ?unreasonable? to me, because the > > >> bandwidth of the memory system is quite huge, and the number of > > >> keys (10000) remains the same during the experiment. So, here is the > questions: > > >> which operations inside infinspan have strong relatives with the > > >> size of value, and why they costs so much as the size increases? > > >> > > >> We have also tested infinispan in the scenario which the key and > > >> the > > >> put/write(key,value) operation reside in different nodes(we noted > > >> it as ?0+1?). Compared with ?1+0?, ?0+1? triggers network > > >> communications, however, the network latency is much smaller > > >> compared to the performance gas between the two scenarios. Why this > > >> situation > > happens? > > >> For example, with a 25K bytes ping packet, the RTT is about 0.713ms > > >> while performance gas between the two scenarios is about 8.4ms?what > > >> operations inside infinispan used the other 7.6ms? > > >> > > >> UDP is utilized as the transport protocol, the infinispan version > > >> we used is 7.0 and there are 4 nodes in the cluster, each has 10000 > > >> keys, all of them have memory bigger than 32G, and all of them have > > >> xeon cpu > > >> e5-2407 x2. > > >> > > >> Value size > > >> > > >> > > >> > > >> 250B( us) > > >> > > >> > > >> > > >> 2.5K( us) > > >> > > >> > > >> > > >> 25k(us) > > >> > > >> > > >> > > >> 250k(us) > > >> > > >> > > >> > > >> 2.5M(us) > > >> > > >> > > >> > > >> 25M(us) > > >> > > >> 1+0 > > >> > > >> > > >> > > >> 463 > > >> > > >> > > >> > > >> 726 > > >> > > >> > > >> > > >> 3 236 > > >> > > >> > > >> > > >> 26 560 > > >> > > >> > > >> > > >> 354 454 > > >> > > >> > > >> > > >> 3 979 830 > > >> > > >> 0+1 > > >> > > >> > > >> > > >> 1 807 > > >> > > >> > > >> > > >> 2 829 > > >> > > >> > > >> > > >> 11 635 > > >> > > >> > > >> > > >> 87 540 > > >> > > >> > > >> > > >> 1 035 133 > > >> > > >> > > >> > > >> 11 653 389 > > >> > > >> Thanks! > > >> > > >> Best Regards, > > >> > > >> JR > > >> > > >> > > >> > > >> _______________________________________________ > > >> infinispan-dev mailing list > > >> infinispan-dev at lists.jboss.org > > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > > -- > > > Radim Vansa > > > JBoss DataGrid QA > > > > > > _______________________________________________ > > > infinispan-dev mailing list > > > infinispan-dev at lists.jboss.org > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Wed Dec 17 10:17:58 2014 From: rvansa at redhat.com (Radim Vansa) Date: Wed, 17 Dec 2014 16:17:58 +0100 Subject: [infinispan-dev] Performance gap between different value sizes and between key loactions In-Reply-To: <92F95A318014D342B5C277097274F2E5076B15@szxeml555-mbs.china.huawei.com> References: <92F95A318014D342B5C277097274F2E5076938@szxeml555-mbs.china.huawei.com> <548EE350.8070807@redhat.com> <92F95A318014D342B5C277097274F2E5076A1F@szxeml555-mbs.china.huawei.com> <92F95A318014D342B5C277097274F2E5076B15@szxeml555-mbs.china.huawei.com> Message-ID: <54919EA6.1080309@redhat.com> I think that you're configuring the storeAsBinary correctly - however, one marshalling call is required anyway, this value just means that you store the marshalled form in cluster (more suitable when reading the value remotely, but requires unmarshalling every time you read that locally). Could you try to use profiler? Especially with larger values (and response times in orders of hundreds of milliseconds) it has quite a good chance to hit the hot spot. Just stick to sampling, instrumentation would skew the results too much (at least I was never able to get any reasonable readings when using instrumentation). Radim On 12/17/2014 03:17 PM, linjunru wrote: > Hi, all: > I tested infinispan again with 10Gbe L2 Network. Write latency of ?1+0? scenario almost remain the same, and latency of '0+1' scenario has little improvement especially when size of ?value" increase. Does these results indicate that the network has little impact on infinispan's write latency in these scenario. > As Radim mentioned, marshalling and general GC activity may be the other two reason for cause the huge latency. I have tried to disable storeAsBinary, but there is not much differences. I'm not sure whether I configure the infinspan in the right way, so I list my configuration at the end of the email, again(^-^?, if there is anything wrong of the configuration, please point out. > Refereed to the GC activity, is there any configurations can optimize it? > At last, is there anybody/company use infiinspan to store media such as images, videos or big files? > > > > > > > > > > > > > > > > Ps: only distributed cache ?dist" is utilized and tested. > > Thanks! > > Best Regards, > JR > > > >> -----Original Message----- >> From: infinispan-dev-bounces at lists.jboss.org >> [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of linjunru >> Sent: Tuesday, December 16, 2014 8:37 PM >> To: infinispan -Dev List >> Subject: Re: [infinispan-dev] Performance gap between different value sizes and >> between key loactions >> >> Dan & Radim, Thanks! >> >> I have attempted to disable storeAsBinary with the followed infinispan >> configurations, but the results don't show much differences. >> >> > xsi:schemaLocation="urn:infinispan:config:7.0 >> http://www.infinispan.org/schemas/infinispan-config-7.0.xsd" >> xmlns="urn:infinispan:config:7.0"> >> - >> >> >> - >> >> >> >> >> >> >> >> >> >> The infinispan configurations utilized by the previous experiments is: >> > xsi:schemaLocation="urn:infinispan:config:7.0 >> http://www.infinispan.org/schemas/infinispan-config-7.0.xsd" >> xmlns="urn:infinispan:config:7.0"> >> - >> >> >> - >> >> >> >> >> >> >> >> Best Regards, >> JR >> >> >>> -----Original Message----- >>> From: infinispan-dev-bounces at lists.jboss.org >>> [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of Dan >>> Berindei >>> Sent: Monday, December 15, 2014 11:44 PM >>> To: infinispan -Dev List >>> Subject: Re: [infinispan-dev] Performance gap between different value >>> sizes and between key loactions >>> >>> JR, could you share your test, or at least the configuration you used >>> and what key/value types you used? >>> >>> Like Radim said, in your 1+0 scenario with storeAsBinary disabled and >>> no cache store attached, I would expect the latency to be exactly the >>> same for all value sizes. >>> >>> Cheers >>> Dan >>> >>> >>> On Mon, Dec 15, 2014 at 3:34 PM, Radim Vansa >> wrote: >>>> Hi JR, >>>> >>>> thanks for those findings! I was benchmarking the dependency of >>>> achieved throughput based on entry size in the past, and I found the >>>> sweet spot on 8k values (likely because our machines had 9k MTU). >>>> Regrettably, we were focusing on throughput rather than on latency. >>>> >>>> I think that the increased latency could be on the account of: >>>> a) marshalling - this is the top suspect >>>> b) when receiving the data from network (in JGroups), those are >>>> copied from the socket to buffer >>>> c) general GC activity - with larger data flow you're about to >>>> trigger GC sooner >>>> >>>> Though, I am quite surprised by such linear scaling, usually RPC >>>> latency or waiting for locks is the villain. Unless you set in cache >>>> configuration to storeAsBinary, Infinispan treats values as >>>> references and there should be no overhead involved. >>>> >>>> Could you set up sampling mode profiler and check what it reports? >>>> All the above are just slightly educated guesses. >>>> >>>> Radim >>>> >>>> On 12/15/2014 01:54 PM, linjunru wrote: >>>>> Hi, all: >>>>> >>>>> I have tested infinispan in distributed mode in terms of latency of >>>>> put(k,v) operation. The own_num is 1 and the key we put/write >>>>> locates in the same node as the put operation occurs(In the table,?1+0? >>>>> represents this scenario), the results indicates that the latency >>>>> increases as the size of the value increases. However the >>>>> increments seem to be a little ?unreasonable? to me, because the >>>>> bandwidth of the memory system is quite huge, and the number of >>>>> keys (10000) remains the same during the experiment. So, here is the >> questions: >>>>> which operations inside infinspan have strong relatives with the >>>>> size of value, and why they costs so much as the size increases? >>>>> >>>>> We have also tested infinispan in the scenario which the key and >>>>> the >>>>> put/write(key,value) operation reside in different nodes(we noted >>>>> it as ?0+1?). Compared with ?1+0?, ?0+1? triggers network >>>>> communications, however, the network latency is much smaller >>>>> compared to the performance gas between the two scenarios. Why this >>>>> situation >>> happens? >>>>> For example, with a 25K bytes ping packet, the RTT is about 0.713ms >>>>> while performance gas between the two scenarios is about 8.4ms?what >>>>> operations inside infinispan used the other 7.6ms? >>>>> >>>>> UDP is utilized as the transport protocol, the infinispan version >>>>> we used is 7.0 and there are 4 nodes in the cluster, each has 10000 >>>>> keys, all of them have memory bigger than 32G, and all of them have >>>>> xeon cpu >>>>> e5-2407 x2. >>>>> >>>>> Value size >>>>> >>>>> >>>>> >>>>> 250B( us) >>>>> >>>>> >>>>> >>>>> 2.5K( us) >>>>> >>>>> >>>>> >>>>> 25k(us) >>>>> >>>>> >>>>> >>>>> 250k(us) >>>>> >>>>> >>>>> >>>>> 2.5M(us) >>>>> >>>>> >>>>> >>>>> 25M(us) >>>>> >>>>> 1+0 >>>>> >>>>> >>>>> >>>>> 463 >>>>> >>>>> >>>>> >>>>> 726 >>>>> >>>>> >>>>> >>>>> 3 236 >>>>> >>>>> >>>>> >>>>> 26 560 >>>>> >>>>> >>>>> >>>>> 354 454 >>>>> >>>>> >>>>> >>>>> 3 979 830 >>>>> >>>>> 0+1 >>>>> >>>>> >>>>> >>>>> 1 807 >>>>> >>>>> >>>>> >>>>> 2 829 >>>>> >>>>> >>>>> >>>>> 11 635 >>>>> >>>>> >>>>> >>>>> 87 540 >>>>> >>>>> >>>>> >>>>> 1 035 133 >>>>> >>>>> >>>>> >>>>> 11 653 389 >>>>> >>>>> Thanks! >>>>> >>>>> Best Regards, >>>>> >>>>> JR >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> -- >>>> Radim Vansa >>>> JBoss DataGrid QA >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss DataGrid QA From rvansa at redhat.com Wed Dec 17 12:58:40 2014 From: rvansa at redhat.com (Radim Vansa) Date: Wed, 17 Dec 2014 18:58:40 +0100 Subject: [infinispan-dev] Indexing deadlock (solution suggestion) Message-ID: <5491C450.80208@redhat.com> Hi, what I was suggesting in the call in order to get rid of the indexing: Currently we're doing this: 1. thread on primary owner executes the write and sends indexing request (synchronous RPC) to index master, waits for the response 2. remote/OOB thread on indexing master enqueues the indexing request and waits 3. indexing thread (on indexing master) retrieves the request, processes it and wakes up the waiting remote/OOB thread 4. remote/OOB thread sends RPC response 5. primary owner receives the RPC response (in OOB thread, inside JGroups) and wakes up the thread sending the RPC What I suggest is that: 1. thread on primary owner executes the write and sends indexing request as asynchronous RPC (single message) to index master, and waits on a custom synchronization primitive 2. remote/OOB thread on indexing master enqueues the indexing request and returns back to the threadpool 3. indexing thread (on indexing master) retrieves the request, processes it and sends asynchronouse RPC (again single message) to the primary owner 4. primary owner (in OOB thread) receives the message and wakes up thread waiting on the custom synchronization primitive (in Infinispan) My 2c Radim -- Radim Vansa JBoss DataGrid QA From galder at redhat.com Thu Dec 18 11:05:33 2014 From: galder at redhat.com (=?iso-8859-1?Q?Galder_Zamarre=F1o?=) Date: Thu, 18 Dec 2014 17:05:33 +0100 Subject: [infinispan-dev] Infinispan 7.0.2.Final is a certified JSR-107 1.0 implementation Message-ID: Hi all, The infinispan-jcache module in Infinispan 7.0.2.Final has been certified to be compatible JSR-107 1.0 specification implementation. Find out more about it in http://blog.infinispan.org/2014/12/infinispan-702final-is-certified-jsr.html Cheers, -- Galder Zamarre?o galder at redhat.com twitter.com/galderz From ttarrant at redhat.com Fri Dec 19 08:52:53 2014 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 19 Dec 2014 14:52:53 +0100 Subject: [infinispan-dev] JIRAs for a 7.0.3 release In-Reply-To: References:

Message-ID: <54942DB5.80001@redhat.com> Hi all, this is the current status of the 7.0.x branch: ISPN-5027 OutOfMemoryError in entry retriever when state transfer chunk size is Integer.MAX_VALUE ISPN-5053 Modules inheriting directly from the BOM use Java 1.5 ISPN-5008 7.0.x missing cachestore-remote and extended-statistics modules ISPN-5007 Enhance the distribution script to detect missing artifacts ISPN-5052 Lock timeout details prints out null for local locks ISPN-4989 infinispan-transport thread name is undefined ISPN-5030 NPE during node rebalance after a leave ISPN-4975 Cross site state transfer - status of push gets stuck at "SENDING" after being cancelled ISPN-4444 After state transfer, a node is able to read keys it no longer owns from its data container ISPN-4979 CacheStatusResponse map uses too much memory ISPN-4949 Split brain: inconsistent data after merge ISPN-3561 A joining cache should receive the rebalancedEnabled flag from the coordinator. ISPN-5000 Cleanup rebalance confirmation collector when node is not coord ISPN-5040 Upgrade to JGroups 3.6.1.Final ISPN-5011 CacheManager not stopping when search factory not initialized ISPN-5048 Relocate some imported packages in uberjars and remove any javax.* classes ISPN-4948 Package embedded CLI as uberjar ISPN-5017 Include the CLI uberjar in the distribution zip ISPN-5029 Infinispan 7.0.2 not fully backwards compatible with 6.0.x ISPN-5026 The Infinispan 7.0.2's GUI demo cannot be properly launched in Windows 7 ISPN-5018 Add test for protobuf marshalling of primitives ISPN-5006 Upgrade to Hibernate Search 5.0.0.Beta3 ISPN-5005 Upgrade to Hibernate HQL Parser 1.1.0.Beta1 ISPN-5032 Create dedicated GetCacheEntryCommand to simplify code and save memory on GetKeyValueCommand The remaining wishlist items are: ISPN-4546 Possible stale lock when the primary owner leaves during rebalance ISPN-4586 Too many OutdatedTopologyExceptions in non-transactional caches ISPN-5037 Do not replicate value from backup owner ISPN-5042 Remote gets caused by writes could be replicated only to the primary owner ISPN-5076 Pessimistic transactions can lose their locks when the primary owner changes ISPN-5088 Deleted entries from (FineGrained)AtomicMap reappear in subsequent transaction which I have tagged with the label "7.0". It seems like ISPN-5088 has a chance to be fixed quickly and since I'd like to do a 7.0.3 sooner rather than later we'll wait for that and then release. Anything else can be postponed to 7.0.4 or 7.1.0.Final (which is due by Jan 31). Tristan On 16/12/2014 11:03, Emmanuel Bernard wrote: > While we are on a wish list for 7.0.3. > We have found a regression from Infinispan 6 on (FineGrained)AtomicMap that is seriously impacting Hibernate OGM. > > ISPN-5088 > > We should get this one quickly to avoid people tripping over the concurrency concern the workaround implies. > > Emmanuel > >> On 15 Dec 2014, at 21:50, Erik Salter wrote: >> >> Hi all, >> >> I have a few more that might be candidates for the 7.0.x branch. The >> critical ones (for me) have to do with state transfer and locks. Some of >> these are still pending, though. >> >> ISPN-5076 >> ISPN-5030 >> ISPN-5000 >> ISPN-4546 >> >> There are a few optimizations. ISPN-5042, in particular, seems like a >> quick win. For ISPN-5037 and ISPN-5032, I'll probably implement on my own. >> >> Also under the heading of optimizations -- one thing that's a real problem >> with my cache configuration is the verbose nature of >> OutdatedTopologyExceptions. They're thrown as RemoteExceptions, which >> cause stack traces all over the place when a NonTx cache retries its >> operation. (These actually crashed an indexer on my analytics engine >> during a performance state transfer test.) Also, if the key ownership >> hasn't changed, why throw them at all? >> >> ISPN-4695 >> ISPN-4586 >> >> (If I'm on Santa's naughty list, I may have a crack at implementing them.) >> >> There are a few minor cosmetic things that are easily ported. >> >> ISPN-4989 >> ISPN-5040 >> ISPN-5052 >> >> >> Thanks all, >> >> Erik >> >> On 12/4/14, 10:22 AM, "Erik Salter" wrote: >> >>> Hi all, >>> >>> I was asked to vote on a list of JIRAs for 7.0.3 and send it to the >>> mailing list. The next iteration of my application is migrating from >>> 5.2.x to 7.0.x, so I'm really focused on hardening and stability, >>> especially WRT state transfer. Here are the ones I was looking at, mostly >>> related to state transfer: >>> >>> - ISPN-5000 >>> >>> - ISPN-4949 (and related ISPN-5030) >>> - ISPN-4975 >>> - ISPN-5027 >>> >>> Notes on a few others I was looking at: >>> - ISPN-4444, from the description, looks serious enough to include. I >>> haven't looked at the commit in-depth; appears to be limited to keys in >>> L1? >>> - ISPN-4979 appears to be a substantial change. I would defer to the team >>> about how risky of a change it is. >>> >>> And if ISPN-3561 makes it into a 7.0.3, I'd consider it a personal favor. >>> >>> Regards, >>> >>> Erik >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Fri Dec 19 09:55:49 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 19 Dec 2014 14:55:49 +0000 Subject: [infinispan-dev] JIRAs for a 7.0.3 release In-Reply-To: <54942DB5.80001@redhat.com> References:

<54942DB5.80001@redhat.com> Message-ID: It's possible I could get you an upgrade to Hibernate Search 5.0.0.Final & related HQL Parser 1.1.0.Final soon, but I'm currently in trouble with regressions of the Infinispan testsuite. @Pedro, that draft commit you had sent me made wonders. You think you could finish that and get it integrated? Until that's solved I'm not going to bother spending time on Infinispan ;-) On 19 December 2014 at 13:52, Tristan Tarrant wrote: > Hi all, > > this is the current status of the 7.0.x branch: > > ISPN-5027 OutOfMemoryError in entry retriever when state transfer > chunk size is Integer.MAX_VALUE > ISPN-5053 Modules inheriting directly from the BOM use Java 1.5 > ISPN-5008 7.0.x missing cachestore-remote and extended-statistics > modules > ISPN-5007 Enhance the distribution script to detect missing artifacts > ISPN-5052 Lock timeout details prints out null for local locks > ISPN-4989 infinispan-transport thread name is undefined > ISPN-5030 NPE during node rebalance after a leave > ISPN-4975 Cross site state transfer - status of push gets stuck at > "SENDING" after being cancelled > ISPN-4444 After state transfer, a node is able to read keys it no > longer owns from its data container > ISPN-4979 CacheStatusResponse map uses too much memory > ISPN-4949 Split brain: inconsistent data after merge > ISPN-3561 A joining cache should receive the rebalancedEnabled flag > from the coordinator. > ISPN-5000 Cleanup rebalance confirmation collector when node is not > coord > ISPN-5040 Upgrade to JGroups 3.6.1.Final > ISPN-5011 CacheManager not stopping when search factory not initialized > ISPN-5048 Relocate some imported packages in uberjars and remove any > javax.* classes > ISPN-4948 Package embedded CLI as uberjar > ISPN-5017 Include the CLI uberjar in the distribution zip > ISPN-5029 Infinispan 7.0.2 not fully backwards compatible with 6.0.x > ISPN-5026 The Infinispan 7.0.2's GUI demo cannot be properly > launched in Windows 7 > ISPN-5018 Add test for protobuf marshalling of primitives > ISPN-5006 Upgrade to Hibernate Search 5.0.0.Beta3 > ISPN-5005 Upgrade to Hibernate HQL Parser 1.1.0.Beta1 > ISPN-5032 Create dedicated GetCacheEntryCommand to simplify code and > save memory on GetKeyValueCommand > > The remaining wishlist items are: > > ISPN-4546 Possible stale lock when the primary owner leaves during > rebalance > ISPN-4586 Too many OutdatedTopologyExceptions in non-transactional > caches > ISPN-5037 Do not replicate value from backup owner > ISPN-5042 Remote gets caused by writes could be replicated only to > the primary owner > ISPN-5076 Pessimistic transactions can lose their locks when the > primary owner changes > ISPN-5088 Deleted entries from (FineGrained)AtomicMap reappear in > subsequent transaction > > which I have tagged with the label "7.0". > It seems like ISPN-5088 has a chance to be fixed quickly and since I'd > like to do a 7.0.3 sooner rather than later we'll wait for that and then > release. Anything else can be postponed to 7.0.4 or 7.1.0.Final (which > is due by Jan 31). > > Tristan > > > On 16/12/2014 11:03, Emmanuel Bernard wrote: >> While we are on a wish list for 7.0.3. >> We have found a regression from Infinispan 6 on (FineGrained)AtomicMap that is seriously impacting Hibernate OGM. >> >> ISPN-5088 >> >> We should get this one quickly to avoid people tripping over the concurrency concern the workaround implies. >> >> Emmanuel >> >>> On 15 Dec 2014, at 21:50, Erik Salter wrote: >>> >>> Hi all, >>> >>> I have a few more that might be candidates for the 7.0.x branch. The >>> critical ones (for me) have to do with state transfer and locks. Some of >>> these are still pending, though. >>> >>> ISPN-5076 >>> ISPN-5030 >>> ISPN-5000 >>> ISPN-4546 >>> >>> There are a few optimizations. ISPN-5042, in particular, seems like a >>> quick win. For ISPN-5037 and ISPN-5032, I'll probably implement on my own. >>> >>> Also under the heading of optimizations -- one thing that's a real problem >>> with my cache configuration is the verbose nature of >>> OutdatedTopologyExceptions. They're thrown as RemoteExceptions, which >>> cause stack traces all over the place when a NonTx cache retries its >>> operation. (These actually crashed an indexer on my analytics engine >>> during a performance state transfer test.) Also, if the key ownership >>> hasn't changed, why throw them at all? >>> >>> ISPN-4695 >>> ISPN-4586 >>> >>> (If I'm on Santa's naughty list, I may have a crack at implementing them.) >>> >>> There are a few minor cosmetic things that are easily ported. >>> >>> ISPN-4989 >>> ISPN-5040 >>> ISPN-5052 >>> >>> >>> Thanks all, >>> >>> Erik >>> >>> On 12/4/14, 10:22 AM, "Erik Salter" wrote: >>> >>>> Hi all, >>>> >>>> I was asked to vote on a list of JIRAs for 7.0.3 and send it to the >>>> mailing list. The next iteration of my application is migrating from >>>> 5.2.x to 7.0.x, so I'm really focused on hardening and stability, >>>> especially WRT state transfer. Here are the ones I was looking at, mostly >>>> related to state transfer: >>>> >>>> - ISPN-5000 >>>> >>>> - ISPN-4949 (and related ISPN-5030) >>>> - ISPN-4975 >>>> - ISPN-5027 >>>> >>>> Notes on a few others I was looking at: >>>> - ISPN-4444, from the description, looks serious enough to include. I >>>> haven't looked at the commit in-depth; appears to be limited to keys in >>>> L1? >>>> - ISPN-4979 appears to be a substantial change. I would defer to the team >>>> about how risky of a change it is. >>>> >>>> And if ISPN-3561 makes it into a 7.0.3, I'd consider it a personal favor. >>>> >>>> Regards, >>>> >>>> Erik >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> > > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From pedro at infinispan.org Fri Dec 19 10:04:29 2014 From: pedro at infinispan.org (Pedro Ruivo) Date: Fri, 19 Dec 2014 15:04:29 +0000 Subject: [infinispan-dev] JIRAs for a 7.0.3 release In-Reply-To: References:

<54942DB5.80001@redhat.com> Message-ID: <54943E7D.6030106@infinispan.org> On 12/19/2014 02:55 PM, Sanne Grinovero wrote: > It's possible I could get you an upgrade to Hibernate Search > 5.0.0.Final & related HQL Parser 1.1.0.Final soon, but I'm currently > in trouble with regressions of the Infinispan testsuite. > > @Pedro, that draft commit you had sent me made wonders. You think you > could finish that and get it integrated? > Until that's solved I'm not going to bother spending time on Infinispan ;-) Which commit? The performance regression because of the remote refactor? that commit was already integrated :) If you found another one let me know :P Pedro > > On 19 December 2014 at 13:52, Tristan Tarrant wrote: >> Hi all, >> >> this is the current status of the 7.0.x branch: >> >> ISPN-5027 OutOfMemoryError in entry retriever when state transfer >> chunk size is Integer.MAX_VALUE >> ISPN-5053 Modules inheriting directly from the BOM use Java 1.5 >> ISPN-5008 7.0.x missing cachestore-remote and extended-statistics >> modules >> ISPN-5007 Enhance the distribution script to detect missing artifacts >> ISPN-5052 Lock timeout details prints out null for local locks >> ISPN-4989 infinispan-transport thread name is undefined >> ISPN-5030 NPE during node rebalance after a leave >> ISPN-4975 Cross site state transfer - status of push gets stuck at >> "SENDING" after being cancelled >> ISPN-4444 After state transfer, a node is able to read keys it no >> longer owns from its data container >> ISPN-4979 CacheStatusResponse map uses too much memory >> ISPN-4949 Split brain: inconsistent data after merge >> ISPN-3561 A joining cache should receive the rebalancedEnabled flag >> from the coordinator. >> ISPN-5000 Cleanup rebalance confirmation collector when node is not >> coord >> ISPN-5040 Upgrade to JGroups 3.6.1.Final >> ISPN-5011 CacheManager not stopping when search factory not initialized >> ISPN-5048 Relocate some imported packages in uberjars and remove any >> javax.* classes >> ISPN-4948 Package embedded CLI as uberjar >> ISPN-5017 Include the CLI uberjar in the distribution zip >> ISPN-5029 Infinispan 7.0.2 not fully backwards compatible with 6.0.x >> ISPN-5026 The Infinispan 7.0.2's GUI demo cannot be properly >> launched in Windows 7 >> ISPN-5018 Add test for protobuf marshalling of primitives >> ISPN-5006 Upgrade to Hibernate Search 5.0.0.Beta3 >> ISPN-5005 Upgrade to Hibernate HQL Parser 1.1.0.Beta1 >> ISPN-5032 Create dedicated GetCacheEntryCommand to simplify code and >> save memory on GetKeyValueCommand >> >> The remaining wishlist items are: >> >> ISPN-4546 Possible stale lock when the primary owner leaves during >> rebalance >> ISPN-4586 Too many OutdatedTopologyExceptions in non-transactional >> caches >> ISPN-5037 Do not replicate value from backup owner >> ISPN-5042 Remote gets caused by writes could be replicated only to >> the primary owner >> ISPN-5076 Pessimistic transactions can lose their locks when the >> primary owner changes >> ISPN-5088 Deleted entries from (FineGrained)AtomicMap reappear in >> subsequent transaction >> >> which I have tagged with the label "7.0". >> It seems like ISPN-5088 has a chance to be fixed quickly and since I'd >> like to do a 7.0.3 sooner rather than later we'll wait for that and then >> release. Anything else can be postponed to 7.0.4 or 7.1.0.Final (which >> is due by Jan 31). >> >> Tristan >> >> >> On 16/12/2014 11:03, Emmanuel Bernard wrote: >>> While we are on a wish list for 7.0.3. >>> We have found a regression from Infinispan 6 on (FineGrained)AtomicMap that is seriously impacting Hibernate OGM. >>> >>> ISPN-5088 >>> >>> We should get this one quickly to avoid people tripping over the concurrency concern the workaround implies. >>> >>> Emmanuel >>> >>>> On 15 Dec 2014, at 21:50, Erik Salter wrote: >>>> >>>> Hi all, >>>> >>>> I have a few more that might be candidates for the 7.0.x branch. The >>>> critical ones (for me) have to do with state transfer and locks. Some of >>>> these are still pending, though. >>>> >>>> ISPN-5076 >>>> ISPN-5030 >>>> ISPN-5000 >>>> ISPN-4546 >>>> >>>> There are a few optimizations. ISPN-5042, in particular, seems like a >>>> quick win. For ISPN-5037 and ISPN-5032, I'll probably implement on my own. >>>> >>>> Also under the heading of optimizations -- one thing that's a real problem >>>> with my cache configuration is the verbose nature of >>>> OutdatedTopologyExceptions. They're thrown as RemoteExceptions, which >>>> cause stack traces all over the place when a NonTx cache retries its >>>> operation. (These actually crashed an indexer on my analytics engine >>>> during a performance state transfer test.) Also, if the key ownership >>>> hasn't changed, why throw them at all? >>>> >>>> ISPN-4695 >>>> ISPN-4586 >>>> >>>> (If I'm on Santa's naughty list, I may have a crack at implementing them.) >>>> >>>> There are a few minor cosmetic things that are easily ported. >>>> >>>> ISPN-4989 >>>> ISPN-5040 >>>> ISPN-5052 >>>> >>>> >>>> Thanks all, >>>> >>>> Erik >>>> >>>> On 12/4/14, 10:22 AM, "Erik Salter" wrote: >>>> >>>>> Hi all, >>>>> >>>>> I was asked to vote on a list of JIRAs for 7.0.3 and send it to the >>>>> mailing list. The next iteration of my application is migrating from >>>>> 5.2.x to 7.0.x, so I'm really focused on hardening and stability, >>>>> especially WRT state transfer. Here are the ones I was looking at, mostly >>>>> related to state transfer: >>>>> >>>>> - ISPN-5000 >>>>> >>>>> - ISPN-4949 (and related ISPN-5030) >>>>> - ISPN-4975 >>>>> - ISPN-5027 >>>>> >>>>> Notes on a few others I was looking at: >>>>> - ISPN-4444, from the description, looks serious enough to include. I >>>>> haven't looked at the commit in-depth; appears to be limited to keys in >>>>> L1? >>>>> - ISPN-4979 appears to be a substantial change. I would defer to the team >>>>> about how risky of a change it is. >>>>> >>>>> And if ISPN-3561 makes it into a 7.0.3, I'd consider it a personal favor. >>>>> >>>>> Regards, >>>>> >>>>> Erik >>>>> >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >> >> >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From isavin at redhat.com Tue Dec 23 04:05:17 2014 From: isavin at redhat.com (Ion Savin) Date: Tue, 23 Dec 2014 11:05:17 +0200 Subject: [infinispan-dev] my status Message-ID: <5499304D.3090601@redhat.com> Hi all, My status for last week: * continued to work on JSR107 remote * a few more build/ide-related fixes: https://github.com/infinispan/infinispan/commit/106aa0fec84b9719c4036a10d1007f1f6a1a25f3 https://github.com/infinispan/infinispan/commit/5f43619bb3865d90300296b40e287a26f987fdca https://github.com/infinispan/infinispan/pull/3168 This week: * 22/23 - JSR107/product work * 24-26 - PTO From rvansa at redhat.com Tue Dec 23 08:16:13 2014 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 23 Dec 2014 14:16:13 +0100 Subject: [infinispan-dev] Dan's wiki page on cache consistency Message-ID: <54996B1D.2050402@redhat.com> Hi guys, since not everyone is watching ISPN-5016, I wanted to spread the audience for $SUBJECT [1]. A few details need more attention yet, but this is really the most comprehensive information on Infinispan Cache API guarantees and I'd recommend anyone to spend the time to read this carefully (although it's not an one-coffee article). Thanks a lot, Dan, for compiling this. Radim [1] https://github.com/infinispan/infinispan/wiki/Consistency-guarantees-in-Infinispan -- Radim Vansa JBoss DataGrid QA From bban at redhat.com Tue Dec 23 08:38:57 2014 From: bban at redhat.com (Bela Ban) Date: Tue, 23 Dec 2014 14:38:57 +0100 Subject: [infinispan-dev] Dan's wiki page on cache consistency In-Reply-To: <54996B1D.2050402@redhat.com> References: <54996B1D.2050402@redhat.com> Message-ID: <54997071.5030101@redhat.com> Added the link to the Berlin agenda. An overview by Dan on this would be nice, so that everyone's on the same page. Please read the document before the meeting Cheers, On 23/12/14 14:16, Radim Vansa wrote: > Hi guys, > > since not everyone is watching ISPN-5016, I wanted to spread the > audience for $SUBJECT [1]. A few details need more attention yet, but > this is really the most comprehensive information on Infinispan Cache > API guarantees and I'd recommend anyone to spend the time to read this > carefully (although it's not an one-coffee article). > > Thanks a lot, Dan, for compiling this. > > Radim > > [1] > https://github.com/infinispan/infinispan/wiki/Consistency-guarantees-in-Infinispan > -- Bela Ban, JGroups lead (http://www.jgroups.org) From sanne at infinispan.org Tue Dec 23 14:39:09 2014 From: sanne at infinispan.org (Sanne Grinovero) Date: Tue, 23 Dec 2014 19:39:09 +0000 Subject: [infinispan-dev] Dan's wiki page on cache consistency In-Reply-To: <54997071.5030101@redhat.com> References: <54996B1D.2050402@redhat.com> <54997071.5030101@redhat.com> Message-ID: That's great, thanks a lot to Dan for writing it. But it's a very long document, and how would you all prefer to handle feedback, challenges and improvement proposals? While this is probably the most faithful description of the (current) implementation - which is extremely helpful to users of the current version - I'd love to also start a debate on how we should improve it, so a clear distinction between -A cases which match what Infinispan aims to be and therefore people should either live with it or look for a different solution -B cases for which we agree they are a current limitation for which we have hopes to eventually improve on To be clear, I think some of the described semantics are unacceptable in practice, but I'm unsure on what would be the best platform to have such a debate; I'm afraid that by email we'd be focusing on the first raised points first while we should identify priorities, and being this a long document there are going to be lots of items to discuss. It might help a lot to classify A vs B cases to try and define a mission for the project. A selfish example for such a mission sound like "be an efficient and reliable way to store Lucene indexes for Hibernate Search", or "Aim to become the most efficient cache strategy for Hibernate". These are just examples, but would be helpful to define "current known bug" vs "unavoidable limitation". I realize that my very own view of "acceptable limitations" vs. "unacceptable" is defined by these use cases I have in mind. Especially the "key/value store" could use some clarification to see what we're aiming at. Just for the sake of an example, the "limitation" pointed out of when using an Invalidating Cache with a shared cachestore, that invalidated entries might return from the cachestore seems a critical concern to me. But again thanks a lot for writing this, great step in the always challenging direction of improvement! Sanne On 23 December 2014 at 13:38, Bela Ban wrote: > Added the link to the Berlin agenda. An overview by Dan on this would be > nice, so that everyone's on the same page. Please read the document > before the meeting > Cheers, > > On 23/12/14 14:16, Radim Vansa wrote: >> Hi guys, >> >> since not everyone is watching ISPN-5016, I wanted to spread the >> audience for $SUBJECT [1]. A few details need more attention yet, but >> this is really the most comprehensive information on Infinispan Cache >> API guarantees and I'd recommend anyone to spend the time to read this >> carefully (although it's not an one-coffee article). >> >> Thanks a lot, Dan, for compiling this. >> >> Radim >> >> [1] >> https://github.com/infinispan/infinispan/wiki/Consistency-guarantees-in-Infinispan >> > > -- > Bela Ban, JGroups lead (http://www.jgroups.org) > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Wed Dec 31 04:59:54 2014 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 31 Dec 2014 11:59:54 +0200 Subject: [infinispan-dev] Dan's wiki page on cache consistency In-Reply-To: References: <54996B1D.2050402@redhat.com> <54997071.5030101@redhat.com> Message-ID: On Tue, Dec 23, 2014 at 9:39 PM, Sanne Grinovero wrote: > That's great, thanks a lot to Dan for writing it. > > But it's a very long document, and how would you all prefer to handle > feedback, challenges and improvement proposals? I think the best option is to clone the wiki repository on your machine and add your comments inline with your favorite editor (IntelliJ also has a nice plugin that lets you preview live in a split editor). Just make sure to keep each sentence on a separate line to minimize conflicts! I tried moving it to GoogleDocs, but I had to go through HTML and I lost all the formatting, so I gave up. Radim has posted some comments in JIRA, but that doesn't look scalable. Let me know if you have other suggestions! > While this is probably the most faithful description of the (current) > implementation - which is extremely helpful to users of the current > version - I'd love to also start a debate on how we should improve it, > so a clear distinction between > -A cases which match what Infinispan aims to be and therefore people > should either live with it or look for a different solution > -B cases for which we agree they are a current limitation for which > we have hopes to eventually improve on > > To be clear, I think some of the described semantics are unacceptable > in practice, but I'm unsure on what would be the best platform to have > such a debate; I'm afraid that by email we'd be focusing on the first > raised points first while we should identify priorities, and being > this a long document there are going to be lots of items to discuss. > > It might help a lot to classify A vs B cases to try and define a > mission for the project. A selfish example for such a mission sound > like "be an efficient and reliable way to store Lucene indexes for > Hibernate Search", or "Aim to become the most efficient cache strategy > for Hibernate". > These are just examples, but would be helpful to define "current known > bug" vs "unavoidable limitation". > I realize that my very own view of "acceptable limitations" vs. > "unacceptable" is defined by these use cases I have in mind. > Especially the "key/value store" could use some clarification to see > what we're aiming at. Right, this wiki page isn't a design document as much as a description of current limitations. We should definitely have a proper design wiki page/google doc stating what we want to achieve. Perhaps Lucene/Hibernate have a document with their consistency requirements that we can start from? One more thing I want to try with the current document is to switch heading levels to make the different scenarios more prominent and the different configurations less so - right now I have a lot of links from one configuration to another. Radim was suggesting a summary table with just a checkmark indicating if the cache stays consistent, I'll have a go at that as well. > > Just for the sake of an example, the "limitation" pointed out of when > using an Invalidating Cache with a shared cachestore, that invalidated > entries might return from the cachestore seems a critical concern to > me. Yes, this looks very similar to the L1 invalidation problems that Will fixed with his L1WriteSynchronizer - after all, the L1 cache is just like an invalidating cache. But you missed the bigger (I think) problem with a read operation undoing a write even in local mode :) At first I wanted to create issues in JIRA for all the new problems that I was seeing, but I wasn't able to keep up for long... As you read it, feel free to add a TODO when you see something that's obvious to you (or even better, add a link to a new/existing issue). > > But again thanks a lot for writing this, great step in the always > challenging direction of improvement! > Sanne Thanks Sanne! > > > On 23 December 2014 at 13:38, Bela Ban wrote: >> Added the link to the Berlin agenda. An overview by Dan on this would be >> nice, so that everyone's on the same page. Please read the document >> before the meeting >> Cheers, >> >> On 23/12/14 14:16, Radim Vansa wrote: >>> Hi guys, >>> >>> since not everyone is watching ISPN-5016, I wanted to spread the >>> audience for $SUBJECT [1]. A few details need more attention yet, but >>> this is really the most comprehensive information on Infinispan Cache >>> API guarantees and I'd recommend anyone to spend the time to read this >>> carefully (although it's not an one-coffee article). >>> >>> Thanks a lot, Dan, for compiling this. >>> >>> Radim >>> >>> [1] >>> https://github.com/infinispan/infinispan/wiki/Consistency-guarantees-in-Infinispan >>> >> >> -- >> Bela Ban, JGroups lead (http://www.jgroups.org) >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev