From ttarrant at redhat.com Thu Dec 1 02:30:09 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 1 Dec 2016 08:30:09 +0100 Subject: [infinispan-dev] Default cache In-Reply-To: References: <6664a2bc-a7aa-4fbb-4386-76916d2f6813@redhat.com> <53632d56-e75b-13c8-5d6f-1ad61f5dddc0@redhat.com> Message-ID: <2059e8d7-c9c3-2116-e627-9001795d7bf8@redhat.com> If we go with your approach, there would be little incentive for users to actually fix their application. The problem here is that the "inheritance from the default cache" anti-feature is more often than not the cause of misunderstanding. The real fix is to use configuration templates, obviously. Tristan On 30/11/16 16:24, Sebastian Laskawiec wrote: > Hey Tristan, > > Comments inlined. > > Thanks > Sebastian > > On Wed, Nov 30, 2016 at 3:13 PM, Tristan Tarrant > wrote: > > Some additional notes: > > - currently the XSD specifies the default-cache attribute on the > cache-container element as required, but the parser doesn't enforce it. > - A default ConfigurationBuilder is created for the default cache if one > has not been specified > > My questions: > > 1. do we want the default cache to be optional or actually require it in > the declarative configuration ? > > ** A: no enforcement. In this case requesting the default cache should > print a warning about falling back to a "default" empty configuration. > > ** B: we don't require the user to specify a default cache in the > configuration, but invoking getCache() will throw an exception. > > ** C: enforce it, although this will break all those XML files who > haven't specified it. > > My preference is to use the namespace version and go for the A approach > for < 9.0 and the B approach otherwise. > > > I generally don't like the option B, since it frustrates developers and > it might make the 8.x -> 9.x migration painful. > > However I really like your proposal for a GlobalConfigurationManager > with implicitCacheCreation. However I would set it to true as our > default. Effectively this would results in option A being implemented > (somewhat). > > > > 2. currently, requesting a named cache for which a configuration hasn't > been defined implicitly creates the cache by using the default > configuration as a template. > > ** A: continue as is > > ** B: continue to implicitly create a cache, but use an empty > configuration instead of using the default configuration, as this has > been the source of confusion among users. Also print a warning. > > ** C: do not create caches unless a configuration has been explicitly > provided. > > My preference is to use the namespace version and go for the A approach > for < 9.0 and the C approach otherwise. > > Unfortunately the namespace version trick doesn't work for programmatic > configurations. Probably we should add a boolean flag on the > GlobalConfigurationManager (e.g. implicitCacheCreation) which defaults > to false (because that's the "new order") but allows switching to the > old behaviour if needed. > > > Again A. The same arguments as the above. > > > > In any case I'd like to also introduce a JCache-like createCache() API > > Tristan > > On 10/11/16 13:20, Paul Ferraro wrote: > > +1000 > > > > This is precisely how we've setup cache manager semantics in WildFly > > (since AS7): > > > https://github.com/wildfly/wildfly/blob/master/clustering/infinispan/spi/src/main/java/org/wildfly/clustering/infinispan/spi/CacheContainer.java > > > > https://github.com/wildfly/wildfly/blob/master/clustering/infinispan/extension/src/main/java/org/jboss/as/clustering/infinispan/DefaultCacheContainer.java > > > > > I'd love to be able to drop this. > > > > Paul > > > > On Thu, Nov 10, 2016 at 3:38 AM, Tristan Tarrant > > wrote: > >> In the discussion for [1] the subject of the default cache and > the way > >> it affects configuration inheritance came up. > >> > >> My proposal is: > >> - remove the default cache as a special cache altogether > >> - CacheManager.getCache() should return the named cache specified as > >> default in the configuration. > >> - the programmatic GlobalConfigurationBuilder/GlobalConfiguration > should > >> have the notion of the default named cache (currently this is > handled in > >> the parser) > >> - Retrieving the cache named "___defaultcache" should actually > retrieve > >> the above named cache > >> > >> Opinions ? > >> > >> Tristan > >> > >> [1] https://github.com/infinispan/infinispan/pull/4631 > > >> -- > >> Tristan Tarrant > >> Infinispan Lead > >> JBoss, a division of Red Hat > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From galder at redhat.com Fri Dec 2 11:26:42 2016 From: galder at redhat.com (=?utf-8?Q?Galder_Zamarre=C3=B1o?=) Date: Fri, 2 Dec 2016 17:26:42 +0100 Subject: [infinispan-dev] Default cache In-Reply-To: <2059e8d7-c9c3-2116-e627-9001795d7bf8@redhat.com> References: <6664a2bc-a7aa-4fbb-4386-76916d2f6813@redhat.com> <53632d56-e75b-13c8-5d6f-1ad61f5dddc0@redhat.com> <2059e8d7-c9c3-2116-e627-9001795d7bf8@redhat.com> Message-ID: There is a valid point to Paul's suggestion: Caches created via createCache() should be managed by the app itself, which would make it more symmetrical. It'd be asymmetrical for a cache created by the user code to be managed by the server itself. If you create something, you should destroy it yourself. However, we can't really change the behaviour of getCache() to not return caches created via createCache(). If we did that in the JCache impl, we'd break the API. We could have JCache impl adhere to that, but have Infinispan getCache() behave the way Paul suggests. Since we would be adding createCache() for the first time to Infinispan Cache, in theory we're free to define how to deal with caches created via createCache() the way we want. That we'd deviate from JCache would be unfortunate but the right thing. Thoughts? -- Galder Zamarre?o Infinispan, Red Hat > On 1 Dec 2016, at 08:30, Tristan Tarrant wrote: > > If we go with your approach, there would be little incentive for users > to actually fix their application. The problem here is that the > "inheritance from the default cache" anti-feature is more often than not > the cause of misunderstanding. The real fix is to use configuration > templates, obviously. > > Tristan > > On 30/11/16 16:24, Sebastian Laskawiec wrote: >> Hey Tristan, >> >> Comments inlined. >> >> Thanks >> Sebastian >> >> On Wed, Nov 30, 2016 at 3:13 PM, Tristan Tarrant > > wrote: >> >> Some additional notes: >> >> - currently the XSD specifies the default-cache attribute on the >> cache-container element as required, but the parser doesn't enforce it. >> - A default ConfigurationBuilder is created for the default cache if one >> has not been specified >> >> My questions: >> >> 1. do we want the default cache to be optional or actually require it in >> the declarative configuration ? >> >> ** A: no enforcement. In this case requesting the default cache should >> print a warning about falling back to a "default" empty configuration. >> >> ** B: we don't require the user to specify a default cache in the >> configuration, but invoking getCache() will throw an exception. >> >> ** C: enforce it, although this will break all those XML files who >> haven't specified it. >> >> My preference is to use the namespace version and go for the A approach >> for < 9.0 and the B approach otherwise. >> >> >> I generally don't like the option B, since it frustrates developers and >> it might make the 8.x -> 9.x migration painful. >> >> However I really like your proposal for a GlobalConfigurationManager >> with implicitCacheCreation. However I would set it to true as our >> default. Effectively this would results in option A being implemented >> (somewhat). >> >> >> >> 2. currently, requesting a named cache for which a configuration hasn't >> been defined implicitly creates the cache by using the default >> configuration as a template. >> >> ** A: continue as is >> >> ** B: continue to implicitly create a cache, but use an empty >> configuration instead of using the default configuration, as this has >> been the source of confusion among users. Also print a warning. >> >> ** C: do not create caches unless a configuration has been explicitly >> provided. >> >> My preference is to use the namespace version and go for the A approach >> for < 9.0 and the C approach otherwise. >> >> Unfortunately the namespace version trick doesn't work for programmatic >> configurations. Probably we should add a boolean flag on the >> GlobalConfigurationManager (e.g. implicitCacheCreation) which defaults >> to false (because that's the "new order") but allows switching to the >> old behaviour if needed. >> >> >> Again A. The same arguments as the above. >> >> >> >> In any case I'd like to also introduce a JCache-like createCache() API >> >> Tristan >> >> On 10/11/16 13:20, Paul Ferraro wrote: >>> +1000 >>> >>> This is precisely how we've setup cache manager semantics in WildFly >>> (since AS7): >>> >> https://github.com/wildfly/wildfly/blob/master/clustering/infinispan/spi/src/main/java/org/wildfly/clustering/infinispan/spi/CacheContainer.java >> >>> >> https://github.com/wildfly/wildfly/blob/master/clustering/infinispan/extension/src/main/java/org/jboss/as/clustering/infinispan/DefaultCacheContainer.java >> >>> >>> I'd love to be able to drop this. >>> >>> Paul >>> >>> On Thu, Nov 10, 2016 at 3:38 AM, Tristan Tarrant >> > wrote: >>>> In the discussion for [1] the subject of the default cache and >> the way >>>> it affects configuration inheritance came up. >>>> >>>> My proposal is: >>>> - remove the default cache as a special cache altogether >>>> - CacheManager.getCache() should return the named cache specified as >>>> default in the configuration. >>>> - the programmatic GlobalConfigurationBuilder/GlobalConfiguration >> should >>>> have the notion of the default named cache (currently this is >> handled in >>>> the parser) >>>> - Retrieving the cache named "___defaultcache" should actually >> retrieve >>>> the above named cache >>>> >>>> Opinions ? >>>> >>>> Tristan >>>> >>>> [1] https://github.com/infinispan/infinispan/pull/4631 >> >>>> -- >>>> Tristan Tarrant >>>> Infinispan Lead >>>> JBoss, a division of Red Hat >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >> >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>> >> >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Dec 5 16:58:56 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 5 Dec 2016 22:58:56 +0100 Subject: [infinispan-dev] Infinispan 9.0.0.Beta1 Message-ID: <91554e90-6487-67b5-9926-a14213f71cd7@redhat.com> Dear all, we have just released Infinispan 9.0.0.Beta1. More information is available at [1] Enjoy ! The Infinispan team [1] http://blog.infinispan.org/2016/12/infinispan-900beta1-ruppaner.html -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From gustavo at infinispan.org Wed Dec 7 11:55:15 2016 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Wed, 7 Dec 2016 16:55:15 +0000 Subject: [infinispan-dev] Where is the server? Message-ID: Apparently it's not on maven anymore [1] after 9.0.0.Beta1 [1] https://mvnrepository.com/artifact/org.infinispan.server/infinispan-server-build -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161207/7397a1df/attachment.html From slaskawi at redhat.com Wed Dec 7 13:12:17 2016 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Wed, 7 Dec 2016 19:12:17 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: References: Message-ID: Yes, I can confirm this - it hasn't been uploaded to Nexus [1]. [1] https://origin-repository.jboss.org/nexus/content/repositories/public-jboss/org/infinispan/server/infinispan-server-build/ On Wed, Dec 7, 2016 at 5:55 PM, Gustavo Fernandes wrote: > Apparently it's not on maven anymore [1] after 9.0.0.Beta1 > > [1] https://mvnrepository.com/artifact/org.infinispan. > server/infinispan-server-build > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161207/53869af7/attachment.html From ttarrant at redhat.com Wed Dec 7 14:55:35 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Wed, 7 Dec 2016 20:55:35 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: References: Message-ID: Probably because of the use of the staging plugin ? On 07/12/16 19:12, Sebastian Laskawiec wrote: > Yes, I can confirm this - it hasn't been uploaded to Nexus [1]. > > [1] https://origin-repository.jboss.org/nexus/content/repositories/public-jboss/org/infinispan/server/infinispan-server-build/ > > On Wed, Dec 7, 2016 at 5:55 PM, Gustavo Fernandes > > wrote: > > Apparently it's not on maven anymore [1] after 9.0.0.Beta1 > > [1] > https://mvnrepository.com/artifact/org.infinispan.server/infinispan-server-build > > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From rhauch at redhat.com Wed Dec 7 16:20:01 2016 From: rhauch at redhat.com (Randall Hauch) Date: Wed, 7 Dec 2016 15:20:01 -0600 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> Message-ID: <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Reviving this old thread, and as before I appreciate any help the Infinispan community might provide. There definitely is interest in Debezium capturing the changes being made to an Infinispan cluster. This isn?t as important when Infinispan is used as a cache, but when Infinispan is used as a store then it is important for other apps/services to be able to accurately keep up with the changes being made in the store. > On Jul 29, 2016, at 8:47 AM, Galder Zamarre?o wrote: > > > -- > Galder Zamarre?o > Infinispan, Red Hat > >> On 11 Jul 2016, at 16:41, Randall Hauch wrote: >> >>> >>> On Jul 11, 2016, at 3:42 AM, Adrian Nistor wrote: >>> >>> Hi Randall, >>> >>> Infinispan supports both push and pull access models. The push model is supported by events (and listeners), which are cluster wide and are available in both library and remote mode (hotrod). The notification system is pretty advanced as there is a filtering mechanism available that can use a hand coded filter / converter or one specified in jpql (experimental atm). Getting a snapshot of the initial data is also possible. But infinispan does not produce a transaction log to be used for determining all changes that happened since a previous connection time, so you'll always have to get a new full snapshot when re-connecting. >>> >>> So if Infinispan is the data store I would base the Debezium connector implementation on Infinispan's event notification system. Not sure about the other use case though. >>> >> >> Thanks, Adrian, for the feedback. A couple of questions. >> >> You mentioned Infinispan has a pull model ? is this just using the normal API to read the entries? >> >> With event listeners, a single connection will receive all of the events that occur in the cluster, correct? Is it possible (e.g., a very unfortunately timed crash) for a change to be made to the cache without an event being produced and sent to listeners? > > ^ Yeah, that can happen due to async nature of remote events. However, there's the possibility for clients, upon receiving a new topology, to receive the current state of the server as events, see [1] and [2] > > [1] http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_state_consumption > [2] http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_failure_handling It is critical that any change event stream is consistent with the store, and the change event stream is worthless without it. Only when the change event stream is an accurate representation of what changed can downstream consumers use the stream to rebuild their own perfect copy of the upstream store and to keep those copies consistent with the upstream store. So, given that the events are handled asynchronously, in a cluster how are multiple changes to a single entry handled. For example, if a client sets entry , then a short time after that (or another) client sets entry , is it guaranteed that a client listening to events will see first and some time later? Or is it possible that a client listening might first see and then ? > >> What happens if the network fails or partitions? How does cross site replication address this? > > In terms of cross-site, depends what the client is connected to. Clients can now failover between sites, so they should be able to deal with events too in the same as explained above. > >> >> Has there been any thought about adding to Infinispan a write ahead log or transaction log to each node or, better yet, for the whole cluster? > > Not that I'm aware of but we've recently added security audit log, so a transaction log might make sense too. Without a transaction log, Debezium would have to use a client listener with includeCurrentState=true to obtain the state every time it reconnects. If Debezium just included all of this state in the event stream, then the stream might contain lots of superfluous or unnecessary events, then this impacts all downstream consumers by forcing them to spend a lot of time processing changes that never really happened. So the only way to avoid that would be for Debezium to use an external store to track the changes it has seen so far so that it doesn?t include unnecessary events in the change event stream. It?d be a shame to have to require this much infrastructure. A transaction log would really be a great way to solve this problem. Has there been any more thought about Infinispan using and exposing a transaction log? Or perhaps Infinispan could record the changes in a Kafka topic directly? (I guess if the Infinispan cache used relational database(s) as a cache store(s), then Debezium could just capture the changes from there. That seems like a big constraint, though.) Thoughts? > > Cheers, > >> >> Thanks again! >> >>> Adrian >>> >>> On 07/09/2016 04:38 PM, Randall Hauch wrote: >>>> The Debezium project [1] is working on building change data capture connectors for a variety of databases. MySQL is available now, MongoDB will be soon, and PostgreSQL and Oracle are next on our roadmap. >>>> >>>> One way in which Debezium and Infinispan can be used together is when Infinispan is being used as a cache for data stored in a database. In this case, Debezium can capture the changes to the database and produce a stream of events; a separate process can consume these change and evict entries from an Infinispan cache. >>>> >>>> If Infinispan is to be used as a data store, then it would be useful for Debezium to be able to capture those changes so other apps/services can consume the changes. First of all, does this make sense? Secondly, if it does, then Debezium would need an Infinispan connector, and it?s not clear to me how that connector might capture the changes from Infinispan. >>>> >>>> Debezium typically monitors the log of transactions/changes that are committed to a database. Of course how this works varies for each type of database. For example, MySQL internally produces a transaction log that contains information about every committed row change, and MySQL ensures that every committed change is included and that non-committed changes are excluded. The MySQL mechanism is actually part of the replication mechanism, so slaves update their internal state by reading the master?s log. The Debezium MySQL connector [2] simply reads the same log. >>>> >>>> Infinispan has several mechanisms that may be useful: >>>> >>>> ? Interceptors - See [3]. This seems pretty straightforward and IIUC provides access to all internal operations. However, it?s not clear to me whether a single interceptor will see all the changes in a cluster (perhaps in local and replicated modes) or only those changes that happen on that particular node (in distributed mode). It?s also not clear whether this interceptor is called within the context of the cache?s transaction, so if a failure happens just at the wrong time whether a change might be made to the cache but is not seen by the interceptor (or vice versa). >>>> ? Cross-site replication - See [4][5]. A potential advantage of this mechanism appears to be that it is defined (more) globally, and it appears to function if the remote backup comes back online after being offline for a period of time. >>>> ? State transfer - is it possible to participate as a non-active member of the cluster, and to effectively read all state transfer activities that occur within the cluster? >>>> ? Cache store - tie into the cache store mechanism, perhaps by wrapping an existing cache store and sitting between the cache and the cache store >>>> ? Monitor the cache store - don?t monitor Infinispan at all, and instead monitor the store in which Infinispan is storing entries. (This is probably the least attractive, since some stores can?t be monitored, or because the store is persisting an opaque binary value.) >>>> >>>> Are there other mechanism that might be used? >>>> >>>> There are a couple of important requirements for change data capture to be able to work correctly: >>>> >>>> ? Upon initial connection, the CDC connector must be able to obtain a snapshot of all existing data, followed by seeing all changes to data that may have occurred since the snapshot was started. If the connector is stopped/fails, upon restart it needs to be able to reconnect and either see all changes that occurred since it last was capturing changes, or perform a snapshot. (Performing a snapshot upon restart is very inefficient and undesirable.) This works as follows: the CDC connector only records the ?offset? in the source?s sequence of events; what this ?offset? entails depends on the source. Upon restart, the connector can use this offset information to coordinate with the source where it wants to start reading. (In MySQL and PostgreSQL, every event includes the filename of the log and position in that file. MongoDB includes in each event the monotonically increasing timestamp of the transaction. >>>> ? No change can be missed, even when things go wrong and components crash. >>>> ? When a new entry is added, the ?after? state of the entity will be included. When an entry is updated, the ?after? state will be included in the event; if possible, the event should also include the ?before? state. When an entry is removed, the ?before? state should be included in the event. >>>> >>>> Any thoughts or advice would be greatly appreciated. >>>> >>>> Best regards, >>>> >>>> Randall >>>> >>>> >>>> [1] http://debezium.io >>>> [2] http://debezium.io/docs/connectors/mysql/ >>>> [3] http://infinispan.org/docs/stable/user_guide/user_guide.html#_custom_interceptors_chapter >>>> [4] http://infinispan.org/docs/stable/user_guide/user_guide.html#CrossSiteReplication >>>> [5] https://github.com/infinispan/infinispan/wiki/Design-For-Cross-Site-Replication >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161207/88bc8939/attachment-0001.html From slaskawi at redhat.com Thu Dec 8 01:46:29 2016 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Thu, 8 Dec 2016 07:46:29 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: References: Message-ID: Yes, it is possible. I've created ISPN-7299 to check this. On Wed, Dec 7, 2016 at 8:55 PM, Tristan Tarrant wrote: > Probably because of the use of the staging plugin ? > > On 07/12/16 19:12, Sebastian Laskawiec wrote: > > Yes, I can confirm this - it hasn't been uploaded to Nexus [1]. > > > > [1] https://origin-repository.jboss.org/nexus/content/ > repositories/public-jboss/org/infinispan/server/infinispan-server-build/ > > > > On Wed, Dec 7, 2016 at 5:55 PM, Gustavo Fernandes > > > wrote: > > > > Apparently it's not on maven anymore [1] after 9.0.0.Beta1 > > > > [1] > > https://mvnrepository.com/artifact/org.infinispan. > server/infinispan-server-build > > server/infinispan-server-build> > > > > > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org jboss.org> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161208/e877f35c/attachment.html From gustavo at infinispan.org Thu Dec 8 04:13:28 2016 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Thu, 8 Dec 2016 09:13:28 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: On Wed, Dec 7, 2016 at 9:20 PM, Randall Hauch wrote: > Reviving this old thread, and as before I appreciate any help the > Infinispan community might provide. There definitely is interest in > Debezium capturing the changes being made to an Infinispan cluster. This > isn?t as important when Infinispan is used as a cache, but when Infinispan > is used as a store then it is important for other apps/services to be able > to accurately keep up with the changes being made in the store. > > On Jul 29, 2016, at 8:47 AM, Galder Zamarre?o wrote: > > > -- > Galder Zamarre?o > Infinispan, Red Hat > > On 11 Jul 2016, at 16:41, Randall Hauch wrote: > > > On Jul 11, 2016, at 3:42 AM, Adrian Nistor wrote: > > Hi Randall, > > Infinispan supports both push and pull access models. The push model is > supported by events (and listeners), which are cluster wide and are > available in both library and remote mode (hotrod). The notification system > is pretty advanced as there is a filtering mechanism available that can use > a hand coded filter / converter or one specified in jpql (experimental > atm). Getting a snapshot of the initial data is also possible. But > infinispan does not produce a transaction log to be used for determining > all changes that happened since a previous connection time, so you'll > always have to get a new full snapshot when re-connecting. > > So if Infinispan is the data store I would base the Debezium connector > implementation on Infinispan's event notification system. Not sure about > the other use case though. > > > Thanks, Adrian, for the feedback. A couple of questions. > > You mentioned Infinispan has a pull model ? is this just using the normal > API to read the entries? > > With event listeners, a single connection will receive all of the events > that occur in the cluster, correct? Is it possible (e.g., a very > unfortunately timed crash) for a change to be made to the cache without an > event being produced and sent to listeners? > > > ^ Yeah, that can happen due to async nature of remote events. However, > there's the possibility for clients, upon receiving a new topology, to > receive the current state of the server as events, see [1] and [2] > > [1] http://infinispan.org/docs/dev/user_guide/user_ > guide.html#client_event_listener_state_consumption > [2] http://infinispan.org/docs/dev/user_guide/user_ > guide.html#client_event_listener_failure_handling > > > It is critical that any change event stream is consistent with the store, > and the change event stream is worthless without it. Only when the change > event stream is an accurate representation of what changed can downstream > consumers use the stream to rebuild their own perfect copy of the upstream > store and to keep those copies consistent with the upstream store. > > So, given that the events are handled asynchronously, in a cluster how are > multiple changes to a single entry handled. For example, if a client sets > entry , then a short time after that (or another) client sets entry > , is it guaranteed that a client listening to events will see > first and some time later? Or is it possible that a client > listening might first see and then ? > > > What happens if the network fails or partitions? How does cross site > replication address this? > > > In terms of cross-site, depends what the client is connected to. Clients > can now failover between sites, so they should be able to deal with events > too in the same as explained above. > > > Has there been any thought about adding to Infinispan a write ahead log or > transaction log to each node or, better yet, for the whole cluster? > > > Not that I'm aware of but we've recently added security audit log, so a > transaction log might make sense too. > > > Without a transaction log, Debezium would have to use a client listener > with includeCurrentState=true to obtain the state every time it reconnects. > If Debezium just included all of this state in the event stream, then the > stream might contain lots of superfluous or unnecessary events, then this > impacts all downstream consumers by forcing them to spend a lot of time > processing changes that never really happened. So the only way to avoid > that would be for Debezium to use an external store to track the changes it > has seen so far so that it doesn?t include unnecessary events in the change > event stream. It?d be a shame to have to require this much infrastructure. > > A transaction log would really be a great way to solve this problem. Has > there been any more thought about Infinispan using and exposing a > transaction log? Or perhaps Infinispan could record the changes in a Kafka > topic directly? > > (I guess if the Infinispan cache used relational database(s) as a cache > store(s), then Debezium could just capture the changes from there. That > seems like a big constraint, though.) > > Thoughts? > I recently updated a proposal [1] based on several discussions we had in the past that is essentially about introducing an event storage mechanism (write ahead log) in order to improve reliability, failover and "replayability" for the remote listeners, any feedback greatly appreciated. [1] https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal Thanks, Gustavo -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161208/a17cf773/attachment-0001.html From slaskawi at redhat.com Thu Dec 8 09:17:02 2016 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Thu, 8 Dec 2016 15:17:02 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: References: Message-ID: Fixed: https://github.com/infinispan/infinispan/pull/4712 I managed to upload server zip but the rest of server modules are missing (and will need to wait for Beta2). The reason the staging plugin didn't work was because we have 2 BOMs (bom/pom.xml and server/integration/versions/pom.xml). The staging plugin should go to those 2 modules instead of parent. BTW - why do we have 2 BOMs? It's a bit weird.... Thanks Sebastian On Thu, Dec 8, 2016 at 7:46 AM, Sebastian Laskawiec wrote: > Yes, it is possible. I've created ISPN-7299 to check this. > > On Wed, Dec 7, 2016 at 8:55 PM, Tristan Tarrant > wrote: > >> Probably because of the use of the staging plugin ? >> >> On 07/12/16 19:12, Sebastian Laskawiec wrote: >> > Yes, I can confirm this - it hasn't been uploaded to Nexus [1]. >> > >> > [1] https://origin-repository.jboss.org/nexus/content/repositori >> es/public-jboss/org/infinispan/server/infinispan-server-build/ >> > >> > On Wed, Dec 7, 2016 at 5:55 PM, Gustavo Fernandes >> > > wrote: >> > >> > Apparently it's not on maven anymore [1] after 9.0.0.Beta1 >> > >> > [1] >> > https://mvnrepository.com/artifact/org.infinispan.server/ >> infinispan-server-build >> > > infinispan-server-build> >> > >> > >> > >> > >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org > boss.org> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > >> > >> > >> > >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > >> >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161208/fef85f30/attachment.html From anistor at redhat.com Thu Dec 8 10:50:11 2016 From: anistor at redhat.com (Adrian Nistor) Date: Thu, 8 Dec 2016 17:50:11 +0200 Subject: [infinispan-dev] New blog post Message-ID: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> Hi all, I've just published a new blog post that briefly introduces Ickle, the query language of Infinispan [1]. This will be followed soon by another one on defining domain model schemas, configuring model indexing and analysis. Cheers, Adrian [1]http://blog.infinispan.org/2016/12/meet-ickle.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161208/95c7fbdb/attachment.html From anistor at redhat.com Thu Dec 8 10:57:17 2016 From: anistor at redhat.com (Adrian Nistor) Date: Thu, 8 Dec 2016 17:57:17 +0200 Subject: [infinispan-dev] New blog post In-Reply-To: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> References: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> Message-ID: Wrong link? Here is the correct one: http://blog.infinispan.org/2016/12/meet-ickle.html On 12/08/2016 05:50 PM, Adrian Nistor wrote: > Hi all, > > I've just published a new blog post that briefly introduces Ickle, the query language of Infinispan [1]. This will be followed soon by another one on defining domain model schemas, configuring model indexing and analysis. > > Cheers, > Adrian > > [1] http://blog.infinispan.org/2016/12/meet-ickle.html > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161208/1aefa46e/attachment.html From sanne at infinispan.org Thu Dec 8 11:11:24 2016 From: sanne at infinispan.org (Sanne Grinovero) Date: Thu, 8 Dec 2016 16:11:24 +0000 Subject: [infinispan-dev] New blog post In-Reply-To: References: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> Message-ID: Thank you so much and congratulations Adrian! That's a huge leap forward -- Sanne On 8 December 2016 at 15:57, Adrian Nistor wrote: > Wrong link? > Here is the correct one: http://blog.infinispan.org/2016/12/meet-ickle.html > > > On 12/08/2016 05:50 PM, Adrian Nistor wrote: > > Hi all, > > I've just published a new blog post that briefly introduces Ickle, the query > language of Infinispan [1]. This will be followed soon by another one on > defining domain model schemas, configuring model indexing and analysis. > > Cheers, > Adrian > > [1] http://blog.infinispan.org/2016/12/meet-ickle.html > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Thu Dec 8 11:20:19 2016 From: rvansa at redhat.com (Radim Vansa) Date: Thu, 8 Dec 2016 17:20:19 +0100 Subject: [infinispan-dev] New blog post In-Reply-To: References: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> Message-ID: Nice! I wonder when we'll find out that we need prepared statements, though. R. On 12/08/2016 05:11 PM, Sanne Grinovero wrote: > Thank you so much and congratulations Adrian! That's a huge leap forward > > -- Sanne > > On 8 December 2016 at 15:57, Adrian Nistor wrote: >> Wrong link? >> Here is the correct one: http://blog.infinispan.org/2016/12/meet-ickle.html >> >> >> On 12/08/2016 05:50 PM, Adrian Nistor wrote: >> >> Hi all, >> >> I've just published a new blog post that briefly introduces Ickle, the query >> language of Infinispan [1]. This will be followed soon by another one on >> defining domain model schemas, configuring model indexing and analysis. >> >> Cheers, >> Adrian >> >> [1] http://blog.infinispan.org/2016/12/meet-ickle.html >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From rhauch at redhat.com Thu Dec 8 14:02:19 2016 From: rhauch at redhat.com (Randall Hauch) Date: Thu, 8 Dec 2016 13:02:19 -0600 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: > On Dec 8, 2016, at 3:13 AM, Gustavo Fernandes wrote: > > On Wed, Dec 7, 2016 at 9:20 PM, Randall Hauch > wrote: > Reviving this old thread, and as before I appreciate any help the Infinispan community might provide. There definitely is interest in Debezium capturing the changes being made to an Infinispan cluster. This isn?t as important when Infinispan is used as a cache, but when Infinispan is used as a store then it is important for other apps/services to be able to accurately keep up with the changes being made in the store. > >> On Jul 29, 2016, at 8:47 AM, Galder Zamarre?o > wrote: >> >> >> -- >> Galder Zamarre?o >> Infinispan, Red Hat >> >>> On 11 Jul 2016, at 16:41, Randall Hauch > wrote: >>> >>>> >>>> On Jul 11, 2016, at 3:42 AM, Adrian Nistor > wrote: >>>> >>>> Hi Randall, >>>> >>>> Infinispan supports both push and pull access models. The push model is supported by events (and listeners), which are cluster wide and are available in both library and remote mode (hotrod). The notification system is pretty advanced as there is a filtering mechanism available that can use a hand coded filter / converter or one specified in jpql (experimental atm). Getting a snapshot of the initial data is also possible. But infinispan does not produce a transaction log to be used for determining all changes that happened since a previous connection time, so you'll always have to get a new full snapshot when re-connecting. >>>> >>>> So if Infinispan is the data store I would base the Debezium connector implementation on Infinispan's event notification system. Not sure about the other use case though. >>>> >>> >>> Thanks, Adrian, for the feedback. A couple of questions. >>> >>> You mentioned Infinispan has a pull model ? is this just using the normal API to read the entries? >>> >>> With event listeners, a single connection will receive all of the events that occur in the cluster, correct? Is it possible (e.g., a very unfortunately timed crash) for a change to be made to the cache without an event being produced and sent to listeners? >> >> ^ Yeah, that can happen due to async nature of remote events. However, there's the possibility for clients, upon receiving a new topology, to receive the current state of the server as events, see [1] and [2] >> >> [1] http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_state_consumption >> [2] http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_failure_handling > > It is critical that any change event stream is consistent with the store, and the change event stream is worthless without it. Only when the change event stream is an accurate representation of what changed can downstream consumers use the stream to rebuild their own perfect copy of the upstream store and to keep those copies consistent with the upstream store. > > So, given that the events are handled asynchronously, in a cluster how are multiple changes to a single entry handled. For example, if a client sets entry , then a short time after that (or another) client sets entry , is it guaranteed that a client listening to events will see first and some time later? Or is it possible that a client listening might first see and then ? > >> >>> What happens if the network fails or partitions? How does cross site replication address this? >> >> In terms of cross-site, depends what the client is connected to. Clients can now failover between sites, so they should be able to deal with events too in the same as explained above. >> >>> >>> Has there been any thought about adding to Infinispan a write ahead log or transaction log to each node or, better yet, for the whole cluster? >> >> Not that I'm aware of but we've recently added security audit log, so a transaction log might make sense too. > > Without a transaction log, Debezium would have to use a client listener with includeCurrentState=true to obtain the state every time it reconnects. If Debezium just included all of this state in the event stream, then the stream might contain lots of superfluous or unnecessary events, then this impacts all downstream consumers by forcing them to spend a lot of time processing changes that never really happened. So the only way to avoid that would be for Debezium to use an external store to track the changes it has seen so far so that it doesn?t include unnecessary events in the change event stream. It?d be a shame to have to require this much infrastructure. > > A transaction log would really be a great way to solve this problem. Has there been any more thought about Infinispan using and exposing a transaction log? Or perhaps Infinispan could record the changes in a Kafka topic directly? > > (I guess if the Infinispan cache used relational database(s) as a cache store(s), then Debezium could just capture the changes from there. That seems like a big constraint, though.) > > Thoughts? > > > I recently updated a proposal [1] based on several discussions we had in the past that is essentially about introducing an event storage mechanism (write ahead log) in order to improve reliability, failover and "replayability" for the remote listeners, any feedback greatly appreciated. > > > [1] https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal > Hi, Gustavo. Thanks for the response. I like the proposal a lot, and have a few specific comments and questions. Let me know if there is a better forum for this feedback. It is smart to require the application using the HotRod client to know/manage the id of the latest event it has seen. This allows an application to restart from where it left off, but it also allows the application to replay some events if needed. For example, an application may ?fully-process? an event asynchronously from the ?handle? method (e.g., the event handler method just puts the event into a queue and immediately returns), so only the application knows which ids it has fully processed. If anything goes wrong, the client is in full control over where it wants to restart. When a client first registers, it should always obtain the id of the most recent event in the log. When using "includeState=true?, the client will first receive the state of all entries, and then needs to start reading events from the point at which the state transfer started (this is the only way to ensure that every change is seen at least once). It must be possible to enable this logging on an existing cache, and doing this will likely mean the log starts out capturing only the changes made since the log was enabled. This should be acceptable, since clients that want all entries can optionally start out with a snapshot (e.g., ?includeState=true?). Is the log guaranteed to have the same order of changes as was changed in the cache? Will the log be configured with a TTL for the events or a fixed size? TTLs are easy to understand but require variable amount of storage; capped storage size is easy to manage but harder to understand. Will the log store the ?before? state of the entry? This increases the size of the events and therefore the log, but it means client applications can do a lot more with the events without storing (as much) state. It is very useful for the HodRod client automatically failover when it loses its connectivity with Infinispan. I presume this is based upon the id of the event successfully provided and handled by the listener method. Will the log include transaction boundaries, or at least a transaction id/number in each event? Do/will the events include the entry version numbers? Are the versions included in events when "includeCurrentState=true? is set? I hope this helps; let me know if you want clarification on any of these. I can?t wait to have this feature! Best regards, Randall > Thanks, > Gustavo > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161208/3da88bbf/attachment-0001.html From rvansa at redhat.com Fri Dec 9 04:13:52 2016 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 9 Dec 2016 10:13:52 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: On 12/08/2016 10:13 AM, Gustavo Fernandes wrote: > > I recently updated a proposal [1] based on several discussions we had > in the past that is essentially about introducing an event storage > mechanism (write ahead log) in order to improve reliability, failover > and "replayability" for the remote listeners, any feedback greatly > appreciated. Hi Gustavo, while I really like the pull-style architecture and reliable events, I see some problematic parts here: 1) 'cache that would persist the events with a monotonically increasing id' I assume that you mean globally (for all entries) monotonous. How will you obtain such ID? Currently, commands have unique IDs that are where the number part is monotonous per node. That's easy to achieve. But introducing globally monotonous counter means that there will be a single contention point. (you can introduce another contention points by adding backups, but this is probably unnecessary as you can find out the last id from the indexed cache data). Per-segment monotonous would be probably more scalabe, though that increases complexity. 2) 'The write to the event log would be async in order to not affect normal data writes' Who should write to the cache? a) originator - what if originator crashes (despite the change has been added)? Besides, originator would have to do (async) RPC to primary owner (which will be the primary owner of the event, too). b) primary owner - with triangle, primary does not really know if the change has been written on backup. Piggybacking that info won't be trivial - we don't want to send another message explicitly. But even if we get the confirmation, since the write to event cache is async, if the primary owner crashes before replicating the event to backup, we lost the event c) all owners, but locally - that will require more complex reconciliation if the event did really happen on all surviving nodes or not. And backups could have some trouble to resolve order, too. IIUC clustered listeners are called from primary owner before the change is really confirmed on backups (@Pedro correct me if I am wrong, please), but for this reliable event cache you need higher level of consistency. 3) The log will also have to filter out retried operations (based on command ID - though this can be indexed, too). Though, I would prefer to see per-event command-id log to deal with retries properly. 4) Client should pull data, but I would keep push notifications that 'something happened' (throttled on server). There could be use case for rarely updated caches, and polling the servers would be excessive there. Radim > > > [1] > https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal > > Thanks, > Gustavo > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From rory.odonnell at oracle.com Fri Dec 9 05:28:21 2016 From: rory.odonnell at oracle.com (Rory O'Donnell) Date: Fri, 9 Dec 2016 10:28:21 +0000 Subject: [infinispan-dev] JDK 9 b148 including a refresh of the module system is available on java.net Message-ID: <43c9d625-8879-943a-d6ad-265a71408dbe@oracle.com> Hi Galder, JDK 9 build b148 includes an important Refresh of the module system [1] , summary of changes are listed here . *This refresh includes a disruptive change that is important to understand. *For those that have been trying out modules with regular JDK 9 builds then be aware that `requires public` changes to `requires transitive`. In addition, the binary representation of the module declaration (module-info.class) has changed so that you need to recompile any modules that were compiled with previous JDK 9 builds. As things stand today in JDK 9 then you use setAccessible to break into non-public elements of any type in exported packages. However, it cannot be used to break into any type in non-exported package. The current specified behavior was a compromise for the initial integration of the module system. It is of course not very satisfactory, hence the #AwkwardStrongEncapsulation issue [2] on the JSR 376 issues list. With the updated proposal in the JSR, this refresh changes setAccessible further so that it cannot be used to break into non-public types, or non-public elements of public types, in exported packages. Code that uses setAccessible to hack into the private constructor of java.lang.invoke.MethodHandles.Lookup will be disappointed for example. This change will expose hacks in many existing libraries and tools. As a workaround then a new command line option `--add-opens` can be used to open specific packages for "deep reflection". For example, a really popular build tool fails with this refresh because it uses setAccessible + core reflection to hack into a private field of an unmodifiable collection so that it can mutate it, facepalm! This code will continue to work as before when run with `--add-opens java.base/java.util=ALL-UNNAMED` to open the package java.util in module java.base to "all unnamed modules" (think class path). *Any help reporting issues to popular tools and libraries would be appreciated. * A debugging aid that is useful to identify issues is to run with -Dsun.reflect.debugModuleAccessChecks=true to get a stack trace when setAccessible fails, this is particularly useful when code swallows exceptions without any logging. Rgds,Rory [1] http://mail.openjdk.java.net/pipermail/jdk9-dev/2016-November/005276.html [2] http://openjdk.java.net/projects/jigsaw/spec/issues/#AwkwardStrongEncapsulation -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161209/c41a2b02/attachment.html From slaskawi at redhat.com Fri Dec 9 09:38:27 2016 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Fri, 9 Dec 2016 15:38:27 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: References: Message-ID: Hey guys! Just to clarify the situation. I created the staging plugin fix [1] but unfortunately some of the artifacts are missing from our repository (including BOM). So my vote would be to merge it and release Beta2 ASAP. Of course I volunteer for doing the release. Thanks Sebastian [1] https://github.com/infinispan/infinispan/pull/4712 On Thu, Dec 8, 2016 at 3:17 PM, Sebastian Laskawiec wrote: > Fixed: https://github.com/infinispan/infinispan/pull/4712 > > I managed to upload server zip but the rest of server modules are missing > (and will need to wait for Beta2). > > The reason the staging plugin didn't work was because we have 2 BOMs > (bom/pom.xml and server/integration/versions/pom.xml). The staging plugin > should go to those 2 modules instead of parent. > > BTW - why do we have 2 BOMs? It's a bit weird.... > > Thanks > Sebastian > > On Thu, Dec 8, 2016 at 7:46 AM, Sebastian Laskawiec > wrote: > >> Yes, it is possible. I've created ISPN-7299 to check this. >> >> On Wed, Dec 7, 2016 at 8:55 PM, Tristan Tarrant >> wrote: >> >>> Probably because of the use of the staging plugin ? >>> >>> On 07/12/16 19:12, Sebastian Laskawiec wrote: >>> > Yes, I can confirm this - it hasn't been uploaded to Nexus [1]. >>> > >>> > [1] https://origin-repository.jboss.org/nexus/content/repositori >>> es/public-jboss/org/infinispan/server/infinispan-server-build/ >>> > >>> > On Wed, Dec 7, 2016 at 5:55 PM, Gustavo Fernandes >>> > > wrote: >>> > >>> > Apparently it's not on maven anymore [1] after 9.0.0.Beta1 >>> > >>> > [1] >>> > https://mvnrepository.com/artifact/org.infinispan.server/in >>> finispan-server-build >>> > >> nfinispan-server-build> >>> > >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > infinispan-dev mailing list >>> > infinispan-dev at lists.jboss.org >> boss.org> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> > >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > infinispan-dev mailing list >>> > infinispan-dev at lists.jboss.org >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> > >>> >>> -- >>> Tristan Tarrant >>> Infinispan Lead >>> JBoss, a division of Red Hat >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161209/78ddfb3a/attachment.html From anistor at redhat.com Fri Dec 9 10:29:59 2016 From: anistor at redhat.com (Adrian Nistor) Date: Fri, 9 Dec 2016 17:29:59 +0200 Subject: [infinispan-dev] New blog post In-Reply-To: References: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> Message-ID: <44314f5e-27f1-5049-7190-29169b6c5ff2@redhat.com> Hi Radim, We already need them and almost have them. QueryFactory.create(String queryString) creates a Query object that can be executed multiple times with different params. The Query object could be considered 'prepared'. In theory. In reality this does not work right now because the internals are only implemented half way. Thanks for reminding me to finish it :) Adrian On 12/08/2016 06:20 PM, Radim Vansa wrote: > Nice! I wonder when we'll find out that we need prepared statements, though. > > R. > > On 12/08/2016 05:11 PM, Sanne Grinovero wrote: >> Thank you so much and congratulations Adrian! That's a huge leap forward >> >> -- Sanne >> >> On 8 December 2016 at 15:57, Adrian Nistor wrote: >>> Wrong link? >>> Here is the correct one: http://blog.infinispan.org/2016/12/meet-ickle.html >>> >>> >>> On 12/08/2016 05:50 PM, Adrian Nistor wrote: >>> >>> Hi all, >>> >>> I've just published a new blog post that briefly introduces Ickle, the query >>> language of Infinispan [1]. This will be followed soon by another one on >>> defining domain model schemas, configuring model indexing and analysis. >>> >>> Cheers, >>> Adrian >>> >>> [1] http://blog.infinispan.org/2016/12/meet-ickle.html >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > From rhauch at redhat.com Fri Dec 9 11:08:09 2016 From: rhauch at redhat.com (Randall Hauch) Date: Fri, 9 Dec 2016 10:08:09 -0600 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: > On Dec 9, 2016, at 3:13 AM, Radim Vansa wrote: > > On 12/08/2016 10:13 AM, Gustavo Fernandes wrote: >> >> I recently updated a proposal [1] based on several discussions we had >> in the past that is essentially about introducing an event storage >> mechanism (write ahead log) in order to improve reliability, failover >> and "replayability" for the remote listeners, any feedback greatly >> appreciated. > > Hi Gustavo, > > while I really like the pull-style architecture and reliable events, I > see some problematic parts here: > > 1) 'cache that would persist the events with a monotonically increasing id' > > I assume that you mean globally (for all entries) monotonous. How will > you obtain such ID? Currently, commands have unique IDs that are > where the number part is monotonous per node. That's > easy to achieve. But introducing globally monotonous counter means that > there will be a single contention point. (you can introduce another > contention points by adding backups, but this is probably unnecessary as > you can find out the last id from the indexed cache data). Per-segment > monotonous would be probably more scalabe, though that increases complexity. It is complicated, but one way to do this is to have one ?primary? node maintain the log and to have other replicate from it. The cluster does need to use consensus to agree which is the primary, and to know which secondary becomes the primary if the primary is failing. Consensus is not trivial, but JGroups Raft (http://belaban.github.io/jgroups-raft/ ) may be an option. However, this approach ensures that the replica logs are identical to the primary since they are simply recording the primary?s log as-is. Of course, another challenge is what happens during a failure of the primary log node, and can any transactions be performed/completed while the primary is unavailable. Another option is to have each node maintain their own log, and to have an aggregator log that merges/combines the various logs into one. Not sure how feasible it is to merge logs by getting rid of duplicates and determining a total order, but if it is then it may have better fault tolerance characteristics. Of course, it is possible to have node-specific monotonic IDs. For example, MySQL Global Transaction IDs (GTIDs) use a unique UUID for each node, and then GTIDs consists of the node?s UUID plus a monotonically-increasing value (e.g., ?31fc48cd-ecd4-46ad-b0a9-f515fc9497c4:1001?). The transaction log contains a mix of GTIDs, and MySQL replication uses a ?GTID set? to describe the ranges of transactions known by a server (e.g., ?u1:1-100,u2:1-10000,u3:3-5? where ?u1?, ?u2?, and ?u3? are actually UUIDs). So, when a MySQL replica connects, it says ?I know about this GTID set", and this tells the master where that client wants to start reading. > > 2) 'The write to the event log would be async in order to not affect > normal data writes' > > Who should write to the cache? > a) originator - what if originator crashes (despite the change has been > added)? Besides, originator would have to do (async) RPC to primary > owner (which will be the primary owner of the event, too). > b) primary owner - with triangle, primary does not really know if the > change has been written on backup. Piggybacking that info won't be > trivial - we don't want to send another message explicitly. But even if > we get the confirmation, since the write to event cache is async, if the > primary owner crashes before replicating the event to backup, we lost > the event > c) all owners, but locally - that will require more complex > reconciliation if the event did really happen on all surviving nodes or > not. And backups could have some trouble to resolve order, too. > > IIUC clustered listeners are called from primary owner before the change > is really confirmed on backups (@Pedro correct me if I am wrong, > please), but for this reliable event cache you need higher level of > consistency. This could be handled by writing a confirmation or ?commit? event to the log when the write is confirmed or the transaction is committed. Then, only those confirmed events/transactions would be exposed to client listeners. This requires some buffering, but this could be done in each HotRod client. > > 3) The log will also have to filter out retried operations (based on > command ID - though this can be indexed, too). Though, I would prefer to > see per-event command-id log to deal with retries properly. IIUC, a ?commit? event would work here, too. > > 4) Client should pull data, but I would keep push notifications that > 'something happened' (throttled on server). There could be use case for > rarely updated caches, and polling the servers would be excessive there. IMO the clients should poll, but if the server has nothing to return it blocks until there is something or until a timeout occurs. This makes it easy for clients and actually reduces network traffic compared to constantly polling. BTW, a lot of this is replicating the functionality of Kafka, which is already quite mature and feature rich. It?s actually possible to *embed* Kafka to simplify operations, but I don?t think that?s recommended. And, it introduces a very complex codebase that would need to be supported. > > Radim > >> >> >> [1] >> https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal >> >> Thanks, >> Gustavo >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161209/e8731039/attachment.html From emmanuel at hibernate.org Fri Dec 9 12:25:56 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Fri, 9 Dec 2016 18:25:56 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <20161209172556.GD48509@hibernate.org> Randall and I had a chat on $subject. Here is a proposal worth exploring as it is very lightweight on Infinispan's code. Does an operation has a unique id sahred by the master and replicas? If not could we add that? The proposal itself: The total order would not be global but per key. Each node has a Debezium connector instance embedded that listens to the operations happening (primary and replicas alike). All of this process is happening async compared to the operation. Per key, a log of operations is kept in memory (it contains the key, the operation, the operation unique id and a ack status. If on the key owner, the operation is written by the Debezium connector to Kafka when it has been acked (whatever that means is where I'm less knowledgable - too many bi-cache, tri-cache and quadri latency mixed in my brain). On a replica, the kafka partition is read regularly to clear the in-memory log from operations stored in Kafka If the replica becomes the owner, it reads the kafka partition to see what operations are already in and writes the missing ones. There are a few cool things: - few to no change in what Infinispan does - no global ordering simplifies things and frankly is fine for most Debezium cases. In the end a global order could be defined after the fact (by not partitioning for example). But that's a pure downstream concern. - everything is async compared to the Infinispan ops - the in-memory log can remain in memory as it is protected by replicas - the in-memory log is self cleaning thanks to the state in Kafka Everyone wins. But it does require some sort of globally unique id per operation to dedup. Emmanuel On Fri 16-12-09 10:08, Randall Hauch wrote: > >> On Dec 9, 2016, at 3:13 AM, Radim Vansa wrote: >> >> On 12/08/2016 10:13 AM, Gustavo Fernandes wrote: >>> >>> I recently updated a proposal [1] based on several discussions we had >>> in the past that is essentially about introducing an event storage >>> mechanism (write ahead log) in order to improve reliability, failover >>> and "replayability" for the remote listeners, any feedback greatly >>> appreciated. >> >> Hi Gustavo, >> >> while I really like the pull-style architecture and reliable events, I >> see some problematic parts here: >> >> 1) 'cache that would persist the events with a monotonically increasing id' >> >> I assume that you mean globally (for all entries) monotonous. How will >> you obtain such ID? Currently, commands have unique IDs that are >> where the number part is monotonous per node. That's >> easy to achieve. But introducing globally monotonous counter means that >> there will be a single contention point. (you can introduce another >> contention points by adding backups, but this is probably unnecessary as >> you can find out the last id from the indexed cache data). Per-segment >> monotonous would be probably more scalabe, though that increases complexity. > >It is complicated, but one way to do this is to have one ?primary? node maintain the log and to have other replicate from it. The cluster does need to use consensus to agree which is the primary, and to know which secondary becomes the primary if the primary is failing. Consensus is not trivial, but JGroups Raft (http://belaban.github.io/jgroups-raft/ ) may be an option. However, this approach ensures that the replica logs are identical to the primary since they are simply recording the primary?s log as-is. Of course, another challenge is what happens during a failure of the primary log node, and can any transactions be performed/completed while the primary is unavailable. > >Another option is to have each node maintain their own log, and to have an aggregator log that merges/combines the various logs into one. Not sure how feasible it is to merge logs by getting rid of duplicates and determining a total order, but if it is then it may have better fault tolerance characteristics. > >Of course, it is possible to have node-specific monotonic IDs. For example, MySQL Global Transaction IDs (GTIDs) use a unique UUID for each node, and then GTIDs consists of the node?s UUID plus a monotonically-increasing value (e.g., ?31fc48cd-ecd4-46ad-b0a9-f515fc9497c4:1001?). The transaction log contains a mix of GTIDs, and MySQL replication uses a ?GTID set? to describe the ranges of transactions known by a server (e.g., ?u1:1-100,u2:1-10000,u3:3-5? where ?u1?, ?u2?, and ?u3? are actually UUIDs). So, when a MySQL replica connects, it says ?I know about this GTID set", and this tells the master where that client wants to start reading. > >> >> 2) 'The write to the event log would be async in order to not affect >> normal data writes' >> >> Who should write to the cache? >> a) originator - what if originator crashes (despite the change has been >> added)? Besides, originator would have to do (async) RPC to primary >> owner (which will be the primary owner of the event, too). >> b) primary owner - with triangle, primary does not really know if the >> change has been written on backup. Piggybacking that info won't be >> trivial - we don't want to send another message explicitly. But even if >> we get the confirmation, since the write to event cache is async, if the >> primary owner crashes before replicating the event to backup, we lost >> the event >> c) all owners, but locally - that will require more complex >> reconciliation if the event did really happen on all surviving nodes or >> not. And backups could have some trouble to resolve order, too. >> >> IIUC clustered listeners are called from primary owner before the change >> is really confirmed on backups (@Pedro correct me if I am wrong, >> please), but for this reliable event cache you need higher level of >> consistency. > >This could be handled by writing a confirmation or ?commit? event to the log when the write is confirmed or the transaction is committed. Then, only those confirmed events/transactions would be exposed to client listeners. This requires some buffering, but this could be done in each HotRod client. > >> >> 3) The log will also have to filter out retried operations (based on >> command ID - though this can be indexed, too). Though, I would prefer to >> see per-event command-id log to deal with retries properly. > >IIUC, a ?commit? event would work here, too. > >> >> 4) Client should pull data, but I would keep push notifications that >> 'something happened' (throttled on server). There could be use case for >> rarely updated caches, and polling the servers would be excessive there. > >IMO the clients should poll, but if the server has nothing to return it blocks until there is something or until a timeout occurs. This makes it easy for clients and actually reduces network traffic compared to constantly polling. > >BTW, a lot of this is replicating the functionality of Kafka, which is already quite mature and feature rich. It?s actually possible to *embed* Kafka to simplify operations, but I don?t think that?s recommended. And, it introduces a very complex codebase that would need to be supported. > >> >> Radim >> >>> >>> >>> [1] >>> https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal >>> >>> Thanks, >>> Gustavo >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Radim Vansa >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >_______________________________________________ >infinispan-dev mailing list >infinispan-dev at lists.jboss.org >https://lists.jboss.org/mailman/listinfo/infinispan-dev From rhauch at redhat.com Fri Dec 9 12:30:07 2016 From: rhauch at redhat.com (Randall Hauch) Date: Fri, 9 Dec 2016 11:30:07 -0600 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <7907E3BF-0C05-448F-AB24-7C5214B775A7@redhat.com> > On Dec 9, 2016, at 10:08 AM, Randall Hauch wrote: > >> >> On Dec 9, 2016, at 3:13 AM, Radim Vansa > wrote: >> >> On 12/08/2016 10:13 AM, Gustavo Fernandes wrote: >>> >>> I recently updated a proposal [1] based on several discussions we had >>> in the past that is essentially about introducing an event storage >>> mechanism (write ahead log) in order to improve reliability, failover >>> and "replayability" for the remote listeners, any feedback greatly >>> appreciated. >> >> Hi Gustavo, >> >> while I really like the pull-style architecture and reliable events, I >> see some problematic parts here: >> >> 1) 'cache that would persist the events with a monotonically increasing id' >> >> I assume that you mean globally (for all entries) monotonous. How will >> you obtain such ID? Currently, commands have unique IDs that are >> where the number part is monotonous per node. That's >> easy to achieve. But introducing globally monotonous counter means that >> there will be a single contention point. (you can introduce another >> contention points by adding backups, but this is probably unnecessary as >> you can find out the last id from the indexed cache data). Per-segment >> monotonous would be probably more scalabe, though that increases complexity. > > It is complicated, but one way to do this is to have one ?primary? node maintain the log and to have other replicate from it. The cluster does need to use consensus to agree which is the primary, and to know which secondary becomes the primary if the primary is failing. Consensus is not trivial, but JGroups Raft (http://belaban.github.io/jgroups-raft/ ) may be an option. However, this approach ensures that the replica logs are identical to the primary since they are simply recording the primary?s log as-is. Of course, another challenge is what happens during a failure of the primary log node, and can any transactions be performed/completed while the primary is unavailable. > > Another option is to have each node maintain their own log, and to have an aggregator log that merges/combines the various logs into one. Not sure how feasible it is to merge logs by getting rid of duplicates and determining a total order, but if it is then it may have better fault tolerance characteristics. > > Of course, it is possible to have node-specific monotonic IDs. For example, MySQL Global Transaction IDs (GTIDs) use a unique UUID for each node, and then GTIDs consists of the node?s UUID plus a monotonically-increasing value (e.g., ?31fc48cd-ecd4-46ad-b0a9-f515fc9497c4:1001?). The transaction log contains a mix of GTIDs, and MySQL replication uses a ?GTID set? to describe the ranges of transactions known by a server (e.g., ?u1:1-100,u2:1-10000,u3:3-5? where ?u1?, ?u2?, and ?u3? are actually UUIDs). So, when a MySQL replica connects, it says ?I know about this GTID set", and this tells the master where that client wants to start reading. Emmanuel and I were talking offline. Another approach entirely is to have each node (optionally) write the changes it is making as a leader directly to Kafka, meaning that Kafka becomes the event log and delivery mechanism. Upon failure of that node, the node that becomes the new leader would write any of its events not already written by the former leader, and then continue writing new changes it is making as a leader. Thus, Infinispan would not be producing a single log with total order of all changes to a cache (which there isn?t one in Infinispan), but rather the total order of each key. (Kafka does this very nicely via topic partitions, where all changes for each key always get written to the same partition, and each partition has a total order.) This approach may still need separate ?commit? events to reflect how Infinispan currently works internally. Obviously Infinispan wouldn?t require this to be done, but when it?s enabled it might provide a much simpler way of capturing the history of changes to the events in an Infinispan cache. The HotRod client could consume the events directly from Kafka, or that could be left to a completely different client/utility. It does add a dependency on Kafka, but it means the Infinispan community doesn?t need to build much of the same functionality. > >> >> 2) 'The write to the event log would be async in order to not affect >> normal data writes' >> >> Who should write to the cache? >> a) originator - what if originator crashes (despite the change has been >> added)? Besides, originator would have to do (async) RPC to primary >> owner (which will be the primary owner of the event, too). >> b) primary owner - with triangle, primary does not really know if the >> change has been written on backup. Piggybacking that info won't be >> trivial - we don't want to send another message explicitly. But even if >> we get the confirmation, since the write to event cache is async, if the >> primary owner crashes before replicating the event to backup, we lost >> the event >> c) all owners, but locally - that will require more complex >> reconciliation if the event did really happen on all surviving nodes or >> not. And backups could have some trouble to resolve order, too. >> >> IIUC clustered listeners are called from primary owner before the change >> is really confirmed on backups (@Pedro correct me if I am wrong, >> please), but for this reliable event cache you need higher level of >> consistency. > > This could be handled by writing a confirmation or ?commit? event to the log when the write is confirmed or the transaction is committed. Then, only those confirmed events/transactions would be exposed to client listeners. This requires some buffering, but this could be done in each HotRod client. > >> >> 3) The log will also have to filter out retried operations (based on >> command ID - though this can be indexed, too). Though, I would prefer to >> see per-event command-id log to deal with retries properly. > > IIUC, a ?commit? event would work here, too. > >> >> 4) Client should pull data, but I would keep push notifications that >> 'something happened' (throttled on server). There could be use case for >> rarely updated caches, and polling the servers would be excessive there. > > IMO the clients should poll, but if the server has nothing to return it blocks until there is something or until a timeout occurs. This makes it easy for clients and actually reduces network traffic compared to constantly polling. > > BTW, a lot of this is replicating the functionality of Kafka, which is already quite mature and feature rich. It?s actually possible to *embed* Kafka to simplify operations, but I don?t think that?s recommended. And, it introduces a very complex codebase that would need to be supported. > >> >> Radim >> >>> >>> >>> [1] >>> https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal >>> >>> Thanks, >>> Gustavo >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Radim Vansa > >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161209/1a37861f/attachment-0001.html From rvansa at redhat.com Fri Dec 9 15:30:43 2016 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 9 Dec 2016 21:30:43 +0100 Subject: [infinispan-dev] New blog post In-Reply-To: <44314f5e-27f1-5049-7190-29169b6c5ff2@redhat.com> References: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> <44314f5e-27f1-5049-7190-29169b6c5ff2@redhat.com> Message-ID: <69cdc77b-3109-e495-ec5f-89efa5ee2e99@redhat.com> If that's a remote query, it will send over the wire the Ickle string, right? Then it's not a prepared statement as I see it, since the server will have to parse that string again. By prepared statement I would expect sending only identifier (+params), and the server could only look-up a table of prepared statements to get the underlying (Lucene?) representation, and maybe a recipe for more effective unmarshalling of parameters. If any of my assumptions are wrong, please correct me, I haven't played with querying for a long time. Radim On 12/09/2016 04:29 PM, Adrian Nistor wrote: > Hi Radim, > > We already need them and almost have them. QueryFactory.create(String > queryString) creates a Query object that can be executed multiple times > with different params. The Query object could be considered 'prepared'. > In theory. > > In reality this does not work right now because the internals are only > implemented half way. Thanks for reminding me to finish it :) > > Adrian > > On 12/08/2016 06:20 PM, Radim Vansa wrote: >> Nice! I wonder when we'll find out that we need prepared statements, though. >> >> R. >> >> On 12/08/2016 05:11 PM, Sanne Grinovero wrote: >>> Thank you so much and congratulations Adrian! That's a huge leap forward >>> >>> -- Sanne >>> >>> On 8 December 2016 at 15:57, Adrian Nistor wrote: >>>> Wrong link? >>>> Here is the correct one: http://blog.infinispan.org/2016/12/meet-ickle.html >>>> >>>> >>>> On 12/08/2016 05:50 PM, Adrian Nistor wrote: >>>> >>>> Hi all, >>>> >>>> I've just published a new blog post that briefly introduces Ickle, the query >>>> language of Infinispan [1]. This will be followed soon by another one on >>>> defining domain model schemas, configuring model indexing and analysis. >>>> >>>> Cheers, >>>> Adrian >>>> >>>> [1] http://blog.infinispan.org/2016/12/meet-ickle.html >>>> >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From rvansa at redhat.com Fri Dec 9 16:12:27 2016 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 9 Dec 2016 22:12:27 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <0e85fd3d-2f1a-57a2-4344-25522840d771@redhat.com> On 12/09/2016 05:08 PM, Randall Hauch wrote: > >> On Dec 9, 2016, at 3:13 AM, Radim Vansa > > wrote: >> >> On 12/08/2016 10:13 AM, Gustavo Fernandes wrote: >>> >>> I recently updated a proposal [1] based on several discussions we had >>> in the past that is essentially about introducing an event storage >>> mechanism (write ahead log) in order to improve reliability, failover >>> and "replayability" for the remote listeners, any feedback greatly >>> appreciated. >> >> Hi Gustavo, >> >> while I really like the pull-style architecture and reliable events, I >> see some problematic parts here: >> >> 1) 'cache that would persist the events with a monotonically >> increasing id' >> >> I assume that you mean globally (for all entries) monotonous. How will >> you obtain such ID? Currently, commands have unique IDs that are >> where the number part is monotonous per node. That's >> easy to achieve. But introducing globally monotonous counter means that >> there will be a single contention point. (you can introduce another >> contention points by adding backups, but this is probably unnecessary as >> you can find out the last id from the indexed cache data). Per-segment >> monotonous would be probably more scalabe, though that increases >> complexity. > > It is complicated, but one way to do this is to have one ?primary? > node maintain the log and to have other replicate from it. The cluster > does need to use consensus to agree which is the primary, and to know > which secondary becomes the primary if the primary is failing. > Consensus is not trivial, but JGroups Raft > (http://belaban.github.io/jgroups-raft/) may be an option. However, > this approach ensures that the replica logs are identical to the > primary since they are simply recording the primary?s log as-is. Of > course, another challenge is what happens during a failure of the > primary log node, and can any transactions be performed/completed > while the primary is unavailable. I am not sure here if you propose to store all events in log on one node, use RAFT for the monotonic counter, or just for some node selection that will source the ids. In either case, you introduce a bottleneck - RAFT does not scale performance-wise, as any solution that uses single node for each operation, no matter how simple that operation is. > > Another option is to have each node maintain their own log, and to > have an aggregator log that merges/combines the various logs into one. > Not sure how feasible it is to merge logs by getting rid of duplicates > and determining a total order, but if it is then it may have better > fault tolerance characteristics. > > Of course, it is possible to have node-specific monotonic IDs. For > example, MySQL Global Transaction IDs (GTIDs) use a unique UUID for > each node, and then GTIDs consists of the node?s UUID plus a > monotonically-increasing value (e.g., > ?31fc48cd-ecd4-46ad-b0a9-f515fc9497c4:1001?). The transaction log > contains a mix of GTIDs, and MySQL replication uses a ?GTID set? to > describe the ranges of transactions known by a server (e.g., > ?u1:1-100,u2:1-10000,u3:3-5? where ?u1?, ?u2?, and ?u3? are actually > UUIDs). So, when a MySQL replica connects, it says ?I know about this > GTID set", and this tells the master where that client wants to start > reading. Yes, similar node + monotonous id is used both for transactions and for non-transactional commands in Infinispan. I would say that in complexity, it's similar to per-segment counters, but so far we have a constant number of segments as opposed to varying number of nodes. Node-specific monotonic ids do not give you monotonic order of commits, just unique ids: If a NodeA does operation 1 and 2, this does not say that 1 will be comitted before 2; 2 can be finished (and pushed to log) before 1. But I don't think you really need a monotonic sequence. In Infinispan, all the nodes should push the events in the same order, though, so the log will know where to start from if a client asks for all messages after op 1. As long as duplicates are properly filtered out. > >> >> 2) 'The write to the event log would be async in order to not affect >> normal data writes' >> >> Who should write to the cache? >> a) originator - what if originator crashes (despite the change has been >> added)? Besides, originator would have to do (async) RPC to primary >> owner (which will be the primary owner of the event, too). >> b) primary owner - with triangle, primary does not really know if the >> change has been written on backup. Piggybacking that info won't be >> trivial - we don't want to send another message explicitly. But even if >> we get the confirmation, since the write to event cache is async, if the >> primary owner crashes before replicating the event to backup, we lost >> the event >> c) all owners, but locally - that will require more complex >> reconciliation if the event did really happen on all surviving nodes or >> not. And backups could have some trouble to resolve order, too. >> >> IIUC clustered listeners are called from primary owner before the change >> is really confirmed on backups (@Pedro correct me if I am wrong, >> please), but for this reliable event cache you need higher level of >> consistency. > > This could be handled by writing a confirmation or ?commit? event to > the log when the write is confirmed or the transaction is committed. > Then, only those confirmed events/transactions would be exposed to > client listeners. This requires some buffering, but this could be done > in each HotRod client. I would put this under "originator". So, if the node that writes the "commit" event crashes, the data is changed (and consistent) in the cluster but nobody will be notified about that. Note that Infinispan does not guarantee that data being written by a crashing node will end up consistent on all owners, because it is the originator who retries the operation if one of the owners crashed (or generally, when a topology changes during the command). So it's not that bad solution after all, if you're okay by missing an effectively committed operation on node crash. > >> >> 3) The log will also have to filter out retried operations (based on >> command ID - though this can be indexed, too). Though, I would prefer to >> see per-event command-id log to deal with retries properly. > > IIUC, a ?commit? event would work here, too. > >> >> 4) Client should pull data, but I would keep push notifications that >> 'something happened' (throttled on server). There could be use case for >> rarely updated caches, and polling the servers would be excessive there. > > IMO the clients should poll, but if the server has nothing to return > it blocks until there is something or until a timeout occurs. This > makes it easy for clients and actually reduces network traffic > compared to constantly polling. I would say that client waiting on a blocked connection is a push (maybe there's a method to implement push otherwise on TCP connection but I am not aware of it - please forgive my ignorance). > > BTW, a lot of this is replicating the functionality of Kafka, which is > already quite mature and feature rich. It?s actually possible to > *embed* Kafka to simplify operations, but I don?t think that?s > recommended. And, it introduces a very complex codebase that would > need to be supported. I wouldn't use complex third party project on a similar tier as JGroups/Infinispan to implement basic functionality (which remote listeners are), but for Debezium it could be a fit. Let's discuss your Kafka based proposal in the follow-up mail thread. R. > >> >> Radim >> >>> >>> >>> [1] >>> https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal >>> >>> Thanks, >>> Gustavo >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Radim Vansa > >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From rvansa at redhat.com Fri Dec 9 16:43:15 2016 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 9 Dec 2016 22:43:15 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <20161209172556.GD48509@hibernate.org> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> Message-ID: <2bde4d84-4e61-5663-2cb2-3fe2afe3908d@redhat.com> On 12/09/2016 06:25 PM, Emmanuel Bernard wrote: > Randall and I had a chat on $subject. Here is a proposal worth > exploring as it is very lightweight on Infinispan's code. > > Does an operation has a unique id sahred by the master and replicas? > If not could we add that? Yes, each modification has unique CommandInvocationId in non-tx cache, and there are GlobalTransaction ids in tx-caches. > > The proposal itself: > > The total order would not be global but per key. > Each node has a Debezium connector instance embedded that listens to the > operations happening (primary and replicas alike). > All of this process is happening async compared to the operation. > Per key, a log of operations is kept in memory (it contains the key, the > operation, the operation unique id and a ack status. > If on the key owner, the operation is written by the Debezium connector > to Kafka when it has been acked (whatever that means is where I'm less > knowledgable - too many bi-cache, tri-cache and quadri latency mixed in > my brain). And the ack is what you have to define. All Infinispan gives you is operation was confirmed on originator => all owners (primary + backups) have stored the value If you send the ack from originator to primary, it could be lost (when originator crashes). If you write Kafka on originator, you don't have any order, and the update could be lost by crashing before somehow replicating to Kafka. If you write Kafka on primary, you need the ack from all backups (minor technical difficulty), and if primary crashes after it has sent the update to all backups, data is effectively modified but Kafka is not. The originator has to detect primary crashing to retry - so probably the primary could only send the ack to originator after it gets ack from all backups AND updates Kafka. But this is exactly what Triangle eliminated. And you still have the problem when originator crashes as well, but at least you're resilient to single node (primary) failure. So you probably intend to forget any "acks" and as soon as primary executes the write locally, just push it to Kafka. No matter the actual "outcome" of the operation. E.g. with putIfAbsent there could be topology change during replication to backup, which will cause the operation to be retried (from the originator). In the meantime, there would be another write, and the retried putIfAbsent will fail. You will have one successful and one unsuccessful putIfAbsent in the log, with the same ID. > On a replica, the kafka partition is read regularly to clear the > in-memory log from operations stored in Kafka > If the replica becomes the owner, it reads the kafka partition to see > what operations are already in and writes the missing ones. Backup owner (becoming primary owner) having the write locally logged does not mean that operation was successfully finished on all owners. > > There are a few cool things: > - few to no change in what Infinispan does > - no global ordering simplifies things and frankly is fine for most > Debezium cases. In the end a global order could be defined after the > fact (by not partitioning for example). But that's a pure downstream > concern. > - everything is async compared to the Infinispan ops > - the in-memory log can remain in memory as it is protected by replicas > - the in-memory log is self cleaning thanks to the state in Kafka > > Everyone wins. But it does require some sort of globally unique id per > operation to dedup. And a suitable definition for Debezium if the operation "happened" or not. Radim > > Emmanuel > > > On Fri 16-12-09 10:08, Randall Hauch wrote: >>> On Dec 9, 2016, at 3:13 AM, Radim Vansa wrote: >>> >>> On 12/08/2016 10:13 AM, Gustavo Fernandes wrote: >>>> I recently updated a proposal [1] based on several discussions we had >>>> in the past that is essentially about introducing an event storage >>>> mechanism (write ahead log) in order to improve reliability, failover >>>> and "replayability" for the remote listeners, any feedback greatly >>>> appreciated. >>> Hi Gustavo, >>> >>> while I really like the pull-style architecture and reliable events, I >>> see some problematic parts here: >>> >>> 1) 'cache that would persist the events with a monotonically increasing id' >>> >>> I assume that you mean globally (for all entries) monotonous. How will >>> you obtain such ID? Currently, commands have unique IDs that are >>> where the number part is monotonous per node. That's >>> easy to achieve. But introducing globally monotonous counter means that >>> there will be a single contention point. (you can introduce another >>> contention points by adding backups, but this is probably unnecessary as >>> you can find out the last id from the indexed cache data). Per-segment >>> monotonous would be probably more scalabe, though that increases complexity. >> It is complicated, but one way to do this is to have one ?primary? node maintain the log and to have other replicate from it. The cluster does need to use consensus to agree which is the primary, and to know which secondary becomes the primary if the primary is failing. Consensus is not trivial, but JGroups Raft (http://belaban.github.io/jgroups-raft/ ) may be an option. However, this approach ensures that the replica logs are identical to the primary since they are simply recording the primary?s log as-is. Of course, another challenge is what happens during a failure of the primary log node, and can any transactions be performed/completed while the primary is unavailable. >> >> Another option is to have each node maintain their own log, and to have an aggregator log that merges/combines the various logs into one. Not sure how feasible it is to merge logs by getting rid of duplicates and determining a total order, but if it is then it may have better fault tolerance characteristics. >> >> Of course, it is possible to have node-specific monotonic IDs. For example, MySQL Global Transaction IDs (GTIDs) use a unique UUID for each node, and then GTIDs consists of the node?s UUID plus a monotonically-increasing value (e.g., ?31fc48cd-ecd4-46ad-b0a9-f515fc9497c4:1001?). The transaction log contains a mix of GTIDs, and MySQL replication uses a ?GTID set? to describe the ranges of transactions known by a server (e.g., ?u1:1-100,u2:1-10000,u3:3-5? where ?u1?, ?u2?, and ?u3? are actually UUIDs). So, when a MySQL replica connects, it says ?I know about this GTID set", and this tells the master where that client wants to start reading. >> >>> 2) 'The write to the event log would be async in order to not affect >>> normal data writes' >>> >>> Who should write to the cache? >>> a) originator - what if originator crashes (despite the change has been >>> added)? Besides, originator would have to do (async) RPC to primary >>> owner (which will be the primary owner of the event, too). >>> b) primary owner - with triangle, primary does not really know if the >>> change has been written on backup. Piggybacking that info won't be >>> trivial - we don't want to send another message explicitly. But even if >>> we get the confirmation, since the write to event cache is async, if the >>> primary owner crashes before replicating the event to backup, we lost >>> the event >>> c) all owners, but locally - that will require more complex >>> reconciliation if the event did really happen on all surviving nodes or >>> not. And backups could have some trouble to resolve order, too. >>> >>> IIUC clustered listeners are called from primary owner before the change >>> is really confirmed on backups (@Pedro correct me if I am wrong, >>> please), but for this reliable event cache you need higher level of >>> consistency. >> This could be handled by writing a confirmation or ?commit? event to the log when the write is confirmed or the transaction is committed. Then, only those confirmed events/transactions would be exposed to client listeners. This requires some buffering, but this could be done in each HotRod client. >> >>> 3) The log will also have to filter out retried operations (based on >>> command ID - though this can be indexed, too). Though, I would prefer to >>> see per-event command-id log to deal with retries properly. >> IIUC, a ?commit? event would work here, too. >> >>> 4) Client should pull data, but I would keep push notifications that >>> 'something happened' (throttled on server). There could be use case for >>> rarely updated caches, and polling the servers would be excessive there. >> IMO the clients should poll, but if the server has nothing to return it blocks until there is something or until a timeout occurs. This makes it easy for clients and actually reduces network traffic compared to constantly polling. >> >> BTW, a lot of this is replicating the functionality of Kafka, which is already quite mature and feature rich. It?s actually possible to *embed* Kafka to simplify operations, but I don?t think that?s recommended. And, it introduces a very complex codebase that would need to be supported. >> >>> Radim >>> >>>> >>>> [1] >>>> https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal >>>> >>>> Thanks, >>>> Gustavo >>>> >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> -- >>> Radim Vansa >>> JBoss Performance Team >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From slaskawi at redhat.com Sat Dec 10 16:16:38 2016 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Sat, 10 Dec 2016 22:16:38 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: References: Message-ID: One more update... I experimented with [1] a lot and polluting BOMs seems to be the only option (see my comment on PR). I also uploaded missing artifacts manually. Let me know if you still encounter a problem with artifacts downloads. Thanks Sebastian [1] https://github.com/infinispan/infinispan/pull/4712 On Fri, Dec 9, 2016 at 3:38 PM, Sebastian Laskawiec wrote: > Hey guys! > > Just to clarify the situation. I created the staging plugin fix [1] but > unfortunately some of the artifacts are missing from our repository > (including BOM). > > So my vote would be to merge it and release Beta2 ASAP. Of course I > volunteer for doing the release. > > Thanks > Sebastian > > [1] https://github.com/infinispan/infinispan/pull/4712 > > > On Thu, Dec 8, 2016 at 3:17 PM, Sebastian Laskawiec > wrote: > >> Fixed: https://github.com/infinispan/infinispan/pull/4712 >> >> I managed to upload server zip but the rest of server modules are missing >> (and will need to wait for Beta2). >> >> The reason the staging plugin didn't work was because we have 2 BOMs >> (bom/pom.xml and server/integration/versions/pom.xml). The staging >> plugin should go to those 2 modules instead of parent. >> >> BTW - why do we have 2 BOMs? It's a bit weird.... >> >> Thanks >> Sebastian >> >> On Thu, Dec 8, 2016 at 7:46 AM, Sebastian Laskawiec >> wrote: >> >>> Yes, it is possible. I've created ISPN-7299 to check this. >>> >>> On Wed, Dec 7, 2016 at 8:55 PM, Tristan Tarrant >>> wrote: >>> >>>> Probably because of the use of the staging plugin ? >>>> >>>> On 07/12/16 19:12, Sebastian Laskawiec wrote: >>>> > Yes, I can confirm this - it hasn't been uploaded to Nexus [1]. >>>> > >>>> > [1] https://origin-repository.jboss.org/nexus/content/repositori >>>> es/public-jboss/org/infinispan/server/infinispan-server-build/ >>>> > >>>> > On Wed, Dec 7, 2016 at 5:55 PM, Gustavo Fernandes >>>> > > wrote: >>>> > >>>> > Apparently it's not on maven anymore [1] after 9.0.0.Beta1 >>>> > >>>> > [1] >>>> > https://mvnrepository.com/artifact/org.infinispan.server/in >>>> finispan-server-build >>>> > >>> nfinispan-server-build> >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > _______________________________________________ >>>> > infinispan-dev mailing list >>>> > infinispan-dev at lists.jboss.org >>> boss.org> >>>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > _______________________________________________ >>>> > infinispan-dev mailing list >>>> > infinispan-dev at lists.jboss.org >>>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> > >>>> >>>> -- >>>> Tristan Tarrant >>>> Infinispan Lead >>>> JBoss, a division of Red Hat >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161210/0b7ad01f/attachment-0001.html From bban at redhat.com Sun Dec 11 05:32:41 2016 From: bban at redhat.com (Bela Ban) Date: Sun, 11 Dec 2016 11:32:41 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <2bde4d84-4e61-5663-2cb2-3fe2afe3908d@redhat.com> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <2bde4d84-4e61-5663-2cb2-3fe2afe3908d@redhat.com> Message-ID: Could BiCache help here? It does establish total order per key (essentially established by the primary) and since multiple node (or all) have ordering information, modifications can always be rolled forward... On 09/12/16 22:43, Radim Vansa wrote: > On 12/09/2016 06:25 PM, Emmanuel Bernard wrote: >> Randall and I had a chat on $subject. Here is a proposal worth >> exploring as it is very lightweight on Infinispan's code. >> >> Does an operation has a unique id sahred by the master and replicas? >> If not could we add that? > > Yes, each modification has unique CommandInvocationId in non-tx cache, > and there are GlobalTransaction ids in tx-caches. > >> >> The proposal itself: >> >> The total order would not be global but per key. >> Each node has a Debezium connector instance embedded that listens to the >> operations happening (primary and replicas alike). >> All of this process is happening async compared to the operation. >> Per key, a log of operations is kept in memory (it contains the key, the >> operation, the operation unique id and a ack status. >> If on the key owner, the operation is written by the Debezium connector >> to Kafka when it has been acked (whatever that means is where I'm less >> knowledgable - too many bi-cache, tri-cache and quadri latency mixed in >> my brain). > > And the ack is what you have to define. All Infinispan gives you is > > operation was confirmed on originator => all owners (primary + backups) > have stored the value > > If you send the ack from originator to primary, it could be lost (when > originator crashes). > If you write Kafka on originator, you don't have any order, and the > update could be lost by crashing before somehow replicating to Kafka. > If you write Kafka on primary, you need the ack from all backups (minor > technical difficulty), and if primary crashes after it has sent the > update to all backups, data is effectively modified but Kafka is not. > The originator has to detect primary crashing to retry - so probably the > primary could only send the ack to originator after it gets ack from all > backups AND updates Kafka. But this is exactly what Triangle eliminated. > And you still have the problem when originator crashes as well, but at > least you're resilient to single node (primary) failure. > > So you probably intend to forget any "acks" and as soon as primary > executes the write locally, just push it to Kafka. No matter the actual > "outcome" of the operation. E.g. with putIfAbsent there could be > topology change during replication to backup, which will cause the > operation to be retried (from the originator). In the meantime, there > would be another write, and the retried putIfAbsent will fail. You will > have one successful and one unsuccessful putIfAbsent in the log, with > the same ID. > >> On a replica, the kafka partition is read regularly to clear the >> in-memory log from operations stored in Kafka >> If the replica becomes the owner, it reads the kafka partition to see >> what operations are already in and writes the missing ones. > > Backup owner (becoming primary owner) having the write locally logged > does not mean that operation was successfully finished on all owners. > >> >> There are a few cool things: >> - few to no change in what Infinispan does >> - no global ordering simplifies things and frankly is fine for most >> Debezium cases. In the end a global order could be defined after the >> fact (by not partitioning for example). But that's a pure downstream >> concern. >> - everything is async compared to the Infinispan ops >> - the in-memory log can remain in memory as it is protected by replicas >> - the in-memory log is self cleaning thanks to the state in Kafka >> >> Everyone wins. But it does require some sort of globally unique id per >> operation to dedup. > > And a suitable definition for Debezium if the operation "happened" or not. > > Radim > >> >> Emmanuel >> >> >> On Fri 16-12-09 10:08, Randall Hauch wrote: >>>> On Dec 9, 2016, at 3:13 AM, Radim Vansa wrote: >>>> >>>> On 12/08/2016 10:13 AM, Gustavo Fernandes wrote: >>>>> I recently updated a proposal [1] based on several discussions we had >>>>> in the past that is essentially about introducing an event storage >>>>> mechanism (write ahead log) in order to improve reliability, failover >>>>> and "replayability" for the remote listeners, any feedback greatly >>>>> appreciated. >>>> Hi Gustavo, >>>> >>>> while I really like the pull-style architecture and reliable events, I >>>> see some problematic parts here: >>>> >>>> 1) 'cache that would persist the events with a monotonically increasing id' >>>> >>>> I assume that you mean globally (for all entries) monotonous. How will >>>> you obtain such ID? Currently, commands have unique IDs that are >>>> where the number part is monotonous per node. That's >>>> easy to achieve. But introducing globally monotonous counter means that >>>> there will be a single contention point. (you can introduce another >>>> contention points by adding backups, but this is probably unnecessary as >>>> you can find out the last id from the indexed cache data). Per-segment >>>> monotonous would be probably more scalabe, though that increases complexity. >>> It is complicated, but one way to do this is to have one ?primary? node maintain the log and to have other replicate from it. The cluster does need to use consensus to agree which is the primary, and to know which secondary becomes the primary if the primary is failing. Consensus is not trivial, but JGroups Raft (http://belaban.github.io/jgroups-raft/ ) may be an option. However, this approach ensures that the replica logs are identical to the primary since they are simply recording the primary?s log as-is. Of course, another challenge is what happens during a failure of the primary log node, and can any transactions be performed/completed while the primary is unavailable. >>> >>> Another option is to have each node maintain their own log, and to have an aggregator log that merges/combines the various logs into one. Not sure how feasible it is to merge logs by getting rid of duplicates and determining a total order, but if it is then it may have better fault tolerance characteristics. >>> >>> Of course, it is possible to have node-specific monotonic IDs. For example, MySQL Global Transaction IDs (GTIDs) use a unique UUID for each node, and then GTIDs consists of the node?s UUID plus a monotonically-increasing value (e.g., ?31fc48cd-ecd4-46ad-b0a9-f515fc9497c4:1001?). The transaction log contains a mix of GTIDs, and MySQL replication uses a ?GTID set? to describe the ranges of transactions known by a server (e.g., ?u1:1-100,u2:1-10000,u3:3-5? where ?u1?, ?u2?, and ?u3? are actually UUIDs). So, when a MySQL replica connects, it says ?I know about this GTID set", and this tells the master where that client wants to start reading. >>> >>>> 2) 'The write to the event log would be async in order to not affect >>>> normal data writes' >>>> >>>> Who should write to the cache? >>>> a) originator - what if originator crashes (despite the change has been >>>> added)? Besides, originator would have to do (async) RPC to primary >>>> owner (which will be the primary owner of the event, too). >>>> b) primary owner - with triangle, primary does not really know if the >>>> change has been written on backup. Piggybacking that info won't be >>>> trivial - we don't want to send another message explicitly. But even if >>>> we get the confirmation, since the write to event cache is async, if the >>>> primary owner crashes before replicating the event to backup, we lost >>>> the event >>>> c) all owners, but locally - that will require more complex >>>> reconciliation if the event did really happen on all surviving nodes or >>>> not. And backups could have some trouble to resolve order, too. >>>> >>>> IIUC clustered listeners are called from primary owner before the change >>>> is really confirmed on backups (@Pedro correct me if I am wrong, >>>> please), but for this reliable event cache you need higher level of >>>> consistency. >>> This could be handled by writing a confirmation or ?commit? event to the log when the write is confirmed or the transaction is committed. Then, only those confirmed events/transactions would be exposed to client listeners. This requires some buffering, but this could be done in each HotRod client. >>> >>>> 3) The log will also have to filter out retried operations (based on >>>> command ID - though this can be indexed, too). Though, I would prefer to >>>> see per-event command-id log to deal with retries properly. >>> IIUC, a ?commit? event would work here, too. >>> >>>> 4) Client should pull data, but I would keep push notifications that >>>> 'something happened' (throttled on server). There could be use case for >>>> rarely updated caches, and polling the servers would be excessive there. >>> IMO the clients should poll, but if the server has nothing to return it blocks until there is something or until a timeout occurs. This makes it easy for clients and actually reduces network traffic compared to constantly polling. >>> >>> BTW, a lot of this is replicating the functionality of Kafka, which is already quite mature and feature rich. It?s actually possible to *embed* Kafka to simplify operations, but I don?t think that?s recommended. And, it introduces a very complex codebase that would need to be supported. >>> >>>> Radim >>>> >>>>> >>>>> [1] >>>>> https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal >>>>> >>>>> Thanks, >>>>> Gustavo >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> -- >>>> Radim Vansa >>>> JBoss Performance Team >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- Bela Ban, JGroups lead (http://www.jgroups.org) From ttarrant at redhat.com Sun Dec 11 13:55:56 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Sun, 11 Dec 2016 19:55:56 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: References: Message-ID: <4037684a-c846-7180-d440-b426f9ad1451@redhat.com> On 08/12/16 15:17, Sebastian Laskawiec wrote: > Fixed: https://github.com/infinispan/infinispan/pull/4712 > > I managed to upload server zip but the rest of server modules are > missing (and will need to wait for Beta2). > > The reason the staging plugin didn't work was because we have 2 BOMs > (bom/pom.xml and server/integration/versions/pom.xml). The staging > plugin should go to those 2 modules instead of parent. > > BTW - why do we have 2 BOMs? It's a bit weird.... Because they have separate parents. The server needs to be a descendant of org.jboss:jboss-parent. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From anistor at redhat.com Sun Dec 11 15:38:13 2016 From: anistor at redhat.com (Adrian Nistor) Date: Sun, 11 Dec 2016 22:38:13 +0200 Subject: [infinispan-dev] New blog post In-Reply-To: <69cdc77b-3109-e495-ec5f-89efa5ee2e99@redhat.com> References: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> <44314f5e-27f1-5049-7190-29169b6c5ff2@redhat.com> <69cdc77b-3109-e495-ec5f-89efa5ee2e99@redhat.com> Message-ID: We have an internal 'query cache' that stores the parsed query objects so next time it sees the same query string it will not parse it again. The query cache has eviction and expiration configured so it will not grow without control. On 12/09/2016 10:30 PM, Radim Vansa wrote: > If that's a remote query, it will send over the wire the Ickle string, > right? Then it's not a prepared statement as I see it, since the server > will have to parse that string again. By prepared statement I would > expect sending only identifier (+params), and the server could only > look-up a table of prepared statements to get the underlying (Lucene?) > representation, and maybe a recipe for more effective unmarshalling of > parameters. > > If any of my assumptions are wrong, please correct me, I haven't played > with querying for a long time. > > Radim > > On 12/09/2016 04:29 PM, Adrian Nistor wrote: >> Hi Radim, >> >> We already need them and almost have them. QueryFactory.create(String >> queryString) creates a Query object that can be executed multiple times >> with different params. The Query object could be considered 'prepared'. >> In theory. >> >> In reality this does not work right now because the internals are only >> implemented half way. Thanks for reminding me to finish it :) >> >> Adrian >> >> On 12/08/2016 06:20 PM, Radim Vansa wrote: >>> Nice! I wonder when we'll find out that we need prepared statements, though. >>> >>> R. >>> >>> On 12/08/2016 05:11 PM, Sanne Grinovero wrote: >>>> Thank you so much and congratulations Adrian! That's a huge leap forward >>>> >>>> -- Sanne >>>> >>>> On 8 December 2016 at 15:57, Adrian Nistor wrote: >>>>> Wrong link? >>>>> Here is the correct one: http://blog.infinispan.org/2016/12/meet-ickle.html >>>>> >>>>> >>>>> On 12/08/2016 05:50 PM, Adrian Nistor wrote: >>>>> >>>>> Hi all, >>>>> >>>>> I've just published a new blog post that briefly introduces Ickle, the query >>>>> language of Infinispan [1]. This will be followed soon by another one on >>>>> defining domain model schemas, configuring model indexing and analysis. >>>>> >>>>> Cheers, >>>>> Adrian >>>>> >>>>> [1] http://blog.infinispan.org/2016/12/meet-ickle.html >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > From slaskawi at redhat.com Mon Dec 12 00:56:38 2016 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Mon, 12 Dec 2016 06:56:38 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: <4037684a-c846-7180-d440-b426f9ad1451@redhat.com> References: <4037684a-c846-7180-d440-b426f9ad1451@redhat.com> Message-ID: And why it needs to use this specific parent (and not our infinispan BOM for the instance)? On Sun, Dec 11, 2016 at 7:55 PM, Tristan Tarrant wrote: > > > On 08/12/16 15:17, Sebastian Laskawiec wrote: > > Fixed: https://github.com/infinispan/infinispan/pull/4712 > > > > I managed to upload server zip but the rest of server modules are > > missing (and will need to wait for Beta2). > > > > The reason the staging plugin didn't work was because we have 2 BOMs > > (bom/pom.xml and server/integration/versions/pom.xml). The staging > > plugin should go to those 2 modules instead of parent. > > > > BTW - why do we have 2 BOMs? It's a bit weird.... > > > Because they have separate parents. The server needs to be a descendant > of org.jboss:jboss-parent. > > Tristan > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161212/e146ac2e/attachment.html From ttarrant at redhat.com Mon Dec 12 02:24:27 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 12 Dec 2016 08:24:27 +0100 Subject: [infinispan-dev] Where is the server? In-Reply-To: References: <4037684a-c846-7180-d440-b426f9ad1451@redhat.com> Message-ID: Because the server is more WildFly than it is Infinispan, although this might now be irrelevant since the introduction of feature packs. If you find a way to ensure that the server modules are built with all of the correct dependencies just by import of the WildFly BOM, then go ahead. Tristan On 12/12/16 06:56, Sebastian Laskawiec wrote: > And why it needs to use this specific parent (and not our infinispan BOM > for the instance)? > > On Sun, Dec 11, 2016 at 7:55 PM, Tristan Tarrant > wrote: > > > > On 08/12/16 15:17, Sebastian Laskawiec wrote: > > Fixed: https://github.com/infinispan/infinispan/pull/4712 > > > > > I managed to upload server zip but the rest of server modules are > > missing (and will need to wait for Beta2). > > > > The reason the staging plugin didn't work was because we have 2 BOMs > > (bom/pom.xml and server/integration/versions/pom.xml). The staging > > plugin should go to those 2 modules instead of parent. > > > > BTW - why do we have 2 BOMs? It's a bit weird.... > > > Because they have separate parents. The server needs to be a descendant > of org.jboss:jboss-parent. > > Tristan > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From gustavo at infinispan.org Mon Dec 12 08:19:53 2016 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Mon, 12 Dec 2016 13:19:53 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa wrote: > 1) 'cache that would persist the events with a monotonically increasing id' > > I assume that you mean globally (for all entries) monotonous. How will > you obtain such ID? Currently, commands have unique IDs that are > where the number part is monotonous per node. That's > easy to achieve. But introducing globally monotonous counter means that > there will be a single contention point. (you can introduce another > contention points by adding backups, but this is probably unnecessary as > you can find out the last id from the indexed cache data). Per-segment > monotonous would be probably more scalabe, though that increases > complexity. > Having it per segment would imply only operations involving the same key would be ordered, probably it's fine for most cases. Could this order be affected during topology changes though? As I could observe, there is a small window where there is more than 1 primary owner for a given key due to the fact that the CH propagation is not complete. > > 2) 'The write to the event log would be async in order to not affect > normal data writes' > > Who should write to the cache? > a) originator - what if originator crashes (despite the change has been > added)? Besides, originator would have to do (async) RPC to primary > owner (which will be the primary owner of the event, too). > b) primary owner - with triangle, primary does not really know if the > change has been written on backup. Piggybacking that info won't be > trivial - we don't want to send another message explicitly. But even if > we get the confirmation, since the write to event cache is async, if the > primary owner crashes before replicating the event to backup, we lost > the event > c) all owners, but locally - that will require more complex > reconciliation if the event did really happen on all surviving nodes or > not. And backups could have some trouble to resolve order, too. > > IIUC clustered listeners are called from primary owner before the change > is really confirmed on backups (@Pedro correct me if I am wrong, > please), but for this reliable event cache you need higher level of > consistency. > Async writes to a cache event log would not provide the best of guarantees, agreed. OTOH, to have the writes done synchronously, it'd be hard to avoid extra RPCs. Some can be prevented by using a KeyPartitioner similar to the one used on the AffinityIndexManager [1], so that Segment(K) = Segment(KE), being K the key and KE the related event log key. Still RPCs would happen to replicate events, and as you pointed out, it is not trivial to piggyback this on the triangle'd data RPCs. I'm starting to think that an extra cache to store events is overkill. An alternative could be to bypass the event log cache altogether and store the events on the Lucene index directly. For this a custom interceptor would write them to a local index when it's "safe" to do so, similar to what the QueryInterceptor does with the Index.ALL flag, but only writing on primary + backup, more like to a hypothetical "Index.OWNER" setup. This index does not necessarily need to be stored in extra caches (like the Infinispan directory does) but can use a local MMap based directory, making it OS cache friendly. At event consumption time, though, broadcast queries to the primary owners would be needed to collect the events on each of the nodes and merge them before serving to the clients. [1] https://github.com/infinispan/infinispan/blob/master/core/sr c/main/java/org/infinispan/distribution/ch/impl/AffinityPartitioner.java > 3) The log will also have to filter out retried operations (based on > command ID - though this can be indexed, too). Though, I would prefer to > see per-event command-id log to deal with retries properly. > > 4) Client should pull data, but I would keep push notifications that > 'something happened' (throttled on server). There could be use case for > rarely updated caches, and polling the servers would be excessive there. > > Radim > Makes sense, the push could be a notification that the event log changed and the client would them proceed with its normal pull. > > > > > > [1] > > https://github.com/infinispan/infinispan/wiki/Remote-Listene > rs-improvement-proposal > > > > Thanks, > > Gustavo > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161212/61bcd98f/attachment.html From gustavo at infinispan.org Mon Dec 12 08:58:02 2016 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Mon, 12 Dec 2016 13:58:02 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa wrote: > But introducing globally monotonous counter means that > there will be a single contention point. > I wonder if the trade-off of Flake Ids [1] could be acceptable for this use case. [1] http://yellerapp.com/posts/2015-02-09-flake-ids.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161212/79d4e1fc/attachment.html From sanne at infinispan.org Mon Dec 12 10:13:53 2016 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 12 Dec 2016 15:13:53 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: I'm reading many clever suggestions for various aspects of such a system, but I fail to see a clear definition of the goal. >From Randall's opening email I understand how MySQL does this, but it's an example and I'm not sure which aspects are implementation details of how MySQL happens to accomplish this, and which aspects are requirements for the Infinispan enhancement proposals. I remember a meeting with Manik Surtani, Jonathan Halliday and Mark Little, whose outcome was a general agreement that Infinispan would eventually need both tombstones and versioned entries, not just for change data capture but to improve several other aspects; unfortunately that was in December 2010 and never became a priority, but the benefits are clear. The complexities which have put off such plans lie in the "garbage collection", aka the need to not grow the history without bounds, and have to drop or compact history. So I'm definitely sold on the need to add a certain amount of history, but we need to define how much of this history is expected to be held. In short, what's the ultimate goal? I see two main but different options intertwined: - allow to synchronize the *final state* of a replica - inspect specific changes For the first case, it would be enough for us to be able to provide a "squashed history" (as in Git squash), but we'd need to keep versioned shapshots around and someone needs to tell you which ones can be garbage collected. For example when a key is: written, updated, updated, deleted since the snapshot, we'll send only "deleted" as the intermediary states are irrelevant. For the second case, say the goal is to inspect fluctuations of price variations of some item, then the intermediary states are not irrelevant. Which one will we want to solve? Both? Personally the attempt of solving the second one seems like a huge pivot of the project, the current data-structures and storage are not designed for this. I see the value of such benefits, but maybe Infinispan is not the right tool for such a problem. I'd prefer to focus on the benefits of the squashed history, and have versioned entries soon, but even in that case we need to define which versions need to be kept around, and how garbage collection / vacuuming is handled. This can be designed to be transparent to the client: handled as an internal implementation detail which we use to improve performance of Infinispan itself, or it can be exposed to clients to implement change data capture, but in this case we need to track which clients are still going to need an older snapshot; this has an impact as clients would need to be registered, and has a significant impact on the storage strategies. Within Kafka the log compaction strategies are configurable; I have no experience with Kafka but the docs seem to suggest that it's most often used to provide the last known value of each key. That would be doable for us, but Kafka also does allow optionally for wider scope retention strategies: can we agree that that would not be an option with Infinispan? If not, these goals need to be clarified. My main concern is that if we don't limit the scope of which capabilities we want Infinispan to provide, it risks to become the same thing as Kafka, rather than integrating with it. I don't think we want to pivot all our storage design into efficiently treating large scale logs. In short, I'd like to see an agreement that analyzing e.g. fluctuations in stock prices would be a non-goal, if these are stored as {"stock name", value} key/value pairs. One could still implement such a thing by using a more sophisticated model, just don't expect to be able to see all intermediary values each entry has ever had since the key was first used. # Commenting on specific proposals On ID generation: I'd definitely go with IDs per segment rather than IDs per key for the purpose of change data capture. If you go with independent IDs per key, the client would need to keep track of each version of each entry, which has an high overhead and degree of complexity for the clients. On the other hand, we already guarantee that each segment is managed by a single primary owner, so attaching the "segment transaction id" to each internal entry being changed can be implemented efficiently by Infinispan. Segment ownership handoff needs to be highly consistent during cluster topology changes, but that requirement already exists; we'd just need to make sure that this monotonic counter is included during the handoff of the responsibility as primary owner of a segment. Thanks, Sanne On 12 December 2016 at 13:58, Gustavo Fernandes wrote: > > > On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa wrote: >> >> But introducing globally monotonous counter means that >> there will be a single contention point. > > > I wonder if the trade-off of Flake Ids [1] could be acceptable for this use > case. > > [1] http://yellerapp.com/posts/2015-02-09-flake-ids.html > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From galder at redhat.com Mon Dec 12 11:55:59 2016 From: galder at redhat.com (=?utf-8?Q?Galder_Zamarre=C3=B1o?=) Date: Mon, 12 Dec 2016 17:55:59 +0100 Subject: [infinispan-dev] New blog post In-Reply-To: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> References: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> Message-ID: <2CCC1676-3F5A-4731-AB6F-F9820CB33E7E@redhat.com> Remember to tweet after blogging under @infinispan account. If anyone doesn't know the credentials, I can point you in the right direction ;) Cheers, -- Galder Zamarre?o Infinispan, Red Hat > On 8 Dec 2016, at 16:50, Adrian Nistor wrote: > > Hi all, > > I've just published a new blog post that briefly introduces Ickle, the query language of Infinispan [1]. This will be followed soon by another one on defining domain model schemas, configuring model indexing and analysis. > > Cheers, > Adrian > > [1] > http://blog.infinispan.org/2016/12/meet-ickle.html > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From gustavo at infinispan.org Mon Dec 12 12:56:51 2016 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Mon, 12 Dec 2016 17:56:51 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: On Mon, Dec 12, 2016 at 3:13 PM, Sanne Grinovero wrote: > In short, what's the ultimate goal? I see two main but different > options intertwined: > - allow to synchronize the *final state* of a replica > I'm assuming this case is already in place when using remote listeners and includeCurrentState=true and we are discussing how to improve it, as described in the proposal in the wiki and on the 5th email of this thread. > - inspect specific changes > > For the first case, it would be enough for us to be able to provide a > "squashed history" (as in Git squash), but we'd need to keep versioned > shapshots around and someone needs to tell you which ones can be > garbage collected. > For example when a key is: written, updated, updated, deleted since > the snapshot, we'll send only "deleted" as the intermediary states are > irrelevant. > For the second case, say the goal is to inspect fluctuations of price > variations of some item, then the intermediary states are not > irrelevant. > > Which one will we want to solve? Both? > Looking at http://debezium.io/, it implies the second case. "[...] Start it up, point it at your databases, and your apps can start responding to all of the inserts, updates, and deletes that other apps commit to your databases. [...] your apps can respond quickly and never miss an event, even when things go wrong." IMO the choice between squashed/full history, and even retention time is highly application specific. Deletes might not even be involved, one may be interested on answering "what is the peak value of a certain key during the day?" > Personally the attempt of solving the second one seems like a huge > pivot of the project, the current data-structures and storage are not > designed for this. +1, as I wrote earlier about ditching the idea of event cache storage in favor of Lucene. > I see the value of such benefits, but maybe > Infinispan is not the right tool for such a problem. > > I'd prefer to focus on the benefits of the squashed history, and have > versioned entries soon, but even in that case we need to define which > versions need to be kept around, and how garbage collection / > vacuuming is handled. > Is that proposal written/recorded somewhere? It'd be interesting to know how a client interested on data changes would consume those multi-versioned entries (push/pull with offset?, sorted/unsorted?, client tracking/per key/per version?), as it seems there is some storage impedance as well. > > In short, I'd like to see an agreement that analyzing e.g. > fluctuations in stock prices would be a non-goal, if these are stored > as {"stock name", value} key/value pairs. One could still implement > such a thing by using a more sophisticated model, just don't expect to > be able to see all intermediary values each entry has ever had since > the key was first used. > Continuous Queries listens to data key/value data using a query, should it not be expected to see all the intermediary values when changes in the server causes an entry to start/stop matching the query? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161212/3b37404b/attachment.html From anistor at redhat.com Tue Dec 13 03:27:05 2016 From: anistor at redhat.com (Adrian Nistor) Date: Tue, 13 Dec 2016 10:27:05 +0200 Subject: [infinispan-dev] New blog post In-Reply-To: <2CCC1676-3F5A-4731-AB6F-F9820CB33E7E@redhat.com> References: <10cef484-8bfd-1fc7-d9cb-f3b48b55fe38@redhat.com> <2CCC1676-3F5A-4731-AB6F-F9820CB33E7E@redhat.com> Message-ID: Thanks for reminding me that. I see you tweeted it yesterday. On 12/12/2016 06:55 PM, Galder Zamarre?o wrote: > Remember to tweet after blogging under @infinispan account. > > If anyone doesn't know the credentials, I can point you in the right direction ;) > > Cheers, > -- > Galder Zamarre?o > Infinispan, Red Hat > >> On 8 Dec 2016, at 16:50, Adrian Nistor wrote: >> >> Hi all, >> >> I've just published a new blog post that briefly introduces Ickle, the query language of Infinispan [1]. This will be followed soon by another one on defining domain model schemas, configuring model indexing and analysis. >> >> Cheers, >> Adrian >> >> [1] >> http://blog.infinispan.org/2016/12/meet-ickle.html >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Tue Dec 13 03:33:53 2016 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 13 Dec 2016 09:33:53 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <94a61a07-cf8e-13c5-6c7f-c8a3a48dacee@redhat.com> On 12/12/2016 06:56 PM, Gustavo Fernandes wrote: > On Mon, Dec 12, 2016 at 3:13 PM, Sanne Grinovero > wrote: > > In short, what's the ultimate goal? I see two main but different > options intertwined: > - allow to synchronize the *final state* of a replica > > > I'm assuming this case is already in place when using remote listeners > and includeCurrentState=true and we are > discussing how to improve it, as described in the proposal in the wiki > and on the 5th email of this thread. I don't think the guarantees for any listeners are explicitly stated anywhere in docs. There are two parts of it: - ideal state: I assume that in ideal state we don't want to miss any committed operation, but we have to define committed. And mention that events can be received multiple times (we aim at at-least-once semantics) - current limitations: behaviour that does not resonate with the ideal but we were not able to fix it so far. Even [1] does not mention listeners (and it would be outdated). [1] https://github.com/infinispan/infinispan/wiki/Consistency-guarantees-in-Infinispan > - inspect specific changes > > For the first case, it would be enough for us to be able to provide a > "squashed history" (as in Git squash), but we'd need to keep versioned > shapshots around and someone needs to tell you which ones can be > garbage collected. > For example when a key is: written, updated, updated, deleted since > the snapshot, we'll send only "deleted" as the intermediary states are > irrelevant. > For the second case, say the goal is to inspect fluctuations of price > variations of some item, then the intermediary states are not > irrelevant. > > Which one will we want to solve? Both? > > > Looking at http://debezium.io/, it implies the second case. > > "[...] Start it up, point it at your databases, and your apps can > start responding to all of the inserts, updates, > and deletes that other apps commit to your databases. [...] your apps > can respond quickly and never miss an event, > even when things go wrong." > > IMO the choice between squashed/full history, and even retention time > is highly application specific. Deletes might > not even be involved, one may be interested on answering "what is the > peak value of a certain key during the day?" > > Personally the attempt of solving the second one seems like a huge > pivot of the project, the current data-structures and storage are not > designed for this. > > > > +1, as I wrote earlier about ditching the idea of event cache storage > in favor of Lucene. > > I see the value of such benefits, but maybe > Infinispan is not the right tool for such a problem. > > I'd prefer to focus on the benefits of the squashed history, and have > versioned entries soon, but even in that case we need to define which > versions need to be kept around, and how garbage collection / > vacuuming is handled. > > > Is that proposal written/recorded somewhere? It'd be interesting to > know how a client interested on data > changes would consume those multi-versioned entries (push/pull with > offset?, sorted/unsorted?, client tracking/per key/per version?), > as it seems there is some storage impedance as well. > > > In short, I'd like to see an agreement that analyzing e.g. > fluctuations in stock prices would be a non-goal, if these are stored > as {"stock name", value} key/value pairs. One could still implement > such a thing by using a more sophisticated model, just don't expect to > be able to see all intermediary values each entry has ever had since > the key was first used. > > > > Continuous Queries listens to data key/value data using a query, > should it not be expected to > see all the intermediary values when changes in the server causes an > entry to start/stop matching > the query? In Konstanz we were discussing listeners with Dan and later with Adrian and found out that CQ expects listeners to be much more reliable than these actually are. So, CQ is already broken and people can live with that; Theoretically Debezium can do the same, boldly claim that "your apps can respond quickly and never miss an event, even when things go wrong" and push the blame to Infinispan :) Radim > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From slaskawi at redhat.com Wed Dec 14 07:33:01 2016 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Wed, 14 Dec 2016 13:33:01 +0100 Subject: [infinispan-dev] Long term Hot Rod Client refactoring goals Message-ID: Hey guys, Over past few days we've been discussing some long-term goals for Hot Rod client refactoring. The document is here [1]. I would appreciate any feedback. Thanks Sebastian [1] https://goo.gl/jKAGcL -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161214/72301814/attachment.html From galder at redhat.com Wed Dec 14 08:19:46 2016 From: galder at redhat.com (=?utf-8?Q?Galder_Zamarre=C3=B1o?=) Date: Wed, 14 Dec 2016 14:19:46 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <84CFF0BA-B728-4C63-B7F4-DA81EDA13B9C@redhat.com> -- Galder Zamarre?o Infinispan, Red Hat > On 12 Dec 2016, at 14:58, Gustavo Fernandes wrote: > > > > On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa wrote: > But introducing globally monotonous counter means that > there will be a single contention point. > > I wonder if the trade-off of Flake Ids [1] could be acceptable for this use case. Not exactly the same, but org.infinispan.container.versioning.NumericVersionGenerator uses a view id + node rank + local counter combo for generating version numbers. > > [1] http://yellerapp.com/posts/2015-02-09-flake-ids.html > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From sanne at infinispan.org Wed Dec 14 08:58:10 2016 From: sanne at infinispan.org (Sanne Grinovero) Date: Wed, 14 Dec 2016 13:58:10 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: On 12 December 2016 at 17:56, Gustavo Fernandes wrote: > On Mon, Dec 12, 2016 at 3:13 PM, Sanne Grinovero > wrote: > >> >> In short, what's the ultimate goal? I see two main but different >> options intertwined: >> - allow to synchronize the *final state* of a replica > > > I'm assuming this case is already in place when using remote listeners and > includeCurrentState=true and we are > discussing how to improve it, as described in the proposal in the wiki and > on the 5th email of this thread. > >> >> - inspect specific changes >> >> For the first case, it would be enough for us to be able to provide a >> "squashed history" (as in Git squash), but we'd need to keep versioned >> shapshots around and someone needs to tell you which ones can be >> garbage collected. >> For example when a key is: written, updated, updated, deleted since >> the snapshot, we'll send only "deleted" as the intermediary states are >> irrelevant. >> For the second case, say the goal is to inspect fluctuations of price >> variations of some item, then the intermediary states are not >> irrelevant. >> >> Which one will we want to solve? Both? > > > > Looking at http://debezium.io/, it implies the second case. That's what I'm asking which needs to be clarified. If it's the second case, then while I appreciate the value of such a system I don't see it as a good fit for Infinispan. > > "[...] Start it up, point it at your databases, and your apps can start > responding to all of the inserts, updates, > and deletes that other apps commit to your databases. [...] your apps can > respond quickly and never miss an event, > even when things go wrong." > > IMO the choice between squashed/full history, and even retention time is > highly application specific. Deletes might > not even be involved, one may be interested on answering "what is the peak > value of a certain key during the day?" Absolutely. And Infinispan might need to draw a line and clarify which problems it is meant to solve, and which problems are better solved with a different solution. >> Personally the attempt of solving the second one seems like a huge >> pivot of the project, the current data-structures and storage are not >> designed for this. > > > +1, as I wrote earlier about ditching the idea of event cache storage in > favor of Lucene. Yes that's a great idea, but I'd like to discuss first were we want to get. >> I see the value of such benefits, but maybe >> Infinispan is not the right tool for such a problem. >> >> I'd prefer to focus on the benefits of the squashed history, and have >> versioned entries soon, but even in that case we need to define which >> versions need to be kept around, and how garbage collection / >> vacuuming is handled. > > > Is that proposal written/recorded somewhere? It'd be interesting to know how > a client interested on data > changes would consume those multi-versioned entries (push/pull with offset?, > sorted/unsorted?, client tracking/per key/per version?), > as it seems there is some storage impedance as well. > >> >> >> In short, I'd like to see an agreement that analyzing e.g. >> fluctuations in stock prices would be a non-goal, if these are stored >> as {"stock name", value} key/value pairs. One could still implement >> such a thing by using a more sophisticated model, just don't expect to >> be able to see all intermediary values each entry has ever had since >> the key was first used. > > > > Continuous Queries listens to data key/value data using a query, should it > not be expected to > see all the intermediary values when changes in the server causes an entry > to start/stop matching > the query? That's exactly the doubt I'm raising: I'm not sure we set that expectations, and if we did then I don't agree with that choice, and I remember voicing concerns on feasibility of such aspects of CQ during early design. I might be a minority, but whatever the decision was I don't think this is now clear nor properly documented. If one needs to store a significant sequence of values, organised by "keys" (aka partitions), that pretty much suggests the need for Kafka itself, rather than an integration with Kafka, or perhaps depending on the use case a time-series database. Kafka is more evolved in this area, and yet even in that case I'm confident that an unbounded history would not be a reasonable expectation; Kafka's however treats the managing of such boundaries - and history compaction policies - as first class concepts both on APIs and integration / extension points. That's not to say we don't need any form of history; we discussed loads of improved protocols over the years which would benefit from versioned entries and tombstones, but we've always assumed to manage the control of "history boundaries" and compaction strategies as internal implementation details, at most to help defining ordering of operations but never promising to expose a fine grained representation of all versions an entry had within a range. BTW I'm not at all against integrating with Debezium, that looks like a very good idea. Just checking if we can agree on the limitations this should have, so we can clearly describe this feature: when it's useful, and when it's not. Thanks, Sanne From rhauch at redhat.com Wed Dec 14 12:21:58 2016 From: rhauch at redhat.com (Randall Hauch) Date: Wed, 14 Dec 2016 11:21:58 -0600 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: > On Dec 14, 2016, at 7:58 AM, Sanne Grinovero wrote: > > On 12 December 2016 at 17:56, Gustavo Fernandes > wrote: >> On Mon, Dec 12, 2016 at 3:13 PM, Sanne Grinovero >> wrote: >> >>> >>> In short, what's the ultimate goal? I see two main but different >>> options intertwined: >>> - allow to synchronize the *final state* of a replica >> >> >> I'm assuming this case is already in place when using remote listeners and >> includeCurrentState=true and we are >> discussing how to improve it, as described in the proposal in the wiki and >> on the 5th email of this thread. >> >>> >>> - inspect specific changes >>> >>> For the first case, it would be enough for us to be able to provide a >>> "squashed history" (as in Git squash), but we'd need to keep versioned >>> shapshots around and someone needs to tell you which ones can be >>> garbage collected. >>> For example when a key is: written, updated, updated, deleted since >>> the snapshot, we'll send only "deleted" as the intermediary states are >>> irrelevant. >>> For the second case, say the goal is to inspect fluctuations of price >>> variations of some item, then the intermediary states are not >>> irrelevant. >>> >>> Which one will we want to solve? Both? >> >> >> >> Looking at http://debezium.io/, it implies the second case. > > That's what I'm asking which needs to be clarified. > > If it's the second case, then while I appreciate the value of such a > system I don't see it as a good fit for Infinispan. If Infinispan were to allow a client to consume (within a reasonable amount of time) an event for every change, then Debezium would certainly then be able to capture these into a stream that is persisted for a much longer period of time. OTOH, I think it?s reasonable for Infinispan to squash history as long as this doesn?t reorder changes and at least the last change is kept. Debezium can still work with this. > >> >> "[...] Start it up, point it at your databases, and your apps can start >> responding to all of the inserts, updates, >> and deletes that other apps commit to your databases. [...] your apps can >> respond quickly and never miss an event, >> even when things go wrong." >> >> IMO the choice between squashed/full history, and even retention time is >> highly application specific. Deletes might >> not even be involved, one may be interested on answering "what is the peak >> value of a certain key during the day?" > > Absolutely. And Infinispan might need to draw a line and clarify which > problems it is meant to solve, and which problems are better solved > with a different solution. +1. Just be clear in what the listeners will see and what they won?t see. And I guess we need to clarify what ?never miss an event? means for Debezium: we capture every event that a source system exposes to us and will not lose any of them, but if using Kafka compaction then when replaying you?re guaranteed to see at least the most recent change for every key. > > >>> Personally the attempt of solving the second one seems like a huge >>> pivot of the project, the current data-structures and storage are not >>> designed for this. >> >> >> +1, as I wrote earlier about ditching the idea of event cache storage in >> favor of Lucene. > > Yes that's a great idea, but I'd like to discuss first were we want to get. > >>> I see the value of such benefits, but maybe >>> Infinispan is not the right tool for such a problem. >>> >>> I'd prefer to focus on the benefits of the squashed history, and have >>> versioned entries soon, but even in that case we need to define which >>> versions need to be kept around, and how garbage collection / >>> vacuuming is handled. >> >> >> Is that proposal written/recorded somewhere? It'd be interesting to know how >> a client interested on data >> changes would consume those multi-versioned entries (push/pull with offset?, >> sorted/unsorted?, client tracking/per key/per version?), >> as it seems there is some storage impedance as well. >> >>> >>> >>> In short, I'd like to see an agreement that analyzing e.g. >>> fluctuations in stock prices would be a non-goal, if these are stored >>> as {"stock name", value} key/value pairs. One could still implement >>> such a thing by using a more sophisticated model, just don't expect to >>> be able to see all intermediary values each entry has ever had since >>> the key was first used. >> >> >> >> Continuous Queries listens to data key/value data using a query, should it >> not be expected to >> see all the intermediary values when changes in the server causes an entry >> to start/stop matching >> the query? > > That's exactly the doubt I'm raising: I'm not sure we set that > expectations, and if we did then I don't agree with that choice, and I > remember voicing concerns on feasibility of such aspects of CQ during > early design. > I might be a minority, but whatever the decision was I don't think > this is now clear nor properly documented. > > If one needs to store a significant sequence of values, organised by > "keys" (aka partitions), that pretty much suggests the need for Kafka > itself, rather than an integration with Kafka, or perhaps depending on > the use case a time-series database. > > Kafka is more evolved in this area, and yet even in that case I'm > confident that an unbounded history would not be a reasonable > expectation; Kafka's however treats the managing of such boundaries - > and history compaction policies - as first class concepts both on APIs > and integration / extension points. > > That's not to say we don't need any form of history; we discussed > loads of improved protocols over the years which would benefit from > versioned entries and tombstones, but we've always assumed to manage > the control of "history boundaries" and compaction strategies as > internal implementation details, at most to help defining ordering of > operations but never promising to expose a fine grained representation > of all versions an entry had within a range. > > BTW I'm not at all against integrating with Debezium, that looks like > a very good idea. Just checking if we can agree on the limitations > this should have, so we can clearly describe this feature: when it's > useful, and when it's not. > > Thanks, > Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161214/ae83934b/attachment-0001.html From emmanuel at hibernate.org Thu Dec 15 04:54:38 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Thu, 15 Dec 2016 10:54:38 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: The goal is as followed: allow to collect all changes to push them to Debezium and thus Kafka. This need does not require to remember all changes since the beginning of time in Infinispan. Just enough to: - let Kafka catchup assuming it is the bottleneck - let us not lose a change in Kafka when it happened in Infinispan (coordinator, owner, replicas dying) The ability to read back history would then be handled by the Debezium / Kafka tail, not infinispan itself. Check my email on this tread from Dec 9th. > On 12 Dec 2016, at 16:13, Sanne Grinovero wrote: > > I'm reading many clever suggestions for various aspects of such a > system, but I fail to see a clear definition of the goal. > >> From Randall's opening email I understand how MySQL does this, but > it's an example and I'm not sure which aspects are implementation > details of how MySQL happens to accomplish this, and which aspects are > requirements for the Infinispan enhancement proposals. > > I remember a meeting with Manik Surtani, Jonathan Halliday and Mark > Little, whose outcome was a general agreement that Infinispan would > eventually need both tombstones and versioned entries, not just for > change data capture but to improve several other aspects; > unfortunately that was in December 2010 and never became a priority, > but the benefits are clear. > The complexities which have put off such plans lie in the "garbage > collection", aka the need to not grow the history without bounds, and > have to drop or compact history. > > So I'm definitely sold on the need to add a certain amount of history, > but we need to define how much of this history is expected to be held. > > In short, what's the ultimate goal? I see two main but different > options intertwined: > - allow to synchronize the *final state* of a replica > - inspect specific changes > > For the first case, it would be enough for us to be able to provide a > "squashed history" (as in Git squash), but we'd need to keep versioned > shapshots around and someone needs to tell you which ones can be > garbage collected. > For example when a key is: written, updated, updated, deleted since > the snapshot, we'll send only "deleted" as the intermediary states are > irrelevant. > For the second case, say the goal is to inspect fluctuations of price > variations of some item, then the intermediary states are not > irrelevant. > > Which one will we want to solve? Both? > Personally the attempt of solving the second one seems like a huge > pivot of the project, the current data-structures and storage are not > designed for this. I see the value of such benefits, but maybe > Infinispan is not the right tool for such a problem. > > I'd prefer to focus on the benefits of the squashed history, and have > versioned entries soon, but even in that case we need to define which > versions need to be kept around, and how garbage collection / > vacuuming is handled. > This can be designed to be transparent to the client: handled as an > internal implementation detail which we use to improve performance of > Infinispan itself, or it can be exposed to clients to implement change > data capture, but in this case we need to track which clients are > still going to need an older snapshot; this has an impact as clients > would need to be registered, and has a significant impact on the > storage strategies. > > Within Kafka the log compaction strategies are configurable; I have no > experience with Kafka but the docs seem to suggest that it's most > often used to provide the last known value of each key. That would be > doable for us, but Kafka also does allow optionally for wider scope > retention strategies: can we agree that that would not be an option > with Infinispan? If not, these goals need to be clarified. > > My main concern is that if we don't limit the scope of which > capabilities we want Infinispan to provide, it risks to become the > same thing as Kafka, rather than integrating with it. I don't think we > want to pivot all our storage design into efficiently treating large > scale logs. > > In short, I'd like to see an agreement that analyzing e.g. > fluctuations in stock prices would be a non-goal, if these are stored > as {"stock name", value} key/value pairs. One could still implement > such a thing by using a more sophisticated model, just don't expect to > be able to see all intermediary values each entry has ever had since > the key was first used. > > # Commenting on specific proposals > > On ID generation: I'd definitely go with IDs per segment rather than > IDs per key for the purpose of change data capture. If you go with > independent IDs per key, the client would need to keep track of each > version of each entry, which has an high overhead and degree of > complexity for the clients. > On the other hand, we already guarantee that each segment is managed > by a single primary owner, so attaching the "segment transaction id" > to each internal entry being changed can be implemented efficiently by > Infinispan. > Segment ownership handoff needs to be highly consistent during cluster > topology changes, but that requirement already exists; we'd just need > to make sure that this monotonic counter is included during the > handoff of the responsibility as primary owner of a segment. > > Thanks, > > Sanne > > > > > > On 12 December 2016 at 13:58, Gustavo Fernandes wrote: >> >> >> On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa wrote: >>> >>> But introducing globally monotonous counter means that >>> there will be a single contention point. >> >> >> I wonder if the trade-off of Flake Ids [1] could be acceptable for this use >> case. >> >> [1] http://yellerapp.com/posts/2015-02-09-flake-ids.html >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From gustavo at infinispan.org Thu Dec 15 05:18:37 2016 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Thu, 15 Dec 2016 10:18:37 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: On Thu, Dec 15, 2016 at 9:54 AM, Emmanuel Bernard wrote: > The goal is as followed: allow to collect all changes to push them to > Debezium and thus Kafka. > > This need does not require to remember all changes since the beginning of > time in Infinispan. Just enough to: > - let Kafka catchup assuming it is the bottleneck > - let us not lose a change in Kafka when it happened in Infinispan > (coordinator, owner, replicas dying) > > The ability to read back history would then be handled by the Debezium / > Kafka tail, not infinispan itself. > > Having an embedded Debezium connector pushing everything to Kafka sounds cool, but what impact would it bring to the other stream consumers: * Remote listeners, which is supported in several clients apart from Java * Continuous Queries (the same) * Spark Stream * Other eventual 3rd party stream processors: Apache Flick, Storm, etc. > Check my email on this tread from Dec 9th. > > > On 12 Dec 2016, at 16:13, Sanne Grinovero wrote: > > > > I'm reading many clever suggestions for various aspects of such a > > system, but I fail to see a clear definition of the goal. > > > >> From Randall's opening email I understand how MySQL does this, but > > it's an example and I'm not sure which aspects are implementation > > details of how MySQL happens to accomplish this, and which aspects are > > requirements for the Infinispan enhancement proposals. > > > > I remember a meeting with Manik Surtani, Jonathan Halliday and Mark > > Little, whose outcome was a general agreement that Infinispan would > > eventually need both tombstones and versioned entries, not just for > > change data capture but to improve several other aspects; > > unfortunately that was in December 2010 and never became a priority, > > but the benefits are clear. > > The complexities which have put off such plans lie in the "garbage > > collection", aka the need to not grow the history without bounds, and > > have to drop or compact history. > > > > So I'm definitely sold on the need to add a certain amount of history, > > but we need to define how much of this history is expected to be held. > > > > In short, what's the ultimate goal? I see two main but different > > options intertwined: > > - allow to synchronize the *final state* of a replica > > - inspect specific changes > > > > For the first case, it would be enough for us to be able to provide a > > "squashed history" (as in Git squash), but we'd need to keep versioned > > shapshots around and someone needs to tell you which ones can be > > garbage collected. > > For example when a key is: written, updated, updated, deleted since > > the snapshot, we'll send only "deleted" as the intermediary states are > > irrelevant. > > For the second case, say the goal is to inspect fluctuations of price > > variations of some item, then the intermediary states are not > > irrelevant. > > > > Which one will we want to solve? Both? > > Personally the attempt of solving the second one seems like a huge > > pivot of the project, the current data-structures and storage are not > > designed for this. I see the value of such benefits, but maybe > > Infinispan is not the right tool for such a problem. > > > > I'd prefer to focus on the benefits of the squashed history, and have > > versioned entries soon, but even in that case we need to define which > > versions need to be kept around, and how garbage collection / > > vacuuming is handled. > > This can be designed to be transparent to the client: handled as an > > internal implementation detail which we use to improve performance of > > Infinispan itself, or it can be exposed to clients to implement change > > data capture, but in this case we need to track which clients are > > still going to need an older snapshot; this has an impact as clients > > would need to be registered, and has a significant impact on the > > storage strategies. > > > > Within Kafka the log compaction strategies are configurable; I have no > > experience with Kafka but the docs seem to suggest that it's most > > often used to provide the last known value of each key. That would be > > doable for us, but Kafka also does allow optionally for wider scope > > retention strategies: can we agree that that would not be an option > > with Infinispan? If not, these goals need to be clarified. > > > > My main concern is that if we don't limit the scope of which > > capabilities we want Infinispan to provide, it risks to become the > > same thing as Kafka, rather than integrating with it. I don't think we > > want to pivot all our storage design into efficiently treating large > > scale logs. > > > > In short, I'd like to see an agreement that analyzing e.g. > > fluctuations in stock prices would be a non-goal, if these are stored > > as {"stock name", value} key/value pairs. One could still implement > > such a thing by using a more sophisticated model, just don't expect to > > be able to see all intermediary values each entry has ever had since > > the key was first used. > > > > # Commenting on specific proposals > > > > On ID generation: I'd definitely go with IDs per segment rather than > > IDs per key for the purpose of change data capture. If you go with > > independent IDs per key, the client would need to keep track of each > > version of each entry, which has an high overhead and degree of > > complexity for the clients. > > On the other hand, we already guarantee that each segment is managed > > by a single primary owner, so attaching the "segment transaction id" > > to each internal entry being changed can be implemented efficiently by > > Infinispan. > > Segment ownership handoff needs to be highly consistent during cluster > > topology changes, but that requirement already exists; we'd just need > > to make sure that this monotonic counter is included during the > > handoff of the responsibility as primary owner of a segment. > > > > Thanks, > > > > Sanne > > > > > > > > > > > > On 12 December 2016 at 13:58, Gustavo Fernandes > wrote: > >> > >> > >> On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa wrote: > >>> > >>> But introducing globally monotonous counter means that > >>> there will be a single contention point. > >> > >> > >> I wonder if the trade-off of Flake Ids [1] could be acceptable for this > use > >> case. > >> > >> [1] http://yellerapp.com/posts/2015-02-09-flake-ids.html > >> > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161215/710faace/attachment.html From ttarrant at redhat.com Thu Dec 15 08:36:44 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 15 Dec 2016 14:36:44 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <90de527b-5f64-2d2e-0730-26a3689dd932@redhat.com> On 12/12/16 16:13, Sanne Grinovero wrote: > In short, what's the ultimate goal? I think that is the most important question, and we really need to make sure that the requirements are such that they fit with Infinispan's model, or they can achieved without fundamentally turning Infinispan into something it is not. Ultimately this is about making a scalable transaction log with variable durability which can be regularly rebased if we can be certain that consumers have kept up. Does Debezium provide a checkpoint notification mechanism to implement this ? Such a log needs to reliably survive both "normal" rebalancing (+/- 1 node) as well as downscaling of the cluster (i.e. the remaining nodes will still need to hold the log for when the cluster was bigger until rebase time). Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From emmanuel at hibernate.org Thu Dec 15 09:52:03 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Thu, 15 Dec 2016 15:52:03 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <90de527b-5f64-2d2e-0730-26a3689dd932@redhat.com> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <90de527b-5f64-2d2e-0730-26a3689dd932@redhat.com> Message-ID: Read my email from Dec 9th and the one from today and let me know which question is not sufficiently answered. I can try and clarify or refine. > On 15 Dec 2016, at 14:36, Tristan Tarrant wrote: > > On 12/12/16 16:13, Sanne Grinovero wrote: >> In short, what's the ultimate goal? > > I think that is the most important question, and we really need to make > sure that the requirements are such that they fit with Infinispan's > model, or they can achieved without fundamentally turning Infinispan > into something it is not. > > Ultimately this is about making a scalable transaction log with variable > durability which can be regularly rebased if we can be certain that > consumers have kept up. Does Debezium provide a checkpoint notification > mechanism to implement this ? > Such a log needs to reliably survive both "normal" rebalancing (+/- 1 > node) as well as downscaling of the cluster (i.e. the remaining nodes > will still need to hold the log for when the cluster was bigger until > rebase time). > > Tristan > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From emmanuel at hibernate.org Thu Dec 15 09:53:27 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Thu, 15 Dec 2016 15:53:27 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: > On 15 Dec 2016, at 11:18, Gustavo Fernandes wrote: > > On Thu, Dec 15, 2016 at 9:54 AM, Emmanuel Bernard > wrote: > The goal is as followed: allow to collect all changes to push them to Debezium and thus Kafka. > > This need does not require to remember all changes since the beginning of time in Infinispan. Just enough to: > - let Kafka catchup assuming it is the bottleneck > - let us not lose a change in Kafka when it happened in Infinispan (coordinator, owner, replicas dying) > > The ability to read back history would then be handled by the Debezium / Kafka tail, not infinispan itself. > > > Having an embedded Debezium connector pushing everything to Kafka sounds cool, but what impact would it bring to the other stream consumers: > > * Remote listeners, which is supported in several clients apart from Java > * Continuous Queries (the same) > * Spark Stream > * Other eventual 3rd party stream processors: Apache Flick, Storm, etc. > > Impact as in perf impact? Potential redesign impact? Or are you thinking of another question? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161215/01243028/attachment.html From gustavo at infinispan.org Thu Dec 15 09:59:06 2016 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Thu, 15 Dec 2016 14:59:06 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: On Thu, Dec 15, 2016 at 2:53 PM, Emmanuel Bernard wrote: > > On 15 Dec 2016, at 11:18, Gustavo Fernandes > wrote: > > On Thu, Dec 15, 2016 at 9:54 AM, Emmanuel Bernard > wrote: > >> The goal is as followed: allow to collect all changes to push them to >> Debezium and thus Kafka. >> >> This need does not require to remember all changes since the beginning of >> time in Infinispan. Just enough to: >> - let Kafka catchup assuming it is the bottleneck >> - let us not lose a change in Kafka when it happened in Infinispan >> (coordinator, owner, replicas dying) >> >> The ability to read back history would then be handled by the Debezium / >> Kafka tail, not infinispan itself. >> >> > Having an embedded Debezium connector pushing everything to Kafka sounds > cool, but what impact would it bring to the other stream consumers: > > * Remote listeners, which is supported in several clients apart from Java > * Continuous Queries (the same) > * Spark Stream > * Other eventual 3rd party stream processors: Apache Flick, Storm, etc. > > > > > Impact as in perf impact? Potential redesign impact? Or are you thinking > of another question? > You mentioned that "The ability to read back history would then be handled by the Debezium / Kafka tail, not infinispan itself", my question was how the other consumers would get access to that history. > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161215/5634b3f6/attachment.html From ttarrant at redhat.com Thu Dec 15 10:06:25 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 15 Dec 2016 16:06:25 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <31bd2833-92cc-f421-f91e-fe38d9b31dc9@redhat.com> On 15/12/16 15:59, Gustavo Fernandes wrote: > You mentioned that "The ability to read back history would then be > handled by the Debezium / Kafka tail, not infinispan itself", my question > was how the other consumers would get access to that history. At that point it is out of the source's control. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Thu Dec 15 10:09:07 2016 From: sanne at infinispan.org (Sanne Grinovero) Date: Thu, 15 Dec 2016 15:09:07 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: Thanks Randall, those clarifications have been great. Emmanuel: some of your statements conflict with Randall's clarifications and with the feasibility points I've been pointing at. You say "collect *all* changes". I've been questioning that Infinispan can not keep *all* changes around for a given single key; I understand we'd allow clients to retrieve streams of changes persisted into Kafka, but we need to be clear that we won't be handling *all* changes to Kafka (nor to Debezium), so the magic these can do is somewhat limited. They can certainly expand on the capabilities that Infinispan would provide on its own, but some of the use cases which Gustavo mentioned would not be suitable. I don't think this is a big problem in practice though; take the example of monitoring fluctuations of value of some stock symbol for example: it wouldn't be possible to investigate derivative numbers from these fluctuations just from the Key/Value pair "stock name" / "value", however people can store such events in a different way, for example by using a composite key "stock name" + "timestamp". People just need clarity on how this works, including us to model the storage appropriately. Thanks, Sanne On 15 December 2016 at 09:54, Emmanuel Bernard wrote: > The goal is as followed: allow to collect all changes to push them to Debezium and thus Kafka. > > This need does not require to remember all changes since the beginning of time in Infinispan. Just enough to: > - let Kafka catchup assuming it is the bottleneck > - let us not lose a change in Kafka when it happened in Infinispan (coordinator, owner, replicas dying) > > The ability to read back history would then be handled by the Debezium / Kafka tail, not infinispan itself. > > Check my email on this tread from Dec 9th. > >> On 12 Dec 2016, at 16:13, Sanne Grinovero wrote: >> >> I'm reading many clever suggestions for various aspects of such a >> system, but I fail to see a clear definition of the goal. >> >>> From Randall's opening email I understand how MySQL does this, but >> it's an example and I'm not sure which aspects are implementation >> details of how MySQL happens to accomplish this, and which aspects are >> requirements for the Infinispan enhancement proposals. >> >> I remember a meeting with Manik Surtani, Jonathan Halliday and Mark >> Little, whose outcome was a general agreement that Infinispan would >> eventually need both tombstones and versioned entries, not just for >> change data capture but to improve several other aspects; >> unfortunately that was in December 2010 and never became a priority, >> but the benefits are clear. >> The complexities which have put off such plans lie in the "garbage >> collection", aka the need to not grow the history without bounds, and >> have to drop or compact history. >> >> So I'm definitely sold on the need to add a certain amount of history, >> but we need to define how much of this history is expected to be held. >> >> In short, what's the ultimate goal? I see two main but different >> options intertwined: >> - allow to synchronize the *final state* of a replica >> - inspect specific changes >> >> For the first case, it would be enough for us to be able to provide a >> "squashed history" (as in Git squash), but we'd need to keep versioned >> shapshots around and someone needs to tell you which ones can be >> garbage collected. >> For example when a key is: written, updated, updated, deleted since >> the snapshot, we'll send only "deleted" as the intermediary states are >> irrelevant. >> For the second case, say the goal is to inspect fluctuations of price >> variations of some item, then the intermediary states are not >> irrelevant. >> >> Which one will we want to solve? Both? >> Personally the attempt of solving the second one seems like a huge >> pivot of the project, the current data-structures and storage are not >> designed for this. I see the value of such benefits, but maybe >> Infinispan is not the right tool for such a problem. >> >> I'd prefer to focus on the benefits of the squashed history, and have >> versioned entries soon, but even in that case we need to define which >> versions need to be kept around, and how garbage collection / >> vacuuming is handled. >> This can be designed to be transparent to the client: handled as an >> internal implementation detail which we use to improve performance of >> Infinispan itself, or it can be exposed to clients to implement change >> data capture, but in this case we need to track which clients are >> still going to need an older snapshot; this has an impact as clients >> would need to be registered, and has a significant impact on the >> storage strategies. >> >> Within Kafka the log compaction strategies are configurable; I have no >> experience with Kafka but the docs seem to suggest that it's most >> often used to provide the last known value of each key. That would be >> doable for us, but Kafka also does allow optionally for wider scope >> retention strategies: can we agree that that would not be an option >> with Infinispan? If not, these goals need to be clarified. >> >> My main concern is that if we don't limit the scope of which >> capabilities we want Infinispan to provide, it risks to become the >> same thing as Kafka, rather than integrating with it. I don't think we >> want to pivot all our storage design into efficiently treating large >> scale logs. >> >> In short, I'd like to see an agreement that analyzing e.g. >> fluctuations in stock prices would be a non-goal, if these are stored >> as {"stock name", value} key/value pairs. One could still implement >> such a thing by using a more sophisticated model, just don't expect to >> be able to see all intermediary values each entry has ever had since >> the key was first used. >> >> # Commenting on specific proposals >> >> On ID generation: I'd definitely go with IDs per segment rather than >> IDs per key for the purpose of change data capture. If you go with >> independent IDs per key, the client would need to keep track of each >> version of each entry, which has an high overhead and degree of >> complexity for the clients. >> On the other hand, we already guarantee that each segment is managed >> by a single primary owner, so attaching the "segment transaction id" >> to each internal entry being changed can be implemented efficiently by >> Infinispan. >> Segment ownership handoff needs to be highly consistent during cluster >> topology changes, but that requirement already exists; we'd just need >> to make sure that this monotonic counter is included during the >> handoff of the responsibility as primary owner of a segment. >> >> Thanks, >> >> Sanne >> >> >> >> >> >> On 12 December 2016 at 13:58, Gustavo Fernandes wrote: >>> >>> >>> On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa wrote: >>>> >>>> But introducing globally monotonous counter means that >>>> there will be a single contention point. >>> >>> >>> I wonder if the trade-off of Flake Ids [1] could be acceptable for this use >>> case. >>> >>> [1] http://yellerapp.com/posts/2015-02-09-flake-ids.html >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rhauch at redhat.com Thu Dec 15 11:02:23 2016 From: rhauch at redhat.com (Randall Hauch) Date: Thu, 15 Dec 2016 10:02:23 -0600 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <72E2FB84-357C-45DC-ACB6-C18F0BF0464E@redhat.com> > On Dec 15, 2016, at 9:09 AM, Sanne Grinovero wrote: > > Thanks Randall, > those clarifications have been great. > > Emmanuel: some of your statements conflict with Randall's > clarifications and with the feasibility points I've been pointing at. > You say "collect *all* changes". I've been questioning that Infinispan > can not keep *all* changes around for a given single key; I understand > we'd allow clients to retrieve streams of changes persisted into > Kafka, but we need to be clear that we won't be handling *all* changes > to Kafka (nor to Debezium), so the magic these can do is somewhat > limited. They can certainly expand on the capabilities that Infinispan > would provide on its own, but some of the use cases which Gustavo > mentioned would not be suitable. I?m not sure we were saying conflicting things. I was saying what is possible: Debezium would capture whatever it can from Infinispan via a client listener API and record it as a stream of change events. I think Emmanuel was arguing that the stream will be (far?) more useful to a wider range of consumers if it has every change made by Infinispan, compared to a stream that contains only some of the changes made in Infinispan. > > I don't think this is a big problem in practice though; take the > example of monitoring fluctuations of value of some stock symbol for > example: it wouldn't be possible to investigate derivative numbers > from these fluctuations just from the Key/Value pair "stock name" / > "value", however people can store such events in a different way, for > example by using a composite key "stock name" + "timestamp". People > just need clarity on how this works, including us to model the storage > appropriately. > > Thanks, > Sanne > > > On 15 December 2016 at 09:54, Emmanuel Bernard wrote: >> The goal is as followed: allow to collect all changes to push them to Debezium and thus Kafka. >> >> This need does not require to remember all changes since the beginning of time in Infinispan. Just enough to: >> - let Kafka catchup assuming it is the bottleneck >> - let us not lose a change in Kafka when it happened in Infinispan (coordinator, owner, replicas dying) >> >> The ability to read back history would then be handled by the Debezium / Kafka tail, not infinispan itself. >> >> Check my email on this tread from Dec 9th. >> >>> On 12 Dec 2016, at 16:13, Sanne Grinovero wrote: >>> >>> I'm reading many clever suggestions for various aspects of such a >>> system, but I fail to see a clear definition of the goal. >>> >>>> From Randall's opening email I understand how MySQL does this, but >>> it's an example and I'm not sure which aspects are implementation >>> details of how MySQL happens to accomplish this, and which aspects are >>> requirements for the Infinispan enhancement proposals. >>> >>> I remember a meeting with Manik Surtani, Jonathan Halliday and Mark >>> Little, whose outcome was a general agreement that Infinispan would >>> eventually need both tombstones and versioned entries, not just for >>> change data capture but to improve several other aspects; >>> unfortunately that was in December 2010 and never became a priority, >>> but the benefits are clear. >>> The complexities which have put off such plans lie in the "garbage >>> collection", aka the need to not grow the history without bounds, and >>> have to drop or compact history. >>> >>> So I'm definitely sold on the need to add a certain amount of history, >>> but we need to define how much of this history is expected to be held. >>> >>> In short, what's the ultimate goal? I see two main but different >>> options intertwined: >>> - allow to synchronize the *final state* of a replica >>> - inspect specific changes >>> >>> For the first case, it would be enough for us to be able to provide a >>> "squashed history" (as in Git squash), but we'd need to keep versioned >>> shapshots around and someone needs to tell you which ones can be >>> garbage collected. >>> For example when a key is: written, updated, updated, deleted since >>> the snapshot, we'll send only "deleted" as the intermediary states are >>> irrelevant. >>> For the second case, say the goal is to inspect fluctuations of price >>> variations of some item, then the intermediary states are not >>> irrelevant. >>> >>> Which one will we want to solve? Both? >>> Personally the attempt of solving the second one seems like a huge >>> pivot of the project, the current data-structures and storage are not >>> designed for this. I see the value of such benefits, but maybe >>> Infinispan is not the right tool for such a problem. >>> >>> I'd prefer to focus on the benefits of the squashed history, and have >>> versioned entries soon, but even in that case we need to define which >>> versions need to be kept around, and how garbage collection / >>> vacuuming is handled. >>> This can be designed to be transparent to the client: handled as an >>> internal implementation detail which we use to improve performance of >>> Infinispan itself, or it can be exposed to clients to implement change >>> data capture, but in this case we need to track which clients are >>> still going to need an older snapshot; this has an impact as clients >>> would need to be registered, and has a significant impact on the >>> storage strategies. >>> >>> Within Kafka the log compaction strategies are configurable; I have no >>> experience with Kafka but the docs seem to suggest that it's most >>> often used to provide the last known value of each key. That would be >>> doable for us, but Kafka also does allow optionally for wider scope >>> retention strategies: can we agree that that would not be an option >>> with Infinispan? If not, these goals need to be clarified. >>> >>> My main concern is that if we don't limit the scope of which >>> capabilities we want Infinispan to provide, it risks to become the >>> same thing as Kafka, rather than integrating with it. I don't think we >>> want to pivot all our storage design into efficiently treating large >>> scale logs. >>> >>> In short, I'd like to see an agreement that analyzing e.g. >>> fluctuations in stock prices would be a non-goal, if these are stored >>> as {"stock name", value} key/value pairs. One could still implement >>> such a thing by using a more sophisticated model, just don't expect to >>> be able to see all intermediary values each entry has ever had since >>> the key was first used. >>> >>> # Commenting on specific proposals >>> >>> On ID generation: I'd definitely go with IDs per segment rather than >>> IDs per key for the purpose of change data capture. If you go with >>> independent IDs per key, the client would need to keep track of each >>> version of each entry, which has an high overhead and degree of >>> complexity for the clients. >>> On the other hand, we already guarantee that each segment is managed >>> by a single primary owner, so attaching the "segment transaction id" >>> to each internal entry being changed can be implemented efficiently by >>> Infinispan. >>> Segment ownership handoff needs to be highly consistent during cluster >>> topology changes, but that requirement already exists; we'd just need >>> to make sure that this monotonic counter is included during the >>> handoff of the responsibility as primary owner of a segment. >>> >>> Thanks, >>> >>> Sanne >>> >>> >>> >>> >>> >>> On 12 December 2016 at 13:58, Gustavo Fernandes wrote: >>>> >>>> >>>> On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa wrote: >>>>> >>>>> But introducing globally monotonous counter means that >>>>> there will be a single contention point. >>>> >>>> >>>> I wonder if the trade-off of Flake Ids [1] could be acceptable for this use >>>> case. >>>> >>>> [1] http://yellerapp.com/posts/2015-02-09-flake-ids.html >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From emmanuel at hibernate.org Thu Dec 15 12:15:49 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Thu, 15 Dec 2016 18:15:49 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <8725EAA0-CE2F-4628-A548-4A3FFF748959@hibernate.org> > On 15 Dec 2016, at 15:59, Gustavo Fernandes wrote: > > > > On Thu, Dec 15, 2016 at 2:53 PM, Emmanuel Bernard > wrote: > >> On 15 Dec 2016, at 11:18, Gustavo Fernandes > wrote: >> >> On Thu, Dec 15, 2016 at 9:54 AM, Emmanuel Bernard > wrote: >> The goal is as followed: allow to collect all changes to push them to Debezium and thus Kafka. >> >> This need does not require to remember all changes since the beginning of time in Infinispan. Just enough to: >> - let Kafka catchup assuming it is the bottleneck >> - let us not lose a change in Kafka when it happened in Infinispan (coordinator, owner, replicas dying) >> >> The ability to read back history would then be handled by the Debezium / Kafka tail, not infinispan itself. >> >> >> Having an embedded Debezium connector pushing everything to Kafka sounds cool, but what impact would it bring to the other stream consumers: >> >> * Remote listeners, which is supported in several clients apart from Java >> * Continuous Queries (the same) >> * Spark Stream >> * Other eventual 3rd party stream processors: Apache Flick, Storm, etc. >> >> > > Impact as in perf impact? Potential redesign impact? Or are you thinking of another question? > > > You mentioned that "The ability to read back history would then be handled by the Debezium / Kafka tail, not infinispan itself", my question > was how the other consumers would get access to that history. Yes that?s an interesting point. First off here we are describing an ad-hoc model where we push changes to Debezium and then Kafka. But the underlying temp queue mechanism I described on the Dec 9th email might be used to harden the code pushing changes to the sources you describe and that even improve the continuous queries engine and the Spark DStream integration I suppose. Maybe we want a more generic mechanism relying on that temp queue system to plug a list of consumers. And focus on Spark Stream, Continuous queries and Debezium as a first set of ?clients?. For the ability to read back in history, I am happy to force consumers to go through a Kafka queue. As others pointed out, if we make Infinispan a durable queue system, we are making a different Infinispan than what it is today and this is probably undesirable. Emmanuel -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161215/4d2a5bc2/attachment.html From emmanuel at hibernate.org Thu Dec 15 12:47:19 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Thu, 15 Dec 2016 18:47:19 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> Message-ID: <380D3CAD-2AA2-4B2E-ACE8-2225ECD32E25@hibernate.org> > On 15 Dec 2016, at 16:09, Sanne Grinovero wrote: > > Thanks Randall, > those clarifications have been great. > > Emmanuel: some of your statements conflict with Randall's > clarifications and with the feasibility points I've been pointing at. > You say "collect *all* changes". I've been questioning that Infinispan > can not keep *all* changes around for a given single key; Yes I want a stronger contract than what Randall said we could do at minimum. But again, it?s not Infinispan keeping all changes around. It Infinispan keeping changes around long enough for an external system to collect them. My email from Dec 9th offers a back-pressure mechanism. I rephrased it in my reply to Gustavo today, so hopefully that clears things up. > I understand > we'd allow clients to retrieve streams of changes persisted into > Kafka, but we need to be clear that we won't be handling *all* changes > to Kafka (nor to Debezium), Why not all changes again? Remember Debezium will be embedded so it will only miss a change iif the node crashes or does not see the change itself. And that?s where the temp queue and back-pressure system on the replicas come into play. > so the magic these can do is somewhat > limited. They can certainly expand on the capabilities that Infinispan > would provide on its own, but some of the use cases which Gustavo > mentioned would not be suitable. OK, I?m catching up and I?ve just read Gustavo?s proposal of https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal and I understand better some of the comments in the thread. So the Emmanuel, Dec 9th proposal does not address all of the use cases Gustavo had in mind. I?m neutral to the native event store idea, I had assumed that was a no-go from a team choice PoV. Gustavo describes the necessary back-pressure mechanism that all event store consumers would have to use. It feels like a longer shot and I?m concerned about the ability to cap the event store size. In the Emmanuel, Dec 9th proposal, at least the caping is addressed and it is expected that the in-memory structure will remain small. If we?re aiming at use cases with cap-less stores, then I think Kafka is a better long term storage than what Infinispan could offer. Emmanuel From gustavo at infinispan.org Thu Dec 15 13:19:45 2016 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Thu, 15 Dec 2016 18:19:45 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <380D3CAD-2AA2-4B2E-ACE8-2225ECD32E25@hibernate.org> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <380D3CAD-2AA2-4B2E-ACE8-2225ECD32E25@hibernate.org> Message-ID: On Thu, Dec 15, 2016 at 5:47 PM, Emmanuel Bernard wrote: > > > On 15 Dec 2016, at 16:09, Sanne Grinovero wrote: > > > > Thanks Randall, > > those clarifications have been great. > > > > Emmanuel: some of your statements conflict with Randall's > > clarifications and with the feasibility points I've been pointing at. > > You say "collect *all* changes". I've been questioning that Infinispan > > can not keep *all* changes around for a given single key; > > Yes I want a stronger contract than what Randall said we could do at > minimum. > But again, it?s not Infinispan keeping all changes around. It Infinispan > keeping changes around long enough for an external system to collect them. > My email from Dec 9th offers a back-pressure mechanism. > I rephrased it in my reply to Gustavo today, so hopefully that clears > things up. > > > I understand > > we'd allow clients to retrieve streams of changes persisted into > > Kafka, but we need to be clear that we won't be handling *all* changes > > to Kafka (nor to Debezium), > > Why not all changes again? Remember Debezium will be embedded so it will > only miss a change iif the node crashes or does not see the change itself. > And that?s where the temp queue and back-pressure system on the replicas > come into play. > > > so the magic these can do is somewhat > > limited. They can certainly expand on the capabilities that Infinispan > > would provide on its own, but some of the use cases which Gustavo > > mentioned would not be suitable. > > OK, I?m catching up and I?ve just read Gustavo?s proposal of > https://github.com/infinispan/infinispan/wiki/Remote- > Listeners-improvement-proposal and I understand better some of the > comments in the thread. > > So the Emmanuel, Dec 9th proposal does not address all of the use cases > Gustavo had in mind. I?m neutral to the native event store idea, I had > assumed that was a no-go from a team choice PoV. > Gustavo describes the necessary back-pressure mechanism that all event > store consumers would have to use. It feels like a longer shot and I?m > concerned about the ability to cap the event store size. In the Emmanuel, > Dec 9th proposal, at least the caping is addressed and it is expected that > the in-memory structure will remain small. If we?re aiming at use cases > with cap-less stores, then I think Kafka is a better long term storage than > what Infinispan could offer. > I mentioned "off-heap *bounded* event storage" in the proposal. Whether this is backed by a cache or a Lucene index, up for discussion. The idea is basically what you are describing: the native event storage time span is configurable. Even 30min worth of events would improve things: a client that disconnects for 2 seconds and comes back later would not need to either 1) loose events that happened in between or 2) get the whole data of the cache again. For a system that needs *all* the events from Infinispan since the dawn of time, the recommendation would be to plug something like Debezium that consumes ISPN data regularly (and retry when disconnected), and save the events to a more suitable long-term data backend, a time series database. > Emmanuel > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161215/cbbdac6d/attachment-0001.html From ttarrant at redhat.com Fri Dec 16 02:46:45 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 16 Dec 2016 08:46:45 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <20161209172556.GD48509@hibernate.org> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> Message-ID: <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> On 09/12/16 18:25, Emmanuel Bernard wrote: > The total order would not be global but per key. > Each node has a Debezium connector instance embedded that listens to the > operations happening (primary and replicas alike). > All of this process is happening async compared to the operation. > Per key, a log of operations is kept in memory (it contains the key, the > operation, the operation unique id and a ack status. > If on the key owner, the operation is written by the Debezium connector > to Kafka when it has been acked (whatever that means is where I'm less > knowledgable - too many bi-cache, tri-cache and quadri latency mixed in > my brain). > On a replica, the kafka partition is read regularly to clear the > in-memory log from operations stored in Kafka > If the replica becomes the owner, it reads the kafka partition to see Yes, the above design is what sprung to mind initially. Not sure about the need of keeping the log in memory, as we would probably need some form of persistent log for cache shutdown. Since this looks a lot like the append-log of the Artemis journal, maybe we could use that. One thing I missed, is whether Debezium is interested in transaction boundaries. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From slaskawi at redhat.com Fri Dec 16 03:22:36 2016 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Fri, 16 Dec 2016 09:22:36 +0100 Subject: [infinispan-dev] Infinispan Managment Console versioning and releases Message-ID: Hey guys, A while ago was been talking with Ryan and Tristan about automated releases for Infinispan Management Console. I would like to send the main point for wider audience. Long story short, we were considering different versioning schemes, such as X.Y.Z.SHA1 or using Z as an auto-increment counter for console releases. The main problem we were trying to solve was how to release the management console more often. I would like to propose different approach - Let's stick with a standard versioning (X.Y.Z.[Alpha|Beta|Fina] for releases and X.Y.Z-SNAPSHOT for ongoing work). Then we need to embed SHA1 into the MANIFEST.MF to increase tracability (in other words, here I have an Infinispan build and I need to know which SHA1 was used to build the console). SNAPSHOTs will be pushed into JBoss Repository [1] after each commit. Infinispan master branch will have a SNAPSHOT dependency to the console. The tricky part are releases. Well, at first we need to release the console (I hope we will automate that in Team City). Then we can use the version plugin [2] to update the Infinispan source code to the latest version of the console. Finally, we can release the Infinispan. As a long-term goal, everything will happen inside a single staging repository in Nexus (but that's a long-term goal... first let get this running). If you agree to my proposal, please change the version in the console into 9.0.0-SNAPSHOT and retrigger [3] (automated builds are disabled at the moment). Next, I would kindly ask to look into the build logs [4][5] and give me a hint how to fix it. The NPM plugin is failing with some weird error. Once we are done with that, I will configure a Pull Request builder and release job. Thanks Sebastian [1] https://repository.jboss.org/nexus/content/repositories/snapshots/ [2] http://www.mojohaus.org/versions-maven-plugin/ [3] http://ci.infinispan.org/viewType.html?buildTypeId=Infinispan_ManagmentConsoleMasterHotspotJdk8 [4] http://ci.infinispan.org/viewLog.html?buildId=46542&buildTypeId=Infinispan_ManagmentConsoleMasterHotspotJdk8&tab=buildLog [5] http://ci.infinispan.org/viewLog.html?buildId=46543&buildTypeId=Infinispan_ManagmentConsoleMasterHotspotJdk8&tab=buildLog -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161216/3a125f9d/attachment.html From emmanuel at hibernate.org Fri Dec 16 03:34:48 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Fri, 16 Dec 2016 09:34:48 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> Message-ID: <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> > On 16 Dec 2016, at 08:46, Tristan Tarrant wrote: > > On 09/12/16 18:25, Emmanuel Bernard wrote: >> The total order would not be global but per key. >> Each node has a Debezium connector instance embedded that listens to the >> operations happening (primary and replicas alike). >> All of this process is happening async compared to the operation. >> Per key, a log of operations is kept in memory (it contains the key, the >> operation, the operation unique id and a ack status. >> If on the key owner, the operation is written by the Debezium connector >> to Kafka when it has been acked (whatever that means is where I'm less >> knowledgable - too many bi-cache, tri-cache and quadri latency mixed in >> my brain). >> On a replica, the kafka partition is read regularly to clear the >> in-memory log from operations stored in Kafka >> If the replica becomes the owner, it reads the kafka partition to see > > Yes, the above design is what sprung to mind initially. Not sure about > the need of keeping the log in memory, as we would probably need some > form of persistent log for cache shutdown. Since this looks a lot like > the append-log of the Artemis journal, maybe we could use that. Well, when the cache is shut down, don?t we have time to empty the in-memory log? > One thing I missed, is whether Debezium is interested in transaction > boundaries. It?s an information we capture in the RDBMS case. So yes Debezium would be interested if available. From ttarrant at redhat.com Fri Dec 16 03:48:34 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 16 Dec 2016 09:48:34 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> Message-ID: <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> On 16/12/16 09:34, Emmanuel Bernard wrote: >> Yes, the above design is what sprung to mind initially. Not sure about >> the need of keeping the log in memory, as we would probably need some >> form of persistent log for cache shutdown. Since this looks a lot like >> the append-log of the Artemis journal, maybe we could use that. > > Well, when the cache is shut down, don?t we have time to empty the in-memory log? Cache shutdown should not be deferred because there is a backlog of events that haven't been forwarded to Debezium, so we would want to pick up from where we were when we restart the cache. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From emmanuel at hibernate.org Fri Dec 16 07:12:54 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Fri, 16 Dec 2016 13:12:54 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> Message-ID: <66984FD0-3574-4E38-81C2-2C2BFBC7C618@hibernate.org> > On 16 Dec 2016, at 09:48, Tristan Tarrant wrote: > > On 16/12/16 09:34, Emmanuel Bernard wrote: >>> Yes, the above design is what sprung to mind initially. Not sure about >>> the need of keeping the log in memory, as we would probably need some >>> form of persistent log for cache shutdown. Since this looks a lot like >>> the append-log of the Artemis journal, maybe we could use that. >> >> Well, when the cache is shut down, don?t we have time to empty the in-memory log? > > Cache shutdown should not be deferred because there is a backlog of > events that haven't been forwarded to Debezium, so we would want to pick > up from where we were when we restart the cache. But you?re willing to wait for the Artemis journal finish writing? I don?t quite see the difference. From ttarrant at redhat.com Fri Dec 16 07:38:07 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 16 Dec 2016 13:38:07 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <66984FD0-3574-4E38-81C2-2C2BFBC7C618@hibernate.org> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> <66984FD0-3574-4E38-81C2-2C2BFBC7C618@hibernate.org> Message-ID: <60d71eb8-15dd-8036-330e-b85f4111410f@redhat.com> On 16/12/16 13:12, Emmanuel Bernard wrote: > >> On 16 Dec 2016, at 09:48, Tristan Tarrant wrote: >> >> On 16/12/16 09:34, Emmanuel Bernard wrote: >>>> Yes, the above design is what sprung to mind initially. Not sure about >>>> the need of keeping the log in memory, as we would probably need some >>>> form of persistent log for cache shutdown. Since this looks a lot like >>>> the append-log of the Artemis journal, maybe we could use that. >>> >>> Well, when the cache is shut down, don?t we have time to empty the in-memory log? >> >> Cache shutdown should not be deferred because there is a backlog of >> events that haven't been forwarded to Debezium, so we would want to pick >> up from where we were when we restart the cache. > > But you?re willing to wait for the Artemis journal finish writing? I don?t quite see the difference. I'm thinking about the case where Debezium is temporarily not able to collect the changes. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Fri Dec 16 08:06:26 2016 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 16 Dec 2016 13:06:26 +0000 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <60d71eb8-15dd-8036-330e-b85f4111410f@redhat.com> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> <66984FD0-3574-4E38-81C2-2C2BFBC7C618@hibernate.org> <60d71eb8-15dd-8036-330e-b85f4111410f@redhat.com> Message-ID: On 16 Dec 2016 12:39, "Tristan Tarrant" wrote: On 16/12/16 13:12, Emmanuel Bernard wrote: > >> On 16 Dec 2016, at 09:48, Tristan Tarrant wrote: >> >> On 16/12/16 09:34, Emmanuel Bernard wrote: >>>> Yes, the above design is what sprung to mind initially. Not sure about >>>> the need of keeping the log in memory, as we would probably need some >>>> form of persistent log for cache shutdown. Since this looks a lot like >>>> the append-log of the Artemis journal, maybe we could use that. >>> >>> Well, when the cache is shut down, don?t we have time to empty the in-memory log? >> >> Cache shutdown should not be deferred because there is a backlog of >> events that haven't been forwarded to Debezium, so we would want to pick >> up from where we were when we restart the cache. > > But you?re willing to wait for the Artemis journal finish writing? I don?t quite see the difference. I'm thinking about the case where Debezium is temporarily not able to collect the changes. +1 That's the crucial concern. We can have Infinispan attempt to transmit all updates to debezium on a best effort base, but we can't guarantee to send them all. We can resume the state replication stream as Randall suggested, providing in that case a squashed view which might work great to replicate the same state. Analysis of the stream of updates though shall not be able to rely on seeing *all* events. As Emmanuel seemed to agree previously, for that use case one would need a different product like Kafka. What I'm getting at is that if we agree that this granularity of events needs *just* to replicate state, then we can take advantage of that detail in various areas of the implementation, providing significant performance optimisation opportunities. Sanne Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat _______________________________________________ infinispan-dev mailing list infinispan-dev at lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161216/37de5e4b/attachment-0001.html From emmanuel at hibernate.org Fri Dec 16 10:25:26 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Fri, 16 Dec 2016 16:25:26 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <60d71eb8-15dd-8036-330e-b85f4111410f@redhat.com> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> <66984FD0-3574-4E38-81C2-2C2BFBC7C618@hibernate.org> <60d71eb8-15dd-8036-330e-b85f4111410f@redhat.com> Message-ID: > On 16 Dec 2016, at 13:38, Tristan Tarrant wrote: > > On 16/12/16 13:12, Emmanuel Bernard wrote: >> >>> On 16 Dec 2016, at 09:48, Tristan Tarrant wrote: >>> >>> On 16/12/16 09:34, Emmanuel Bernard wrote: >>>>> Yes, the above design is what sprung to mind initially. Not sure about >>>>> the need of keeping the log in memory, as we would probably need some >>>>> form of persistent log for cache shutdown. Since this looks a lot like >>>>> the append-log of the Artemis journal, maybe we could use that. >>>> >>>> Well, when the cache is shut down, don?t we have time to empty the in-memory log? >>> >>> Cache shutdown should not be deferred because there is a backlog of >>> events that haven't been forwarded to Debezium, so we would want to pick >>> up from where we were when we restart the cache. >> >> But you?re willing to wait for the Artemis journal finish writing? I don?t quite see the difference. > > I'm thinking about the case where Debezium is temporarily not able to > collect the changes. I?m thinking about the case where Artemis is not able to collect the changes. How is that different? :) From emmanuel at hibernate.org Fri Dec 16 10:30:18 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Fri, 16 Dec 2016 16:30:18 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> <66984FD0-3574-4E38-81C2-2C2BFBC7C618@hibernate.org> <60d71eb8-15dd-8036-330e-b85f4111410f@redhat.com> Message-ID: <878B2B6D-5BD4-4256-98E6-5083C3E662F6@hibernate.org> > On 16 Dec 2016, at 16:25, Emmanuel Bernard wrote: > > >> On 16 Dec 2016, at 13:38, Tristan Tarrant wrote: >> >> On 16/12/16 13:12, Emmanuel Bernard wrote: >>> >>>> On 16 Dec 2016, at 09:48, Tristan Tarrant wrote: >>>> >>>> On 16/12/16 09:34, Emmanuel Bernard wrote: >>>>>> Yes, the above design is what sprung to mind initially. Not sure about >>>>>> the need of keeping the log in memory, as we would probably need some >>>>>> form of persistent log for cache shutdown. Since this looks a lot like >>>>>> the append-log of the Artemis journal, maybe we could use that. >>>>> >>>>> Well, when the cache is shut down, don?t we have time to empty the in-memory log? >>>> >>>> Cache shutdown should not be deferred because there is a backlog of >>>> events that haven't been forwarded to Debezium, so we would want to pick >>>> up from where we were when we restart the cache. >>> >>> But you?re willing to wait for the Artemis journal finish writing? I don?t quite see the difference. >> >> I'm thinking about the case where Debezium is temporarily not able to >> collect the changes. > > I?m thinking about the case where Artemis is not able to collect the changes. How is that different? :) So to answer my own questions. The Emmanuel Dec 9th proposal handles I think the case of topology changes and nodes going down. For a cluster-wide shutdown with no time to flush queues, I think both Artemis and the local debezium talking to remote Kafka will be in trouble. The main difference is I imagine that your Artemis log will be local to the node being shut down. But that would mean a complex system to restart with the unflushed queue from all node that were shut down. Are we there yet? From jholusa at redhat.com Fri Dec 16 10:36:16 2016 From: jholusa at redhat.com (Jiri Holusa) Date: Fri, 16 Dec 2016 10:36:16 -0500 (EST) Subject: [infinispan-dev] Accidental comments on some PRs In-Reply-To: <824741615.5257706.1481902566388.JavaMail.zimbra@redhat.com> Message-ID: <572780618.5257761.1481902576251.JavaMail.zimbra@redhat.com> Hi, some of you might noticed comments on some PRs: "Performance tests run successfully. Link to the results: ${report_url}". I'm sorry about that, it was a misconfiguration. Please ignore them, thanks. I apologize. Jiri From ttarrant at redhat.com Fri Dec 16 11:00:48 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 16 Dec 2016 17:00:48 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <878B2B6D-5BD4-4256-98E6-5083C3E662F6@hibernate.org> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> <66984FD0-3574-4E38-81C2-2C2BFBC7C618@hibernate.org> <60d71eb8-15dd-8036-330e-b85f4111410f@redhat.com> <878B2B6D-5BD4-4256-98E6-5083C3E662F6@hibernate.org> Message-ID: <62ac5f31-3222-10fa-3795-b35501d6e56a@redhat.com> On 16/12/16 16:30, Emmanuel Bernard wrote: > The Emmanuel Dec 9th proposal handles I think the case of topology changes and nodes going down. We need a better name :) > For a cluster-wide shutdown with no time to flush queues, I think both Artemis and the local debezium talking to remote Kafka will be in trouble. > > The main difference is I imagine that your Artemis log will be local to the node being shut down. But that would mean a complex system to restart with the unflushed queue from all node that were shut down. Are we there yet? I'm specifically thinking about when the Debezium embedded in Infinispan cannot talk to Kafka for whatever reason, and the user wants to shutdown. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From emmanuel at hibernate.org Fri Dec 16 13:29:37 2016 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Fri, 16 Dec 2016 19:29:37 +0100 Subject: [infinispan-dev] Infinispan and change data capture In-Reply-To: <62ac5f31-3222-10fa-3795-b35501d6e56a@redhat.com> References: <2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com> <57835BEF.4010902@redhat.com> <8EF34011-3667-495B-8191-F2ED4286F0FA@redhat.com> <0C471C3A-EB3E-4BC8-A080-5026DB036E50@redhat.com> <20161209172556.GD48509@hibernate.org> <9b004489-0a50-107d-bbba-6d848e877210@redhat.com> <9B4C3166-6FDE-4AB3-A526-525D0D600833@hibernate.org> <7b28bbe7-2825-979b-e99f-42447e91d137@redhat.com> <66984FD0-3574-4E38-81C2-2C2BFBC7C618@hibernate.org> <60d71eb8-15dd-8036-330e-b85f4111410f@redhat.com> <878B2B6D-5BD4-4256-98E6-5083C3E662F6@hibernate.org> <62ac5f31-3222-10fa-3795-b35501d6e56a@redhat.com> Message-ID: <7669AAF1-19C9-49F2-A382-BFD2D41A62B0@hibernate.org> Ok, So proposal "Emmanuel Dec 9th" is hereby rechristened Deelog. I?ve captured it in the wiki https://github.com/infinispan/infinispan/wiki/Deelog:-direct-integration-with-Debezium Randall and I discussed the case were more than two owners die too quickly (assuming a distribution factor of 2). We are in relatively deep trouble because we are breaking our eventual consistency rule (between Infinispan and the state in Kafka). I?ve written the few options we have explored to compensate from that. Emmanuel > On 16 Dec 2016, at 17:00, Tristan Tarrant wrote: > > > > On 16/12/16 16:30, Emmanuel Bernard wrote: > >> The Emmanuel Dec 9th proposal handles I think the case of topology changes and nodes going down. > > We need a better name :) > >> For a cluster-wide shutdown with no time to flush queues, I think both Artemis and the local debezium talking to remote Kafka will be in trouble. >> >> The main difference is I imagine that your Artemis log will be local to the node being shut down. But that would mean a complex system to restart with the unflushed queue from all node that were shut down. Are we there yet? > > I'm specifically thinking about when the Debezium embedded in Infinispan > cannot talk to Kafka for whatever reason, and the user wants to shutdown. > > Tristan > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161216/d1a1d173/attachment.html From ttarrant at redhat.com Mon Dec 19 11:49:38 2016 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 19 Dec 2016 17:49:38 +0100 Subject: [infinispan-dev] Weekly IRC Meeting logs 2016-12-19 Message-ID: <52937abc-3ebc-70b2-2eff-7d290e48951b@redhat.com> Last meeting of the year ! http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2016/infinispan.2016-12-19-15.02.log.html Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From mudokonman at gmail.com Mon Dec 19 16:10:29 2016 From: mudokonman at gmail.com (William Burns) Date: Mon, 19 Dec 2016 21:10:29 +0000 Subject: [infinispan-dev] Data Container Changes Part 1 Message-ID: Check out some of the new changes to the Data Container in Infinispan 9.0. Beta 1 [1]. Part 2 will be probably after Holiday break. [1] http://blog.infinispan.org/2016/12/data-container-changes-part-1.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161219/0af5dacb/attachment.html From rvansa at redhat.com Tue Dec 20 04:16:27 2016 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 20 Dec 2016 10:16:27 +0100 Subject: [infinispan-dev] Data Container Changes Part 1 In-Reply-To: References: Message-ID: Regarding another use of Weighter in Caffeine: would it be possible to guarantee that an object with weight 0 (or negative one) is never evicted? R. On 12/19/2016 10:10 PM, William Burns wrote: > Check out some of the new changes to the Data Container in Infinispan > 9.0. Beta 1 [1]. Part 2 will be probably after Holiday break. > > [1] http://blog.infinispan.org/2016/12/data-container-changes-part-1.html > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From mudokonman at gmail.com Tue Dec 20 09:33:30 2016 From: mudokonman at gmail.com (William Burns) Date: Tue, 20 Dec 2016 14:33:30 +0000 Subject: [infinispan-dev] Data Container Changes Part 1 In-Reply-To: References: Message-ID: Yes, definitely! I made sure to check when I added Caffeine [1] I was thinking we could add that later if we really need the feature. - Will [1] https://github.com/ben-manes/caffeine/blob/master/caffeine/src/main/java/com/github/benmanes/caffeine/cache/Caffeine.java#L347 On Tue, Dec 20, 2016 at 4:16 AM Radim Vansa wrote: Regarding another use of Weighter in Caffeine: would it be possible to guarantee that an object with weight 0 (or negative one) is never evicted? R. On 12/19/2016 10:10 PM, William Burns wrote: > Check out some of the new changes to the Data Container in Infinispan > 9.0. Beta 1 [1]. Part 2 will be probably after Holiday break. > > [1] http://blog.infinispan.org/2016/12/data-container-changes-part-1.html > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev at lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20161220/35226507/attachment.html From rvansa at redhat.com Tue Dec 20 11:10:57 2016 From: rvansa at redhat.com (Radim Vansa) Date: Tue, 20 Dec 2016 17:10:57 +0100 Subject: [infinispan-dev] Data Container Changes Part 1 In-Reply-To: References: Message-ID: <0be1da11-1129-1f48-ac13-cb466fadcbcc@redhat.com> Perfect! This would reduce the only limitation of dist/replicated caches in 2LC, while I am convinced that these have much better performance (there have been some changes recently, so I have to re-run the perf tests). So, how should that feature be exposed to users? 1) Annotating the value class with @NonEvictable or @Evictable(false) - What if the user can't change the class definition? That would require us to provide alternative way as well (listing classes in configuration) - Should we set this fixed for a value class? Inheritance? - Should we support also @NonEvictable keys? 2) Adding Flag.NON_EVICTABLE to the write - Flags are a bit controversial, we shouldn't add more user-facing flags - We need a Param for functional commands as well 3) Other ways? Should we create another type of CacheEntry (-1), add a flag to Metadata, or something else? Radim On 12/20/2016 03:33 PM, William Burns wrote: > Yes, definitely! > > I made sure to check when I added Caffeine [1] > > I was thinking we could add that later if we really need the feature. > > - Will > > [1] > https://github.com/ben-manes/caffeine/blob/master/caffeine/src/main/java/com/github/benmanes/caffeine/cache/Caffeine.java#L347 > > On Tue, Dec 20, 2016 at 4:16 AM Radim Vansa > wrote: > > Regarding another use of Weighter in Caffeine: would it be possible to > guarantee that an object with weight 0 (or negative one) is never > evicted? > > R. > > On 12/19/2016 10:10 PM, William Burns wrote: > > Check out some of the new changes to the Data Container in > Infinispan > > 9.0. Beta 1 [1]. Part 2 will be probably after Holiday break. > > > > [1] > http://blog.infinispan.org/2016/12/data-container-changes-part-1.html > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team