From galder at redhat.com Mon Nov 2 05:29:10 2015 From: galder at redhat.com (=?utf-8?Q?Galder_Zamarre=C3=B1o?=) Date: Mon, 2 Nov 2015 11:29:10 +0100 Subject: [infinispan-dev] Cache entry creation and modification events In-Reply-To: References: <5B578F7D-2A56-4110-A433-646A436873E5@redhat.com>

Message-ID: <1F7AE064-CFA4-449D-9E5B-97CF786D5697@redhat.com> -- Galder Zamarre?o Infinispan, Red Hat > On 20 Oct 2015, at 15:05, Dan Berindei wrote: > > On Mon, Oct 19, 2015 at 5:34 PM, Galder Zamarre?o wrote: >> >> >> -- >> Galder Zamarre?o >> Infinispan, Red Hat >> >>> On 14 Oct 2015, at 16:43, Dan Berindei wrote: >>> >>> There is one more thing to consider: the interface for acting on the >>> previous value of an event is very clunky even today. It basically >>> requires a thread-local to keep track of the previous value from the >>> pre=true invocation to the pre=false invocation, and even that won't >>> work once the reactive interceptor stack lands (so that pre and post >>> events are triggered on different threads). >>> >>> So I'm starting to think we should go with option 1 for now, and start >>> a bigger redesign of the cache notifications for 9.0: >>> * Remove the pre=true events >>> * Add explicit event properties for the previous value/metadata >> >> Why redesign cache notifications? As I mentioned in a previous reply, I see Cache and cache notifications being phased out in favour of: JCache (and its events), ConcurrentMap and Functional Map (and its events) :|. > > How is the ConcurrentMap API different from the Cache API? Cache is ConcurrentMap + a mixed bag of other things which includes all sorts of things, from put operations that take extra parameters, to ways to retrieve the Batch container :| ConcurrentMap is what is given to use by the JDK whereas the rest of Cache API is what we want to have, or legacy stuff we've had since day 1. Where possible, we should promote users to use standard APIs, so that's ConcurrentMap (as-is, no Cache) and JCache. If the users find limitations on those, we should try to fill those gaps with Functional Map. It's true that today, using ConcurrentMap is done by getting a Cache, which IMO is really confusing in how we expose our APIs. So, yeah, if people want to use ConcurrentMap, they need Cache, but that should not be the case. > >> >> So, no need to redesign Cache notifications IMO :| > > I was assuming JCache listeners are implemented on top of the Cache > listeners, so they would need fixes in the Cache listeners to work > properly. ^ Yeah, currently that is the case. > We could implement the JCache listeners directly in 9.0 and remove the > Cache listeners, I guess. But looking at the JCache API it looks like > their listeners need the previous value for all writes, unlike the > FunctionalMap listeners, so I think there will still be a need for a > lower-level implementation. Depends. JCache API impl could decide, based on whether any listeners are attached, whether those write-only operations should become read-write operations because listeners attach demand it, as opposed for the actual function requiring to read previous value. Cheers, > >> >>> >>> Without backwards compatibility requirements, we could even add a >>> skipPreviousValue parameter to all listener annotations -- except for >>> @CacheEntryCreated, I guess -- making the new listener type >>> superfluous. >>> >>> Cheers >>> Dan >>> >>> >>> On Mon, Oct 12, 2015 at 11:02 AM, Dan Berindei wrote: >>>> On Fri, Oct 9, 2015 at 4:39 PM, William Burns wrote: >>>>> >>>>> >>>>> On Thu, Oct 8, 2015 at 12:39 PM Dan Berindei wrote: >>>>>> >>>>>> I'm not sure about including removals/invalidations/expiration, >>>>> >>>>> >>>>> Invalidations to me don't quite fit, since it should be passivated in that >>>>> case. >>>> >>>> Passivations have a different listener, I didn't include >>>> @CacheEntryPassivated here :) >>>> >>>> Perhaps invalidation doesn't fit, because it's used to remove stale >>>> entries either after a rebalance, or after a write (for L1 entries, or >>>> in invalidation mode). >>>> >>>> But then I would say expiration also doesn't fit, because the all the >>>> others are direct actions by the user, and expiration happens in the >>>> background. >>>> >>>>> >>>>>> >>>>>> because there would be no way to say "I want to be notified on >>>>>> creation and modification, but no removals". On the other hand, adding >>>>> >>>>> >>>>> We could always add a parameter to the new annotation to say if you don't >>>>> care about removals maybe? >>>> >>>> Yes, that should work. >>>> >>>>> >>>>>> >>>>>> 3 more methods delegating to the same implementation, while not >>>>>> pretty, does allow you to listen to all changes. >>>>> >>>>> >>>>> Do we need 3 methods? Yes I think it would be nice for people. >>>> >>>> I'm assuming the listener methods are checked to receive the right >>>> event types. If we allow supertypes (and make CacheEntryWrittenEvent a >>>> supertype of the others) it could be just 1 method with 4 annotations. >>>> >>>>> >>>>>> >>>>>> >>>>>> Or are you thinking that the 3 additional listeners would add >>>>>> significant overhead when clustered? >>>>> >>>>> >>>>> I was thinking it would be 1 listener. CacheNotifierImpl could raise the >>>>> new event in addition to the existing ones. >>>> >>>> Right, if we include the removals in this annotation there would be >>>> just one listener. But would it be faster than having 4 listeners, one >>>> for create/update, and one each for remove/invalidation/expiration? >>>> >>>> Cheers >>>> Dan >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Tue Nov 3 09:38:46 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Tue, 3 Nov 2015 16:38:46 +0200 Subject: [infinispan-dev] Cache entry creation and modification events In-Reply-To: <1F7AE064-CFA4-449D-9E5B-97CF786D5697@redhat.com> References: <5B578F7D-2A56-4110-A433-646A436873E5@redhat.com>

<1F7AE064-CFA4-449D-9E5B-97CF786D5697@redhat.com> Message-ID: On Mon, Nov 2, 2015 at 12:29 PM, Galder Zamarre?o wrote: > > > -- > Galder Zamarre?o > Infinispan, Red Hat > >> On 20 Oct 2015, at 15:05, Dan Berindei wrote: >> >> On Mon, Oct 19, 2015 at 5:34 PM, Galder Zamarre?o wrote: >>> >>> >>> -- >>> Galder Zamarre?o >>> Infinispan, Red Hat >>> >>>> On 14 Oct 2015, at 16:43, Dan Berindei wrote: >>>> >>>> There is one more thing to consider: the interface for acting on the >>>> previous value of an event is very clunky even today. It basically >>>> requires a thread-local to keep track of the previous value from the >>>> pre=true invocation to the pre=false invocation, and even that won't >>>> work once the reactive interceptor stack lands (so that pre and post >>>> events are triggered on different threads). >>>> >>>> So I'm starting to think we should go with option 1 for now, and start >>>> a bigger redesign of the cache notifications for 9.0: >>>> * Remove the pre=true events >>>> * Add explicit event properties for the previous value/metadata >>> >>> Why redesign cache notifications? As I mentioned in a previous reply, I see Cache and cache notifications being phased out in favour of: JCache (and its events), ConcurrentMap and Functional Map (and its events) :|. >> >> How is the ConcurrentMap API different from the Cache API? > > Cache is ConcurrentMap + a mixed bag of other things which includes all sorts of things, from put operations that take extra parameters, to ways to retrieve the Batch container :| > > ConcurrentMap is what is given to use by the JDK whereas the rest of Cache API is what we want to have, or legacy stuff we've had since day 1. > > Where possible, we should promote users to use standard APIs, so that's ConcurrentMap (as-is, no Cache) and JCache. If the users find limitations on those, we should try to fill those gaps with Functional Map. > > It's true that today, using ConcurrentMap is done by getting a Cache, which IMO is really confusing in how we expose our APIs. So, yeah, if people want to use ConcurrentMap, they need Cache, but that should not be the case. > I always thought by ConcurrentMap you meant our current API, because it's ConcurrentMap-based. I don't think we ever adviced users to get a cache instance and cast it to ConcurrentMap... If we want to deprecate the Cache interface in 9.0, then we can forget about evolving its notification API, but I figured we'd want to keep it for stuff that's not covered by the standard. E.g. JCache forces us to fetch and deserialize the previous value for listeners, even if they might only need the key; our notification API could allow the listener to declare if it needs the value or not. >> >>> >>> So, no need to redesign Cache notifications IMO :| >> >> I was assuming JCache listeners are implemented on top of the Cache >> listeners, so they would need fixes in the Cache listeners to work >> properly. > > ^ Yeah, currently that is the case. Do you have plans to change that for 9.0? > >> We could implement the JCache listeners directly in 9.0 and remove the >> Cache listeners, I guess. But looking at the JCache API it looks like >> their listeners need the previous value for all writes, unlike the >> FunctionalMap listeners, so I think there will still be a need for a >> lower-level implementation. > > Depends. JCache API impl could decide, based on whether any listeners are attached, whether those write-only operations should become read-write operations because listeners attach demand it, as opposed for the actual function requiring to read previous value. I would very much like to skip reading the previous value when it's not needed by the command and/or listeners, but I think that logic should be common to all APIs. There will definitely be applications using both the FunctionalMap's write-only operations and JCache listeners on the same underlying cache instance... Cheers Dan From ttarrant at redhat.com Mon Nov 9 08:57:24 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 9 Nov 2015 14:57:24 +0100 Subject: [infinispan-dev] Weekly IRC meeting logs 2015-11-02 Message-ID: <5640A644.9060608@redhat.com> Forgot to send this last week, apologies ! http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2015/infinispan.2015-11-02-15.02.log.html Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Mon Nov 9 10:30:44 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 9 Nov 2015 15:30:44 +0000 Subject: [infinispan-dev] Weekly IRC meeting logs 2015-11-02 In-Reply-To: <5640A644.9060608@redhat.com> References: <5640A644.9060608@redhat.com> Message-ID: Great to see progress on ISPN-3351 ! Thanks On 9 November 2015 at 13:57, Tristan Tarrant wrote: > Forgot to send this last week, apologies ! > > http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2015/infinispan.2015-11-02-15.02.log.html > > Tristan > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Tue Nov 10 09:51:19 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 10 Nov 2015 15:51:19 +0100 Subject: [infinispan-dev] Weekly IRC meeting logs 2015-11-09 Message-ID: <56420467.5050509@redhat.com> Hi all, the logs for our IRC meeting are available at http://transcripts.jboss.org/meeting/irc.freenode.org/infinispan/2015/infinispan.2015-11-09-15.00.log.html Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From mudokonman at gmail.com Tue Nov 10 11:38:36 2015 From: mudokonman at gmail.com (William Burns) Date: Tue, 10 Nov 2015 16:38:36 +0000 Subject: [infinispan-dev] Infinispan 8.1.0.Beta1 is out! Message-ID: Hello everyone, The first beta is now available of Infinispan 8.1.0. The new management console has progressed even further. We have a few screenshots to highlight this. You can see them at [1]. We also have fixed quite a few bugs and added some minor performance improvements for various features which you can find in the list of issues completed [2]. We are still planning for the 8.1.0 Final release by the end of month. Cheers! Will [1] http://blog.infinispan.org/2015/11/infinispan-810beta1.html [2] https://issues.jboss.org/secure/ReleaseNote.jspa?version=12328071&projectId=12310799 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151110/58246293/attachment.html From ttarrant at redhat.com Fri Nov 13 07:57:53 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 13 Nov 2015 13:57:53 +0100 Subject: [infinispan-dev] Event log Message-ID: <5645DE51.6060608@redhat.com> Hi guys, in order to keep track of notable events (nodes joining, leaving, rebalances, availability, task execution) and present them in the console I was thinking of using specific logging categories. These logs would be then collected by dedicated appenders on each node. The idea would then be to have some distexec tasks exposed to the management interface (console/CLI) which can query and aggregate the data. Initially I was thinking about using a local cache per-node with a SFCS, eviction, expiration and indexing and creating a log4j appender which writes to this cache, but probably a simpler lucene-based appender would suffice. WDYT ? Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From slaskawi at redhat.com Fri Nov 13 08:32:30 2015 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Fri, 13 Nov 2015 14:32:30 +0100 Subject: [infinispan-dev] Event log In-Reply-To: <5645DE51.6060608@redhat.com> References: <5645DE51.6060608@redhat.com> Message-ID: I really like the idea! Would it be possible to forward raw data (not aggregated) to the ELK stack [1]? This aspect might be important for production systems with centralized logging. [1] http://wildfly.org/news/2015/07/25/Wildfly-And-ELK/ On Fri, Nov 13, 2015 at 1:57 PM, Tristan Tarrant wrote: > Hi guys, > > in order to keep track of notable events (nodes joining, leaving, > rebalances, availability, task execution) and present them in the > console I was thinking of using specific logging categories. These logs > would be then collected by dedicated appenders on each node. The idea > would then be to have some distexec tasks exposed to the management > interface (console/CLI) which can query and aggregate the data. > Initially I was thinking about using a local cache per-node with a SFCS, > eviction, expiration and indexing and creating a log4j appender which > writes to this cache, but probably a simpler lucene-based appender would > suffice. > > WDYT ? > > Tristan > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151113/2d5378b3/attachment.html From sanne at infinispan.org Fri Nov 13 08:42:42 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 13 Nov 2015 13:42:42 +0000 Subject: [infinispan-dev] Event log In-Reply-To: References: <5645DE51.6060608@redhat.com> Message-ID: The Hibernate Search project is now working on giving end users the option to write to either - Lucene (embedded/local and optionally into Infinispan's Lucene Directory) - Elastic Search -> would give you ELK integration - [later] Solr We have a working prototype, and just yesterday some of the first steps were merged in master. This implies that if you write those events into an ad-hoc Infinispan Cache, you inherit the functionality via Infinispan Query and benefit from the decoupling and pluggability of our indexing components via a stable API. I'd suggest you just write those events into a dedicated Cache; you could index the cache by default and I'm happy to help defining an appropriate "domain model" to record such events.. Clearly indexing such a cache has benefits, but assuming you don't need some Lucene-only statistics you'd have the option for non-indexed queries and in the near future for ELK connectors. Sanne On 13 November 2015 at 13:32, Sebastian Laskawiec wrote: > I really like the idea! > > Would it be possible to forward raw data (not aggregated) to the ELK stack > [1]? > This aspect might be important for production systems with centralized > logging. > > [1] http://wildfly.org/news/2015/07/25/Wildfly-And-ELK/ > > On Fri, Nov 13, 2015 at 1:57 PM, Tristan Tarrant > wrote: >> >> Hi guys, >> >> in order to keep track of notable events (nodes joining, leaving, >> rebalances, availability, task execution) and present them in the >> console I was thinking of using specific logging categories. These logs >> would be then collected by dedicated appenders on each node. The idea >> would then be to have some distexec tasks exposed to the management >> interface (console/CLI) which can query and aggregate the data. >> Initially I was thinking about using a local cache per-node with a SFCS, >> eviction, expiration and indexing and creating a log4j appender which >> writes to this cache, but probably a simpler lucene-based appender would >> suffice. >> >> WDYT ? >> >> Tristan >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Fri Nov 13 08:50:20 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 13 Nov 2015 14:50:20 +0100 Subject: [infinispan-dev] Event log In-Reply-To: References: <5645DE51.6060608@redhat.com> Message-ID: <5645EA9C.5090809@redhat.com> On 13/11/2015 14:42, Sanne Grinovero wrote: > The Hibernate Search project is now working on giving end users the > option to write to either > - Lucene (embedded/local and optionally into Infinispan's Lucene Directory) > - Elastic Search -> would give you ELK integration > - [later] Solr Nice ! > I'd suggest you just write those events into a dedicated Cache; you > could index the cache by default and I'm happy to help defining an > appropriate "domain model" to record such events.. Ok. Obviously the cache would not be clustered: we still need to simulate clustered queries on top of all the nodes using distexec. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From gustavo at infinispan.org Fri Nov 13 08:53:39 2015 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Fri, 13 Nov 2015 13:53:39 +0000 Subject: [infinispan-dev] Event log In-Reply-To: <5645DE51.6060608@redhat.com> References: <5645DE51.6060608@redhat.com> Message-ID: On Fri, Nov 13, 2015 at 12:57 PM, Tristan Tarrant wrote: > Hi guys, > > in order to keep track of notable events (nodes joining, leaving, > rebalances, availability, task execution) and present them in the > console I was thinking of using specific logging categories. These logs > would be then collected by dedicated appenders on each node. The idea > would then be to have some distexec tasks exposed to the management > interface (console/CLI) which can query and aggregate the data. > Initially I was thinking about using a local cache per-node with a SFCS, > eviction, expiration and indexing and creating a log4j appender which > writes to this cache, but probably a simpler lucene-based appender would > suffice. > I'd favor using an indexed cache instead simple Lucene appenders, since they take care of querying, indexing and after all they expose the Lucene API if needed. Not sure about local caches, what about log retention when the node dies? Gustavo > > WDYT ? > > Tristan > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151113/f0687480/attachment-0001.html From ttarrant at redhat.com Fri Nov 13 09:11:50 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 13 Nov 2015 15:11:50 +0100 Subject: [infinispan-dev] Event log In-Reply-To: References: <5645DE51.6060608@redhat.com> Message-ID: <5645EFA6.7050604@redhat.com> On 13/11/2015 14:53, Gustavo Fernandes wrote: > Not sure about local caches, what about log retention when the node dies? I wanted a solution which can suffer splits and not adversely affect the cluster otherwise. But I'm open for counterarguments. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Fri Nov 13 09:27:54 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 13 Nov 2015 14:27:54 +0000 Subject: [infinispan-dev] Event log In-Reply-To: <5645EFA6.7050604@redhat.com> References: <5645DE51.6060608@redhat.com> <5645EFA6.7050604@redhat.com> Message-ID: On 13 November 2015 at 14:11, Tristan Tarrant wrote: > On 13/11/2015 14:53, Gustavo Fernandes wrote: >> Not sure about local caches, what about log retention when the node dies? > > I wanted a solution which can suffer splits and not adversely affect the > cluster otherwise. But I'm open for counterarguments. The indexer can be a service which lives independently from Infinispan; ELK would be one of the options, but there are others already (JGroups, JMS). But obviously you'd have a problem when the grid is fine, but there's a split between the Infinispan grid and such service.. configuring an embedded local Lucene indexer would be better. When using the JMS backend we have the option to store the logs locally and replicate them to ELK when the connection to ELK is restored; you'd need JMS for that one to work, but it should be easy to implement a similar thing using a simplified disk journal. I guess what I'm staying is that the Search back-end is pluggable, and having one which tries remote first or logs to disk otherwise should be easy (and a welcome reusable backend for other circumstances). Sanne > > Tristan > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Fri Nov 13 09:53:21 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 13 Nov 2015 16:53:21 +0200 Subject: [infinispan-dev] Event log In-Reply-To: References: <5645DE51.6060608@redhat.com> <5645EFA6.7050604@redhat.com> Message-ID: On Fri, Nov 13, 2015 at 4:27 PM, Sanne Grinovero wrote: > On 13 November 2015 at 14:11, Tristan Tarrant wrote: >> On 13/11/2015 14:53, Gustavo Fernandes wrote: >>> Not sure about local caches, what about log retention when the node dies? >> >> I wanted a solution which can suffer splits and not adversely affect the >> cluster otherwise. But I'm open for counterarguments. > > The indexer can be a service which lives independently from > Infinispan; ELK would be one of the options, but there are others > already (JGroups, JMS). > But obviously you'd have a problem when the grid is fine, but there's > a split between the Infinispan grid and such service.. > configuring an embedded local Lucene indexer would be better. > > When using the JMS backend we have the option to store the logs > locally and replicate them to ELK when the connection to ELK is > restored; you'd need JMS for that one to work, but it should be easy > to implement a similar thing using a simplified disk journal. I guess > what I'm staying is that the Search back-end is pluggable, and having > one which tries remote first or logs to disk otherwise should be easy > (and a welcome reusable backend for other circumstances). +1 to use a local cache by default, and maybe support additional (async) replication to another service. Tristan, can the users configure WildFly/Infinispan Server to use another logging service instead of Log4J2? If yes, perhaps implementing this functionality with custom appenders is not a good idea... Cheers Dan From ttarrant at redhat.com Fri Nov 13 10:28:24 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 13 Nov 2015 16:28:24 +0100 Subject: [infinispan-dev] Event log In-Reply-To: References: <5645DE51.6060608@redhat.com> <5645EFA6.7050604@redhat.com>

Message-ID: <56460198.4030305@redhat.com> On 13/11/2015 15:53, Dan Berindei wrote: > > > Tristan, can the users configure WildFly/Infinispan Server to use > another logging service instead of Log4J2? If yes, perhaps > implementing this functionality with custom appenders is not a good > idea... The WildFly logging goes through the lgoging subsystem which is built on top of jboss-logging with direct support for wrapping to log4j 1.2.x (not log4j2). You can plugin additional handlers [1], but you cannot really change the underlying implementation. I'll ping James Perkins and ask him about the details. Tristan [1] http://wildfly.org/news/2015/07/25/Wildfly-And-ELK/ -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Fri Nov 13 12:02:56 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 13 Nov 2015 17:02:56 +0000 Subject: [infinispan-dev] Event log In-Reply-To: <56460198.4030305@redhat.com> References: <5645DE51.6060608@redhat.com> <5645EFA6.7050604@redhat.com>

<56460198.4030305@redhat.com> Message-ID: On 13 November 2015 at 15:28, Tristan Tarrant wrote: > On 13/11/2015 15:53, Dan Berindei wrote: >> >> >> Tristan, can the users configure WildFly/Infinispan Server to use >> another logging service instead of Log4J2? If yes, perhaps >> implementing this functionality with custom appenders is not a good >> idea... > > The WildFly logging goes through the lgoging subsystem which is built on > top of jboss-logging with direct support for wrapping to log4j 1.2.x > (not log4j2). You can plugin additional handlers [1], but you cannot > really change the underlying implementation. I'll ping James Perkins and > ask him about the details. Why bind this to log4j(2) ? Just write a JBoss-Logging backend, it's easier and can wrap any other actual backend. I've created one to test what specific stuff would be logged in the Hibernate ORM testsuite; it's quite nice and easy as JBoss Logging picks up extension points from classpath via the ServiceLoader pattern. Although, I'm surprised that you want to use an actual logging category. The API for that is awful for rich events; I do realize that "text" is a rather standard API, and stuff like ELK are great because they can analyze it, but their goal is to consume any Log format, while you are producing events so you can choose for a richer model. Structure entities, annotate them and dump them in a Cache. You'll get benefits like fields being separated already without having to regex them on the consumer side, and things like numbers being mapped correctly into optimal-for-range queries. It's always possible to transform to text at the end if you want to.. just add a toString()-like contract to your same entity so we could optionally dump them to a standard appender as well, but don't make it your primary API for logging rich events. Sanne > > Tristan > > > [1] http://wildfly.org/news/2015/07/25/Wildfly-And-ELK/ > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From emmanuel at hibernate.org Mon Nov 16 09:04:42 2015 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Mon, 16 Nov 2015 15:04:42 +0100 Subject: [infinispan-dev] Event log In-Reply-To: References: <5645DE51.6060608@redhat.com> <5645EFA6.7050604@redhat.com>

<56460198.4030305@redhat.com> Message-ID: <03F0D63A-DAC5-4D35-8636-8237E583F3B9@hibernate.org> > On 13 Nov 2015, at 18:02, Sanne Grinovero wrote: > > > Although, I'm surprised that you want to use an actual logging > category. The API for that is awful for rich events; I do realize that > "text" is a rather standard API, and stuff like ELK are great because > they can analyze it, but their goal is to consume any Log format, > while you are producing events so you can choose for a richer model. The L side of thing is all about taking the flat text and re-extract structure out of it for proper indexing. So if you can avoid losing the structure in the first place, that?s better. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151116/5c2f9357/attachment.html From emmanuel at hibernate.org Mon Nov 16 13:02:11 2015 From: emmanuel at hibernate.org (Emmanuel Bernard) Date: Mon, 16 Nov 2015 19:02:11 +0100 Subject: [infinispan-dev] Event log In-Reply-To: <03F0D63A-DAC5-4D35-8636-8237E583F3B9@hibernate.org> References: <5645DE51.6060608@redhat.com> <5645EFA6.7050604@redhat.com>

<56460198.4030305@redhat.com> <03F0D63A-DAC5-4D35-8636-8237E583F3B9@hibernate.org> Message-ID: > On 16 nov. 2015, at 15:04, Emmanuel Bernard wrote: > > >> On 13 Nov 2015, at 18:02, Sanne Grinovero wrote: >> >> >> Although, I'm surprised that you want to use an actual logging >> category. The API for that is awful for rich events; I do realize that >> "text" is a rather standard API, and stuff like ELK are great because >> they can analyze it, but their goal is to consume any Log format, >> while you are producing events so you can choose for a richer model. > > The L side of thing is all about taking the flat text and re-extract structure out of it for proper indexing. So if you can avoid losing the structure in the first place, that?s better. L in ELK -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151116/13ad1afa/attachment-0001.html From christian at sweazer.com Sun Nov 22 09:00:40 2015 From: christian at sweazer.com (Christian Beikov) Date: Sun, 22 Nov 2015 15:00:40 +0100 Subject: [infinispan-dev] Memory consumption of org.infinispan.marshall.core.JBossMarshaller Message-ID: <5651CA88.2000201@sweazer.com> Hello, In a recent heap dump analysis I found that org.infinispan.marshall.core.JBossMarshaller consumes a lot of memory(about 46 MB) that seems to be unused. This is due to PerThreadInstanceHolder having ExtendedRiverMarshaller objects that contain big IdentityIntMap objects. Some of those IdentityIntMap instances have a size of 2 million entries, but most of them have sizes of a few 100 thousands. When I look into these IdentityIntMap instances, it seems that the entries are all unused. Is that kind of memory consumption expected or does that indicate a possibly wrong configuration? I am using Infinispan 7.2.4.Final on Wildfly 9.0.1.Final. From rvansa at redhat.com Mon Nov 23 05:26:29 2015 From: rvansa at redhat.com (Radim Vansa) Date: Mon, 23 Nov 2015 11:26:29 +0100 Subject: [infinispan-dev] Equivalence vs. MurmurHash3 Message-ID: <5652E9D5.4070609@redhat.com> Hi guys, I have noticed that even in library mode we use MurmurHash3 to find out the segment for particular key. For strings, this involves encoding into UTF-8 and computation of hashCode, instead of just reading the cached value in string. Common objects just remix the bits of hashCode. When user provides custom Equivalence with non-default hashCode, it is not used to determine the segment. I think that in library mode we should rather use Equivalence.hashCode, maybe XORed with some magic number so that there are less collisions in DataContainer. If we simply replaced the function in CH, we would break the case when user starts HR server on top of library mode, as the clients expect key location based on MurmurHash3. ATM user only has to set AnyServerEquivalence for keys in DC; we would need to detect configuration with server equivalence and set CH function to MH3, and probably also log some warning if the equivalence is set to unknown class and CH function is not specified. WDYT? Radim -- Radim Vansa JBoss Performance Team From sanne at infinispan.org Mon Nov 23 07:07:13 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 23 Nov 2015 12:07:13 +0000 Subject: [infinispan-dev] Equivalence vs. MurmurHash3 In-Reply-To: <5652E9D5.4070609@redhat.com> References: <5652E9D5.4070609@redhat.com> Message-ID: +1 See also https://issues.jboss.org/browse/ISPN-3905 ; although I was mostly concerned on it allocating on a (very) hot path and didn't look at it in terms of compatibility modes. Rather than xorring with "magic numbers" don't you think we Equivalence implementation should be able to rule on that? On 23 November 2015 at 10:26, Radim Vansa wrote: > Hi guys, > > I have noticed that even in library mode we use MurmurHash3 to find out > the segment for particular key. For strings, this involves encoding into > UTF-8 and computation of hashCode, instead of just reading the cached > value in string. Common objects just remix the bits of hashCode. When > user provides custom Equivalence with non-default hashCode, it is not > used to determine the segment. > > I think that in library mode we should rather use Equivalence.hashCode, > maybe XORed with some magic number so that there are less collisions in > DataContainer. > > If we simply replaced the function in CH, we would break the case when > user starts HR server on top of library mode, as the clients expect key > location based on MurmurHash3. ATM user only has to set > AnyServerEquivalence for keys in DC; we would need to detect > configuration with server equivalence and set CH function to MH3, and > probably also log some warning if the equivalence is set to unknown > class and CH function is not specified. > > WDYT? > > Radim > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Mon Nov 23 08:16:04 2015 From: rvansa at redhat.com (Radim Vansa) Date: Mon, 23 Nov 2015 14:16:04 +0100 Subject: [infinispan-dev] Equivalence vs. MurmurHash3 In-Reply-To: References: <5652E9D5.4070609@redhat.com> Message-ID: <56531194.6030109@redhat.com> On 11/23/2015 01:07 PM, Sanne Grinovero wrote: > +1 > > See also https://issues.jboss.org/browse/ISPN-3905 ; although I was > mostly concerned on it allocating on a (very) hot path and didn't look > at it in terms of compatibility modes. Yes, due to compatibility we cannot remove the UTF-8 encoding from MurmurHash3 since compatibility with clients (in other languages) depends on this as well, though, we could theoretically merge encoding and hashing into one function - UTF-8 encoder implementation looks quite simple (60 lines of code?) - could be worth it even if used only for server. However, my proposal was to remove that computation from hot-code path completely. > > Rather than xorring with "magic numbers" don't you think we > Equivalence implementation should be able to rule on that? We shouldn't require user to provide a pair of hashCode functions, I don't think that would work well in practice. Though, we could make the second function Java 8-default method (with return hashCode() ^ 0xWHATEVER), still allowing it to be overridable. Radim > > On 23 November 2015 at 10:26, Radim Vansa wrote: >> Hi guys, >> >> I have noticed that even in library mode we use MurmurHash3 to find out >> the segment for particular key. For strings, this involves encoding into >> UTF-8 and computation of hashCode, instead of just reading the cached >> value in string. Common objects just remix the bits of hashCode. When >> user provides custom Equivalence with non-default hashCode, it is not >> used to determine the segment. >> >> I think that in library mode we should rather use Equivalence.hashCode, >> maybe XORed with some magic number so that there are less collisions in >> DataContainer. >> >> If we simply replaced the function in CH, we would break the case when >> user starts HR server on top of library mode, as the clients expect key >> location based on MurmurHash3. ATM user only has to set >> AnyServerEquivalence for keys in DC; we would need to detect >> configuration with server equivalence and set CH function to MH3, and >> probably also log some warning if the equivalence is set to unknown >> class and CH function is not specified. >> >> WDYT? >> >> Radim >> >> -- >> Radim Vansa >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From slaskawi at redhat.com Mon Nov 23 09:20:03 2015 From: slaskawi at redhat.com (Sebastian Laskawiec) Date: Mon, 23 Nov 2015 15:20:03 +0100 Subject: [infinispan-dev] How about dropping Spring 3 support? In-Reply-To: <56288A44.6020908@redhat.com> References: <5627B6DF.9090309@redhat.com> <56288A44.6020908@redhat.com> Message-ID: Done: https://github.com/infinispan/infinispan/pull/3839 Thanks Sebastian On Thu, Oct 22, 2015 at 9:03 AM, Tristan Tarrant wrote: > We have a section in the documentation about these things: > > http://infinispan.org/docs/8.1.x/upgrading/upgrading.html > > Tristan > > On 22/10/2015 07:54, Sebastian Laskawiec wrote: > > Sounds good to me. Shall I add some note to READ.ME > > file or something similar? > > > > On Wed, Oct 21, 2015 at 6:01 PM, Tristan Tarrant > > wrote: > > > > As others have said. Let's kill it gracefully. > > > > Tristan > > > > On 21/10/2015 09:51, Sebastian Laskawiec wrote: > > > Hey! > > > > > > I'm working on Spring and CDI package split (remote vs embedded) > > and I > > > would like to ask you about supporting Spring versions... > > > > > > Currently we support Spring 3 [1] and Spring 4 [2]. Spring 4 was > > release > > > a while ago (Dec 12 2013 [3]) and as far as I know it's been > widely > > > adopted in production. > > > > > > So the main question - how about dropping support for Spring 3 and > > > removing this module from ISPN distribution? > > > > > > Thanks > > > Sebastian > > > > > > [1] > > https://github.com/infinispan/infinispan/tree/master/spring/spring > > > [2] > > https://github.com/infinispan/infinispan/tree/master/spring/spring4 > > > [3] > > > > > > https://spring.io/blog/2013/12/12/announcing-spring-framework-4-0-ga-release > > > > > > > > > _______________________________________________ > > > infinispan-dev mailing list > > >infinispan-dev at lists.jboss.org infinispan-dev at lists.jboss.org> > > >https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > -- > > Tristan Tarrant > > Infinispan Lead > > JBoss, a division of Red Hat > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org infinispan-dev at lists.jboss.org> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151123/84a11d01/attachment.html From dan.berindei at gmail.com Mon Nov 23 10:11:06 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 23 Nov 2015 16:11:06 +0100 Subject: [infinispan-dev] Equivalence vs. MurmurHash3 In-Reply-To: <56531194.6030109@redhat.com> References: <5652E9D5.4070609@redhat.com> <56531194.6030109@redhat.com> Message-ID: On Mon, Nov 23, 2015 at 2:16 PM, Radim Vansa wrote: > On 11/23/2015 01:07 PM, Sanne Grinovero wrote: >> +1 >> >> See also https://issues.jboss.org/browse/ISPN-3905 ; although I was >> mostly concerned on it allocating on a (very) hot path and didn't look >> at it in terms of compatibility modes. > Yes, due to compatibility we cannot remove the UTF-8 encoding from > MurmurHash3 since compatibility with clients (in other languages) > depends on this as well, though, we could theoretically merge encoding > and hashing into one function - UTF-8 encoder implementation looks quite > simple (60 lines of code?) - could be worth it even if used only for > server. However, my proposal was to remove that computation from > hot-code path completely. > >> >> Rather than xorring with "magic numbers" don't you think we >> Equivalence implementation should be able to rule on that? > > We shouldn't require user to provide a pair of hashCode functions, I > don't think that would work well in practice. Though, we could make the > second function Java 8-default method (with return hashCode() ^ > 0xWHATEVER), still allowing it to be overridable. The JDK team learned long ago to use a spreader on top of user-supplied hashCode() implementations, as user-supplied hash codes are usually very clustered. In the case of strings, many times a common prefix makes up most of the key, and the hash codes of the keys are again clustered. A XOR with a magic value would definitely not help with the clustering issue, that's why java.util.HashMap doesn't use it. Note that our consistent hashes map adjacent keys to the same segment: we use hash / buckets, whereas HashMap uses hash % buckets. So we require a better spread across the hash space than HashMap does, and because of that I think we really need MurmurHash3. Still, we could change it to work on the result of Equivalence.hashCode(Object), instead of dealing with the contents of byte[] and String directly, but maintaining compatibility with old clients may not be possible. Regarding client-server divergences, I think we require compatibility mode to be enabled in order to access a cache both via HotRod and with the embedded API (because the server casts keys and values to byte[]). That means the distribution interceptor sees only the unmarshalled key, and getting the same hash code from the marshalled byte[] (on the client) and the unmarshalled Object (in the distribution interceptor) is going to be quite complex - either with a custom Object.hashCode() implementation, or with a custom Equivalence.hash(). I think the only way around this would be to change compatibility mode to store keys and values as byte[]. Cheers Dan > > Radim > >> >> On 23 November 2015 at 10:26, Radim Vansa wrote: >>> Hi guys, >>> >>> I have noticed that even in library mode we use MurmurHash3 to find out >>> the segment for particular key. For strings, this involves encoding into >>> UTF-8 and computation of hashCode, instead of just reading the cached >>> value in string. Common objects just remix the bits of hashCode. When >>> user provides custom Equivalence with non-default hashCode, it is not >>> used to determine the segment. >>> >>> I think that in library mode we should rather use Equivalence.hashCode, >>> maybe XORed with some magic number so that there are less collisions in >>> DataContainer. >>> >>> If we simply replaced the function in CH, we would break the case when >>> user starts HR server on top of library mode, as the clients expect key >>> location based on MurmurHash3. ATM user only has to set >>> AnyServerEquivalence for keys in DC; we would need to detect >>> configuration with server equivalence and set CH function to MH3, and >>> probably also log some warning if the equivalence is set to unknown >>> class and CH function is not specified. >>> >>> WDYT? >>> >>> Radim >>> >>> -- >>> Radim Vansa >>> JBoss Performance Team >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Mon Nov 23 10:26:24 2015 From: rvansa at redhat.com (Radim Vansa) Date: Mon, 23 Nov 2015 16:26:24 +0100 Subject: [infinispan-dev] Consolidating temporary per-key data Message-ID: <56533020.8090806@redhat.com> Hi again, examining some flamegraphs I've found out that recently the ExpirationInterceptor has been added, which registers ongoing write in a hashmap. So at this point we have a map for locks, map for writes used for expiration, another two key-addressed maps in L1ManagerImpl and one in L1NonTxInterceptor and maybe another maps elsewhere. This makes me think that we could spare map lookups and expensive writes by providing *single map for temporary per-key data*. A reference to the entry could be stored in the context to save the lookups. An extreme case would be to put this into DataContainer, but I think that this would prove too tricky in practice. A downside would be the loss of encapsulation (any component could theoretically access e.g. locks), but I don't find that too dramatic. WDYT? Radim -- Radim Vansa JBoss Performance Team From rvansa at redhat.com Mon Nov 23 10:59:48 2015 From: rvansa at redhat.com (Radim Vansa) Date: Mon, 23 Nov 2015 16:59:48 +0100 Subject: [infinispan-dev] Equivalence vs. MurmurHash3 In-Reply-To: References: <5652E9D5.4070609@redhat.com> <56531194.6030109@redhat.com> Message-ID: <565337F4.9020204@redhat.com> On 11/23/2015 04:11 PM, Dan Berindei wrote: > On Mon, Nov 23, 2015 at 2:16 PM, Radim Vansa wrote: >> On 11/23/2015 01:07 PM, Sanne Grinovero wrote: >>> +1 >>> >>> See also https://issues.jboss.org/browse/ISPN-3905 ; although I was >>> mostly concerned on it allocating on a (very) hot path and didn't look >>> at it in terms of compatibility modes. >> Yes, due to compatibility we cannot remove the UTF-8 encoding from >> MurmurHash3 since compatibility with clients (in other languages) >> depends on this as well, though, we could theoretically merge encoding >> and hashing into one function - UTF-8 encoder implementation looks quite >> simple (60 lines of code?) - could be worth it even if used only for >> server. However, my proposal was to remove that computation from >> hot-code path completely. >> >>> Rather than xorring with "magic numbers" don't you think we >>> Equivalence implementation should be able to rule on that? >> We shouldn't require user to provide a pair of hashCode functions, I >> don't think that would work well in practice. Though, we could make the >> second function Java 8-default method (with return hashCode() ^ >> 0xWHATEVER), still allowing it to be overridable. > The JDK team learned long ago to use a spreader on top of > user-supplied hashCode() implementations, as user-supplied hash codes > are usually very clustered. In the case of strings, many times a > common prefix makes up most of the key, and the hash codes of the keys > are again clustered. A XOR with a magic value would definitely not > help with the clustering issue, that's why java.util.HashMap doesn't > use it. > > Note that our consistent hashes map adjacent keys to the same segment: > we use hash / buckets, whereas HashMap uses hash % buckets. So we > require a better spread across the hash space than HashMap does, and > because of that I think we really need MurmurHash3. Still, we could > change it to work on the result of Equivalence.hashCode(Object), > instead of dealing with the contents of byte[] and String directly, > but maintaining compatibility with old clients may not be possible. Okay, I don't mind using the int32-spreader, that's just a better version of XOR with a couple more bitshifts. But do we need two spreaders, one for CH and another for DC, or is it sufficient that CH will use division and DC modulo? > > Regarding client-server divergences, I think we require compatibility > mode to be enabled in order to access a cache both via HotRod and with > the embedded API (because the server casts keys and values to byte[]). > That means the distribution interceptor sees only the unmarshalled > key, and getting the same hash code from the marshalled byte[] (on the > client) and the unmarshalled Object (in the distribution interceptor) > is going to be quite complex - either with a custom Object.hashCode() > implementation, or with a custom Equivalence.hash(). I think the only > way around this would be to change compatibility mode to store keys > and values as byte[]. I was not talking about combined HR + embedded access, but HR-only access but without the server, therefore configuring & starting cache manager + HR server on their own. And I've not realized that in remote case the server deals only with byte[]s, so in non-compatibility mode we would just use o instanceof byte[] ? MH3.hash((byte[]) o) : MH3.hash(equivalence.hashCode(o)) (there is no point in letting Equivalence compute the hashcode from byte[] and then just spread it). In compatibility case, we need to keep the existing behaviour, because they expect us to do the unmarshall -> encode to UTF-8 -> MH3. But that's already a configuration switch, so no heuristics are needed. Radim > > Cheers > Dan > >> Radim >> >>> On 23 November 2015 at 10:26, Radim Vansa wrote: >>>> Hi guys, >>>> >>>> I have noticed that even in library mode we use MurmurHash3 to find out >>>> the segment for particular key. For strings, this involves encoding into >>>> UTF-8 and computation of hashCode, instead of just reading the cached >>>> value in string. Common objects just remix the bits of hashCode. When >>>> user provides custom Equivalence with non-default hashCode, it is not >>>> used to determine the segment. >>>> >>>> I think that in library mode we should rather use Equivalence.hashCode, >>>> maybe XORed with some magic number so that there are less collisions in >>>> DataContainer. >>>> >>>> If we simply replaced the function in CH, we would break the case when >>>> user starts HR server on top of library mode, as the clients expect key >>>> location based on MurmurHash3. ATM user only has to set >>>> AnyServerEquivalence for keys in DC; we would need to detect >>>> configuration with server equivalence and set CH function to MH3, and >>>> probably also log some warning if the equivalence is set to unknown >>>> class and CH function is not specified. >>>> >>>> WDYT? >>>> >>>> Radim >>>> >>>> -- >>>> Radim Vansa >>>> JBoss Performance Team >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From dan.berindei at gmail.com Mon Nov 23 13:57:09 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 23 Nov 2015 19:57:09 +0100 Subject: [infinispan-dev] Equivalence vs. MurmurHash3 In-Reply-To: <565337F4.9020204@redhat.com> References: <5652E9D5.4070609@redhat.com> <56531194.6030109@redhat.com> <565337F4.9020204@redhat.com> Message-ID: On Mon, Nov 23, 2015 at 4:59 PM, Radim Vansa wrote: > On 11/23/2015 04:11 PM, Dan Berindei wrote: >> On Mon, Nov 23, 2015 at 2:16 PM, Radim Vansa wrote: >>> On 11/23/2015 01:07 PM, Sanne Grinovero wrote: >>>> +1 >>>> >>>> See also https://issues.jboss.org/browse/ISPN-3905 ; although I was >>>> mostly concerned on it allocating on a (very) hot path and didn't look >>>> at it in terms of compatibility modes. >>> Yes, due to compatibility we cannot remove the UTF-8 encoding from >>> MurmurHash3 since compatibility with clients (in other languages) >>> depends on this as well, though, we could theoretically merge encoding >>> and hashing into one function - UTF-8 encoder implementation looks quite >>> simple (60 lines of code?) - could be worth it even if used only for >>> server. However, my proposal was to remove that computation from >>> hot-code path completely. >>> >>>> Rather than xorring with "magic numbers" don't you think we >>>> Equivalence implementation should be able to rule on that? >>> We shouldn't require user to provide a pair of hashCode functions, I >>> don't think that would work well in practice. Though, we could make the >>> second function Java 8-default method (with return hashCode() ^ >>> 0xWHATEVER), still allowing it to be overridable. >> The JDK team learned long ago to use a spreader on top of >> user-supplied hashCode() implementations, as user-supplied hash codes >> are usually very clustered. In the case of strings, many times a >> common prefix makes up most of the key, and the hash codes of the keys >> are again clustered. A XOR with a magic value would definitely not >> help with the clustering issue, that's why java.util.HashMap doesn't >> use it. >> >> Note that our consistent hashes map adjacent keys to the same segment: >> we use hash / buckets, whereas HashMap uses hash % buckets. So we >> require a better spread across the hash space than HashMap does, and >> because of that I think we really need MurmurHash3. Still, we could >> change it to work on the result of Equivalence.hashCode(Object), >> instead of dealing with the contents of byte[] and String directly, >> but maintaining compatibility with old clients may not be possible. > > Okay, I don't mind using the int32-spreader, that's just a better > version of XOR with a couple more bitshifts. But do we need two > spreaders, one for CH and another for DC, or is it sufficient that CH > will use division and DC modulo? I don't know, I've never tried it :) > >> >> Regarding client-server divergences, I think we require compatibility >> mode to be enabled in order to access a cache both via HotRod and with >> the embedded API (because the server casts keys and values to byte[]). >> That means the distribution interceptor sees only the unmarshalled >> key, and getting the same hash code from the marshalled byte[] (on the >> client) and the unmarshalled Object (in the distribution interceptor) >> is going to be quite complex - either with a custom Object.hashCode() >> implementation, or with a custom Equivalence.hash(). I think the only >> way around this would be to change compatibility mode to store keys >> and values as byte[]. > > I was not talking about combined HR + embedded access, but HR-only > access but without the server, therefore configuring & starting cache > manager + HR server on their own. Just curious, what's the use case? > And I've not realized that in remote case the server deals only with > byte[]s, so in non-compatibility mode we would just use > > o instanceof byte[] ? MH3.hash((byte[]) o) : > MH3.hash(equivalence.hashCode(o)) But if we say we don't support non-byte[] keys in a server without compatibility mode enabled (and I believe we do), we don't need the second part at all. > > (there is no point in letting Equivalence compute the hashcode from > byte[] and then just spread it). In compatibility case, we need to keep > the existing behaviour, because they expect us to do the unmarshall -> > encode to UTF-8 -> MH3. But that's already a configuration switch, so no > heuristics are needed. Looks like targetting in compatibility mode is already broken [1], so keeping the existing behaviour may not be necessary after all. [1] https://issues.jboss.org/browse/ISPN-5981 Dan From ttarrant at redhat.com Tue Nov 24 05:10:24 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Tue, 24 Nov 2015 11:10:24 +0100 Subject: [infinispan-dev] GitHub protected branches Message-ID: <56543790.5030502@redhat.com> Hi all, I have used GitHub's new(ish) feature to protect the following projects' branches from force pushes and removal: infinispan infinispan-management-console infinispan.github.io cpp-client dotnet-client infinispan-hadoop infinispan-spark infinispan-cachestore-redis Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Wed Nov 25 05:27:37 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Wed, 25 Nov 2015 10:27:37 +0000 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency Message-ID: At our last face to face meeting we were discussing about possible strategies to reduce the latency of a Put operation. I wasn't part of that conversation, but see in our notes that the idea would be to have the Originator to wait for first ACK from a backup, rather than having the backups confirm with the Owner first, although this would need for Gets to have to go to the primary owner exclusively to keep the current consistency semantics. An alternative is to wait for all ACKs, but I think this could still be optimised in "triangle shape" too by having the Originator only wait for the ACKs from the non-primary replicas? So backup owners have to send a confirmation message to the Originator, while the Primary owner isn't expecting to do so. I suspect the tricky part is what happens when the Primary owner rules +1 to apply the change, but then the backup owners (all or some of them) somehow fail before letting the Originator know. The Originator in this case should seek confirmation about its operation state (success?) with the Primary owner; this implies that the Primary owner needs to keep track of what it's applied and track failures too, and this log needs to be pruned. Sounds pretty nice, or am I missing other difficulties? Thanks, Sanne From pedro at infinispan.org Wed Nov 25 05:48:12 2015 From: pedro at infinispan.org (Pedro Ruivo) Date: Wed, 25 Nov 2015 10:48:12 +0000 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: References: Message-ID: <565591EC.4090907@infinispan.org> Comment inline. On 11/25/2015 10:27 AM, Sanne Grinovero wrote: > At our last face to face meeting we were discussing about possible > strategies to reduce the latency of a Put operation. > > I wasn't part of that conversation, but see in our notes that the idea > would be to have the Originator to wait for first ACK from a backup, > rather than having the backups confirm with the Owner first, although > this would need for Gets to have to go to the primary owner > exclusively to keep the current consistency semantics. > > An alternative is to wait for all ACKs, but I think this could still > be optimised in "triangle shape" too by having the Originator only > wait for the ACKs from the non-primary replicas? > So backup owners have to send a confirmation message to the > Originator, while the Primary owner isn't expecting to do so. IMO, we should wait for all ACKs to keep our read design. However, the Originator needs to wait for the ACK from Primary because of conditional operations and functional API. In this first case, if the conditional operation fail, the Backups are not bothered. The latter case, we may need the return value from the function. > > I suspect the tricky part is what happens when the Primary owner rules > +1 to apply the change, but then the backup owners (all or some of > them) somehow fail before letting the Originator know. The Originator > in this case should seek confirmation about its operation state > (success?) with the Primary owner; this implies that the Primary owner > needs to keep track of what it's applied and track failures too, and > this log needs to be pruned. > > Sounds pretty nice, or am I missing other difficulties? > > Thanks, > Sanne > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > From sanne at infinispan.org Wed Nov 25 06:07:09 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Wed, 25 Nov 2015 11:07:09 +0000 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <565591EC.4090907@infinispan.org> References: <565591EC.4090907@infinispan.org> Message-ID: On 25 November 2015 at 10:48, Pedro Ruivo wrote: > Comment inline. > > On 11/25/2015 10:27 AM, Sanne Grinovero wrote: >> At our last face to face meeting we were discussing about possible >> strategies to reduce the latency of a Put operation. >> >> I wasn't part of that conversation, but see in our notes that the idea >> would be to have the Originator to wait for first ACK from a backup, >> rather than having the backups confirm with the Owner first, although >> this would need for Gets to have to go to the primary owner >> exclusively to keep the current consistency semantics. >> >> An alternative is to wait for all ACKs, but I think this could still >> be optimised in "triangle shape" too by having the Originator only >> wait for the ACKs from the non-primary replicas? >> So backup owners have to send a confirmation message to the >> Originator, while the Primary owner isn't expecting to do so. > > IMO, we should wait for all ACKs to keep our read design. However, the > Originator needs to wait for the ACK from Primary because of conditional > operations and functional API. If the operation is successful, Primary will have to let the secondaries know so these can reply to the Originator directly: still saves an hop. > In this first case, if the conditional operation fail, the Backups are > not bothered. The latter case, we may need the return value from the > function. Right, for a failed or rejected operation the secondaries won't even know about it, so the Primary is in charge of letting the Originator know. Essentially you're highlighting that the Originator needs to wait for either the response from secondaries (all of them?) or from the Primary. > >> >> I suspect the tricky part is what happens when the Primary owner rules >> +1 to apply the change, but then the backup owners (all or some of >> them) somehow fail before letting the Originator know. The Originator >> in this case should seek confirmation about its operation state >> (success?) with the Primary owner; this implies that the Primary owner >> needs to keep track of what it's applied and track failures too, and >> this log needs to be pruned. >> >> Sounds pretty nice, or am I missing other difficulties? >> >> Thanks, >> Sanne >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Wed Nov 25 08:20:44 2015 From: rvansa at redhat.com (Radim Vansa) Date: Wed, 25 Nov 2015 14:20:44 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: References: <565591EC.4090907@infinispan.org> Message-ID: <5655B5AC.6070504@redhat.com> On 11/25/2015 12:07 PM, Sanne Grinovero wrote: > On 25 November 2015 at 10:48, Pedro Ruivo wrote: >> Comment inline. >> >> On 11/25/2015 10:27 AM, Sanne Grinovero wrote: >>> At our last face to face meeting we were discussing about possible >>> strategies to reduce the latency of a Put operation. >>> >>> I wasn't part of that conversation, but see in our notes that the idea >>> would be to have the Originator to wait for first ACK from a backup, >>> rather than having the backups confirm with the Owner first, although >>> this would need for Gets to have to go to the primary owner >>> exclusively to keep the current consistency semantics. >>> >>> An alternative is to wait for all ACKs, but I think this could still >>> be optimised in "triangle shape" too by having the Originator only >>> wait for the ACKs from the non-primary replicas? >>> So backup owners have to send a confirmation message to the >>> Originator, while the Primary owner isn't expecting to do so. >> IMO, we should wait for all ACKs to keep our read design. What exactly is our 'read design'? I think that the source of optimization is that once primary decides to backup the operation, he can forget about it and unlock the entry. So, we don't need any ACK from primary unless it's an exception/noop notification (as with conditional ops). If primary waited for ACK from backup, we wouldn't save anything. The gains are: * less hops (3 instead of 4 if O != P && O != B) * less messages (primary ACK is transitive based on ack from B) * shorter lock times (not locking during P -> B RPC) >> However, the >> Originator needs to wait for the ACK from Primary because of conditional >> operations and functional API. > If the operation is successful, Primary will have to let the > secondaries know so these can reply to the Originator directly: still > saves an hop. > >> In this first case, if the conditional operation fail, the Backups are >> not bothered. The latter case, we may need the return value from the >> function. > Right, for a failed or rejected operation the secondaries won't even > know about it, > so the Primary is in charge of letting the Originator know. > Essentially you're highlighting that the Originator needs to wait for > either the response from secondaries (all of them?) > or from the Primary. > >>> I suspect the tricky part is what happens when the Primary owner rules >>> +1 to apply the change, but then the backup owners (all or some of >>> them) somehow fail before letting the Originator know. The Originator >>> in this case should seek confirmation about its operation state >>> (success?) with the Primary owner; this implies that the Primary owner >>> needs to keep track of what it's applied and track failures too, and >>> this log needs to be pruned. Currently, in case of lost (timed out) ACK from B to P, we just report exception and don't care about synchronizing P and B - B can already store updated value. So we don't have to care about rollback on P if replication to B fails either - we just report that it's broken, sorry. Better consolidation API would be nice, though, something like cache.getAllVersions(). Radim >>> >>> Sounds pretty nice, or am I missing other difficulties? >>> >>> Thanks, >>> Sanne >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From pedro at infinispan.org Wed Nov 25 09:24:15 2015 From: pedro at infinispan.org (Pedro Ruivo) Date: Wed, 25 Nov 2015 14:24:15 +0000 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <5655B5AC.6070504@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> Message-ID: <5655C48F.1000804@infinispan.org> On 11/25/2015 01:20 PM, Radim Vansa wrote: > On 11/25/2015 12:07 PM, Sanne Grinovero wrote: >> On 25 November 2015 at 10:48, Pedro Ruivo wrote: >>>> >>>> An alternative is to wait for all ACKs, but I think this could still >>>> be optimised in "triangle shape" too by having the Originator only >>>> wait for the ACKs from the non-primary replicas? >>>> So backup owners have to send a confirmation message to the >>>> Originator, while the Primary owner isn't expecting to do so. >>> IMO, we should wait for all ACKs to keep our read design. > > What exactly is our 'read design'? If we don't wait for all the ACKs, then we have to go to the primary owner for reads, even if the originator is a Backup owner. > > I think that the source of optimization is that once primary decides to > backup the operation, he can forget about it and unlock the entry. So, > we don't need any ACK from primary unless it's an exception/noop > notification (as with conditional ops). If primary waited for ACK from > backup, we wouldn't save anything. About the iteration between P -> B, you're right. We don't need to wait for the ACKs if the messages are sent in FIFO (and JGroups guarantee that) About the O -> P, IMO, the Originator should wait for the reply from Backup. At least, the Primary would be the only one who needs to return the previous value (if needed) and it can return if the operation succeed or not. This way, it would avoid forking the code for each type of command without any benefit (I'm thinking sending the reply to originator in parallel with the update message to the backups). > > The gains are: > * less hops (3 instead of 4 if O != P && O != B) > * less messages (primary ACK is transitive based on ack from B) > * shorter lock times (not locking during P -> B RPC) > >>> However, the >>> Originator needs to wait for the ACK from Primary because of conditional >>> operations and functional API. >> If the operation is successful, Primary will have to let the >> secondaries know so these can reply to the Originator directly: still >> saves an hop. As I said above: "I'm thinking sending the reply to originator in parallel with the update message to the backups" >> >>> In this first case, if the conditional operation fail, the Backups are >>> not bothered. The latter case, we may need the return value from the >>> function. >> Right, for a failed or rejected operation the secondaries won't even >> know about it, >> so the Primary is in charge of letting the Originator know. >> Essentially you're highlighting that the Originator needs to wait for >> either the response from secondaries (all of them?) >> or from the Primary. >> >>>> I suspect the tricky part is what happens when the Primary owner rules >>>> +1 to apply the change, but then the backup owners (all or some of >>>> them) somehow fail before letting the Originator know. The Originator >>>> in this case should seek confirmation about its operation state >>>> (success?) with the Primary owner; this implies that the Primary owner >>>> needs to keep track of what it's applied and track failures too, and >>>> this log needs to be pruned. > > Currently, in case of lost (timed out) ACK from B to P, we just report > exception and don't care about synchronizing P and B - B can already > store updated value. > So we don't have to care about rollback on P if replication to B fails > either - we just report that it's broken, sorry. > Better consolidation API would be nice, though, something like > cache.getAllVersions(). > > Radim > > >>>> >>>> Sounds pretty nice, or am I missing other difficulties? >>>> >>>> Thanks, >>>> Sanne >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > From rvansa at redhat.com Wed Nov 25 10:15:14 2015 From: rvansa at redhat.com (Radim Vansa) Date: Wed, 25 Nov 2015 16:15:14 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <5655C48F.1000804@infinispan.org> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5655C48F.1000804@infinispan.org> Message-ID: <5655D082.8020608@redhat.com> On 11/25/2015 03:24 PM, Pedro Ruivo wrote: > > On 11/25/2015 01:20 PM, Radim Vansa wrote: >> On 11/25/2015 12:07 PM, Sanne Grinovero wrote: >>> On 25 November 2015 at 10:48, Pedro Ruivo wrote: >>>>> An alternative is to wait for all ACKs, but I think this could still >>>>> be optimised in "triangle shape" too by having the Originator only >>>>> wait for the ACKs from the non-primary replicas? >>>>> So backup owners have to send a confirmation message to the >>>>> Originator, while the Primary owner isn't expecting to do so. >>>> IMO, we should wait for all ACKs to keep our read design. >> What exactly is our 'read design'? > If we don't wait for all the ACKs, then we have to go to the primary > owner for reads, even if the originator is a Backup owner. I don't think so, but we probably have som miscom. If O = B, we still wait for reply from B (which is local) which is triggered by receiving an update from P (after applying the change locally). So it goes OB(application thread) [cache.put()] -(unordered)-> P(worker thread) [applies update] -(ordered)-> OB(worker thread) [applies update] -(in-VM)-> OB(application thread) [continues] > >> I think that the source of optimization is that once primary decides to >> backup the operation, he can forget about it and unlock the entry. So, >> we don't need any ACK from primary unless it's an exception/noop >> notification (as with conditional ops). If primary waited for ACK from >> backup, we wouldn't save anything. > About the iteration between P -> B, you're right. We don't need to wait > for the ACKs if the messages are sent in FIFO (and JGroups guarantee that) > > About the O -> P, IMO, the Originator should wait for the reply from > Backup. I was never claiming otherwise, O always needs to wait for ACK from Bs - only then it can successfully report that value has been written on all owners. What does this have to do with O -> P? > At least, the Primary would be the only one who needs to return > the previous value (if needed) and it can return if the operation > succeed or not. Simple success: no P -> O, B -> O (success) Simple failure/non-modifying operation (as with putIfAbsent/functional call): P -> O (failure/custom value), no B -> O previous/custom value (as with replace() or functional call): P -> O (previous/custom value), B -> O (success); alternative is P -> B (previous/custom value, new value) and B -> O (previous/custom value) Exception on either P or B: send the exception to O Lost/timed out P -> B: O times out waiting for ack from B, throws exception > This way, it would avoid forking the code for each type > of command without any benefit (I'm thinking sending the reply to > originator in parallel with the update message to the backups). What forking of code for each type do you mean? I see that there are two branches whether the command is going to be replicated to B or not. Radim > >> The gains are: >> * less hops (3 instead of 4 if O != P && O != B) >> * less messages (primary ACK is transitive based on ack from B) >> * shorter lock times (not locking during P -> B RPC) >> >>>> However, the >>>> Originator needs to wait for the ACK from Primary because of conditional >>>> operations and functional API. >>> If the operation is successful, Primary will have to let the >>> secondaries know so these can reply to the Originator directly: still >>> saves an hop. > As I said above: "I'm thinking sending the reply to originator in > parallel with the update message to the backups" > >>>> In this first case, if the conditional operation fail, the Backups are >>>> not bothered. The latter case, we may need the return value from the >>>> function. >>> Right, for a failed or rejected operation the secondaries won't even >>> know about it, >>> so the Primary is in charge of letting the Originator know. >>> Essentially you're highlighting that the Originator needs to wait for >>> either the response from secondaries (all of them?) >>> or from the Primary. >>> >>>>> I suspect the tricky part is what happens when the Primary owner rules >>>>> +1 to apply the change, but then the backup owners (all or some of >>>>> them) somehow fail before letting the Originator know. The Originator >>>>> in this case should seek confirmation about its operation state >>>>> (success?) with the Primary owner; this implies that the Primary owner >>>>> needs to keep track of what it's applied and track failures too, and >>>>> this log needs to be pruned. >> Currently, in case of lost (timed out) ACK from B to P, we just report >> exception and don't care about synchronizing P and B - B can already >> store updated value. >> So we don't have to care about rollback on P if replication to B fails >> either - we just report that it's broken, sorry. >> Better consolidation API would be nice, though, something like >> cache.getAllVersions(). >> >> Radim >> >> >>>>> Sounds pretty nice, or am I missing other difficulties? >>>>> >>>>> Thanks, >>>>> Sanne >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From dan.berindei at gmail.com Wed Nov 25 10:43:34 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Wed, 25 Nov 2015 17:43:34 +0200 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <5655D082.8020608@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5655C48F.1000804@infinispan.org> <5655D082.8020608@redhat.com> Message-ID: On Wed, Nov 25, 2015 at 5:15 PM, Radim Vansa wrote: > On 11/25/2015 03:24 PM, Pedro Ruivo wrote: >> >> On 11/25/2015 01:20 PM, Radim Vansa wrote: >>> On 11/25/2015 12:07 PM, Sanne Grinovero wrote: >>>> On 25 November 2015 at 10:48, Pedro Ruivo wrote: >>>>>> An alternative is to wait for all ACKs, but I think this could still >>>>>> be optimised in "triangle shape" too by having the Originator only >>>>>> wait for the ACKs from the non-primary replicas? >>>>>> So backup owners have to send a confirmation message to the >>>>>> Originator, while the Primary owner isn't expecting to do so. >>>>> IMO, we should wait for all ACKs to keep our read design. >>> What exactly is our 'read design'? >> If we don't wait for all the ACKs, then we have to go to the primary >> owner for reads, even if the originator is a Backup owner. > > I don't think so, but we probably have som miscom. If O = B, we still > wait for reply from B (which is local) which is triggered by receiving > an update from P (after applying the change locally). So it goes > > OB(application thread) [cache.put()] -(unordered)-> P(worker thread) > [applies update] -(ordered)-> OB(worker thread) [applies update] > -(in-VM)-> OB(application thread) [continues] In your example, O still has to receive a message from P with the previous value. The previous value may be included in the update sent by the primary, or it may be sent in a separate message, but O still has to receive the previous value somehow. Including the previous value in the backup update command is not necessary in general (except for FunctionalMap's commands, maybe?), so I'd rather use a separate message. > >> >>> I think that the source of optimization is that once primary decides to >>> backup the operation, he can forget about it and unlock the entry. So, >>> we don't need any ACK from primary unless it's an exception/noop >>> notification (as with conditional ops). If primary waited for ACK from >>> backup, we wouldn't save anything. >> About the iteration between P -> B, you're right. We don't need to wait >> for the ACKs if the messages are sent in FIFO (and JGroups guarantee that) >> >> About the O -> P, IMO, the Originator should wait for the reply from >> Backup. > > I was never claiming otherwise, O always needs to wait for ACK from Bs - > only then it can successfully report that value has been written on all > owners. What does this have to do with O -> P? Right, this is the thing I should have brought up during the meeting... if we only wait for the Ack from one B, then P can crash after we confirmed to the application but before all Bs have received the update message, and there will be nobody to retransmit/retry the command => inconsistency. > >> At least, the Primary would be the only one who needs to return >> the previous value (if needed) and it can return if the operation >> succeed or not. > > Simple success: no P -> O, B -> O (success) > Simple failure/non-modifying operation (as with putIfAbsent/functional > call): P -> O (failure/custom value), no B -> O > previous/custom value (as with replace() or functional call): P -> O > (previous/custom value), B -> O (success); alternative is P -> B > (previous/custom value, new value) and B -> O (previous/custom value) > Exception on either P or B: send the exception to O > Lost/timed out P -> B: O times out waiting for ack from B, throws exception > Like I said above, I would prefer it if P would send the previous value directly to O (if necessary). Otherwise yeah, I don't see any problem with O waiting for replies from P and Bs in parallel. We've talked several times about removing the replication timeout and assuming that a node will always reply in a timely manner to a command, unless it's not available. Maybe this time we'll really do it :) > >> This way, it would avoid forking the code for each type >> of command without any benefit (I'm thinking sending the reply to >> originator in parallel with the update message to the backups). > > What forking of code for each type do you mean? I see that there are two > branches whether the command is going to be replicated to B or not. I believe Pedro was talking about having P send the previous value directly to O, and so having different handling of replies on O based on whether we expect a previous value or not. I'm not that worried about it, one way to handle the difference would be to use ResponseMode.GET_ALL when the previous value is needed, and GET_NONE otherwise. Anyway, I think instead of jumping into implementation and fixing bugs as they pop up, this time it may be better to build a model and validate it first... then we can discuss changing details on the model, and checking them as well. I volunteered to do this with Knossos, we'll see how that goes (and when I'll have the time to actually work on it...) Dan > > Radim > >> >>> The gains are: >>> * less hops (3 instead of 4 if O != P && O != B) >>> * less messages (primary ACK is transitive based on ack from B) >>> * shorter lock times (not locking during P -> B RPC) >>> >>>>> However, the >>>>> Originator needs to wait for the ACK from Primary because of conditional >>>>> operations and functional API. >>>> If the operation is successful, Primary will have to let the >>>> secondaries know so these can reply to the Originator directly: still >>>> saves an hop. >> As I said above: "I'm thinking sending the reply to originator in >> parallel with the update message to the backups" >> >>>>> In this first case, if the conditional operation fail, the Backups are >>>>> not bothered. The latter case, we may need the return value from the >>>>> function. >>>> Right, for a failed or rejected operation the secondaries won't even >>>> know about it, >>>> so the Primary is in charge of letting the Originator know. >>>> Essentially you're highlighting that the Originator needs to wait for >>>> either the response from secondaries (all of them?) >>>> or from the Primary. >>>> >>>>>> I suspect the tricky part is what happens when the Primary owner rules >>>>>> +1 to apply the change, but then the backup owners (all or some of >>>>>> them) somehow fail before letting the Originator know. The Originator >>>>>> in this case should seek confirmation about its operation state >>>>>> (success?) with the Primary owner; this implies that the Primary owner >>>>>> needs to keep track of what it's applied and track failures too, and >>>>>> this log needs to be pruned. >>> Currently, in case of lost (timed out) ACK from B to P, we just report >>> exception and don't care about synchronizing P and B - B can already >>> store updated value. >>> So we don't have to care about rollback on P if replication to B fails >>> either - we just report that it's broken, sorry. >>> Better consolidation API would be nice, though, something like >>> cache.getAllVersions(). >>> >>> Radim >>> >>> >>>>>> Sounds pretty nice, or am I missing other difficulties? >>>>>> >>>>>> Thanks, >>>>>> Sanne >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Wed Nov 25 11:12:12 2015 From: rvansa at redhat.com (Radim Vansa) Date: Wed, 25 Nov 2015 17:12:12 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5655C48F.1000804@infinispan.org> <5655D082.8020608@redhat.com> Message-ID: <5655DDDC.6020507@redhat.com> On 11/25/2015 04:43 PM, Dan Berindei wrote: > On Wed, Nov 25, 2015 at 5:15 PM, Radim Vansa wrote: >> On 11/25/2015 03:24 PM, Pedro Ruivo wrote: >>> On 11/25/2015 01:20 PM, Radim Vansa wrote: >>>> On 11/25/2015 12:07 PM, Sanne Grinovero wrote: >>>>> On 25 November 2015 at 10:48, Pedro Ruivo wrote: >>>>>>> An alternative is to wait for all ACKs, but I think this could still >>>>>>> be optimised in "triangle shape" too by having the Originator only >>>>>>> wait for the ACKs from the non-primary replicas? >>>>>>> So backup owners have to send a confirmation message to the >>>>>>> Originator, while the Primary owner isn't expecting to do so. >>>>>> IMO, we should wait for all ACKs to keep our read design. >>>> What exactly is our 'read design'? >>> If we don't wait for all the ACKs, then we have to go to the primary >>> owner for reads, even if the originator is a Backup owner. >> I don't think so, but we probably have som miscom. If O = B, we still >> wait for reply from B (which is local) which is triggered by receiving >> an update from P (after applying the change locally). So it goes >> >> OB(application thread) [cache.put()] -(unordered)-> P(worker thread) >> [applies update] -(ordered)-> OB(worker thread) [applies update] >> -(in-VM)-> OB(application thread) [continues] > In your example, O still has to receive a message from P with the > previous value. The previous value may be included in the update sent > by the primary, or it may be sent in a separate message, but O still > has to receive the previous value somehow. Including the previous > value in the backup update command is not necessary in general (except > for FunctionalMap's commands, maybe?), so I'd rather use a separate > message. All right, in case that we need the previous value it really makes sense to send it to O directly. > >>>> I think that the source of optimization is that once primary decides to >>>> backup the operation, he can forget about it and unlock the entry. So, >>>> we don't need any ACK from primary unless it's an exception/noop >>>> notification (as with conditional ops). If primary waited for ACK from >>>> backup, we wouldn't save anything. >>> About the iteration between P -> B, you're right. We don't need to wait >>> for the ACKs if the messages are sent in FIFO (and JGroups guarantee that) >>> >>> About the O -> P, IMO, the Originator should wait for the reply from >>> Backup. >> I was never claiming otherwise, O always needs to wait for ACK from Bs - >> only then it can successfully report that value has been written on all >> owners. What does this have to do with O -> P? > Right, this is the thing I should have brought up during the > meeting... if we only wait for the Ack from one B, then P can crash > after we confirmed to the application but before all Bs have received > the update message, and there will be nobody to retransmit/retry the > command => inconsistency. We've been mixing the N-owners and 2-owners case here a bit, so let me clarify; anytime I've written that an ack is expected from B, I meant from all backups (but not necessarily from primary). The case with more backups also shows that when a return value other than 'true/false=applied/did not apply the update' is needed, we should send the response directly from P, because we don't want to send relay it through all Bs (or pick one 'special'). > >>> At least, the Primary would be the only one who needs to return >>> the previous value (if needed) and it can return if the operation >>> succeed or not. >> Simple success: no P -> O, B -> O (success) >> Simple failure/non-modifying operation (as with putIfAbsent/functional >> call): P -> O (failure/custom value), no B -> O >> previous/custom value (as with replace() or functional call): P -> O >> (previous/custom value), B -> O (success); alternative is P -> B >> (previous/custom value, new value) and B -> O (previous/custom value) >> Exception on either P or B: send the exception to O >> Lost/timed out P -> B: O times out waiting for ack from B, throws exception >> > Like I said above, I would prefer it if P would send the previous > value directly to O (if necessary). Otherwise yeah, I don't see any > problem with O waiting for replies from P and Bs in parallel. Agreed. > > We've talked several times about removing the replication timeout and > assuming that a node will always reply in a timely manner to a > command, unless it's not available. Maybe this time we'll really do it > :) That would make sense to me once we have true async calls implemented - then, if you want to have timeout-able operation, you would just do cache.putAsync().get(my timeout). But I don't promote async calls when these consume thread from limited threadpool. > >>> This way, it would avoid forking the code for each type >>> of command without any benefit (I'm thinking sending the reply to >>> originator in parallel with the update message to the backups). >> What forking of code for each type do you mean? I see that there are two >> branches whether the command is going to be replicated to B or not. > I believe Pedro was talking about having P send the previous value > directly to O, and so having different handling of replies on O based > on whether we expect a previous value or not. I'm not that worried > about it, one way to handle the difference would be to use > ResponseMode.GET_ALL when the previous value is needed, and GET_NONE > otherwise. If the first implementation does not support omitting simple ack P -> O, that's fine. But when designing, please don't block the path for a nice optimization. > > Anyway, I think instead of jumping into implementation and fixing bugs > as they pop up, this time it may be better to build a model and > validate it first... then we can discuss changing details on the > model, and checking them as well. I volunteered to do this with > Knossos, we'll see how that goes (and when I'll have the time to > actually work on it...) No objections :) If you get any interesting results from model checking, I am one big ear. Radim > > Dan > > >> Radim >> >>>> The gains are: >>>> * less hops (3 instead of 4 if O != P && O != B) >>>> * less messages (primary ACK is transitive based on ack from B) >>>> * shorter lock times (not locking during P -> B RPC) >>>> >>>>>> However, the >>>>>> Originator needs to wait for the ACK from Primary because of conditional >>>>>> operations and functional API. >>>>> If the operation is successful, Primary will have to let the >>>>> secondaries know so these can reply to the Originator directly: still >>>>> saves an hop. >>> As I said above: "I'm thinking sending the reply to originator in >>> parallel with the update message to the backups" >>> >>>>>> In this first case, if the conditional operation fail, the Backups are >>>>>> not bothered. The latter case, we may need the return value from the >>>>>> function. >>>>> Right, for a failed or rejected operation the secondaries won't even >>>>> know about it, >>>>> so the Primary is in charge of letting the Originator know. >>>>> Essentially you're highlighting that the Originator needs to wait for >>>>> either the response from secondaries (all of them?) >>>>> or from the Primary. >>>>> >>>>>>> I suspect the tricky part is what happens when the Primary owner rules >>>>>>> +1 to apply the change, but then the backup owners (all or some of >>>>>>> them) somehow fail before letting the Originator know. The Originator >>>>>>> in this case should seek confirmation about its operation state >>>>>>> (success?) with the Primary owner; this implies that the Primary owner >>>>>>> needs to keep track of what it's applied and track failures too, and >>>>>>> this log needs to be pruned. >>>> Currently, in case of lost (timed out) ACK from B to P, we just report >>>> exception and don't care about synchronizing P and B - B can already >>>> store updated value. >>>> So we don't have to care about rollback on P if replication to B fails >>>> either - we just report that it's broken, sorry. >>>> Better consolidation API would be nice, though, something like >>>> cache.getAllVersions(). >>>> >>>> Radim >>>> >>>> >>>>>>> Sounds pretty nice, or am I missing other difficulties? >>>>>>> >>>>>>> Thanks, >>>>>>> Sanne >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev at lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev at lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa JBoss Performance Team From ttarrant at redhat.com Thu Nov 26 02:00:46 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 26 Nov 2015 08:00:46 +0100 Subject: [infinispan-dev] Infinispan 8.1.0.CR1 (and 8.0.2.Final) Message-ID: <5656AE1E.90802@redhat.com> Dear all, we have two releases for you today: Infinispan 8.1.0.CR1 brings more refinement to the server management console, many improvements to query, statistics, management, security improvements and more. Infinispan 8.0.2.Final brings a number of stabilization bug fixes. Upgrading is highly recommended. Everything is available, as usual, from http://infinispan.org -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From galder at redhat.com Thu Nov 26 09:56:23 2015 From: galder at redhat.com (=?utf-8?Q?Galder_Zamarre=C3=B1o?=) Date: Thu, 26 Nov 2015 15:56:23 +0100 Subject: [infinispan-dev] Memory consumption of org.infinispan.marshall.core.JBossMarshaller In-Reply-To: <5651CA88.2000201@sweazer.com> References: <5651CA88.2000201@sweazer.com> Message-ID: <1BFA84C8-A69E-45A8-889B-5E6C7896E9AD@redhat.com> Those IdentityIntMap are caches meant to speed up serialization if the same objects or types are marshalled again. It's normal for them to be populated as marshalling operations are executed. We don't currently have a way to clear these caches. Cheers, -- Galder Zamarre?o Infinispan, Red Hat > On 22 Nov 2015, at 15:00, Christian Beikov wrote: > > Hello, > > In a recent heap dump analysis I found that > org.infinispan.marshall.core.JBossMarshaller consumes a lot of > memory(about 46 MB) that seems to be unused. > This is due to PerThreadInstanceHolder having ExtendedRiverMarshaller > objects that contain big IdentityIntMap objects. Some of those > IdentityIntMap instances have a size of 2 million entries, but most of > them have sizes of a few 100 thousands. > When I look into these IdentityIntMap instances, it seems that the > entries are all unused. > > Is that kind of memory consumption expected or does that indicate a > possibly wrong configuration? > > I am using Infinispan 7.2.4.Final on Wildfly 9.0.1.Final. > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From galder at redhat.com Thu Nov 26 11:04:51 2015 From: galder at redhat.com (=?utf-8?Q?Galder_Zamarre=C3=B1o?=) Date: Thu, 26 Nov 2015 17:04:51 +0100 Subject: [infinispan-dev] Equivalence vs. MurmurHash3 In-Reply-To: References: <5652E9D5.4070609@redhat.com> <56531194.6030109@redhat.com> Message-ID: -- Galder Zamarre?o Infinispan, Red Hat > On 23 Nov 2015, at 16:11, Dan Berindei wrote: > > On Mon, Nov 23, 2015 at 2:16 PM, Radim Vansa wrote: >> On 11/23/2015 01:07 PM, Sanne Grinovero wrote: >>> +1 >>> >>> See also https://issues.jboss.org/browse/ISPN-3905 ; although I was >>> mostly concerned on it allocating on a (very) hot path and didn't look >>> at it in terms of compatibility modes. >> Yes, due to compatibility we cannot remove the UTF-8 encoding from >> MurmurHash3 since compatibility with clients (in other languages) >> depends on this as well, though, we could theoretically merge encoding >> and hashing into one function - UTF-8 encoder implementation looks quite >> simple (60 lines of code?) - could be worth it even if used only for >> server. However, my proposal was to remove that computation from >> hot-code path completely. >> >>> >>> Rather than xorring with "magic numbers" don't you think we >>> Equivalence implementation should be able to rule on that? >> >> We shouldn't require user to provide a pair of hashCode functions, I >> don't think that would work well in practice. Though, we could make the >> second function Java 8-default method (with return hashCode() ^ >> 0xWHATEVER), still allowing it to be overridable. > > The JDK team learned long ago to use a spreader on top of > user-supplied hashCode() implementations, as user-supplied hash codes > are usually very clustered. In the case of strings, many times a > common prefix makes up most of the key, and the hash codes of the keys > are again clustered. A XOR with a magic value would definitely not > help with the clustering issue, that's why java.util.HashMap doesn't > use it. > > Note that our consistent hashes map adjacent keys to the same segment: > we use hash / buckets, whereas HashMap uses hash % buckets. So we > require a better spread across the hash space than HashMap does, and > because of that I think we really need MurmurHash3. Still, we could > change it to work on the result of Equivalence.hashCode(Object), > instead of dealing with the contents of byte[] and String directly, > but maintaining compatibility with old clients may not be possible. > > Regarding client-server divergences, I think we require compatibility > mode to be enabled in order to access a cache both via HotRod and with > the embedded API (because the server casts keys and values to byte[]). > That means the distribution interceptor sees only the unmarshalled > key, and getting the same hash code from the marshalled byte[] (on the > client) and the unmarshalled Object (in the distribution interceptor) > is going to be quite complex - either with a custom Object.hashCode() > implementation, or with a custom Equivalence.hash(). I think the only > way around this would be to change compatibility mode to store keys > and values as byte[]. The current compatibility mode set up favours embedded mode since it keeps POJOs in memory, but based on feedback I've got it seems like the use case of compatibility where embedded is involve is not as common. Hence, moving to storing byte[] would be beneficial particularly for those using Hot Rod + REST compatibility modes. Adrian already requested having the possibility to do this [1] and it's currently assigned to him :) Cheers, [1] https://issues.jboss.org/browse/ISPN-3663 > > Cheers > Dan > >> >> Radim >> >>> >>> On 23 November 2015 at 10:26, Radim Vansa wrote: >>>> Hi guys, >>>> >>>> I have noticed that even in library mode we use MurmurHash3 to find out >>>> the segment for particular key. For strings, this involves encoding into >>>> UTF-8 and computation of hashCode, instead of just reading the cached >>>> value in string. Common objects just remix the bits of hashCode. When >>>> user provides custom Equivalence with non-default hashCode, it is not >>>> used to determine the segment. >>>> >>>> I think that in library mode we should rather use Equivalence.hashCode, >>>> maybe XORed with some magic number so that there are less collisions in >>>> DataContainer. >>>> >>>> If we simply replaced the function in CH, we would break the case when >>>> user starts HR server on top of library mode, as the clients expect key >>>> location based on MurmurHash3. ATM user only has to set >>>> AnyServerEquivalence for keys in DC; we would need to detect >>>> configuration with server equivalence and set CH function to MH3, and >>>> probably also log some warning if the equivalence is set to unknown >>>> class and CH function is not specified. >>>> >>>> WDYT? >>>> >>>> Radim >>>> >>>> -- >>>> Radim Vansa >>>> JBoss Performance Team >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Radim Vansa >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Thu Nov 26 11:51:38 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Thu, 26 Nov 2015 17:51:38 +0100 Subject: [infinispan-dev] Equivalence vs. MurmurHash3 In-Reply-To: References: <5652E9D5.4070609@redhat.com> <56531194.6030109@redhat.com>

Message-ID: <5657389A.9000003@redhat.com> Indeed, I see HotRod + REST as far more practical, at least for Infinispan Server. For embedded servers (aka grafted-mode) obviously this is not the case. Tristan On 26/11/2015 17:04, Galder Zamarre?o wrote: > > > -- > Galder Zamarre?o > Infinispan, Red Hat > >> On 23 Nov 2015, at 16:11, Dan Berindei wrote: >> >> On Mon, Nov 23, 2015 at 2:16 PM, Radim Vansa wrote: >>> On 11/23/2015 01:07 PM, Sanne Grinovero wrote: >>>> +1 >>>> >>>> See also https://issues.jboss.org/browse/ISPN-3905 ; although I was >>>> mostly concerned on it allocating on a (very) hot path and didn't look >>>> at it in terms of compatibility modes. >>> Yes, due to compatibility we cannot remove the UTF-8 encoding from >>> MurmurHash3 since compatibility with clients (in other languages) >>> depends on this as well, though, we could theoretically merge encoding >>> and hashing into one function - UTF-8 encoder implementation looks quite >>> simple (60 lines of code?) - could be worth it even if used only for >>> server. However, my proposal was to remove that computation from >>> hot-code path completely. >>> >>>> >>>> Rather than xorring with "magic numbers" don't you think we >>>> Equivalence implementation should be able to rule on that? >>> >>> We shouldn't require user to provide a pair of hashCode functions, I >>> don't think that would work well in practice. Though, we could make the >>> second function Java 8-default method (with return hashCode() ^ >>> 0xWHATEVER), still allowing it to be overridable. >> >> The JDK team learned long ago to use a spreader on top of >> user-supplied hashCode() implementations, as user-supplied hash codes >> are usually very clustered. In the case of strings, many times a >> common prefix makes up most of the key, and the hash codes of the keys >> are again clustered. A XOR with a magic value would definitely not >> help with the clustering issue, that's why java.util.HashMap doesn't >> use it. >> >> Note that our consistent hashes map adjacent keys to the same segment: >> we use hash / buckets, whereas HashMap uses hash % buckets. So we >> require a better spread across the hash space than HashMap does, and >> because of that I think we really need MurmurHash3. Still, we could >> change it to work on the result of Equivalence.hashCode(Object), >> instead of dealing with the contents of byte[] and String directly, >> but maintaining compatibility with old clients may not be possible. >> >> Regarding client-server divergences, I think we require compatibility >> mode to be enabled in order to access a cache both via HotRod and with >> the embedded API (because the server casts keys and values to byte[]). >> That means the distribution interceptor sees only the unmarshalled >> key, and getting the same hash code from the marshalled byte[] (on the >> client) and the unmarshalled Object (in the distribution interceptor) >> is going to be quite complex - either with a custom Object.hashCode() >> implementation, or with a custom Equivalence.hash(). I think the only >> way around this would be to change compatibility mode to store keys >> and values as byte[]. > > The current compatibility mode set up favours embedded mode since it keeps POJOs in memory, but based on feedback I've got it seems like the use case of compatibility where embedded is involve is not as common. Hence, moving to storing byte[] would be beneficial particularly for those using Hot Rod + REST compatibility modes. > > Adrian already requested having the possibility to do this [1] and it's currently assigned to him :) > > Cheers, > > [1] https://issues.jboss.org/browse/ISPN-3663 > >> >> Cheers >> Dan >> >>> >>> Radim >>> >>>> >>>> On 23 November 2015 at 10:26, Radim Vansa wrote: >>>>> Hi guys, >>>>> >>>>> I have noticed that even in library mode we use MurmurHash3 to find out >>>>> the segment for particular key. For strings, this involves encoding into >>>>> UTF-8 and computation of hashCode, instead of just reading the cached >>>>> value in string. Common objects just remix the bits of hashCode. When >>>>> user provides custom Equivalence with non-default hashCode, it is not >>>>> used to determine the segment. >>>>> >>>>> I think that in library mode we should rather use Equivalence.hashCode, >>>>> maybe XORed with some magic number so that there are less collisions in >>>>> DataContainer. >>>>> >>>>> If we simply replaced the function in CH, we would break the case when >>>>> user starts HR server on top of library mode, as the clients expect key >>>>> location based on MurmurHash3. ATM user only has to set >>>>> AnyServerEquivalence for keys in DC; we would need to detect >>>>> configuration with server equivalence and set CH function to MH3, and >>>>> probably also log some warning if the equivalence is set to unknown >>>>> class and CH function is not specified. >>>>> >>>>> WDYT? >>>>> >>>>> Radim >>>>> >>>>> -- >>>>> Radim Vansa >>>>> JBoss Performance Team >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev at lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev at lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> -- >>> Radim Vansa >>> JBoss Performance Team >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From christian at sweazer.com Fri Nov 27 03:26:25 2015 From: christian at sweazer.com (Christian Beikov) Date: Fri, 27 Nov 2015 09:26:25 +0100 Subject: [infinispan-dev] Memory consumption of org.infinispan.marshall.core.JBossMarshaller In-Reply-To: <1BFA84C8-A69E-45A8-889B-5E6C7896E9AD@redhat.com> References: <5651CA88.2000201@sweazer.com> <1BFA84C8-A69E-45A8-889B-5E6C7896E9AD@redhat.com> Message-ID: <565813B1.1040502@sweazer.com> Are you going to do something about this memory consumption or is there at least some kind of minimum expected memory usage you can give me? I ran into an OOMEs the other day and the cluster was unable to recover from that by restarting single nodes. The nodes couldn't synchronize because of the OOMEs. I had to (jgroups-)disconnect all nodes from the cluster and start a separate cluster which of course lead to data loss. All of this happened because of some wrong memory consumption estimations I made so in order to avoid that in the future I would like to plan better ahead. Is there any other way to avoid such a cluster death? Regards, Christian Am 26.11.2015 um 15:56 schrieb Galder Zamarre?o: > Those IdentityIntMap are caches meant to speed up serialization if the same objects or types are marshalled again. It's normal for them to be populated as marshalling operations are executed. We don't currently have a way to clear these caches. > > Cheers, > -- > Galder Zamarre?o > Infinispan, Red Hat > >> On 22 Nov 2015, at 15:00, Christian Beikov wrote: >> >> Hello, >> >> In a recent heap dump analysis I found that >> org.infinispan.marshall.core.JBossMarshaller consumes a lot of >> memory(about 46 MB) that seems to be unused. >> This is due to PerThreadInstanceHolder having ExtendedRiverMarshaller >> objects that contain big IdentityIntMap objects. Some of those >> IdentityIntMap instances have a size of 2 million entries, but most of >> them have sizes of a few 100 thousands. >> When I look into these IdentityIntMap instances, it seems that the >> entries are all unused. >> >> Is that kind of memory consumption expected or does that indicate a >> possibly wrong configuration? >> >> I am using Infinispan 7.2.4.Final on Wildfly 9.0.1.Final. >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From rvansa at redhat.com Fri Nov 27 04:28:47 2015 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 27 Nov 2015 10:28:47 +0100 Subject: [infinispan-dev] Consolidating temporary per-key data In-Reply-To: <56533020.8090806@redhat.com> References: <56533020.8090806@redhat.com> Message-ID: <5658224F.7060404@redhat.com> No thoughts at all? @wburns, could I have your view on this? Thanks Radim On 11/23/2015 04:26 PM, Radim Vansa wrote: > Hi again, > > examining some flamegraphs I've found out that recently the > ExpirationInterceptor has been added, which registers ongoing write in a > hashmap. So at this point we have a map for locks, map for writes used > for expiration, another two key-addressed maps in L1ManagerImpl and one > in L1NonTxInterceptor and maybe another maps elsewhere. > > This makes me think that we could spare map lookups and expensive writes > by providing *single map for temporary per-key data*. A reference to the > entry could be stored in the context to save the lookups. An extreme > case would be to put this into DataContainer, but I think that this > would prove too tricky in practice. > > A downside would be the loss of encapsulation (any component could > theoretically access e.g. locks), but I don't find that too dramatic. > > WDYT? > > Radim > -- Radim Vansa JBoss Performance Team From bban at redhat.com Fri Nov 27 04:34:23 2015 From: bban at redhat.com (Bela Ban) Date: Fri, 27 Nov 2015 10:34:23 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <5655B5AC.6070504@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> Message-ID: <5658239F.4050101@redhat.com> Adding to what Radim wrote (below), would the following make sense (conditions: non-TX, P != O && O != B)? The lock we acquire on P is actually used to establish an ordering for updates to the Bs. So this is very similar to SEQUENCER, expect that we have a sequencer (P) *per key*. Phase 1 ------- - O sends a PUT(x) message to P Phase 2 ------- - P adds PUT(x) to a queue and returns (freeing the up-thread) - A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs (possible optimization: send updates to the same key sets as batches) - PUT(x) is applied locally and an ACK is sent back to O O times out and throws an exception if it doesn't receive the ack from P. This would reduce the current 4 phases (for the above conditions) to 2, plus the added latency of processing PUT(x) in the queue. However, we'd get rid of the put-while-holding-the-lock issue. P's updates to the Bs are FIFO ordered, therefore all we need to do is send the update down into UNICAST3 (or NAKACK2, if we use multicasts) which guarantees ordering. Subsequent updates are ordered according to send order. The updates are guaranteed to be retransmitted as long as P is alive. If P crashes before returning the ack to O, or while updating the Bs, then O will time out and throw an exception. And, yes, there can be inconsistencies, but we're talking about the non-TX case. Perhaps O could resubmit PUT(x) to the new P. I don't know how this behaves wrt rebalancing: are we flushing pending updates before installing the new CH? Thoughts? > I think that the source of optimization is that once primary decides to > backup the operation, he can forget about it and unlock the entry. So, > we don't need any ACK from primary unless it's an exception/noop > notification (as with conditional ops). If primary waited for ACK from > backup, we wouldn't save anything. -- Bela Ban, JGroups lead (http://www.jgroups.org) From rvansa at redhat.com Fri Nov 27 05:12:14 2015 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 27 Nov 2015 11:12:14 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <5658239F.4050101@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5658239F.4050101@redhat.com> Message-ID: <56582C7E.60003@redhat.com> The update needs to be applied to *all* owners before the call returns on O. With your strategy, P could apply update, send ACK but the async backup updates would not be delivered on Bs; so an ACKed update would get completely lost. I don't say that these async Bs are not possible, but not in the basic case - for default configuration, we need to keep the guarantees. Radim On 11/27/2015 10:34 AM, Bela Ban wrote: > Adding to what Radim wrote (below), would the following make sense > (conditions: non-TX, P != O && O != B)? > > The lock we acquire on P is actually used to establish an ordering for > updates to the Bs. So this is very similar to SEQUENCER, expect that we > have a sequencer (P) *per key*. > > Phase 1 > ------- > - O sends a PUT(x) message to P > > Phase 2 > ------- > - P adds PUT(x) to a queue and returns (freeing the up-thread) > - A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs > (possible optimization: send updates to the same key sets as batches) > - PUT(x) is applied locally and an ACK is sent back to O > > O times out and throws an exception if it doesn't receive the ack from P. > > This would reduce the current 4 phases (for the above conditions) to 2, > plus the added latency of processing PUT(x) in the queue. However, we'd > get rid of the put-while-holding-the-lock issue. > > P's updates to the Bs are FIFO ordered, therefore all we need to do is > send the update down into UNICAST3 (or NAKACK2, if we use multicasts) > which guarantees ordering. Subsequent updates are ordered according to > send order. The updates are guaranteed to be retransmitted as long as P > is alive. > > If P crashes before returning the ack to O, or while updating the Bs, > then O will time out and throw an exception. And, yes, there can be > inconsistencies, but we're talking about the non-TX case. Perhaps O > could resubmit PUT(x) to the new P. > > I don't know how this behaves wrt rebalancing: are we flushing pending > updates before installing the new CH? > > Thoughts? > > >> I think that the source of optimization is that once primary decides to >> backup the operation, he can forget about it and unlock the entry. So, >> we don't need any ACK from primary unless it's an exception/noop >> notification (as with conditional ops). If primary waited for ACK from >> backup, we wouldn't save anything. > -- Radim Vansa JBoss Performance Team From rvansa at redhat.com Fri Nov 27 05:18:33 2015 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 27 Nov 2015 11:18:33 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <56582C7E.60003@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5658239F.4050101@redhat.com> <56582C7E.60003@redhat.com> Message-ID: <56582DF9.5010005@redhat.com> On 11/27/2015 11:12 AM, Radim Vansa wrote: > The update needs to be applied to *all* owners before the call returns > on O. With your strategy, P could apply update, send ACK but the async > backup updates would not be delivered on Bs; so an ACKed update would > get completely lost. Err, I forgot to say that P crashes after sending the ACK, but before making sure that updates are delivered to B. One thing about relying on JGroups FIFO is that as it's not per-key, single message lost on network will delay *all* other updates before the resend kicks in and sorts it out. That's not a very appealing characteristics, and as we seem to be in need for versions on entries for different purposes as well, I would rather do the ordering in Infinispan and let JGroups handle only the reliability. R. > I don't say that these async Bs are not possible, but not in the basic > case - for default configuration, we need to keep the guarantees. > > Radim > > On 11/27/2015 10:34 AM, Bela Ban wrote: >> Adding to what Radim wrote (below), would the following make sense >> (conditions: non-TX, P != O && O != B)? >> >> The lock we acquire on P is actually used to establish an ordering for >> updates to the Bs. So this is very similar to SEQUENCER, expect that we >> have a sequencer (P) *per key*. >> >> Phase 1 >> ------- >> - O sends a PUT(x) message to P >> >> Phase 2 >> ------- >> - P adds PUT(x) to a queue and returns (freeing the up-thread) >> - A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs >> (possible optimization: send updates to the same key sets as >> batches) >> - PUT(x) is applied locally and an ACK is sent back to O >> >> O times out and throws an exception if it doesn't receive the ack >> from P. >> >> This would reduce the current 4 phases (for the above conditions) to 2, >> plus the added latency of processing PUT(x) in the queue. However, we'd >> get rid of the put-while-holding-the-lock issue. >> >> P's updates to the Bs are FIFO ordered, therefore all we need to do is >> send the update down into UNICAST3 (or NAKACK2, if we use multicasts) >> which guarantees ordering. Subsequent updates are ordered according to >> send order. The updates are guaranteed to be retransmitted as long as P >> is alive. >> >> If P crashes before returning the ack to O, or while updating the Bs, >> then O will time out and throw an exception. And, yes, there can be >> inconsistencies, but we're talking about the non-TX case. Perhaps O >> could resubmit PUT(x) to the new P. >> >> I don't know how this behaves wrt rebalancing: are we flushing pending >> updates before installing the new CH? >> >> Thoughts? >> >> >>> I think that the source of optimization is that once primary decides to >>> backup the operation, he can forget about it and unlock the entry. So, >>> we don't need any ACK from primary unless it's an exception/noop >>> notification (as with conditional ops). If primary waited for ACK from >>> backup, we wouldn't save anything. >> > > -- Radim Vansa JBoss Performance Team From bban at redhat.com Fri Nov 27 05:48:32 2015 From: bban at redhat.com (Bela Ban) Date: Fri, 27 Nov 2015 11:48:32 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <56582C7E.60003@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5658239F.4050101@redhat.com> <56582C7E.60003@redhat.com> Message-ID: <56583500.7020900@redhat.com> You're talking about the case where P applies the PUT, and sends an ACK back to O, but the async updates to the Bs are received by only a subset (or none) of the Bs, and then P crashes. As I was referring about the non-transactional case, wouldn't this be fine? Or do we want the *non-transactional* case to be an atomic update of P and all Bs? IMO, the latter should be done as part of a TX, not for the non-transactional case. So I think we need to come up with a concise definition of what the transactional versus non-transaction semantics are. But even if we go with a design where O waits for ACKs from *all* Bs, we can still end up with inconsistencies; e.g. when not all Bs received the updates. O will fail the PUT, but the question is what do we do in such a case? Re-submit the PUT? On 27/11/15 11:12, Radim Vansa wrote: > The update needs to be applied to *all* owners before the call returns > on O. With your strategy, P could apply update, send ACK but the async > backup updates would not be delivered on Bs; so an ACKed update would > get completely lost. > I don't say that these async Bs are not possible, but not in the basic > case - for default configuration, we need to keep the guarantees. > > Radim > > On 11/27/2015 10:34 AM, Bela Ban wrote: >> Adding to what Radim wrote (below), would the following make sense >> (conditions: non-TX, P != O && O != B)? >> >> The lock we acquire on P is actually used to establish an ordering for >> updates to the Bs. So this is very similar to SEQUENCER, expect that we >> have a sequencer (P) *per key*. >> >> Phase 1 >> ------- >> - O sends a PUT(x) message to P >> >> Phase 2 >> ------- >> - P adds PUT(x) to a queue and returns (freeing the up-thread) >> - A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs >> (possible optimization: send updates to the same key sets as batches) >> - PUT(x) is applied locally and an ACK is sent back to O >> >> O times out and throws an exception if it doesn't receive the ack from P. >> >> This would reduce the current 4 phases (for the above conditions) to 2, >> plus the added latency of processing PUT(x) in the queue. However, we'd >> get rid of the put-while-holding-the-lock issue. >> >> P's updates to the Bs are FIFO ordered, therefore all we need to do is >> send the update down into UNICAST3 (or NAKACK2, if we use multicasts) >> which guarantees ordering. Subsequent updates are ordered according to >> send order. The updates are guaranteed to be retransmitted as long as P >> is alive. >> >> If P crashes before returning the ack to O, or while updating the Bs, >> then O will time out and throw an exception. And, yes, there can be >> inconsistencies, but we're talking about the non-TX case. Perhaps O >> could resubmit PUT(x) to the new P. >> >> I don't know how this behaves wrt rebalancing: are we flushing pending >> updates before installing the new CH? >> >> Thoughts? >> >> >>> I think that the source of optimization is that once primary decides to >>> backup the operation, he can forget about it and unlock the entry. So, >>> we don't need any ACK from primary unless it's an exception/noop >>> notification (as with conditional ops). If primary waited for ACK from >>> backup, we wouldn't save anything. >> > > -- Bela Ban, JGroups lead (http://www.jgroups.org) From bban at redhat.com Fri Nov 27 05:51:10 2015 From: bban at redhat.com (Bela Ban) Date: Fri, 27 Nov 2015 11:51:10 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <56582DF9.5010005@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5658239F.4050101@redhat.com> <56582C7E.60003@redhat.com> <56582DF9.5010005@redhat.com> Message-ID: <5658359E.6080805@redhat.com> On 27/11/15 11:18, Radim Vansa wrote: > On 11/27/2015 11:12 AM, Radim Vansa wrote: >> The update needs to be applied to *all* owners before the call returns >> on O. With your strategy, P could apply update, send ACK but the async >> backup updates would not be delivered on Bs; so an ACKed update would >> get completely lost. > > Err, I forgot to say that P crashes after sending the ACK, but before > making sure that updates are delivered to B. > > One thing about relying on JGroups FIFO is that as it's not per-key, > single message lost on network will delay *all* other updates before the > resend kicks in and sorts it out. That's not a very appealing > characteristics, and as we seem to be in need for versions on entries > for different purposes as well, I would rather do the ordering in > Infinispan and let JGroups handle only the reliability. Unless I'm mistaken, this is already the case: Infinispan's internal thread pool performs ordering and delivery afair. > R. > >> I don't say that these async Bs are not possible, but not in the basic >> case - for default configuration, we need to keep the guarantees. >> >> Radim >> >> On 11/27/2015 10:34 AM, Bela Ban wrote: >>> Adding to what Radim wrote (below), would the following make sense >>> (conditions: non-TX, P != O && O != B)? >>> >>> The lock we acquire on P is actually used to establish an ordering for >>> updates to the Bs. So this is very similar to SEQUENCER, expect that we >>> have a sequencer (P) *per key*. >>> >>> Phase 1 >>> ------- >>> - O sends a PUT(x) message to P >>> >>> Phase 2 >>> ------- >>> - P adds PUT(x) to a queue and returns (freeing the up-thread) >>> - A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs >>> (possible optimization: send updates to the same key sets as >>> batches) >>> - PUT(x) is applied locally and an ACK is sent back to O >>> >>> O times out and throws an exception if it doesn't receive the ack >>> from P. >>> >>> This would reduce the current 4 phases (for the above conditions) to 2, >>> plus the added latency of processing PUT(x) in the queue. However, we'd >>> get rid of the put-while-holding-the-lock issue. >>> >>> P's updates to the Bs are FIFO ordered, therefore all we need to do is >>> send the update down into UNICAST3 (or NAKACK2, if we use multicasts) >>> which guarantees ordering. Subsequent updates are ordered according to >>> send order. The updates are guaranteed to be retransmitted as long as P >>> is alive. >>> >>> If P crashes before returning the ack to O, or while updating the Bs, >>> then O will time out and throw an exception. And, yes, there can be >>> inconsistencies, but we're talking about the non-TX case. Perhaps O >>> could resubmit PUT(x) to the new P. >>> >>> I don't know how this behaves wrt rebalancing: are we flushing pending >>> updates before installing the new CH? >>> >>> Thoughts? >>> >>> >>>> I think that the source of optimization is that once primary decides to >>>> backup the operation, he can forget about it and unlock the entry. So, >>>> we don't need any ACK from primary unless it's an exception/noop >>>> notification (as with conditional ops). If primary waited for ACK from >>>> backup, we wouldn't save anything. >>> >> >> > > -- Bela Ban, JGroups lead (http://www.jgroups.org) From ttarrant at redhat.com Fri Nov 27 06:05:44 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 27 Nov 2015 12:05:44 +0100 Subject: [infinispan-dev] CI build failures Message-ID: <56583908.7050606@redhat.com> Hi, the hibernate test-jar debacle has caused many builds on CI to fail half-way through. I've cleared all repos hoping that this will cause re-download. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From rvansa at redhat.com Fri Nov 27 06:45:37 2015 From: rvansa at redhat.com (Radim Vansa) Date: Fri, 27 Nov 2015 12:45:37 +0100 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <56583500.7020900@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5658239F.4050101@redhat.com> <56582C7E.60003@redhat.com> <56583500.7020900@redhat.com> Message-ID: <56584261.3000805@redhat.com> On 11/27/2015 11:48 AM, Bela Ban wrote: > You're talking about the case where P applies the PUT, and sends an ACK > back to O, but the async updates to the Bs are received by only a subset > (or none) of the Bs, and then P crashes. > > As I was referring about the non-transactional case, wouldn't this be > fine? Or do we want the *non-transactional* case to be an atomic update > of P and all Bs? IMO, the latter should be done as part of a TX, not for > the non-transactional case. You're not talking about non-transactional mode but mongo mode :) Non-transactional mode still guarantees that the data will be reliably stored, but it does not allow any consistency between two keys. Transactional mode allows you to change all keys or none of them. The atomicity is rather discutable. All writes are atomic with respect to writes, but reads just come and read something, and there's no way to make sure that two transactional reads read the same value. Due to the two-armies problem, in case that an error is encountered, it's possible that the cluster will end up in inconsistent state - in non-tx mode this is the updated B and P not applying the update. In transactional case, if the second phase (CommitCommand) gets executed on a subset of nodes and the others don't reply, the rollback sent cannot undo the already committed transactions. In that case, Infinispan is obliged to throw an exception to the user (tx mode requires useSynchronizations=false to do this) but it cannot prevent/resolve it. > > So I think we need to come up with a concise definition of what the > transactional versus non-transaction semantics are. > > But even if we go with a design where O waits for ACKs from *all* Bs, we > can still end up with inconsistencies; e.g. when not all Bs received the > updates. O will fail the PUT, but the question is what do we do in such > a case? Re-submit the PUT? Throw exception and provide API to asses the situation. Radim > > On 27/11/15 11:12, Radim Vansa wrote: >> The update needs to be applied to *all* owners before the call returns >> on O. With your strategy, P could apply update, send ACK but the async >> backup updates would not be delivered on Bs; so an ACKed update would >> get completely lost. >> I don't say that these async Bs are not possible, but not in the basic >> case - for default configuration, we need to keep the guarantees. >> >> Radim >> >> On 11/27/2015 10:34 AM, Bela Ban wrote: >>> Adding to what Radim wrote (below), would the following make sense >>> (conditions: non-TX, P != O && O != B)? >>> >>> The lock we acquire on P is actually used to establish an ordering for >>> updates to the Bs. So this is very similar to SEQUENCER, expect that we >>> have a sequencer (P) *per key*. >>> >>> Phase 1 >>> ------- >>> - O sends a PUT(x) message to P >>> >>> Phase 2 >>> ------- >>> - P adds PUT(x) to a queue and returns (freeing the up-thread) >>> - A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs >>> (possible optimization: send updates to the same key sets as batches) >>> - PUT(x) is applied locally and an ACK is sent back to O >>> >>> O times out and throws an exception if it doesn't receive the ack from P. >>> >>> This would reduce the current 4 phases (for the above conditions) to 2, >>> plus the added latency of processing PUT(x) in the queue. However, we'd >>> get rid of the put-while-holding-the-lock issue. >>> >>> P's updates to the Bs are FIFO ordered, therefore all we need to do is >>> send the update down into UNICAST3 (or NAKACK2, if we use multicasts) >>> which guarantees ordering. Subsequent updates are ordered according to >>> send order. The updates are guaranteed to be retransmitted as long as P >>> is alive. >>> >>> If P crashes before returning the ack to O, or while updating the Bs, >>> then O will time out and throw an exception. And, yes, there can be >>> inconsistencies, but we're talking about the non-TX case. Perhaps O >>> could resubmit PUT(x) to the new P. >>> >>> I don't know how this behaves wrt rebalancing: are we flushing pending >>> updates before installing the new CH? >>> >>> Thoughts? >>> >>> >>>> I think that the source of optimization is that once primary decides to >>>> backup the operation, he can forget about it and unlock the entry. So, >>>> we don't need any ACK from primary unless it's an exception/noop >>>> notification (as with conditional ops). If primary waited for ACK from >>>> backup, we wouldn't save anything. >> -- Radim Vansa JBoss Performance Team From rory.odonnell at oracle.com Fri Nov 27 07:36:25 2015 From: rory.odonnell at oracle.com (Rory O'Donnell) Date: Fri, 27 Nov 2015 12:36:25 +0000 Subject: [infinispan-dev] Early Access b93 is available for JDK 9 on java.net Message-ID: <56584E49.8080604@oracle.com> Hi Galder, Since my last message about JDK 9 build b88, a number of new JEPs have been integrated into JDK 9 b93 available here . I'd like to point you to a few that are now available for testing in this JDK 9 Early Access build: JEP 254: Compact Strings (http://openjdk.java.net/jeps/254) This JEP adopts a more space-efficient internal representation for strings. We propose to change the internal representation of the String class from a UTF-16 char array to a byte array plus an encoding-flag field. The new String class will store characters encoded either as ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes per character), based upon the contents of the string. The encoding flag will indicate which encoding is used. JEP 165: Compiler Control (http://openjdk.java.net/jeps/165) This JEP proposes an improved way to control the JVM compilers. It enables runtime manageable, method dependent compiler flags. (Immutable for the duration of a compilation.) Method-context dependent control of the compilation process is a powerful tool for writing small contained JVM compiler tests that can be run without restarting the entire JVM. It is also very useful for creating workarounds for bugs in the JVM compilers. JEP 243: Java-Level JVM Compiler Interface (http://openjdk.java.net/jeps/243) This JEP instruments the data flows within the JVM which are used by the JIT compiler to allow Java code to observe, query, and affect the JVM's compilation process and its associated metadata. JEP 268: XML Catalog API (http://openjdk.java.net/jeps/268) This JEP develops a standard XML Catalog API that supports the OASIS XML Catalogs standard, v1.1. The API will define catalog and catalog-resolver abstractions which can be used with the JAXP processors that accept resolvers. Existing libraries or applications that use the internal API will need to migrate to the new API in order to take advantage of the new features. Rgds, Rory -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151127/4a9e4ed1/attachment.html From sanne at infinispan.org Fri Nov 27 07:45:26 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 27 Nov 2015 12:45:26 +0000 Subject: [infinispan-dev] CI build failures In-Reply-To: <56583908.7050606@redhat.com> References: <56583908.7050606@redhat.com> Message-ID: What happened? On 27 November 2015 at 11:05, Tristan Tarrant wrote: > Hi, > > the hibernate test-jar debacle has caused many builds on CI to fail > half-way through. I've cleared all repos hoping that this will cause > re-download. > > Tristan > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Fri Nov 27 08:18:06 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 27 Nov 2015 14:18:06 +0100 Subject: [infinispan-dev] CI build failures In-Reply-To: References: <56583908.7050606@redhat.com> Message-ID: <5658580E.4070305@redhat.com> It is still happening, but that's because I didn't clean all of the teamcity local maven repos. It cached some broken poms: [Step 2/5] org.infinispan:infinispan-directory-provider [11:14:59][org.infinispan:infinispan-directory-provider] Importing data from '/mnt/persistent_storage/cloud-user/ispn/buildAgent/work/c9523b7cef7aa565/lucene/directory-provider/target/surefire-reports/TEST-*.xml' (not existing file) with 'surefire' processor [11:14:59][org.infinispan:infinispan-directory-provider] Importing data from '/mnt/persistent_storage/cloud-user/ispn/buildAgent/work/c9523b7cef7aa565/lucene/directory-provider/target/failsafe-reports/TEST-*.xml' (not existing file) with 'surefire' processor [11:14:59][org.infinispan:infinispan-directory-provider] [WARNING] The POM for org.hibernate:hibernate-search-engine:jar:5.5.1.Final is missing, no dependency information available [11:14:59][org.infinispan:infinispan-directory-provider] [WARNING] The POM for org.hibernate:hibernate-search-engine:jar:tests:5.5.1.Final is missing, no dependency information available [11:14:59][org.infinispan:infinispan-directory-provider] Failed to execute goal on project infinispan-directory-provider: Could not resolve dependencies for project org.infinispan:infinispan-directory-provider:jar:8.1.0-SNAPSHOT: The following artifacts could not be resolved: org.hibernate:hibernate-search-engine:jar:5.5.1.Final, org.hibernate:hibernate-search-engine:jar:tests:5.5.1.Final: Failure to find org.hibernate:hibernate-search-engine:jar:5.5.1.Final in http://maven.repository.redhat.com/earlyaccess/all/ was cached in the local repository, resolution will not be reattempted until the update interval of redhat-earlyaccess-repository-group has elapsed or updates are forced Tristan On 27/11/2015 13:45, Sanne Grinovero wrote: > What happened? > > On 27 November 2015 at 11:05, Tristan Tarrant wrote: >> Hi, >> >> the hibernate test-jar debacle has caused many builds on CI to fail >> half-way through. I've cleared all repos hoping that this will cause >> re-download. >> >> Tristan >> -- >> Tristan Tarrant >> Infinispan Lead >> JBoss, a division of Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From ttarrant at redhat.com Fri Nov 27 08:19:36 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Fri, 27 Nov 2015 14:19:36 +0100 Subject: [infinispan-dev] Early Access b93 is available for JDK 9 on java.net In-Reply-To: <56584E49.8080604@oracle.com> References: <56584E49.8080604@oracle.com> Message-ID: <56585868.8070108@redhat.com> On 27/11/2015 13:36, Rory O'Donnell wrote: > JEP 254: Compact Strings (http://openjdk.java.net/jeps/254) > > This JEP adopts a more space-efficient internal representation for strings. > > We propose to change the internal representation of the String class > from a UTF-16 char array to a byte array plus an encoding-flag field. > The new String class will store characters encoded either as > ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes per > character), based upon the contents of the string. The encoding flag > will indicate which encoding is used. This one sounds very promising, in the context of our last talks. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From sanne at infinispan.org Fri Nov 27 08:23:36 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Fri, 27 Nov 2015 13:23:36 +0000 Subject: [infinispan-dev] Early Access b93 is available for JDK 9 on java.net In-Reply-To: <56585868.8070108@redhat.com> References: <56584E49.8080604@oracle.com> <56585868.8070108@redhat.com> Message-ID: On 27 November 2015 at 13:19, Tristan Tarrant wrote: > > > On 27/11/2015 13:36, Rory O'Donnell wrote: > >> JEP 254: Compact Strings (http://openjdk.java.net/jeps/254) >> >> This JEP adopts a more space-efficient internal representation for strings. >> >> We propose to change the internal representation of the String class >> from a UTF-16 char array to a byte array plus an encoding-flag field. >> The new String class will store characters encoded either as >> ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes per >> character), based upon the contents of the string. The encoding flag >> will indicate which encoding is used. > > > This one sounds very promising, in the context of our last talks. Yes, but we also need a way to Marshall these efficiently. I haven't looked yet, but I'd assume we'd need to be able to read that internal flag somehow, to adjust wire encoding strategies. Hopefully David will get interested? Sanne From dan.berindei at gmail.com Fri Nov 27 09:06:43 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Fri, 27 Nov 2015 16:06:43 +0200 Subject: [infinispan-dev] The "Triangle" pattern for reducing Put latency In-Reply-To: <56584261.3000805@redhat.com> References: <565591EC.4090907@infinispan.org> <5655B5AC.6070504@redhat.com> <5658239F.4050101@redhat.com> <56582C7E.60003@redhat.com> <56583500.7020900@redhat.com> <56584261.3000805@redhat.com> Message-ID: On Fri, Nov 27, 2015 at 1:45 PM, Radim Vansa wrote: > On 11/27/2015 11:48 AM, Bela Ban wrote: >> You're talking about the case where P applies the PUT, and sends an ACK >> back to O, but the async updates to the Bs are received by only a subset >> (or none) of the Bs, and then P crashes. >> >> As I was referring about the non-transactional case, wouldn't this be >> fine? Or do we want the *non-transactional* case to be an atomic update >> of P and all Bs? IMO, the latter should be done as part of a TX, not for >> the non-transactional case. > > You're not talking about non-transactional mode but mongo mode :) Indeed! There's a reason we don't recommend using DIST_ASYNC, which is quite similar. In fact, I wouldn't be opposed to changing DIST_ASYNC so that the O -> P communication is synchronous, and only the P -> B communication is asynchronous. > > Non-transactional mode still guarantees that the data will be reliably > stored, but it does not allow any consistency between two keys. > Transactional mode allows you to change all keys or none of them. > > The atomicity is rather discutable. All writes are atomic with respect > to writes, but reads just come and read something, and there's no way to > make sure that two transactional reads read the same value. > > Due to the two-armies problem, in case that an error is encountered, > it's possible that the cluster will end up in inconsistent state - in > non-tx mode this is the updated B and P not applying the update. In > transactional case, if the second phase (CommitCommand) gets executed on > a subset of nodes and the others don't reply, the rollback sent cannot > undo the already committed transactions. In that case, Infinispan is > obliged to throw an exception to the user (tx mode requires > useSynchronizations=false to do this) but it cannot prevent/resolve it. Right, the biggest source of inconsistencies right now is replication timeouts, because those are not retried. As Radim says, transactional mode is not necessarily safer when timeouts are involved. I've argued before that we should remove the replication timeout and rely on FD exclusively, because suspected nodes are handled much better. There are also problems when the originator crashes after sending the commit command to some owners, but before all of them got it [1]. With pessimistic locking, the same can happen if the originator crashes after only some of the owners received the prepare command. We've made some progress handling crashed originators when partition handling is enabled, but we aren't covering all the cases yet. Still, there's a difference between having an inconsistency after reporting an error to the application, and having an inconsistency after reporting that everything's A-OK. [1] https://issues.jboss.org/browse/ISPN-3421 > >> >> So I think we need to come up with a concise definition of what the >> transactional versus non-transaction semantics are. >> >> But even if we go with a design where O waits for ACKs from *all* Bs, we >> can still end up with inconsistencies; e.g. when not all Bs received the >> updates. O will fail the PUT, but the question is what do we do in such >> a case? Re-submit the PUT? > > Throw exception and provide API to asses the situation. The "API to assess the situation" part is kind of lacking at the moment, but yeah, that's the idea. I doubt we'll ever get to a point where we can say the cache will *never* become inconsistent, but the application should at least be notified when something goes wrong. Cheers Dan From galder at redhat.com Mon Nov 30 11:57:22 2015 From: galder at redhat.com (=?utf-8?Q?Galder_Zamarre=C3=B1o?=) Date: Mon, 30 Nov 2015 17:57:22 +0100 Subject: [infinispan-dev] Memory consumption of org.infinispan.marshall.core.JBossMarshaller In-Reply-To: <565813B1.1040502@sweazer.com> References: <5651CA88.2000201@sweazer.com> <1BFA84C8-A69E-45A8-889B-5E6C7896E9AD@redhat.com> <565813B1.1040502@sweazer.com> Message-ID: <2CA3DE92-00A8-4789-B4F2-9A3F82451AAA@redhat.com> We're actively working to reduce our memory footprint. I can't really provide guidance on memory requirements since it's very dependant on the types stored and the amount of instances that are stored, which is specific to each use case. It's worth investing some time estimating loads and running load tests to adjust memory parameters before going to production. Cheers, -- Galder Zamarre?o Infinispan, Red Hat > On 27 Nov 2015, at 09:26, Christian Beikov wrote: > > Are you going to do something about this memory consumption or is there > at least some kind of minimum expected memory usage you can give me? > I ran into an OOMEs the other day and the cluster was unable to recover > from that by restarting single nodes. The nodes couldn't synchronize > because of the OOMEs. I had to (jgroups-)disconnect all nodes from the > cluster and start a separate cluster which of course lead to data loss. > All of this happened because of some wrong memory consumption > estimations I made so in order to avoid that in the future I would like > to plan better ahead. Is there any other way to avoid such a cluster death? > > Regards, > Christian > > Am 26.11.2015 um 15:56 schrieb Galder Zamarre?o: >> Those IdentityIntMap are caches meant to speed up serialization if the same objects or types are marshalled again. It's normal for them to be populated as marshalling operations are executed. We don't currently have a way to clear these caches. >> >> Cheers, >> -- >> Galder Zamarre?o >> Infinispan, Red Hat >> >>> On 22 Nov 2015, at 15:00, Christian Beikov wrote: >>> >>> Hello, >>> >>> In a recent heap dump analysis I found that >>> org.infinispan.marshall.core.JBossMarshaller consumes a lot of >>> memory(about 46 MB) that seems to be unused. >>> This is due to PerThreadInstanceHolder having ExtendedRiverMarshaller >>> objects that contain big IdentityIntMap objects. Some of those >>> IdentityIntMap instances have a size of 2 million entries, but most of >>> them have sizes of a few 100 thousands. >>> When I look into these IdentityIntMap instances, it seems that the >>> entries are all unused. >>> >>> Is that kind of memory consumption expected or does that indicate a >>> possibly wrong configuration? >>> >>> I am using Infinispan 7.2.4.Final on Wildfly 9.0.1.Final. >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev at lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From dan.berindei at gmail.com Mon Nov 30 14:17:28 2015 From: dan.berindei at gmail.com (Dan Berindei) Date: Mon, 30 Nov 2015 21:17:28 +0200 Subject: [infinispan-dev] Consolidating temporary per-key data In-Reply-To: <5658224F.7060404@redhat.com> References: <56533020.8090806@redhat.com> <5658224F.7060404@redhat.com> Message-ID: The first problem that comes to mind is that context entries are also stored in a map, at least in transactional mode. So access through the context would only be faster in non-tx caches, in tx caches it would not add any benefits. I also have some trouble imagining how these temporary entries would be released, since locks, L1 requestors, L1 synchronizers, and write registrations all have their own rules for cleaning up. Finally, I'm not sure how much this would help. I actually removed the write registration for everything except RemoveExpiredCommand when testing the HotRod server performance, but I didn't get any significant improvement on my machine. Which was kind of expected, since the benchmark doesn't seem to be CPU-bound, and JFR was showing it with < 1.5% of CPU. Cheers Dan On Fri, Nov 27, 2015 at 11:28 AM, Radim Vansa wrote: > No thoughts at all? @wburns, could I have your view on this? > > Thanks > > Radim > > On 11/23/2015 04:26 PM, Radim Vansa wrote: >> Hi again, >> >> examining some flamegraphs I've found out that recently the >> ExpirationInterceptor has been added, which registers ongoing write in a >> hashmap. So at this point we have a map for locks, map for writes used >> for expiration, another two key-addressed maps in L1ManagerImpl and one >> in L1NonTxInterceptor and maybe another maps elsewhere. >> >> This makes me think that we could spare map lookups and expensive writes >> by providing *single map for temporary per-key data*. A reference to the >> entry could be stored in the context to save the lookups. An extreme >> case would be to put this into DataContainer, but I think that this >> would prove too tricky in practice. >> >> A downside would be the loss of encapsulation (any component could >> theoretically access e.g. locks), but I don't find that too dramatic. >> >> WDYT? >> >> Radim >> > > > -- > Radim Vansa > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From ttarrant at redhat.com Mon Nov 30 14:58:52 2015 From: ttarrant at redhat.com (Tristan Tarrant) Date: Mon, 30 Nov 2015 20:58:52 +0100 Subject: [infinispan-dev] Cache Aliases Message-ID: <565CAA7C.7040206@redhat.com> Hi everybody, to address the needs of Teiid to implement materialized views on Infinispan caches, I have written a small design document describing "alias caches" [1]. To be able to fully support the use-case in remote configurations, we need to perform the switching operation remotely. Since remote admin functionality is also required by the JCache integration code, I have created a separate page [2] describing what the remote admin client would look like in terms of functionality and packaging. As usual, comments are welcome [1] https://github.com/infinispan/infinispan/wiki/Alias-Caches [2] https://github.com/infinispan/infinispan/wiki/Remote-Admin-Client -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat From gustavo at infinispan.org Mon Nov 30 15:20:36 2015 From: gustavo at infinispan.org (Gustavo Fernandes) Date: Mon, 30 Nov 2015 20:20:36 +0000 Subject: [infinispan-dev] Cache Aliases In-Reply-To: <565CAA7C.7040206@redhat.com> References: <565CAA7C.7040206@redhat.com> Message-ID: On 30 Nov 2015 19:58, "Tristan Tarrant" wrote: > > Hi everybody, > > to address the needs of Teiid to implement materialized views on > Infinispan caches, I have written a small design document describing > "alias caches" [1]. > To be able to fully support the use-case in remote configurations, we > need to perform the switching operation remotely. Since remote admin > functionality is also required by the JCache integration code, I have > created a separate page [2] describing what the remote admin client > would look like in terms of functionality and packaging. +1 This is needed as well for [1], and it could be usefull in the test suite to manipulate the server at runtime rather than at build time. How do you plan to represent the cache configuration? Maybe reuse the embedded builders and XML? [1] https://issues.jboss.org/browse/ISPRK-2 > > As usual, comments are welcome > > > [1] https://github.com/infinispan/infinispan/wiki/Alias-Caches > [2] https://github.com/infinispan/infinispan/wiki/Remote-Admin-Client > -- > Tristan Tarrant > Infinispan Lead > JBoss, a division of Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151130/3f21a154/attachment.html From mudokonman at gmail.com Mon Nov 30 16:54:07 2015 From: mudokonman at gmail.com (William Burns) Date: Mon, 30 Nov 2015 21:54:07 +0000 Subject: [infinispan-dev] Consolidating temporary per-key data In-Reply-To: References: <56533020.8090806@redhat.com> <5658224F.7060404@redhat.com> Message-ID: I am not sure there is an easy way to consolidate these into a single map, since some of these are written to on reads, some on writes and sometimes conditionally written to. And then as Dan said they are cleaned up at different times possibly. We could do something like states (based on which ones would have written to the map), but I think it will get quite complex, especially if we ever add more of these map type requirement. On a similar note, I had actually thought of possibly moving the expiration check out of the data container and into the entry wrapping interceptor or the likes. This would allow for us to remove the expiration map completely since we could only raise the extra expiration commands on a read and not writes. But this would change the API and I am thinking we can only do this for 9.0. On Mon, Nov 30, 2015 at 2:18 PM Dan Berindei wrote: > The first problem that comes to mind is that context entries are also > stored in a map, at least in transactional mode. So access through the > context would only be faster in non-tx caches, in tx caches it would > not add any benefits. > > I also have some trouble imagining how these temporary entries would > be released, since locks, L1 requestors, L1 synchronizers, and write > registrations all have their own rules for cleaning up. > > Finally, I'm not sure how much this would help. I actually removed the > write registration for everything except RemoveExpiredCommand when > testing the HotRod server performance, but I didn't get any > significant improvement on my machine. Which was kind of expected, > since the benchmark doesn't seem to be CPU-bound, and JFR was showing > it with < 1.5% of CPU. > Cheers > Dan > > > On Fri, Nov 27, 2015 at 11:28 AM, Radim Vansa wrote: > > No thoughts at all? @wburns, could I have your view on this? > > > > Thanks > > > > Radim > > > > On 11/23/2015 04:26 PM, Radim Vansa wrote: > >> Hi again, > >> > >> examining some flamegraphs I've found out that recently the > >> ExpirationInterceptor has been added, which registers ongoing write in a > >> hashmap. So at this point we have a map for locks, map for writes used > >> for expiration, another two key-addressed maps in L1ManagerImpl and one > >> in L1NonTxInterceptor and maybe another maps elsewhere. > >> > >> This makes me think that we could spare map lookups and expensive writes > >> by providing *single map for temporary per-key data*. A reference to the > >> entry could be stored in the context to save the lookups. An extreme > >> case would be to put this into DataContainer, but I think that this > >> would prove too tricky in practice. > >> > >> A downside would be the loss of encapsulation (any component could > >> theoretically access e.g. locks), but I don't find that too dramatic. > >> > >> WDYT? > >> > >> Radim > >> > > > > > > -- > > Radim Vansa > > JBoss Performance Team > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151130/903aa9e8/attachment-0001.html From sanne at infinispan.org Mon Nov 30 17:45:34 2015 From: sanne at infinispan.org (Sanne Grinovero) Date: Mon, 30 Nov 2015 22:45:34 +0000 Subject: [infinispan-dev] Consolidating temporary per-key data In-Reply-To: References: <56533020.8090806@redhat.com> <5658224F.7060404@redhat.com>

Message-ID: Wouldn't it be an interesting compromise to make sure we calculate things like the key's hash only once? On 30 November 2015 at 21:54, William Burns wrote: > I am not sure there is an easy way to consolidate these into a single map, > since some of these are written to on reads, some on writes and sometimes > conditionally written to. And then as Dan said they are cleaned up at > different times possibly. > > We could do something like states (based on which ones would have written to > the map), but I think it will get quite complex, especially if we ever add > more of these map type requirement. > > On a similar note, I had actually thought of possibly moving the expiration > check out of the data container and into the entry wrapping interceptor or > the likes. This would allow for us to remove the expiration map completely > since we could only raise the extra expiration commands on a read and not > writes. But this would change the API and I am thinking we can only do this > for 9.0. > > On Mon, Nov 30, 2015 at 2:18 PM Dan Berindei wrote: >> >> The first problem that comes to mind is that context entries are also >> stored in a map, at least in transactional mode. So access through the >> context would only be faster in non-tx caches, in tx caches it would >> not add any benefits. >> >> I also have some trouble imagining how these temporary entries would >> be released, since locks, L1 requestors, L1 synchronizers, and write >> registrations all have their own rules for cleaning up. >> >> Finally, I'm not sure how much this would help. I actually removed the >> write registration for everything except RemoveExpiredCommand when >> testing the HotRod server performance, but I didn't get any >> significant improvement on my machine. Which was kind of expected, >> since the benchmark doesn't seem to be CPU-bound, and JFR was showing >> it with < 1.5% of CPU. >> >> >> Cheers >> Dan >> >> >> On Fri, Nov 27, 2015 at 11:28 AM, Radim Vansa wrote: >> > No thoughts at all? @wburns, could I have your view on this? >> > >> > Thanks >> > >> > Radim >> > >> > On 11/23/2015 04:26 PM, Radim Vansa wrote: >> >> Hi again, >> >> >> >> examining some flamegraphs I've found out that recently the >> >> ExpirationInterceptor has been added, which registers ongoing write in >> >> a >> >> hashmap. So at this point we have a map for locks, map for writes used >> >> for expiration, another two key-addressed maps in L1ManagerImpl and one >> >> in L1NonTxInterceptor and maybe another maps elsewhere. >> >> >> >> This makes me think that we could spare map lookups and expensive >> >> writes >> >> by providing *single map for temporary per-key data*. A reference to >> >> the >> >> entry could be stored in the context to save the lookups. An extreme >> >> case would be to put this into DataContainer, but I think that this >> >> would prove too tricky in practice. >> >> >> >> A downside would be the loss of encapsulation (any component could >> >> theoretically access e.g. locks), but I don't find that too dramatic. >> >> >> >> WDYT? >> >> >> >> Radim >> >> >> > >> > >> > -- >> > Radim Vansa >> > JBoss Performance Team >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev at lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev at lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev From mudokonman at gmail.com Mon Nov 30 22:54:02 2015 From: mudokonman at gmail.com (William Burns) Date: Tue, 01 Dec 2015 03:54:02 +0000 Subject: [infinispan-dev] Consolidating temporary per-key data In-Reply-To: References: <56533020.8090806@redhat.com> <5658224F.7060404@redhat.com>

Message-ID: Actually looking into this closer. I have found a way to completely remove the expiration interceptor with minimal drawback. The drawback is that a remove expired command will be generated if a read finds the entry gone is concurrently fired with a write for the same key. But I would say this should happen so infrequently that it probably shouldn't matter. I have put it all on [1] and all the tests seem to pass fine. I want to double check a few things but this should be pretty good. [1] https://github.com/wburns/infinispan/commits/expiration_listener On Mon, Nov 30, 2015 at 5:46 PM Sanne Grinovero wrote: > Wouldn't it be an interesting compromise to make sure we calculate > things like the key's hash only once? > > On 30 November 2015 at 21:54, William Burns wrote: > > I am not sure there is an easy way to consolidate these into a single > map, > > since some of these are written to on reads, some on writes and sometimes > > conditionally written to. And then as Dan said they are cleaned up at > > different times possibly. > > > > We could do something like states (based on which ones would have > written to > > the map), but I think it will get quite complex, especially if we ever > add > > more of these map type requirement. > > > > On a similar note, I had actually thought of possibly moving the > expiration > > check out of the data container and into the entry wrapping interceptor > or > > the likes. This would allow for us to remove the expiration map > completely > > since we could only raise the extra expiration commands on a read and not > > writes. But this would change the API and I am thinking we can only do > this > > for 9.0. > > > > On Mon, Nov 30, 2015 at 2:18 PM Dan Berindei > wrote: > >> > >> The first problem that comes to mind is that context entries are also > >> stored in a map, at least in transactional mode. So access through the > >> context would only be faster in non-tx caches, in tx caches it would > >> not add any benefits. > >> > >> I also have some trouble imagining how these temporary entries would > >> be released, since locks, L1 requestors, L1 synchronizers, and write > >> registrations all have their own rules for cleaning up. > >> > >> Finally, I'm not sure how much this would help. I actually removed the > >> write registration for everything except RemoveExpiredCommand when > >> testing the HotRod server performance, but I didn't get any > >> significant improvement on my machine. Which was kind of expected, > >> since the benchmark doesn't seem to be CPU-bound, and JFR was showing > >> it with < 1.5% of CPU. > >> > >> > >> Cheers > >> Dan > >> > >> > >> On Fri, Nov 27, 2015 at 11:28 AM, Radim Vansa > wrote: > >> > No thoughts at all? @wburns, could I have your view on this? > >> > > >> > Thanks > >> > > >> > Radim > >> > > >> > On 11/23/2015 04:26 PM, Radim Vansa wrote: > >> >> Hi again, > >> >> > >> >> examining some flamegraphs I've found out that recently the > >> >> ExpirationInterceptor has been added, which registers ongoing write > in > >> >> a > >> >> hashmap. So at this point we have a map for locks, map for writes > used > >> >> for expiration, another two key-addressed maps in L1ManagerImpl and > one > >> >> in L1NonTxInterceptor and maybe another maps elsewhere. > >> >> > >> >> This makes me think that we could spare map lookups and expensive > >> >> writes > >> >> by providing *single map for temporary per-key data*. A reference to > >> >> the > >> >> entry could be stored in the context to save the lookups. An extreme > >> >> case would be to put this into DataContainer, but I think that this > >> >> would prove too tricky in practice. > >> >> > >> >> A downside would be the loss of encapsulation (any component could > >> >> theoretically access e.g. locks), but I don't find that too dramatic. > >> >> > >> >> WDYT? > >> >> > >> >> Radim > >> >> > >> > > >> > > >> > -- > >> > Radim Vansa > >> > JBoss Performance Team > >> > > >> > _______________________________________________ > >> > infinispan-dev mailing list > >> > infinispan-dev at lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev at lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev at lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev at lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20151201/ba1ec70f/attachment.html