Fwd: [concurrency-interest] ConcurrentHashMap bulk parallel operations
by Galder Zamarreño
Interesting functional programming additions for concurrent collections (i.e. caches) coming up in JDK8 too.
Begin forwarded message:
> From: Doug Lea <dl(a)cs.oswego.edu>
> Subject: [concurrency-interest] ConcurrentHashMap bulk parallel operations
> Date: August 13, 2012 6:08:50 PM GMT+02:00
> To: "Concurrency-interest(a)cs.oswego.edu" <Concurrency-interest(a)cs.oswego.edu>
>
>
> As most of you know, lambda-based bulk operations with optional
> parallelism are planned for JDK8. We've been working with JSR335 folks
> on these. However, in addition we plan to offer access to bulk
> operations designed for use on truly concurrent data structures, in
> particular ConcurrentHashMaps. These methods are designed for use on
> "live" data, registeries, caches, in-memory data stores, chunks of
> Hadoop-style "big data" sets. and so on that might be concurrently
> updated during the course of computations. The form and style of this
> API are targeted to support operations that can be sensibly applied in
> such contexts. This boils down to only three basic forms: forEach,
> search, and reduce (each with multiple variations), expressed in an
> imperative style -- no fluency, stateful Streams, etc that are planned
> for the JDK8 java.util-based framework. There is no sense in
> compromising support for either of these kinds of target usages, so we
> won't. However, the functionality coverage is essentially identical
> for those operations that do apply, so we anticipate that the JDK8
> java.util-based framework will be able to layer on top of this when
> applicable.
>
> The API is in two layers. Nested (static) class "ForkJoinTasks"
> returns task objects that, when invoked, provide the given
> functionality, but may also be used in other ways. Nested (inner)
> class "Parallel" provides an API for using them with a given
> ForkJoinPool. The class-level javadocs for CHM(V8).Parallel are pasted
> below. There will surely be some further API changes in the course of
> JDK8 integration. However, in the mean time, we are releasing a
> stand-alone form, intended to be usable by both current
> ConcurrentHashMapV8 users running JDK7, as well as those experimenting
> with current JDK8 preview lambda builds (at
> http://jdk8.java.net/lambda/) The current javadocs don't have any
> usage examples, because they look vastly different in JDK7 vs JDK8.
>
> Doing this forces a bit of disruption on everyone though.
>
> 1. To avoid FJ version mismatches, the current jsr166y FJ classes are
> duplicated into jsr166e.
>
> 2. To avoid JDK version mismatches, the j.u.c version (plain
> "ConcurrentHashMap" without the "V8") is committed in main repository,
> while keeping its "V8" in package jsr166e. (This also required an
> initial merge of jsr166e.LongAdder and related classes.)
>
> 3. To avoid current and future naming problems, a set of function
> interfaces are nested within ConcurrentHashMap, with names
> intentionally different than those currently used in JDK8 previews
> (for example "Action" instead of "Block"). For lambda-enabled
> JDK8-preview users, this won't much matter because lambda expressions
> will still match as expected. However, others tediously using this
> with emulated-lambdas via static instances of classes implementing the
> interfaces will have to bear with future name changes of these
> interfaces. This forbearance starts immediately, because the
> previously named nested MappingFunction and RemappingFunction are
> already changed so as to be applicable across the extended
> APIs. Sorry.
>
>
> ... pasting from
> http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166edocs/jsr166e/ConcurrentHas...
>
>
> public class ConcurrentHashMapV8.Parallel
>
>
> An extended view of a ConcurrentHashMap supporting bulk parallel operations. These operations are designed to be be safely, and often sensibly, applied even with maps that are being concurrently updated by other threads; for example, when computing a snapshot summary of the values in a shared registry. There are three kinds of operation, each with four forms, accepting functions with Keys, Values, Entries, and (Key, Value) arguments and/or return values. Because the elements of a ConcurrentHashMap are not ordered in any particular way, and may be processed in different orders in different parallel executions, the correctness of supplied functions should not depend on any ordering, or on any other objects or values that may transiently change while computation is in progress; and except for forEach actions, should ideally be side-effect-free.
>
> * forEach: Perform a given action on each element. A variant form applies a given transformation on each element before performing the action.
> * search: Return the first available non-null result of applying a given function on each element; skipping further search when a result is found.
> * reduce: Accumulate each element. The supplied reduction function cannot rely on ordering (more formally, it should be both associative and commutative). There are five variants:
> o Plain reductions. (There is not a form of this method for (key, value) function arguments since there is no corresponding return type.)
> o Mapped reductions that accumulate the results of a given function applied to each element.
> o Reductions to scalar doubles, longs, and ints, using a given basis value.
>
> The concurrency properties of the bulk operations follow from those of ConcurrentHashMap: Any non-null result returned from get(key) and related access methods bears a happens-before relation with the associated insertion or update. The result of any bulk operation reflects the composition of these per-element relations (but is not necessarily atomic with respect to the map as a whole unless it is somehow known to be quiescent). Conversely, because keys and values in the map are never null, null serves as a reliable atomic indicator of the current lack of any result. To maintain this property, null serves as an implicit basis for all non-scalar reduction operations. For the double, long, and int versions, the basis should be one that, when combined with any other value, returns that other value (more formally, it should be the identity element for the reduction). Most common reductions have these properties; for example, computing a sum with basis 0 or a minimum with basis MAX_VALUE.
>
> Search and transformation functions provided as arguments should similarly return null to indicate the lack of any result (in which case it is not used). In the case of mapped reductions, this also enables transformations to serve as filters, returning null (or, in the case of primitive specializations, the identity basis) if the element should not be combined. You can create compound transformations and filterings by composing them yourself under this "null means there is nothing there now" rule before using them in search or reduce operations.
>
> Methods accepting and/or returning Entry arguments maintain key-value associations. They may be useful for example when finding the key for the greatest value. Note that "plain" Entry arguments can be supplied using new AbstractMap.SimpleEntry(k,v).
>
> Bulk operations may complete abruptly, throwing an exception encountered in the application of a supplied function. Bear in mind when handling such exceptions that other concurrently executing functions could also have thrown exceptions, or would have done so if the first exception had not occurred.
>
> Parallel speedups compared to sequential processing are common but not guaranteed. Operations involving brief functions on small maps may execute more slowly than sequential loops if the underlying work to parallelize the computation is more expensive than the computation itself. Similarly, parallelization may not lead to much actual parallelism if all processors are busy performing unrelated tasks.
>
> All arguments to all task methods must be non-null.
>
> jsr166e note: During transition, this class uses nested functional interfaces with different names but the same forms as those expected for JDK8.
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest(a)cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
12 years, 4 months
ISPN-2164 Why does a conditional remove (on it's own), end up with a write skew check?
by Galder Zamarreño
Hi,
Re: https://issues.jboss.org/browse/ISPN-2164 and https://github.com/infinispan/infinispan/pull/1221
It could be argued that a remove, regardless of whether it's conditional or not, should not need to have the need for a write skew check, unless there's really been a read before.
IOW, the test now passes because the commit throws an exception in 4 out of 5 threads. But, the test could/should maybe work in such way that no exception was thrown and 4 out of 5 invocations returned false to the checks made (conditional remove returns false, or normal remove returns null).
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
12 years, 4 months
Deadlock on state transfer
by Sanne Grinovero
Hi all,
I'm getting deadlocks during state transfer; I've attached 2 sample
thread dumps from different test runs.
I can't complete the testsuite of core either. Does this ring a bell?
Sorry I haven't followed latest news, I'm wondering if I'm on
something new.
Cheers,
Sanne
12 years, 4 months
LocalQueryInterceptor misunderstanding
by Israel Lacerra
Guys,
I think there is a misunderstanding in the way that LocalQueryInterceptor
works (or, at least, in my understanding). LocalQueryInterceptor indexes
data from calls originated locally. That is different from indexing data in
the node is responsible for the data, and I think we expect
LocalQueryInterceptor works like this, right?
At least,in ISPN 200, we have this item:
"Each node maintains local indexes for state it is responsible
for (-Dinfinispan.query.indexLocalOnly=true)."
What you think about this? Should I change LocalQueryInterceptor? Should I
create another interceptor just for clustered queries work??
cheers!
Israel
12 years, 4 months
question on read committed semantics
by Mircea Markus
Hi,
Ales raised a rather interesting problem around read committed' semantics.
We have read_committed cache, and two concurrent transactions running as follows:
1. tx1:: cache.put(k,v1);
2. tx2:: cache.put(k, v2);
3. tx2:: commit(); //the entry is now (k, v2)
4. tx1 :: cache.get(k);
Now what should the get at returned at step 4?
a) v1 -> the value in the current transaction scope? (current behaviour)
b) v2, as the isolation level is read committed and v2 is the last committed value
c) would make sense to be able to switch between a) and b) on a per cache(config)/per invocation(flag) basis?
Any oppinion much appreciated!
Cheers,
Mircea
12 years, 4 months
X-S replication configuration
by Mircea Markus
Hi,
This[1] is the first draft of the cross-site (x-s)[2] replication bela and I came with.
Any feedback is more than welcomed!
Cheers,
Mircea
[1] https://gist.github.com/3059621
[2] we've also thought of using the term 'site' instead of 'datacenter'. Dataceneter is a bit too specific, not as concise and also inconsistent with the term 'site' we've been using for the TopologyAwareConsistentHash.
12 years, 4 months
MapReduce on REPL and Mapper behaviour
by Thomas Fromm
Hi,
I've just started to use MapReduce so there I have some questions:
1) To avoid specific handling I'd like to have the ability to execute
MapReduceTasks also on REPL/LOCAL caches. Feature request or to
expensive to implement?
2) In case I do not have an Reduce job, I'd like to avoid provide
Reducer. Creating feature request that reducedWith(...) is optional?
What do you think?
One more: Will the mapper be executed on _every_ entry, regardless if
its currently in the cache or just hold in store? I can test it but
maybe you can give me quick answer on this.
Thanks in advance,
tf
12 years, 4 months
Re: [infinispan-dev] upgrade to 5.2.x
by Ales Justin
> FYI - I found the underlying issue:
> https://issues.jboss.org/browse/ISPN-2182
Ah, yes, the legacy --> current config transformation looks like the biggest non-tested stuff ever. ;-)
As I've previously had to fix a few similar, if not almost the same, issues before - wrt indexing.
> A workaround for the issue, as well as the complete 5.2 upgrade can be
> found in this branch:
> https://github.com/pferraro/jboss-as/tree/infinispan
Shouldn't this be fixed in Infinispan directly?
(as that's where I had to fix my indexing config issues)
> I haven't actually used infinispan-query, so I don't know if
> infinispan-core needs to have its classes visible. Can you comment?
https://github.com/capedwarf/capedwarf-jboss-as/blob/master/build/src/mai...
Afaik, it's needed for Infinispan to pick-up Infinispan-Query's extension/integration hooks;
e.g. org.infinispan.query.impl.LifecycleManager in META-INF/services ServiceLoader pattern
As you can see, the module is marked as optional, as I mentioned,
we should be fine if it's not there, so, imo, this could be added by default.
-Ales
12 years, 5 months