Interesting functional programming additions for concurrent collections (i.e. caches)
coming up in JDK8 too.
Begin forwarded message:
From: Doug Lea <dl(a)cs.oswego.edu>
Subject: [concurrency-interest] ConcurrentHashMap bulk parallel operations
Date: August 13, 2012 6:08:50 PM GMT+02:00
To: "Concurrency-interest(a)cs.oswego.edu"
<Concurrency-interest(a)cs.oswego.edu>
As most of you know, lambda-based bulk operations with optional
parallelism are planned for JDK8. We've been working with JSR335 folks
on these. However, in addition we plan to offer access to bulk
operations designed for use on truly concurrent data structures, in
particular ConcurrentHashMaps. These methods are designed for use on
"live" data, registeries, caches, in-memory data stores, chunks of
Hadoop-style "big data" sets. and so on that might be concurrently
updated during the course of computations. The form and style of this
API are targeted to support operations that can be sensibly applied in
such contexts. This boils down to only three basic forms: forEach,
search, and reduce (each with multiple variations), expressed in an
imperative style -- no fluency, stateful Streams, etc that are planned
for the JDK8 java.util-based framework. There is no sense in
compromising support for either of these kinds of target usages, so we
won't. However, the functionality coverage is essentially identical
for those operations that do apply, so we anticipate that the JDK8
java.util-based framework will be able to layer on top of this when
applicable.
The API is in two layers. Nested (static) class "ForkJoinTasks"
returns task objects that, when invoked, provide the given
functionality, but may also be used in other ways. Nested (inner)
class "Parallel" provides an API for using them with a given
ForkJoinPool. The class-level javadocs for CHM(V8).Parallel are pasted
below. There will surely be some further API changes in the course of
JDK8 integration. However, in the mean time, we are releasing a
stand-alone form, intended to be usable by both current
ConcurrentHashMapV8 users running JDK7, as well as those experimenting
with current JDK8 preview lambda builds (at
http://jdk8.java.net/lambda/) The current javadocs don't have any
usage examples, because they look vastly different in JDK7 vs JDK8.
Doing this forces a bit of disruption on everyone though.
1. To avoid FJ version mismatches, the current jsr166y FJ classes are
duplicated into jsr166e.
2. To avoid JDK version mismatches, the j.u.c version (plain
"ConcurrentHashMap" without the "V8") is committed in main
repository,
while keeping its "V8" in package jsr166e. (This also required an
initial merge of jsr166e.LongAdder and related classes.)
3. To avoid current and future naming problems, a set of function
interfaces are nested within ConcurrentHashMap, with names
intentionally different than those currently used in JDK8 previews
(for example "Action" instead of "Block"). For lambda-enabled
JDK8-preview users, this won't much matter because lambda expressions
will still match as expected. However, others tediously using this
with emulated-lambdas via static instances of classes implementing the
interfaces will have to bear with future name changes of these
interfaces. This forbearance starts immediately, because the
previously named nested MappingFunction and RemappingFunction are
already changed so as to be applicable across the extended
APIs. Sorry.
... pasting from
http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166edocs/jsr166e/ConcurrentHas...
public class ConcurrentHashMapV8.Parallel
An extended view of a ConcurrentHashMap supporting bulk parallel operations. These
operations are designed to be be safely, and often sensibly, applied even with maps that
are being concurrently updated by other threads; for example, when computing a snapshot
summary of the values in a shared registry. There are three kinds of operation, each with
four forms, accepting functions with Keys, Values, Entries, and (Key, Value) arguments
and/or return values. Because the elements of a ConcurrentHashMap are not ordered in any
particular way, and may be processed in different orders in different parallel executions,
the correctness of supplied functions should not depend on any ordering, or on any other
objects or values that may transiently change while computation is in progress; and except
for forEach actions, should ideally be side-effect-free.
* forEach: Perform a given action on each element. A variant form applies a given
transformation on each element before performing the action.
* search: Return the first available non-null result of applying a given function on
each element; skipping further search when a result is found.
* reduce: Accumulate each element. The supplied reduction function cannot rely on
ordering (more formally, it should be both associative and commutative). There are five
variants:
o Plain reductions. (There is not a form of this method for (key, value)
function arguments since there is no corresponding return type.)
o Mapped reductions that accumulate the results of a given function applied to
each element.
o Reductions to scalar doubles, longs, and ints, using a given basis value.
The concurrency properties of the bulk operations follow from those of ConcurrentHashMap:
Any non-null result returned from get(key) and related access methods bears a
happens-before relation with the associated insertion or update. The result of any bulk
operation reflects the composition of these per-element relations (but is not necessarily
atomic with respect to the map as a whole unless it is somehow known to be quiescent).
Conversely, because keys and values in the map are never null, null serves as a reliable
atomic indicator of the current lack of any result. To maintain this property, null serves
as an implicit basis for all non-scalar reduction operations. For the double, long, and
int versions, the basis should be one that, when combined with any other value, returns
that other value (more formally, it should be the identity element for the reduction).
Most common reductions have these properties; for example, computing a sum with basis 0 or
a minimum with basis MAX_VALUE.
Search and transformation functions provided as arguments should similarly return null to
indicate the lack of any result (in which case it is not used). In the case of mapped
reductions, this also enables transformations to serve as filters, returning null (or, in
the case of primitive specializations, the identity basis) if the element should not be
combined. You can create compound transformations and filterings by composing them
yourself under this "null means there is nothing there now" rule before using
them in search or reduce operations.
Methods accepting and/or returning Entry arguments maintain key-value associations. They
may be useful for example when finding the key for the greatest value. Note that
"plain" Entry arguments can be supplied using new AbstractMap.SimpleEntry(k,v).
Bulk operations may complete abruptly, throwing an exception encountered in the
application of a supplied function. Bear in mind when handling such exceptions that other
concurrently executing functions could also have thrown exceptions, or would have done so
if the first exception had not occurred.
Parallel speedups compared to sequential processing are common but not guaranteed.
Operations involving brief functions on small maps may execute more slowly than sequential
loops if the underlying work to parallelize the computation is more expensive than the
computation itself. Similarly, parallelization may not lead to much actual parallelism if
all processors are busy performing unrelated tasks.
All arguments to all task methods must be non-null.
jsr166e note: During transition, this class uses nested functional interfaces with
different names but the same forms as those expected for JDK8.
_______________________________________________
Concurrency-interest mailing list
Concurrency-interest(a)cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache