[infinispan-dev] Lambda Serialization

William Burns mudokonman at gmail.com
Thu Mar 3 10:19:51 EST 2016


I now have a working branch that is using this for the new CacheStream
interface [1].

With this it allows users to use a stream without needing any casts for any
of the intermediate or terminal operations.  Note I completely revamped the
BaseStreamTest [2]  So in that case every example a user can find online
can be pretty much copy pasted without additional changes, which to me is
HUGE.

Unfortunately this causes the API to bloat quite a bit and I had to add a
bunch of Serializable* classes (ex. [3]).  The former bloat issue seems
acceptable to me, I had thought about making a new separate API, but it
seems like it is unneeded to me.  The latter issue I had tried defining the
generics on the method itself but the compiler can't quite figure out which
method to invoke still [4].

I am still planning on adding a CacheIntStream, CacheDoubleStream and
CacheLongStream interfaces as well.  Without those users would need to do
casts on the subsequent primitive stream if they used any of the
mapTo<Int|Double|Long> or flatMapTo<Int|Double|Long> methods.

Another side benefit of this refactoring is we can easily add new
operations to the stream interfaces.  We could add approximation methods
maybe that return after a certain timeout, histogram specific support among
others.  I am open to whatever people think they would want added here.
Unfortunately, we can't easily add in a Map.Entry stream (similar to spark
PairRDD) without redoing a bunch more of the APIs and I don't know if we
have time to support that.

Any feedback would be great, hoping to get this ironed out soon before API
freeze :)

Cheers,

 - Will

[1] https://github.com/wburns/infinispan/tree/ISPN-6272
[2]
https://github.com/wburns/infinispan/commit/09734d533a445df23df94f7a053b11bc496422ec#diff-170c50a8f618af028f238109f0f1392a
[3]
https://github.com/wburns/infinispan/blob/ISPN-6272/core/src/main/java/org/infinispan/util/SerializableFunction.java
[4] https://gist.github.com/wburns/dffe4f7543f68215f74b


On Wed, Feb 17, 2016 at 8:39 AM William Burns <mudokonman at gmail.com> wrote:

> Actually I have a PR that will go in before the 8.2 Final release that
> uses this [1].  Specifically check out the ClusterExecutor interface.  It
> doesn't have the issues of streams with overloading existing methods,
> however it adds both overloaded variants and you can see how the tests
> invoke those.
>
> [1] https://github.com/infinispan/infinispan/pull/4008
>
>
> On Wed, Feb 17, 2016 at 3:23 AM Galder Zamarreño <galder at redhat.com>
> wrote:
>
>> Hey Will,
>>
>> A very interesting discovery!
>>
>> Do you have a branch were you've tried this out? I'd like to play with it
>> to see it in action and analyse the downsides more closely.
>>
>> Cheers,
>> --
>> Galder Zamarreño
>> Infinispan, Red Hat
>>
>> > On 9 Feb 2016, at 17:36, William Burns <mudokonman at gmail.com> wrote:
>> >
>> > I wanted to propose a pretty simple way of making the lambdas
>> serializable by default that I stumbled upon while working on another issue.
>> >
>> > I noticed that in the method resolution of the compiler it does some
>> nice things [1].  To be more specific when you have 2 methods with the same
>> name but vary by argument types, it will attempt to pick the most
>> "specific" one.  Specific in this case you can think of if I can cast one
>> argument type to the other but it can't be cast to this type, then this one
>> is most specific.
>> >
>> > Here is an example, given the following class
>> >
>> > interface SerializableFunction<T, R> extends Serializable, Function<T,
>> R>
>> >
>> > The stream interface already defines:
>> >
>> >    Stream map(Function<? super T, ? extends R> mapper);
>> >
>> > But we could add this to the CacheStream interface
>> >
>> >   CacheStream map(SerializableFunction<? super T, ? extends R> mapper);
>> >
>> > In this case you have 2 different map methods accessible from your
>> CacheStream instance.  When passing a lambda the Java compiler will
>> automatically choose the most specific one (in this case the
>> SerializableFunction one since Function can't be cast to
>> SerializableFunction).  This will then make the lambda automatically
>> Serializable.  In this way nothing special has to be done (ie. explicit
>> cast) to make the instance Serializable.
>> >
>> > This allows anyone using our Cache interface to immediately get lambdas
>> that are Serializable when using Streams.
>> >
>> > The main problem however would be ambiguity because the Serialization
>> would only be applied assuming you are using a defined class of CacheStream
>> etc.  Also this means there are 2 methods (but that seems fine to me), so
>> it could cause a bit of confusion.  The non serialization method is still
>> helpful if people want to their own Externalizer, since their
>> implementation doesn't have to implement Serializable then.
>> >
>> > What do you guys think?  It seems like a decent compromise to me.
>> >
>> >  - Will
>> >
>> >
>> >
>> >
>> >
>> > [1]
>> https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
>> >
>> >
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > infinispan-dev at lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20160303/890eccee/attachment.html 


More information about the infinispan-dev mailing list