[infinispan-dev] Lambda Serialization

William Burns mudokonman at gmail.com
Thu Mar 3 10:40:07 EST 2016


On Thu, Mar 3, 2016 at 10:26 AM Sanne Grinovero <sanne at infinispan.org>
wrote:

> On 3 March 2016 at 15:19, William Burns <mudokonman at gmail.com> wrote:
> > I now have a working branch that is using this for the new CacheStream
> > interface [1].
> >
> > With this it allows users to use a stream without needing any casts for
> any
> > of the intermediate or terminal operations.  Note I completely revamped
> the
> > BaseStreamTest [2]  So in that case every example a user can find online
> can
> > be pretty much copy pasted without additional changes, which to me is
> HUGE.
>
> I agree, it's HUGE! Great work!
>

Oh I forgot to mention the 1 caveat with this approach.  If the user
defines their Cache or the various collections returned from it as the base
type (ie. Map, ConcurrentMap, Set, Collection) this automatic serialization
is lost and would require manual casting again.  Normal method chaining
keeps this benefit though.  This seems like an acceptable and unavoidable
issue to me.


>
> >
> > Unfortunately this causes the API to bloat quite a bit and I had to add a
> > bunch of Serializable* classes (ex. [3]).  The former bloat issue seems
> > acceptable to me, I had thought about making a new separate API, but it
> > seems like it is unneeded to me.  The latter issue I had tried defining
> the
> > generics on the method itself but the compiler can't quite figure out
> which
> > method to invoke still [4].
>
> Rather than making many things serializable, did you consider to extend
> our collection of JBoss Marshallers?
> Maybe support for marshalling many of JDK's stream components could
> be contributed directly to the Marshaller project.
>

I personally haven't looked at this aspect.  To be honest, I was leaning on
Galder a bit more here, since he is much more familiar with the Marshalling
code.  If we can do this instead I think it would be even bigger.
Unfortunately I don't know how feasible it is.


>
> >
> > I am still planning on adding a CacheIntStream, CacheDoubleStream and
> > CacheLongStream interfaces as well.  Without those users would need to do
> > casts on the subsequent primitive stream if they used any of the
> > mapTo<Int|Double|Long> or flatMapTo<Int|Double|Long> methods.
> >
> > Another side benefit of this refactoring is we can easily add new
> operations
> > to the stream interfaces.  We could add approximation methods maybe that
> > return after a certain timeout, histogram specific support among
> others.  I
> > am open to whatever people think they would want added here.
> Unfortunately,
> > we can't easily add in a Map.Entry stream (similar to spark PairRDD)
> without
> > redoing a bunch more of the APIs and I don't know if we have time to
> support
> > that.
> >
> > Any feedback would be great, hoping to get this ironed out soon before
> API
> > freeze :)
> >
> > Cheers,
> >
> >  - Will
> >
> > [1] https://github.com/wburns/infinispan/tree/ISPN-6272
> > [2]
> >
> https://github.com/wburns/infinispan/commit/09734d533a445df23df94f7a053b11bc496422ec#diff-170c50a8f618af028f238109f0f1392a
> > [3]
> >
> https://github.com/wburns/infinispan/blob/ISPN-6272/core/src/main/java/org/infinispan/util/SerializableFunction.java
> > [4] https://gist.github.com/wburns/dffe4f7543f68215f74b
> >
> >
> >
> > On Wed, Feb 17, 2016 at 8:39 AM William Burns <mudokonman at gmail.com>
> wrote:
> >>
> >> Actually I have a PR that will go in before the 8.2 Final release that
> >> uses this [1].  Specifically check out the ClusterExecutor interface.
> It
> >> doesn't have the issues of streams with overloading existing methods,
> >> however it adds both overloaded variants and you can see how the tests
> >> invoke those.
> >>
> >> [1] https://github.com/infinispan/infinispan/pull/4008
> >>
> >>
> >> On Wed, Feb 17, 2016 at 3:23 AM Galder Zamarreño <galder at redhat.com>
> >> wrote:
> >>>
> >>> Hey Will,
> >>>
> >>> A very interesting discovery!
> >>>
> >>> Do you have a branch were you've tried this out? I'd like to play with
> it
> >>> to see it in action and analyse the downsides more closely.
> >>>
> >>> Cheers,
> >>> --
> >>> Galder Zamarreño
> >>> Infinispan, Red Hat
> >>>
> >>> > On 9 Feb 2016, at 17:36, William Burns <mudokonman at gmail.com> wrote:
> >>> >
> >>> > I wanted to propose a pretty simple way of making the lambdas
> >>> > serializable by default that I stumbled upon while working on
> another issue.
> >>> >
> >>> > I noticed that in the method resolution of the compiler it does some
> >>> > nice things [1].  To be more specific when you have 2 methods with
> the same
> >>> > name but vary by argument types, it will attempt to pick the most
> "specific"
> >>> > one.  Specific in this case you can think of if I can cast one
> argument type
> >>> > to the other but it can't be cast to this type, then this one is most
> >>> > specific.
> >>> >
> >>> > Here is an example, given the following class
> >>> >
> >>> > interface SerializableFunction<T, R> extends Serializable,
> Function<T,
> >>> > R>
> >>> >
> >>> > The stream interface already defines:
> >>> >
> >>> >    Stream map(Function<? super T, ? extends R> mapper);
> >>> >
> >>> > But we could add this to the CacheStream interface
> >>> >
> >>> >   CacheStream map(SerializableFunction<? super T, ? extends R>
> mapper);
> >>> >
> >>> > In this case you have 2 different map methods accessible from your
> >>> > CacheStream instance.  When passing a lambda the Java compiler will
> >>> > automatically choose the most specific one (in this case the
> >>> > SerializableFunction one since Function can't be cast to
> >>> > SerializableFunction).  This will then make the lambda automatically
> >>> > Serializable.  In this way nothing special has to be done (ie.
> explicit
> >>> > cast) to make the instance Serializable.
> >>> >
> >>> > This allows anyone using our Cache interface to immediately get
> lambdas
> >>> > that are Serializable when using Streams.
> >>> >
> >>> > The main problem however would be ambiguity because the Serialization
> >>> > would only be applied assuming you are using a defined class of
> CacheStream
> >>> > etc.  Also this means there are 2 methods (but that seems fine to
> me), so it
> >>> > could cause a bit of confusion.  The non serialization method is
> still
> >>> > helpful if people want to their own Externalizer, since their
> implementation
> >>> > doesn't have to implement Serializable then.
> >>> >
> >>> > What do you guys think?  It seems like a decent compromise to me.
> >>> >
> >>> >  - Will
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > [1]
> >>> >
> https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
> >>> >
> >>> >
> >>> > _______________________________________________
> >>> > infinispan-dev mailing list
> >>> > infinispan-dev at lists.jboss.org
> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>
> >>>
> >>> _______________________________________________
> >>> infinispan-dev mailing list
> >>> infinispan-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20160303/958191c4/attachment.html 


More information about the infinispan-dev mailing list