[infinispan-dev] StoreByValueTest tck test
Dan Berindei
dan.berindei at gmail.com
Mon Jan 28 09:01:33 EST 2013
On Mon, Jan 28, 2013 at 2:43 PM, Manik Surtani <msurtani at redhat.com> wrote:
>
> On 28 Jan 2013, at 12:35, Dan Berindei <dan.berindei at gmail.com> wrote:
>
>
>
> On Mon, Jan 28, 2013 at 1:56 PM, Manik Surtani <msurtani at redhat.com>wrote:
>
>> Let me clarify a few things on this thread. THere seems to be a bit of
>> confusion here. :)
>>
>> storeAsBinary in Infinispan was designed with the following purposes in
>> mind, in order of importance:
>>
>> 1) Performance. Prevent serialising/deserializing an entry multiple
>> times (e.g., to write through to disk, to replicate over the network,
>> concurrent threads needing to read the object representation).
>>
>>
> TBH I don't think storeAsBinary as it works now is that good for
> performance, because MarshalledValueInterceptor compacts keys/values after
> every operation (see MarshalledValueInterceptor.java:320 and its callers).
> Once a key/value is deserialized, its serialized form is deleted, and it
> has to be serialized again if a remote node asks for it.
>
>
> That is only correct on the node where you're running the operation. The
> remote node has different characteristics. The byte array is never
> deserialized when reading off the wire, always kept as a byte array, and
> when asked for again (via a remote GET) it just needs to do a buffer copy.
> Now this is breaks the moment a thread local to that remote node looks up
> an entry, but if you have some form of key affinity then you really see
> this benefit.
>
>
Correct, except if you ever do a local GET on the "remote" node, it will
deserialize the object and from that moment on it will have to serialize it
again for each remote GET.
Most of the time nodes are treated as interchangeable, and the probability
of a key being accessed from the "remote" node is the same as from any
other node.
> So it would save at most one serialization compared to storing the entries
> as references (and only if the entry also needs to be written to a cache
> store). Instead it adds a bit of overhead on each operation to keep track
> of the marshalled value status.
>
>
> Well, if you have > 1 cache store enabled, etc etc.
>
>
Or you could have no cache store...
> 2) Classloader isolation (as Galder mentioned). This became a secondary
>> purpose of this feature (originally observed as a side-effect). Enhanced
>> by allowing storeKeyAsBinary and storeValueAsBinary options for more
>> fine-grained control of this behaviour.
>>
>>
> I'd say this has become the reason most people use this feature now, even
> though a regular cache should work fine in AS7 there are still other
> environments where it is needed.
>
>
> Yes, they are both as important today; I was just stating what the
> original intentions were. :)
>
>
>
>
>> Now lets consider what JSR 107 needs. Similarly named, the feature in
>> JSR 107 serves a completely different purpose, and this is referential
>> integrity. Think database-style isolation (repeatable read, etc) where
>> concurrent threads holding object references to the same value, and
>> mutating the same value, are not visible until a commit.
>>
>> I originally thought that Infinispan's storeAsBinary can be used for
>> this, but apparently not without some additional changes/tweaks. Maybe we
>> need:
>>
>> 1) A new config option for this behaviour. <storeAsBinary
>> defensive="true" /> ?
>>
> 2) If enabled, maybe use a subclass of MarshalledValue
>> (DefensiveMarshalledValue?) that *always* stores a byte[] and never caches
>> the object representation?
>>
>>
> I think we'd still need to cache the object instance while the command is
> executing, otherwise we'll have too many deserializations. But perhaps the
> new setting could control whether MarshalledValueInterceptor calls
> MarshalledValue.compact with preferSerializedRepresentation == true instead
> of false, as it does now.
>
>
> Well, you will want eager serialisation too, even in local mode. So that
> would have to be built in. So maybe rather than a MarshalledValue
> subclass, we really need a MarshalledValueInterceptor subclass. Even
> easier/better encapsulated. :)
>
>
That sounds good to me.
>
>
>
>> What do you think?
>>
>> Cheers
>> Manik
>>
>> On 28 Jan 2013, at 10:00, Sanne Grinovero <sanne at infinispan.org> wrote:
>>
>> > I remember Manik and me pair-programming on that class to simplify it
>> > a bit - especially as there are some performance complexities - but we
>> > ended up not touching it as any change would have violated some
>> > expectations of one feature or another.
>> >
>> > Let's put this on the list of cleanups to be performed for 6.0?
>> >
>> > On 28 January 2013 09:14, Galder Zamarreño <galder at redhat.com> wrote:
>> >>
>> >> On Jan 25, 2013, at 11:37 AM, Sanne Grinovero <sanne at infinispan.org>
>> wrote:
>> >>
>> >>> On 25 January 2013 11:11, Galder Zamarreño <galder at redhat.com> wrote:
>> >>>>
>> >>>> On Jan 24, 2013, at 4:26 PM, Sanne Grinovero <sanne at infinispan.org>
>> wrote:
>> >>>>
>> >>>>> It's important to note that Infinispan's implementation of storing
>> as
>> >>>>> binary isn't guaranteeing different instances of objects are
>> returned
>> >>>>> to different get() invocations (especially when they happen in
>> >>>>> parallel).
>> >>>>
>> >>>> ^ Do you have a test for this?
>> >>>
>> >>> No, it's self-evident by reading the code. I'd venture saying it's a
>> >>> design choice: the option was not designed to provide isolation,
>> >>> people should not abuse of it for a different purpose.
>> >>>
>> >>>> Could this be related to the fact that a get(), unless it had
>> received that entry from another node, will held as reference?
>> >>>>
>> >>>> It'd be interesting if that test works if after a put() you call
>> compact()...
>> >>>>
>> >>>>> This is the reason for example that Hibernate OGM can't use this
>> flag
>> >>>>> to have safe and independent instances, but needs to make defensive
>> >>>>> copies if returned values. As I read in your first post, you want to
>> >>>>> use this for defensive copies: that doesn't work, especially if the
>> >>>>> TCK is performing concurrent requests.
>> >>>>
>> >>>> ^ As I said, the storeAsBinary feature is heavily optimised for
>> performance, hence why it initially keeps instances as references, so that
>> if another thread requests the entry soon later, a reference is sent back
>> (no need to serialize/deserialize the entry just put)
>> >>>
>> >>> As you say "the reference is sent back", even if it's the same
>> >>> instance as a previous request. I have no doubt that's for performance
>> >>> reasons: I patched that code myself and have carefully kept that
>> >>> "feature" of instance reuse available.
>> >>> I'm not sure it can provide much of a benefit generally speaking, but
>> >>> this has always been like that and I guess there could be specific
>> >>> access patterns in which this is very useful.
>> >>
>> >> The reason we have storeAsBinary is due to lazyDeserialization. The
>> latter was a solution we designed to get around deserialization issues on
>> app server environments where JGroups would attempt to deserialize data
>> with the wrong classloader.
>> >>
>> >> The idea at the time was that deserialization would be delayed until a
>> thread with the correct classloader in context would come and deserialize
>> data, hence the name: lazy deserialization. This was needed in AS4/5/6.
>> >>
>> >> The design has always been the same, make sure data is kept in binary
>> format in the receiver and only deserialize when needed.
>> >>
>> >> This lazy deserialization is no longer needed in AS7 cos a particular
>> plugin is set in JBoss Marshaller which adds modular classloader info to
>> serialized data. So, when data arrives in the receiver, it can be
>> deserialized directly cos the classloader info allows for the correct
>> classloader to be found.
>> >>
>> >> The naming change of lazyDeserialization to storeAsBinary was on
>> purpouse and precisely with the aim of it becoming a way to provide
>> store-as-value capabilities. The problem is that as you and Vladimir have
>> spotted, this doesn't really work like that.
>> >>
>> >>>
>> >>> Cheers,
>> >>> Sanne
>> >>>
>> >>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On 24 January 2013 16:09, Manik Surtani <manik at jboss.org> wrote:
>> >>>>>>
>> >>>>>> On 24 Jan 2013, at 15:39, Vladimir Blagojevic <vblagoje at redhat.com>
>> wrote:
>> >>>>>>
>> >>>>>>> No valid reason Manik. In summary I thought I would have gotten
>> our keys/values serialized even in local VM if I turn on storeAsBinary but
>> that does not seem to be the case.
>> >>>>>>
>> >>>>>> Is it not? Perhaps it is only serialised the first time a serial
>> form is necessary. You can get around this by calling compact()
>> >>>>>>
>> >>>>>>
>> http://docs.jboss.org/infinispan/5.1/apidocs/org/infinispan/Cache.html#compact()
>> >>>>>>
>> >>>>>> But this definitely isn't the most optimal way of doing things.
>> Perhaps a new config option for eager serialisation might be necessary,
>> but for now calling compact() should work.
>> >>>>>>
>> >>>>>>> I need to use storeAsBinary to complete a feature of JSR 107 that
>> allows storing of key/value pairs as serialized values rather than simple
>> references.
>> >>>>>>
>> >>>>>> Yup, I realise.
>> >>>>>>
>> >>>>>>>
>> >>>>>>> TBH, I am not sure how can we do this given mechanisms we have in
>> place. I would have to implement serialization/deserialization in our jsr
>> 107 project but that would be a wrong path if we can somehow turn on our
>> own existing storeAsBinary for in VM stored objects (see Galder's email on
>> what is currently done)
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Regards,
>> >>>>>>> Vladimir
>> >>>>>>> On 13-01-24 7:09 AM, Manik Surtani wrote:
>> >>>>>>>> JSR 107's storeAsBinary and our storeAsBinary are conceptually
>> the same. You get a defensive copy and this should work.
>> >>>>>>>>
>> >>>>>>>> But see my comment below:
>> >>>>>>>>
>> >>>>>>>> Also adding Mircea in cc. Any reason why you're not using
>> infinispan-dev for this?
>> >>>>>>>>
>> >>>>>>>> On 24 Jan 2013, at 12:00, Galder Zamarreño <galder at redhat.com>
>> wrote:
>> >>>>>>>>
>> >>>>>>>>> Hey Vladimir,
>> >>>>>>>>>
>> >>>>>>>>> IIRC, for performance reasons, even with storeAsBinary,
>> Infinispan keeps the data as normal instance locally. When data is
>> serialized and sent to other nodes, again for performance reasons, it keeps
>> it as raw or byte[] format.
>> >>>>>>>>>
>> >>>>>>>>> So, storing objects by value only happens in counted occassions
>> when storeAsBinary is enabled.
>> >>>>>>>>>
>> >>>>>>>>> You can track it by using a debugger and see how the the
>> MarshalledValue instances are created.
>> >>>>>>>>>
>> >>>>>>>>> Not sure how to fix this without some extra configuration
>> option.
>> >>>>>>>>>
>> >>>>>>>>> Cheers,
>> >>>>>>>>>
>> >>>>>>>>> On Jan 23, 2013, at 5:38 PM, Vladimir Blagojevic <
>> vblagoje at redhat.com> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Galder,
>> >>>>>>>>>>
>> >>>>>>>>>> A quick search of help from you beacuse you are more familiar
>> with this area (storeAsBinary) than I am. There is a tck test that checks
>> storing of objects by value not by reference in the cache [1]. I thought
>> that if we set our underlying cache to be storeAsBinary we would handle
>> this tck requirement (store by value if neeed rather than by reference).
>> However, StoreByValueTest fails although I set our underlying Infinispan
>> cache to be storeAsBinary. I am using local cache athough I tried with
>> transport and dist_async setup as well - same result. Any ideas what is
>> going on?
>> >>>>>>>>>>
>> >>>>>>>>>> Have a look at the test [1] , result I get are below:
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> -------------------------------------------------------
>> >>>>>>>>>> Running org.jsr107.tck.StoreByValueTest
>> >>>>>>>>>> Jan 23, 2013 12:35:29 PM org.jsr107.tck.util.ExcludeList <init>
>> >>>>>>>>>> INFO: ===== ExcludeList
>> url=file:/Users/vladimir/workspace/jsr107/jsr107tck/implementation-tester/target/test-classes/ExcludeList
>> >>>>>>>>>> Defined org.jsr107.tck.StoreByValueTest config
>> StoreAsBinaryConfiguration{enabled=true, storeKeysAsBinary=true,
>> storeValuesAsBinary=true}
>> >>>>>>>>>> Tests run: 6, Failures: 6, Errors: 0, Skipped: 0, Time
>> elapsed: 21.852 sec <<< FAILURE!
>> >>>>>>>>>>
>> >>>>>>>>>> Results :
>> >>>>>>>>>>
>> >>>>>>>>>> Failed tests:
>> get_Existing_MutateValue(org.jsr107.tck.StoreByValueTest): expected:
>> java.util.Date<Wed Jan 23 12:35:34 EST 2013> but was: java.util.Date<Wed
>> Jan 23 12:35:34 EST 2013>
>> >>>>>>>> ?? These seem the same to me? How is the TCK testing for these
>> two values? By reference? Or using .equals()?
>> >>>>>>>>
>> >>>>>>>>>> get_Existing_MutateKey(org.jsr107.tck.StoreByValueTest):
>> expected:<Wed Jan 23 12:35:38 EST 2013> but was:<null>
>> >>>>>>>> This seems a bigger issue. You might want to look at Infinispan
>> logs here?
>> >>>>>>>>
>> >>>>>>>>>> getAndPut_NotThere(org.jsr107.tck.StoreByValueTest): expected:
>> java.util.Date<Wed Jan 23 12:35:41 EST 2013> but was: java.util.Date<Wed
>> Jan 23 12:35:41 EST 2013>
>> >>>>>>>> Again, see my first comment.
>> >>>>>>>>
>> >>>>>>>>>>
>> getAndPut_Existing_MutateValue(org.jsr107.tck.StoreByValueTest): expected:
>> java.util.Date<Wed Jan 23 12:35:45 EST 2013> but was: java.util.Date<Wed
>> Jan 23 12:35:45 EST 2013>
>> >>>>>>>> Again, see my first comment.
>> >>>>>>>>
>> >>>>>>>>>>
>> getAndPut_Existing_NonSameKey_MutateValue(org.jsr107.tck.StoreByValueTest):
>> expected: java.util.Date<Wed Jan 23 12:35:48 EST 2013> but was:
>> java.util.Date<Wed Jan 23 12:35:48 EST 2013>
>> >>>>>>>> Again, see my first comment.
>> >>>>>>>>
>> >>>>>>>>>>
>> getAndPut_Existing_NonSameKey_MutateKey(org.jsr107.tck.StoreByValueTest):
>> expected:<Wed Jan 23 12:35:51 EST 2013> but was:<null>
>> >>>>>>>>>>
>> >>>>>>>>>> Tests run: 6, Failures: 6, Errors: 0, Skipped: 0
>> >>>>>>>>>>
>> >>>>>>>>>> [1]
>> https://github.com/jsr107/jsr107tck/blob/master/cache-tests/src/test/java/org/jsr107/tck/StoreByValueTest.java
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> Galder Zamarreño
>> >>>>>>>>> galder at redhat.com
>> >>>>>>>>> twitter.com/galderz
>> >>>>>>>>>
>> >>>>>>>>> Project Lead, Escalante
>> >>>>>>>>> http://escalante.io
>> >>>>>>>>>
>> >>>>>>>>> Engineer, Infinispan
>> >>>>>>>>> http://infinispan.org
>> >>>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Manik Surtani
>> >>>>>>>> manik at jboss.org
>> >>>>>>>> twitter.com/maniksurtani
>> >>>>>>>>
>> >>>>>>>> Platform Architect, JBoss Data Grid
>> >>>>>>>> http://red.ht/data-grid
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Manik Surtani
>> >>>>>> manik at jboss.org
>> >>>>>> twitter.com/maniksurtani
>> >>>>>>
>> >>>>>> Platform Architect, JBoss Data Grid
>> >>>>>> http://red.ht/data-grid
>> >>>>>>
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> infinispan-dev mailing list
>> >>>>>> infinispan-dev at lists.jboss.org
>> >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> infinispan-dev mailing list
>> >>>>> infinispan-dev at lists.jboss.org
>> >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Galder Zamarreño
>> >>>> galder at redhat.com
>> >>>> twitter.com/galderz
>> >>>>
>> >>>> Project Lead, Escalante
>> >>>> http://escalante.io
>> >>>>
>> >>>> Engineer, Infinispan
>> >>>> http://infinispan.org
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> infinispan-dev mailing list
>> >>>> infinispan-dev at lists.jboss.org
>> >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>>
>> >>> _______________________________________________
>> >>> infinispan-dev mailing list
>> >>> infinispan-dev at lists.jboss.org
>> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>
>> >>
>> >> --
>> >> Galder Zamarreño
>> >> galder at redhat.com
>> >> twitter.com/galderz
>> >>
>> >> Project Lead, Escalante
>> >> http://escalante.io
>> >>
>> >> Engineer, Infinispan
>> >> http://infinispan.org
>> >>
>> >>
>> >> _______________________________________________
>> >> infinispan-dev mailing list
>> >> infinispan-dev at lists.jboss.org
>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > infinispan-dev at lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Manik Surtani
>> manik at jboss.org
>> twitter.com/maniksurtani
>>
>> Platform Architect, JBoss Data Grid
>> http://red.ht/data-grid
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> --
> Manik Surtani
> manik at jboss.org
> twitter.com/maniksurtani
>
> Platform Architect, JBoss Data Grid
> http://red.ht/data-grid
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130128/c4d71ec1/attachment-0001.html
More information about the infinispan-dev
mailing list