[infinispan-dev] Providing a context for object de-serialization

Sanne Grinovero sanne at infinispan.org
Wed Jul 11 19:20:52 EDT 2012


On 10 July 2012 12:09, Galder Zamarreño <galder at redhat.com> wrote:
>
> On Jul 6, 2012, at 11:48 PM, Sanne Grinovero wrote:
>
>> On 6 July 2012 15:06, Galder Zamarreño <galder at redhat.com> wrote:
>>>
>>> On Jun 26, 2012, at 6:13 PM, Sanne Grinovero wrote:
>>>
>>>> Imagine I have a value object which needs to be stored in Infinispan:
>>>>
>>>> class Person {
>>>>  final String nationality = ...
>>>>  final String fullName = ...
>>>> [constructor]
>>>> }
>>>>
>>>> And now let's assume that - as you could expect - most Person
>>>> instances have the same value for the nationality String, but a
>>>> different name.
>>>>
>>>> I want to define a custom Externalizer for my type, but the current
>>>> Externalizer API doesn't allow to refer to some common application
>>>> context, which might be extremely useful to deserialize this Person
>>>> instance:
>>>>
>>>> we could avoid filling the memory of my Grid by having multiple copies
>>>> of the nationality String repeated all over, when a String [1] could
>>>> be reused.
>>>>
>>>> Would it be a good idea to have the Externalizer instances have an
>>>> initialization phase receiving a ComponentRegistry, so I could look up
>>>> some custom service to de-duplicate or otherwise optimize my in-memory
>>>> data representation?
>>>> Personally I'd prefer to receive it injected via the constructor so
>>>> that I could use a final field when my custom Externalizer is
>>>> constructed.
>>>>
>>>> This is OGM related.
>>>
>>> ^ Makes sense, but only solves one part of the problem.
>>>
>>> String is probably a bad example here [as you already said, due to 1], but a better example is if you have a Nationality class with country name, timezone…etc in it.
>>>
>>> My point is, your suggestion works for nodes to which data is replicated to, but in the original node where you've created 100 Person instances for Spanish nationaility, you'd still potentially have 100 instances.
>>>
>>> Did you have anything in mind for this?
>>
>> That's where the ComponentRegistry's role kicks in: the user
>> application created these object instances before storing them in the
>> original node, and if it is a bit cleverly designed it will have
>> something like a Map of immutable Nationality instances, so that every
>> time it needs Spanish it looks up the same instance.
>
> ^ Yeah. The problem I was trying to highlight is what happens to the original instances.
>
> I guess the problem of the original object instances goes away if no references are kept to it any more by the rest of the JVM and hence can be garbage collected. This is of course dependant on the client application, but hints would need to be provided for the OGM case so that users avoid such anti-pattern, right?

I intend to use this optimisation only on selected types, specifically
objects which are not exposed to the application: as you say that
would be tricky.

There is not such "original instance" as all instances would be
created by the same factory; that's why I want to share the factory
instance between the Externalizer and the application: for it to work,
they should not have two different pools.
Which implies there will be always a single Nationality instance with
value "Spanish" both in the Infinispan internals and the app using it.

>
> So, assuming no other refs are kept, you're left to the references that the cache has which are reduced in the process you explained.
>
>> Consequentially the custom externalizer implementation needs access to
>> the same service instance as used by the application, so that it can
>> make use of the same pool rather than having to create his own pool
>> instance: the essence of my proposal is really to have the user
>> application and the Externalizer framework to share the same Factory.
>>
>>> Btw, not sure about the need of ComponentRegistry here. IMO, this kind of feature should work for Hot Rod clients too, where Externalizers might be used in the future, and where there's no ComponentRegistry (unless it's a RemoteCacheStore...)
>>
>> It doesn't need to be literally a ComponentRegistry interface
>> implementation, just anything which allows the Externalizer to be
>> initialized using some externally provided service as in the above
>> example.
>>
>> This optimisation should have no functional impact but just an
>> optionally implementable trick which saves some memory.. so if we can
>> think of a way to do the same for Hot Rod that's very cool but doesn't
>> necessarily have to use the same components and (internal) interfaces.
>>
>> I'm thinking of this as a similar "optionality" as we have when
>> choosing between Serializable vs. custom Externalizers : people can
>> plug one in if they know what they're doing (like these instances
>> should definitely be immutable) but everything just works fine if you
>> don't.
>> I'm not really sure if there is a wide range of applications, nor have
>> any idea of the amount of memory it could save in practice... just and
>> idea I wanted to sketch.
>> I suspect it might allow me to do some cool things with both OGM and
>> Lucene Directoy, as you can re-hidratate complex object graphs from
>> different cache entries, reassembling them with direct references...
>> dreaming?
>
> Not dreaming :). For sure we should focus on the most important use case here, which is OGM. We can always work on extending it to other bits at a later stage.
>
>>
>>>
>>>>
>>>> Cheers,
>>>> Sanne
>>>>
>>>>
>>>> 1 - or any immutable object: I'm using String as an example so let's
>>>> forget about the static String pool optimizations the JVM might
>>>> enable..
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>> --
>>> Galder Zamarreño
>>> Sr. Software Engineer
>>> Infinispan, JBoss Cache
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev



More information about the infinispan-dev mailing list