[jbosscache-dev] migrating data stored in 1.x format to VAM format

Galder Zamarreno galder.zamarreno at redhat.com
Wed Mar 7 14:10:16 EST 2007


I tried that and it worked :D

Mircea, thanks for your collaboration!

Galder Zamarreno wrote:
> That could work. I'll give it a try later today.
> 
> Thanks Mircea :)
> 
> Mircea Markus wrote:
>> Hi,
>>
>> On start the root node is created with an empty map. I've changed it 
>> to be created with a sql null value rather than an empty map.
>> This way we'll stay consistent with the nodes added indirectly(as they 
>> are parents of nodes that are specifically added).
>> I also hope this will solve extending JDBCacheLoader problem, as I 
>> think that for deserializaion the TransformingJDBCCacheLoader knows 
>> how to handle DB nulls.
>>
>> Cheers,
>> Mircea
>>
>> On 3/4/07, *Galder Zamarreno* < galder.zamarreno at redhat.com 
>> <mailto:galder.zamarreno at redhat.com>> wrote:
>>
>>     I've got this working, including some basic unit tests and manual
>>     examples to transform entired cache stores from 1.x data to 2.x. 
>> These
>>     last manual examples include source code, 1.x cache stores (file and
>>     jdbc derby db) and sample cache configurations.
>>
>>     Apart from the MV issue referred earlier, I have realised that
>>     TransformingJDBCCacheLoader will have to extend JDBCCacheLoaderOld
>>     instead of JDBCCacheLoader.
>>
>>     The reason is because when JDBCCacheLoader starts, if the root 
>> does not
>>     exist, it'll create it, which in the TransformingJDBCCacheLoader will
>>     mean creating it in 2.x format.
>>
>>     This wouldn't be a problem if the root node didn't need querying 
>> again,
>>     but if customers want to migrate their data, they will start looping
>>     from the node (cache.getRoot ()) and the first thing they'll get its
>>     children. This results in trying to load the root node from the cache
>>     store which breaks, as we're reading from db in 1.x format.
>>
>>     Remember that the TransformingJDBCCacheLoader reads in 1.x format and
>>     stores in 2.x format.
>>
>>     This has a very easy resolution which is extending 
>> JDBCCacheLoaderOld.
>>     After that, it works like a treat :).
>>
>>     Manik, assuming you're happy with the original idea, would extending
>>     JDBCCacheLoaderOld for this one off cache loader be ok with you?
>>
>>     Galder Zamarreno wrote:
>>      > I haven't touched this issue for a couple of weeks and over the 
>> last
>>      > couple of days I had the chance to get back into it.
>>      >
>>      > After discussions with Brian we came up with a different approach
>>     for this.
>>      >
>>      > My previous approach (don't need to read all below) relied on
>>      > introducing legacy code into the main source code that would be
>>     able to
>>      > read 1.x serialization. As I started doing it, I realised that it
>>     would
>>      > need a lot of changes and it would clutter the 2.0 codebase.
>>      >
>>      > Instead, with the help of Brian, we came up with a different
>>     idea, which
>>      > is creating two one-off cache loaders,
>>     TransformingJDBCCacheLoader and
>>      > TransformingFileCacheLoader. They extend the existing cache
>>     loaders, but
>>      > they differentiate by unmarshall stuff in the 1.x way.
>>      >
>>      > This way, we have cache loaders that can read in 1.x way and 
>> write in
>>      > 2.x way. Now, a customer just needs to write a program that uses
>>     a cache
>>      > configured to use any of these two cache loaders above and all it
>>     has to
>>      > do is loop through the tree reading all nodes and putting them
>>     back, and
>>      > voila! you have your data store format changed (I'll be writing an
>>      > example of this).
>>      >
>>      > It's a pretty clean solution to transforming data without making
>>     changes
>>      > to main o.j.cache tree.
>>      >
>>      > But, there's always a but :), 1.4.x used
>>      > org.jboss.invocation.MarshalledValue so there's no way of getting
>>     around
>>      > the need of having this class to do this. This is because
>>      > JDBCCacheLoader stored instances of MarshalledValue, so even 
>> the MV
>>      > class in AOP would not work cos it's a different package (it'd
>>     result in
>>      > CCE)
>>      >
>>      > One thing Brian suggested is that these two cache loaders and
>>      > jboss-minimal are kept in a separate dir structure to the main
>>     one and
>>      > when we distribute, we provide an extra jar containing these that
>>     can be
>>      > used to transform data and that's it. After that, you get rid of
>>     it, you
>>      > go back to the standard libraries.
>>      >
>>      > It's pretty hard to find a neater way of dealing with this but the
>>      > benefits are worth it, customer's data stays alive!
>>      >
>>      > Manik and the rest, thoughts?
>>      >
>>      > Galder Zamarreno wrote:
>>      >  > Manik Surtani wrote:
>>      >  >> On 5 Feb 2007, at 19:57, Galder Zamarreno wrote:
>>      >  >>
>>      >  >>> Quick (but a bit lengthy :( ) update on this:
>>      >  >>>
>>      >  >>> - I've created a new Marshaller called Legacy1xMarshaller
>>     (anyone's
>>      >  >>> got a better name?) which extends o.j.c.m.AsbtractMarshaller
>>     that
>>      >  >>> would do the job of marshalling stuff in the 1.x fashion.
>>     This is to
>>      >  >>> be used by JDBCCacheLoader and FileCacheLoader if configured
>>     to use
>>      >  >>> 1.x marshalling. This has the benefit that the code in these
>>     cache
>>      >  >>> loaders only have to do getMarshaller().whatever... , making
>>     it very
>>      >  >>> simple to switch from VAM to Legacy Marshaller.
>>      >  >>
>>      >  >> I presume the VAM would transparently flip between
>>     marshallers, based
>>      >  >> on the version short at the head of the stream?
>>      >  >
>>      >  > The problem is that 1.x marshalling for cache loaders did 
>> not have
>>      >  > version numbers at the start, it was plain java serialization.
>>     Can you
>>      >  > expect VAM to detect that? That's why I thought of a
>>     Marshaller instance
>>      >  >  in AbstractCacheLoader that would either use VAM or the
>>     Legacy one. We
>>      >  > could however assume that if VAM does not find version number,
>>     it tries
>>      >  > to use Legacy one.
>>      >  >
>>      >  > As you said later in the email, it seems like 1.4.x dealt with
>>     this
>>      >  > similar situation. I'll look at it.
>>      >  >
>>      >  >>
>>      >  >>>
>>      >  >>> - In order to do this, I need to add a new method to
>>      >  >>> o.j.c.m.Marshaller called objectToStream(OutputStream). The
>>     reason
>>      >  >>> for doing is so that FileCacheLoader just needs to call
>>      >  >>> getMarshaller().objectToStream() when it's trying to store
>>     data. This
>>      >  >>> will avoid having an if statement in storeAttributes()
>>     checking which
>>      >  >>> Marshaller is used, and calling objectToObjectStream with the
>>      >  >>> corresponding ObjectOutpuStream.
>>      >  >>
>>      >  >> Again, isn't this already in the VAM?
>>      >  >
>>      >  > Not for OutputStream. You have objectToObjectStream(Object obj,
>>      >  > ObjectOutputStream out) and objectFromStream(InputStream is),
>>     but not
>>      >  > objectToStream for OutputStreams such as FileOutputStream.
>>      >  >
>>      >  >>
>>      >  >>>
>>      >  >>> - The decision maker for which Marshaller to use is to be
>>     done in
>>      >  >>> AbstractCacheLoader which will store the Marshaller used by
>>      >  >>> CacheLoader. getMarshaller() would decide upon
>>     configuration, which
>>      >  >>> Marshaller to use, whether the default cache.getMarshaller()
>>     which is
>>      >  >>> VAM or the legacy one, making it quite clean to switch 
>> from to
>>      > another.
>>      >  >>
>>      >  >> Look at the VAM in the 1.4.x tree - it deals with "legacy
>>     support" to
>>      >  >> deal with JBC 1.2.x and 1.3.x for RPC calls.  (removed in 2.x
>>     since
>>      >  >> the legacy support was no longer needed).  Could easily be
>>      >  >> re-introduced if needed to supportr legacy marshalling for 
>> CLs.
>>      >  >
>>      >  > Ok, i'll definitely have a look at that.
>>      >  >
>>      >  >>
>>      >  >>>
>>      >  >>> - Configuration wise, I created
>>     Legacy1xMarshallingCacheLoaderConfig
>>      >  >>> (I couldn't come up with a better name!) which extends
>>      >  >>> IndividualCacheLoaderConfig. JDBCCacheLoaderConfig and
>>      >  >>> FileCacheLoaderConfig will extend
>>      >  >>> Legacy1xMarshallingCacheLoaderConfig instead.
>>      >  >>
>>      >  >> Could drop the 1x in the name, I suppose?  :-)
>>      >  >
>>      >  > No probs :)
>>      >  >
>>      >  >>
>>      >  >>>
>>      >  >>> - Inside Legacy1xMarshallingCacheLoaderConfig, I search for
>>      >  >>> cache.loader.marshalling.1.x (name again!) boolean property
>>     in the
>>      >  >>> <properties> section. If true, it uses legacy marshalling,
>>     and if
>>      >  >>> false, which is default value, VAM.
>>      >  >>>
>>      >  >>> - I have extended CacheLoaderTestsBase to create
>>      >  >>> FileCacheLoaderLegacyMarshallingTest which tests the
>>     FileCacheLoader
>>      >  >>> with legacy marshalling. I'll be doing the same for
>>     JDBCCacheLoader.
>>      >  >>>
>>      >  >>> - Finally and one of the most important aspects, previous
>>     marhalling
>>      >  >>> relies on these classes:
>>      >  >>>
>>      >  >>> org.jboss.invocation.MarshalledValue ;
>>      >  >>> org.jboss.invocation.MarshalledValueInputStream;
>>      >  >>>
>>      >  >>> Which used to be located in jboss-minimal.jar in 1.x. 
>> There's v
>>      >  >>> similar classes in AOP but not the same, so I'm gonna be
>>     creating a
>>      >  >>> legacy directory in lib with this library. To avoid 
>> compile time
>>      >  >>> dependency, Legacy1xMarshaller will be instantiated via
>>     reflection,
>>      >  >>> so only people who actually use this will need this library.
>>     The
>>      >  >>> library has no conflicts with existing 2.x libraries.
>>      >  >>
>>      >  >> Look at the jboss-common-core jar and particularly JBCOMMON-8
>>     in JIRA.
>>      >  >>
>>      >  >
>>      >  > So, did you test whether you could read data written with
>>      >  > JDBCCacheLoader wiht MV classes with a JDBCCacheLoader not
>>     using MV
>>      >  > classes? That's one of the tests I wanted to do to see whether
>>     this
>>      >  > classes were necessary.
>>      >  >
>>      >  > jboss-common-core.jar contains MarshalledValueOutputStream and
>>      >  > MarshalledValueInputStream so that wouldn't be a problem for 
>> FCL.
>>      >  > JDBCCacheLoader on the contrary, wrapped the node in
>>     MarshalledValue and
>>      >  > the wrote it as an ObjectOutputStream. I'll look at the
>>     commons code to
>>      >  > see whether it's the same which I guess might be.
>>      >  >
>>      >  > There's a MarshalledValue in aop libraries but quick glance at
>>     the code
>>      >  > showed that it's slightly different.
>>      >  >
>>      >  >>>
>>      >  >>> The last problem is that these two classes access
>>      >  >>> org.jboss.logging.Logger that used to be in
>>     jboss-common.jar. Now
>>      >  >>> this jar certainly classes with jboss-common-core.jar in 
>> 2.x, so
>>      >  >>> what's I've done is get jboss-logging-spi.jar 2.0.2.GA
>>     <http://2.0.2.GA> and put it in
>>      >  >>> the legacy directory.
>>      >  >>>
>>      >  >>> So, we end up having two legacy libraries in lib/legacy but
>>     they're
>>      >  >>> only needed at runtime if using 1.x marhalling. I guess 
>> it's the
>>      >  >>> price to pay to make customer's life a bit easier.
>>      >  >>>
>>      >  >>
>>      >  >> Trying to avoid a legacy jar dir ... like I said, see if the
>>     MV and
>>      >  >> MVIS can be in jboss-common-core (without JBoss Logging deps!)
>>      >  >
>>      >  > Yeah defo, we wanna avoid any legacy jars.
>>      >  >
>>      >  >>
>>      >  >>> The other alternative would be for 1.x marshaller not to use
>>     this
>>      >  >>> org.jboss.invocation.* classes and just write to Object
>>     streams but I
>>      >  >>> think these classes have an impact in the format of the
>>     marshalled
>>      >  >>> data. Brian, do you know a bit more about the role of these
>>     classes?
>>      >  >>>
>>      >  >>> A bit more complicated than initially expected but I can't
>>     see any
>>      >  >>> easier way of providing backwards compatibility. Hopefully
>>     we should
>>      >  >>> be able to phase it out asap, 3.x? :)
>>      >  >>>
>>      >  >>> What this has shown as well is how different CacheLoaders
>>     marshalled
>>      >  >>> things in a slightly different way which makes having a 
>> common
>>      >  >>> framework for this even more necessary, i.e. VAM. :D
>>      >  >>>
>>      >  >>> Hope you're not snoring by now ;)
>>      >  >>>
>>      >  >>> If you have better ideas for the naming I used, speak up :)
>>      >  >>>
>>      >  >>> Galder Zamarre�o
>>      >  >>> Sr. Software Maintenance Engineer
>>      >  >>> JBoss, a division of Red Hat
>>      >  >>>
>>      >  >>> -----Original Message-----
>>      >  >>> From: jbosscache-dev-bounces at lists.jboss.org
>>     <mailto:jbosscache-dev-bounces at lists.jboss.org>
>>      >  >>> [mailto: jbosscache-dev-bounces at lists.jboss.org
>>     <mailto:jbosscache-dev-bounces at lists.jboss.org>] On Behalf Of Galder
>>      >  >>> Zamarreno
>>      >  >>> Sent: 31 January 2007 01:01
>>      >  >>> To: Manik Surtani
>>      >  >>> Cc: jbosscache-dev at lists.jboss.org
>>     <mailto:jbosscache-dev at lists.jboss.org>
>>      >  >>> Subject: RE: [jbosscache-dev] migrating data stored in 1.x
>>     format to
>>      >  >>> VAM format
>>      >  >>>
>>      >  >>> +1, VAM should be the default.
>>      >  >>>
>>      >  >>> Only people who are resilient to change their existing
>>     stores to VAM
>>      >  >>> should use the 1.x option, which would need explicitly
>>     definition.
>>      >  >>>
>>      >  >>> Galder Zamarre�o
>>      >  >>> Sr. Software Maintenance Engineer
>>      >  >>> JBoss, a division of Red Hat
>>      >  >>>
>>      >  >>>
>>      >  >>> -----Original Message-----
>>      >  >>> From: Manik Surtani [mailto:manik at jboss.org
>>     <mailto:manik at jboss.org>]
>>      >  >>> Sent: 30 January 2007 22:55
>>      >  >>> To: Galder Zamarreno
>>      >  >>> Cc: jbosscache-dev at lists.jboss.org
>>     <mailto:jbosscache-dev at lists.jboss.org>
>>      >  >>> Subject: Re: [jbosscache-dev] migrating data stored in 1.x
>>     format to
>>      >  >>> VAM format
>>      >  >>>
>>      >  >>> I see what you mean, although I would like the default to be
>>     to use
>>      >  >>> the VAM.
>>      >  >>>
>>      >  >>> --
>>      >  >>> Manik Surtani
>>      >  >>>
>>      >  >>> Lead, JBoss Cache
>>      >  >>> JBoss, a division of Red Hat
>>      >  >>>
>>      >  >>> Email: manik at jboss.org <mailto:manik at jboss.org>
>>      >  >>> Telephone: +44 7786 702 706
>>      >  >>> MSN: manik at surtani.org <mailto:manik at surtani.org>
>>      >  >>> Yahoo/AIM/Skype: maniksurtani
>>      >  >>>
>>      >  >>>
>>      >  >>>
>>      >  >>> On 30 Jan 2007, at 20:45, Galder Zamarreno wrote:
>>      >  >>>
>>      >  >>>> Actually, the more I think about this, the less I like the
>>     idea of
>>      >  >>>> switching the marshalling from 1.x to 2.x at the 
>> CacheLoaders
>>      >  >>>> level, or at least forcing them to do so.
>>      >  >>>>
>>      >  >>>> Customers that want to use JBossCache 2.x might be 
>> reluctant to
>>      >  >>>> migrate their data from one format to the other. I can see
>>     how an
>>      >  >>>> existing customer might think this is a proper pain in the
>>     ass,
>>      >  >>>> independent of the benefits, and might reduce adoption
>>     among them.
>>      >  >>>>
>>      >  >>>> We want to remove barriers upgrading, but at the same time,
>>     we want
>>      >  >>>> new customer to use new marshalling, so I'd actually
>>     implement the
>>      >  >>>> possibility to use 1.x marshalling which is plan java
>>     serialization
>>      >  >>>> at the CacheLoader level. This could easily achieved 
>> adding a
>>      >  >>>> property to the <properties> section.
>>      >  >>>>
>>      >  >>>> Just note that this does not apply to the marshalling 
>> done at
>>      >  >>>> replication level as there's no hard data that needs 
>> migrating.
>>      >  >>>>
>>      >  >>>> Galder Zamarre�o
>>      >  >>>> Sr. Software Maintenance Engineer
>>      >  >>>> JBoss, a division of Red Hat
>>      >  >>>>
>>      >  >>>> -----Original Message-----
>>      >  >>>> From: jbosscache-dev-bounces at lists.jboss.org
>>     <mailto:jbosscache-dev-bounces at lists.jboss.org> [mailto:
>>     jbosscache-dev- <mailto:jbosscache-dev->
>>      >  >>>> bounces at lists.jboss.org <mailto:bounces at lists.jboss.org>]
>>     On Behalf Of Galder Zamarreno
>>      >  >>>> Sent: 25 January 2007 13:07
>>      >  >>>> To: jbosscache-dev at lists.jboss.org
>>     <mailto:jbosscache-dev at lists.jboss.org>
>>      >  >>>> Subject: [jbosscache-dev] migrating data stored in 1.x
>>     format to
>>      >  >>>> VAM format
>>      >  >>>>
>>      >  >>>> Hi all,
>>      >  >>>>
>>      >  >>>> I'm deferring 
>> http://jira.jboss.com/jira/browse/JBCACHE-879 to
>>      >  >>>> BETA2 because I still need to write this:
>>     http://jira.jboss.com/
>>      >  >>>> jira/browse/JBCACHE-882
>>      >  >>>>
>>      >  >>>> The reason I'm deferring it is because I can't see a
>>      >  >>>> straightforward way of doing such thing right now. Ideally,
>>     you
>>      >  >>>> should be able run a 1.x version (cache1) and a 2.x version
>>      >  >>>> (cache2) of JBC in the same VM so that you can do a loop of
>>      >  >>>> cache1.get() and call cache2.put(). However, I have 
>> doubts that
>>      >  >>>> that this approach will be free of class loading issues.
>>     What do
>>      >  >>>> you think?
>>      >  >>>>
>>      >  >>>> I was wondering whether Region based could help here, but I
>>     can't
>>      >  >>>> see right now how this could be done.
>>      >  >>>>
>>      >  >>>> Something I had in mind is having the capability of to 
>> start a
>>      >  >>>> cache with either 1.x marshalling or VAM marshalling, but
>>     oriented
>>      >  >>>> at being used only at the cache loader level. It wouldn't
>>     make much
>>      >  >>>> sense for replication because there's no hard data there.
>>      >  >>>>
>>      >  >>>>
>>      >  >>>> I thought that you could start two instances of cache 2.x,
>>     first
>>      >  >>>> with 1.x. marshalling and the other one with VAM both
>>     pointing to
>>      >  >>>> different JDBCCacheLoader stores. You could then get from
>>     the first
>>      >  >>>> using normal mmarshalling and put in the second one which
>>     has VAM
>>      >  >>>> marshalling active, what do you think?
>>      >  >>>>
>>      >  >>>> If you like the approach, I should be have it ready by 
>> BETA2.
>>      >  >>>>
>>      >  >>>> This last approach looks simpler to me, what do you think?
>>      >  >>>>
>>      >  >>>> Galder Zamarre�o
>>      >  >>>> Sr. Software Maintenance Engineer
>>      >  >>>> JBoss, a division of Red Hat
>>      >  >>>>
>>      >  >>>>
>>      >  >>>>
>>      >  >>>> _______________________________________________
>>      >  >>>> jbosscache-dev mailing list
>>      >  >>>> jbosscache-dev at lists.jboss.org
>>     <mailto:jbosscache-dev at lists.jboss.org>
>>      >  >>>> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>     <https://lists.jboss.org/mailman/listinfo/jbosscache-dev>
>>      >  >>>>
>>      >  >>>> _______________________________________________
>>      >  >>>> jbosscache-dev mailing list
>>      >  >>>> jbosscache-dev at lists.jboss.org
>>     <mailto:jbosscache-dev at lists.jboss.org>
>>      >  >>>> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>      >  >>>
>>      >  >>>
>>      >  >>>
>>      >  >>> _______________________________________________
>>      >  >>> jbosscache-dev mailing list
>>      >  >>> jbosscache-dev at lists.jboss.org
>>     <mailto:jbosscache-dev at lists.jboss.org>
>>      >  >>> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>     <https://lists.jboss.org/mailman/listinfo/jbosscache-dev>
>>      >  >>
>>      >  >
>>      >  >
>>      >
>>
>>     --
>>     Galder Zamarre�o
>>     Sr. Software Maintenance Engineer
>>     JBoss, a division of Red Hat
>>
>>     _______________________________________________
>>     jbosscache-dev mailing list
>>     jbosscache-dev at lists.jboss.org 
>> <mailto:jbosscache-dev at lists.jboss.org>
>>     https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>
>>
> 

-- 
Galder Zamarreño
Sr. Software Maintenance Engineer
JBoss, a division of Red Hat




More information about the jbosscache-dev mailing list