I've got this working, including some basic unit tests and manual
examples to transform entired cache stores from 1.x data to 2.x. These
last manual examples include source code, 1.x cache stores (file and
jdbc derby db) and sample cache configurations.
Apart from the MV issue referred earlier, I have realised that
TransformingJDBCCacheLoader will have to extend JDBCCacheLoaderOld
instead of JDBCCacheLoader.
The reason is because when JDBCCacheLoader starts, if the root does not
exist, it'll create it, which in the TransformingJDBCCacheLoader will
mean creating it in 2.x format.
This wouldn't be a problem if the root node didn't need querying again,
but if customers want to migrate their data, they will start looping
from the node (cache.getRoot()) and the first thing they'll get its
children. This results in trying to load the root node from the cache
store which breaks, as we're reading from db in 1.x format.
Remember that the TransformingJDBCCacheLoader reads in 1.x format and
stores in 2.x format.
This has a very easy resolution which is extending JDBCCacheLoaderOld.
After that, it works like a treat :).
Manik, assuming you're happy with the original idea, would extending
JDBCCacheLoaderOld for this one off cache loader be ok with you?
Galder Zamarreno wrote:
I haven't touched this issue for a couple of weeks and over the
last
couple of days I had the chance to get back into it.
After discussions with Brian we came up with a different approach for this.
My previous approach (don't need to read all below) relied on
introducing legacy code into the main source code that would be able to
read 1.x serialization. As I started doing it, I realised that it would
need a lot of changes and it would clutter the 2.0 codebase.
Instead, with the help of Brian, we came up with a different idea, which
is creating two one-off cache loaders, TransformingJDBCCacheLoader and
TransformingFileCacheLoader. They extend the existing cache loaders, but
they differentiate by unmarshall stuff in the 1.x way.
This way, we have cache loaders that can read in 1.x way and write in
2.x way. Now, a customer just needs to write a program that uses a cache
configured to use any of these two cache loaders above and all it has to
do is loop through the tree reading all nodes and putting them back, and
voila! you have your data store format changed (I'll be writing an
example of this).
It's a pretty clean solution to transforming data without making changes
to main o.j.cache tree.
But, there's always a but :), 1.4.x used
org.jboss.invocation.MarshalledValue so there's no way of getting around
the need of having this class to do this. This is because
JDBCCacheLoader stored instances of MarshalledValue, so even the MV
class in AOP would not work cos it's a different package (it'd result in
CCE)
One thing Brian suggested is that these two cache loaders and
jboss-minimal are kept in a separate dir structure to the main one and
when we distribute, we provide an extra jar containing these that can be
used to transform data and that's it. After that, you get rid of it, you
go back to the standard libraries.
It's pretty hard to find a neater way of dealing with this but the
benefits are worth it, customer's data stays alive!
Manik and the rest, thoughts?
Galder Zamarreno wrote:
> Manik Surtani wrote:
>> On 5 Feb 2007, at 19:57, Galder Zamarreno wrote:
>>
>>> Quick (but a bit lengthy :( ) update on this:
>>>
>>> - I've created a new Marshaller called Legacy1xMarshaller
(anyone's
>>> got a better name?) which extends o.j.c.m.AsbtractMarshaller that
>>> would do the job of marshalling stuff in the 1.x fashion. This is to
>>> be used by JDBCCacheLoader and FileCacheLoader if configured to use
>>> 1.x marshalling. This has the benefit that the code in these cache
>>> loaders only have to do getMarshaller().whatever... , making it very
>>> simple to switch from VAM to Legacy Marshaller.
>>
>> I presume the VAM would transparently flip between marshallers, based
>> on the version short at the head of the stream?
>
> The problem is that 1.x marshalling for cache loaders did not have
> version numbers at the start, it was plain java serialization. Can you
> expect VAM to detect that? That's why I thought of a Marshaller instance
> in AbstractCacheLoader that would either use VAM or the Legacy one. We
> could however assume that if VAM does not find version number, it tries
> to use Legacy one.
>
> As you said later in the email, it seems like 1.4.x dealt with this
> similar situation. I'll look at it.
>
>>
>>>
>>> - In order to do this, I need to add a new method to
>>> o.j.c.m.Marshaller called objectToStream(OutputStream). The reason
>>> for doing is so that FileCacheLoader just needs to call
>>> getMarshaller().objectToStream() when it's trying to store data. This
>>> will avoid having an if statement in storeAttributes() checking which
>>> Marshaller is used, and calling objectToObjectStream with the
>>> corresponding ObjectOutpuStream.
>>
>> Again, isn't this already in the VAM?
>
> Not for OutputStream. You have objectToObjectStream(Object obj,
> ObjectOutputStream out) and objectFromStream(InputStream is), but not
> objectToStream for OutputStreams such as FileOutputStream.
>
>>
>>>
>>> - The decision maker for which Marshaller to use is to be done in
>>> AbstractCacheLoader which will store the Marshaller used by
>>> CacheLoader. getMarshaller() would decide upon configuration, which
>>> Marshaller to use, whether the default cache.getMarshaller() which is
>>> VAM or the legacy one, making it quite clean to switch from to
another.
>>
>> Look at the VAM in the 1.4.x tree - it deals with "legacy support"
to
>> deal with JBC 1.2.x and 1.3.x for RPC calls. (removed in 2.x since
>> the legacy support was no longer needed). Could easily be
>> re-introduced if needed to supportr legacy marshalling for CLs.
>
> Ok, i'll definitely have a look at that.
>
>>
>>>
>>> - Configuration wise, I created Legacy1xMarshallingCacheLoaderConfig
>>> (I couldn't come up with a better name!) which extends
>>> IndividualCacheLoaderConfig. JDBCCacheLoaderConfig and
>>> FileCacheLoaderConfig will extend
>>> Legacy1xMarshallingCacheLoaderConfig instead.
>>
>> Could drop the 1x in the name, I suppose? :-)
>
> No probs :)
>
>>
>>>
>>> - Inside Legacy1xMarshallingCacheLoaderConfig, I search for
>>> cache.loader.marshalling.1.x (name again!) boolean property in the
>>> <properties> section. If true, it uses legacy marshalling, and if
>>> false, which is default value, VAM.
>>>
>>> - I have extended CacheLoaderTestsBase to create
>>> FileCacheLoaderLegacyMarshallingTest which tests the FileCacheLoader
>>> with legacy marshalling. I'll be doing the same for JDBCCacheLoader.
>>>
>>> - Finally and one of the most important aspects, previous marhalling
>>> relies on these classes:
>>>
>>> org.jboss.invocation.MarshalledValue;
>>> org.jboss.invocation.MarshalledValueInputStream;
>>>
>>> Which used to be located in jboss-minimal.jar in 1.x. There's v
>>> similar classes in AOP but not the same, so I'm gonna be creating a
>>> legacy directory in lib with this library. To avoid compile time
>>> dependency, Legacy1xMarshaller will be instantiated via reflection,
>>> so only people who actually use this will need this library. The
>>> library has no conflicts with existing 2.x libraries.
>>
>> Look at the jboss-common-core jar and particularly JBCOMMON-8 in JIRA.
>>
>
> So, did you test whether you could read data written with
> JDBCCacheLoader wiht MV classes with a JDBCCacheLoader not using MV
> classes? That's one of the tests I wanted to do to see whether this
> classes were necessary.
>
> jboss-common-core.jar contains MarshalledValueOutputStream and
> MarshalledValueInputStream so that wouldn't be a problem for FCL.
> JDBCCacheLoader on the contrary, wrapped the node in MarshalledValue and
> the wrote it as an ObjectOutputStream. I'll look at the commons code to
> see whether it's the same which I guess might be.
>
> There's a MarshalledValue in aop libraries but quick glance at the code
> showed that it's slightly different.
>
>>>
>>> The last problem is that these two classes access
>>> org.jboss.logging.Logger that used to be in jboss-common.jar. Now
>>> this jar certainly classes with jboss-common-core.jar in 2.x, so
>>> what's I've done is get jboss-logging-spi.jar 2.0.2.GA and put it
in
>>> the legacy directory.
>>>
>>> So, we end up having two legacy libraries in lib/legacy but they're
>>> only needed at runtime if using 1.x marhalling. I guess it's the
>>> price to pay to make customer's life a bit easier.
>>>
>>
>> Trying to avoid a legacy jar dir ... like I said, see if the MV and
>> MVIS can be in jboss-common-core (without JBoss Logging deps!)
>
> Yeah defo, we wanna avoid any legacy jars.
>
>>
>>> The other alternative would be for 1.x marshaller not to use this
>>> org.jboss.invocation.* classes and just write to Object streams but I
>>> think these classes have an impact in the format of the marshalled
>>> data. Brian, do you know a bit more about the role of these classes?
>>>
>>> A bit more complicated than initially expected but I can't see any
>>> easier way of providing backwards compatibility. Hopefully we should
>>> be able to phase it out asap, 3.x? :)
>>>
>>> What this has shown as well is how different CacheLoaders marshalled
>>> things in a slightly different way which makes having a common
>>> framework for this even more necessary, i.e. VAM. :D
>>>
>>> Hope you're not snoring by now ;)
>>>
>>> If you have better ideas for the naming I used, speak up :)
>>>
>>> Galder Zamarreño
>>> Sr. Software Maintenance Engineer
>>> JBoss, a division of Red Hat
>>>
>>> -----Original Message-----
>>> From: jbosscache-dev-bounces(a)lists.jboss.org
>>> [mailto:jbosscache-dev-bounces@lists.jboss.org] On Behalf Of Galder
>>> Zamarreno
>>> Sent: 31 January 2007 01:01
>>> To: Manik Surtani
>>> Cc: jbosscache-dev(a)lists.jboss.org
>>> Subject: RE: [jbosscache-dev] migrating data stored in 1.x format to
>>> VAM format
>>>
>>> +1, VAM should be the default.
>>>
>>> Only people who are resilient to change their existing stores to VAM
>>> should use the 1.x option, which would need explicitly definition.
>>>
>>> Galder Zamarreño
>>> Sr. Software Maintenance Engineer
>>> JBoss, a division of Red Hat
>>>
>>>
>>> -----Original Message-----
>>> From: Manik Surtani [mailto:manik@jboss.org]
>>> Sent: 30 January 2007 22:55
>>> To: Galder Zamarreno
>>> Cc: jbosscache-dev(a)lists.jboss.org
>>> Subject: Re: [jbosscache-dev] migrating data stored in 1.x format to
>>> VAM format
>>>
>>> I see what you mean, although I would like the default to be to use
>>> the VAM.
>>>
>>> --
>>> Manik Surtani
>>>
>>> Lead, JBoss Cache
>>> JBoss, a division of Red Hat
>>>
>>> Email: manik(a)jboss.org
>>> Telephone: +44 7786 702 706
>>> MSN: manik(a)surtani.org
>>> Yahoo/AIM/Skype: maniksurtani
>>>
>>>
>>>
>>> On 30 Jan 2007, at 20:45, Galder Zamarreno wrote:
>>>
>>>> Actually, the more I think about this, the less I like the idea of
>>>> switching the marshalling from 1.x to 2.x at the CacheLoaders
>>>> level, or at least forcing them to do so.
>>>>
>>>> Customers that want to use JBossCache 2.x might be reluctant to
>>>> migrate their data from one format to the other. I can see how an
>>>> existing customer might think this is a proper pain in the ass,
>>>> independent of the benefits, and might reduce adoption among them.
>>>>
>>>> We want to remove barriers upgrading, but at the same time, we want
>>>> new customer to use new marshalling, so I'd actually implement the
>>>> possibility to use 1.x marshalling which is plan java serialization
>>>> at the CacheLoader level. This could easily achieved adding a
>>>> property to the <properties> section.
>>>>
>>>> Just note that this does not apply to the marshalling done at
>>>> replication level as there's no hard data that needs migrating.
>>>>
>>>> Galder Zamarreño
>>>> Sr. Software Maintenance Engineer
>>>> JBoss, a division of Red Hat
>>>>
>>>> -----Original Message-----
>>>> From: jbosscache-dev-bounces(a)lists.jboss.org [mailto:jbosscache-dev-
>>>> bounces(a)lists.jboss.org] On Behalf Of Galder Zamarreno
>>>> Sent: 25 January 2007 13:07
>>>> To: jbosscache-dev(a)lists.jboss.org
>>>> Subject: [jbosscache-dev] migrating data stored in 1.x format to
>>>> VAM format
>>>>
>>>> Hi all,
>>>>
>>>> I'm deferring
http://jira.jboss.com/jira/browse/JBCACHE-879 to
>>>> BETA2 because I still need to write this:
http://jira.jboss.com/
>>>> jira/browse/JBCACHE-882
>>>>
>>>> The reason I'm deferring it is because I can't see a
>>>> straightforward way of doing such thing right now. Ideally, you
>>>> should be able run a 1.x version (cache1) and a 2.x version
>>>> (cache2) of JBC in the same VM so that you can do a loop of
>>>> cache1.get() and call cache2.put(). However, I have doubts that
>>>> that this approach will be free of class loading issues. What do
>>>> you think?
>>>>
>>>> I was wondering whether Region based could help here, but I can't
>>>> see right now how this could be done.
>>>>
>>>> Something I had in mind is having the capability of to start a
>>>> cache with either 1.x marshalling or VAM marshalling, but oriented
>>>> at being used only at the cache loader level. It wouldn't make
much
>>>> sense for replication because there's no hard data there.
>>>>
>>>>
>>>> I thought that you could start two instances of cache 2.x, first
>>>> with 1.x. marshalling and the other one with VAM both pointing to
>>>> different JDBCCacheLoader stores. You could then get from the first
>>>> using normal mmarshalling and put in the second one which has VAM
>>>> marshalling active, what do you think?
>>>>
>>>> If you like the approach, I should be have it ready by BETA2.
>>>>
>>>> This last approach looks simpler to me, what do you think?
>>>>
>>>> Galder Zamarreño
>>>> Sr. Software Maintenance Engineer
>>>> JBoss, a division of Red Hat
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> jbosscache-dev mailing list
>>>> jbosscache-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>>>
>>>> _______________________________________________
>>>> jbosscache-dev mailing list
>>>> jbosscache-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> jbosscache-dev mailing list
>>> jbosscache-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>
>
>
--
Galder Zamarreño
Sr. Software Maintenance Engineer
JBoss, a division of Red Hat