[jboss-as7-dev] Modularity is the spawn of Lucifer and a stinking donkey
Scott Marlow
smarlow at redhat.com
Thu May 10 11:24:55 EDT 2012
On 05/09/2012 02:01 PM, David M. Lloyd wrote:
> OK I admit I LOL'ed.
>
> On 05/09/2012 11:50 AM, Emmanuel Bernard wrote:
>> Now that I have your attention, I'd like to discuss issues we are experiencing when trying to modularize the Hibernate portfolio and make it work in AS 7.1.
>>
>> ## Disclaimer
>>
>> I perfectly understand all the coolness about modularity (speed, easier dependency isolation etc). I have also carefully read :
>>
>> - https://community.jboss.org/wiki/ModuleCompatibleClassloadingGuide
>> - https://community.jboss.org/wiki/ModularSerialization
>>
>> But these tend to avoid the more complex cases of portable libraries that ought to run even outside AS 7 but have a wide variety of class and resource loading needs.
>> I am not a complete modularity bozo but I am definitely not familiar with JBoss Modules nor similar solution.
>>
>> ## Requirements / Landscape
>>
>> Hibernate ORM uses the notion of service registry and integrator object that help during the integration or customization of the engine behavior by third-party frameworks.
>> Enlistment of Integrators is done via the service locator pattern (a service file in META-INF/services/ that is looked up and contain the implementation class(es) at stake.
>>
>> Hibernate Envers is one of those customizer that depends on Hibernate ORM. Note that the core of Hibernate ORM does not depend on Hibernate Envers. The service locator file is contained in Hibernate Envers JAR.
>> Hibernate OGM likewise, heavily customizes ORM and depends on Hibernate ORM classes - the reverse is not true. The service locator file is contained in Hibernate OGM JAR.
>> Hibernate Search optionally depend on Hibernate ORM and JPA. The core of Hibernate Search is independent but an Hibernate Search ORM module has an integrator implementation. On top of that, Hibernate Search optionally depend on some JPA classes and behaves differently if they are there - we look them up in the classpath by reflection.
>>
>> On top of that, these projects do load resources (config files, classes):
>>
>> - from what Jason calls a Deployment classloader (the user application classes and resources really) - entities, custom analyzer implementations, resources files etc. A user could even write a custom Integrator and use the service locator pattern from his application.
>> - from direct dependencies (Lucene is a declared dependency of Hibernate Search)
>> - from dependencies of the deployment: for example an app developer adds the phonetic analyzer as a dependency of his application and ask Hibernate Search to use it
>> - from modules that use these projects. Modeshape and Capedwarf are being modularized and are making use of Hibernate Search as a module. Properly loading the necessary classes located in Modeshape or Capedwarf's module but from Hibernate Search's engine proves to be very hard in our current approach.
>>
>> All of these projects should be able to run outside JBoss AS 7, so a modular friendly solution should somehow be abstracted and generic enough.
>>
>> ## What solution?
>>
>> More and more projects are being modularized including ones with complex resource loading dependencies like the ones I have described. AFAIK Infinispan is even in a worse situation as clustering and late class binding is at stake but let's put this one aside.
>> I'd love to get a reference design outcome from this thread that could be copied across all these projects and future ones like Bean Validation.
>>
>> Today, we mostly use the good old and simple TCCL model which works fine if the jars are directly embedded in the app but fail the minute we start to move these dependencies into modules. Sanne, Strong, Scott Marlow and I are using dangerous amount of Advil to try and make everything work as expected. Some help would be awesome.
>>
>> To sum up:
>>
>> - can the Hibernate portfolio be supported within JBoss Module and how?
>> - what kind of ClassloaderService contract should we use within these projects to be modular friendly (JBoss Modules and others)?
>> - would such contract be generic enough to be extrapolated to JSRs in need of modular friendliness?
>> - how to solve the chicken and egg issue of the bootstrapping: if we need to pass a ClassloaderService impl?
> How do we do that best in a modular environment without forcing the
> application developer to implement such godforsaken ClassloaderService
> contract or even worse pass directly to us the right classloader for
> each call.
>
> I'll just start at the beginning and you can skip over the background if
> you like.
>
> The key starting concept is that a class' (or package's) identity is not
> just its name but also its class loader. This is the underlying
> (existing) truth that modularity brings to the fore. Corollary to this
> are the fact that a single VM may have more than one class or package
> with the same name, as well as the fact that not all classes/packages
> are always directly visible from a given class loader.
>
> This problem (as you've seen) manifests itself primarily when you're
> locating a class or a resource by name. You basically have two options.
> You can search *relative* to a class loader (most commonly, TCCL,
> though using a single explicit class loader or the caller's class loader
> also fall into this category). Or, you can use the *absolute* identity
> of a class.
>
> Using relative resolution is often a perfectly adequate solution when
> you're loading a single class or resource; in fact for some cases (like
> ServiceLoader for example) it's a perfect fit in conjunction with TCCL
> (in its capacity as an identifier for the "current" application). You
> want the user to be able to specify their implementation of something,
> and you want it to be reasonably transparent; ServiceLoader+TCCL does
> this fairly well.
>
> ServiceLoader also does well from the perspective of APIs with a static,
> fixed number of implementations. In this case, it is appropriate for a
> framework to have ServiceLoader use the class loader of the framework
> itself. The framework would then be sure to import the implementations
> in question (including their service descriptors); in our modular
> environment, which we call a "service import". Note that this often
> means there is a circular dependency between API and implementation:
> that's OK!
We currently use this for envers but that doesn't seem as desirable for
other members of the Hibernate portfolio that may be on a separate
lifecycle. For example, the Hibernate OGM is a persistence provider
that depends on Hibernate ORM. If we have Hibernate ORM depend on OGM,
that limits the number of OGM versions that can be in use on AS7.
Would it be possible, to add a MSC enhancement, that allows an inverse
dependency service loader dependency to be expressed? Such that it
would be enough to only have OGM depend on ORM (with an inverse service
dependency specified). I'm thinking that the OGM module would need to
exchange the service dependency information with the ORM module and
clear it, when OGM goes away.
If this is possible, it would make a nice future enhancement IMO.
>
> A third ServiceLoader option is of course to simply accept a class
> loader argument when looking up an implementation. This grants the most
> flexibility and tends to work well in just about any environment, though
> it may be somewhat lacking aesthetically, if you care about such things.
I'm not sure of how OGM would make its presence known to ORM currently.
Probably via a custom SPI that allows the inverse service loader
dependency to be expressed (so that
https://github.com/hibernate/hibernate-orm/blob/master/hibernate-core/src/main/java/org/hibernate/service/classloading/internal/ClassLoaderServiceImpl.java
can know about Hibernate search/envers/ogm/...).
If we cannot have a MSC way to express the inverse service loader
dependency, this sounds like the next best option.
>
> In any case, ServiceLoader already fits in very well to modularity; it's
> just a question of understanding your use case to know the appropriate
> way to apply it. The key characteristic however of such a fit is that
> it is trying to load a single resource of some kind. Once you move into
> object graph territory, things become a hell of a lot more complex when
> it comes to the relative resolution game.
>
> A good example is serialization. Having a single class loader for all
> resolution needs often simply doesn't cut it. It works to an extent,
> iff the object graph in question never "escapes" what the application
> (or current class loader) is cognizant of. However it may well be the
> case that an implementation class isn't "visible" to the single class
> loader. In this case, especially if more than one class with a given
> name is existent in the system, there's no unambiguous solution to load
> a class relative to an application, unless you explicitly add the
> desired class loaders to the resolution path of the application's class
> loader.
>
> The two solutions to this problem are to either enforce (at serialize
> time) a policy which prevents serializing objects of non-visible
> classes, or to give up relative resolution and go to "absolute" identities.
>
> When you use an absolute identity, you're persisting not only the class'
> name but also the identity of its initiating class loader. Back in the
> RMI and applet days, this would have been done (rather clumsily) by
> extension name or perhaps code source URL. In our container environment
> we tend to use module identifier. But either way, this identity is
> useless unless you have a mechanism to resolve it back to a class
> loader. This mechanism (at least as of today) is going to vary
> substantially from one runtime environment to another though. Thus
> being able to plug in to the process is critical [1].
>
> There's another tricky dimension to this problem though. Say you want
> everything - the ability to use the "current" application to resolve a
> class (i.e. to disambiguate a class of a given name from a neighbor
> application which might want to execute the exact same code but get its
> own relatively visible class), but also the ability to absolutely
> resolve "invisible" classes. This would be common in the case where you
> have two EAR deployments in an app server, each bundling their own copy
> of an EJB JAR (for whatever reason) but which both link against common
> modules.
>
> There's no silver bullet here, but this problem can be solved to a
> significant extent if you know which class loaders are candidates for
> relative resolution and marking them as such in the target externalized
> class data. In AS for example, we could use information about the
> deployment class-path graph to distinguish what was bundled with the app
> and what is external to it, because we know that deployment classes are
> very likely to be visible from the TCCL (and if not, we would not have a
> very large class loader landscape over which to search).
>
> Of course as an end user you don't really want to think about any of
> this crap, you just want it to work. The user can't avoid having some
> knowledge of this though - they have to know what policy they apply to
> the data they're accessing. Sometimes it's simple - if they have a flat
> class loader, and they're reading modularized serial data, they can just
> discard the class loader identification because if a class isn't found
> in their class loader, it's not going to be found anywhere else either.
>
> Sometimes it's more complex though. Writing from one environment and
> reading from a completely differently structured environment can get
> extremely hairy when using absolute resolution, requiring various
> degrees of translation. The best advice would be to have users strive
> to use the same kind of environment when dealing with serialized data.
> For example, modular writers should be consumed by modular readers.
>
> Note that though I use serialization as my main example, these concepts
> should apply as well (at a certain level) to Hibernate and friends,
> Infinispan, etc.
>
> [1] As an example, see
> https://github.com/dmlloyd/jboss-marshalling/blob/master/api/src/main/java/org/jboss/marshalling/ClassResolver.java#L34
More information about the jboss-as7-dev
mailing list