[hibernate-dev] Scanner contract

Thu Mar 21 14:21:52 EDT 2013

I am integrating this into master at the moment.

On Wed 20 Mar 2013 10:27:31 PM CDT, Steve Ebersole wrote:
> I should point out that for most cases, writing a new Scanner should
> be as simple as extending
> org.hibernate.jpa.boot.scan.spi.AbstractScannerImpl and passing in a
> custom org.hibernate.jpa.boot.archive.spi.ArchiveDescriptorFactory
> that handles specific archive types (by URL protocol, etc).
>
> For OSGi, we will need a tiny bit more work for environments like we
> have seen that only give us the root url.  There, assuming we extend
> AbstractScannerImpl, we will need the custom ArchiveDescriptorFactory
> and somehow it will need to account for the single url -> multiple
> artifacts deal (probably the ArchiveDescriptors it returns will need
> to do some delegation to sub-ArchiveDescriptors).
>
>
> On Wed 20 Mar 2013 07:39:51 PM CDT, Steve Ebersole wrote:
>> Here is the initial work I did on this.
>>
>> https://github.com/sebersole/hibernate-core/tree/HHH-8088
>>
>> Unfortunately it includes a lot of fixes to bad test code, especially
>> in org.hibernate.jpa.test.packaging.PackagingTestCase and children
>> (please no more relative file references in tests).
>>
>> At a high level, scanning involves mainly collaboration between stuff
>> in 2 packages:
>> 1) org.hibernate.jpa.boot.archive - essentially what used to be
>> JarVisitor and friends
>> 2) org.hibernate.jpa.boot.scan - uses the archive "walking" provided
>> by org.hibernate.jpa.boot.archive and applies filtering to the "entries"
>>
>> The ultimate goal of the scan is to build a
>> org.hibernate.jpa.boot.scan.spi.ScanResult which essentially collects
>> together the org.hibernate.jpa.boot.spi.ClassDescriptor,
>> org.hibernate.jpa.boot.spi.PackageDescriptor and
>> org.hibernate.jpa.boot.spi.MappingFileDescriptor references.
>>
>>
>> On Tue 19 Mar 2013 11:02:38 AM CDT, Steve Ebersole wrote:
>>> The problem is what Hibernate does once it returns from the Scanner
>>> calls.  It applies filters.  And the filters is applies are different
>>> based on whether the url being scanned is the PU root url or a
>>> non-root.  You simply don't have that info; so I don't think it will
>>> work.
>>>
>>>
>>> On Tue 19 Mar 2013 10:49:56 AM CDT, Brett Meyer wrote:
>>>> It's probably too early in the process to know for sure, but I'm
>>>> already headed down the path of using the existing Scanner contract
>>>> for OSGi scanning in 4.2.  I provide a custom Scanner that ignores
>>>> the "jar URL" completely -- all scans tap into the OSGi
>>>> BundleWiring.  The scanning methods shouldn't be called repeatedly
>>>> since PersistenceUnitInfo#getJarFileUrls is empty (at least in the
>>>> Karaf container).  Correct me if that won't work...
>>>>
>>>> Brett Meyer
>>>> Red Hat Software Engineer, Hibernate
>>>> +1 260.349.5732
>>>>
>>>> ----- Original Message -----
>>>> From: "Steve Ebersole" <steve at hibernate.org>
>>>> To: "Scott Marlow" <smarlow at redhat.com>
>>>> Cc: "Hibernate hibernate-dev" <hibernate-dev at lists.jboss.org>
>>>> Sent: Tuesday, March 19, 2013 11:19:09 AM
>>>> Subject: Re: [hibernate-dev] Scanner contract
>>>>
>>>> Yes, part of the redesign was to return class names (and streams)
>>>> rather than Class instances, but thats just part of the reasoning.
>>>>
>>>> The problem is that we really can't continue to use the Scanner
>>>> contract as-is; it is not great for OSGi environments.  Could we hack
>>>> up the OSGi stuff to work with Scanner?  I don't think so.  When OSGi
>>>> PersistenceUnitInfo simply returns you a root URL, how can we possibly
>>>> apply different filters for root/non-root using the existing Scanner
>>>> contract?
>>>>
>>>> Anyway, I am well down the path of implementing this.  I'll push to my
>>>> fork when done and everyone can chime in their on real concrete code.
>>>> One thing I can tell you that will be a huge bugbear here in terms of
>>>> refactoring is that this existing Scanner and JarVisitor code has
>>>> *zero* unit tests.
>>>>
>>>>
>>>> On Tue 19 Mar 2013 10:06:42 AM CDT, Scott Marlow wrote:
>>>>> On 03/18/2013 09:15 AM, Steve Ebersole wrote:
>>>>>> On Mon 18 Mar 2013 05:14:01 AM CDT, Emmanuel Bernard wrote:
>>>>>>>
>>>>>>> JBoss AS does use this contract so if you break it, we will need
>>>>>>> some
>>>>>>> kind of compatibility matrix between Hibernate and JBoss AS and
>>>>>>> EAP.
>>>>>>> Not unsurmountable but always a small annoyance.
>>>>>>> Maybe other environments also make use of this interface but I am
>>>>>>> not
>>>>>>> aware of them.
>>>>>>
>>>>>> As far as JBoss AS, Scott has been involved in this design from the
>>>>>> beginning.
>>>>>
>>>>> Given all of the discussion so far, and feedback from Ales/Emmanuel
>>>>> who originally created the Scanner, I want to back track and reassess
>>>>> before we get too far ahead on changing the scanner (from the point
>>>>> ofo view of the AS side, just to confirm that the new design would
>>>>> work).  My initial observation was that the AS side is returning
>>>>> Class
>>>>> instances that we are only getting the name from.  Just returning the
>>>>> name directly might be better.  However, if Hibernate does need to
>>>>> access to the classes, I'm also fine with continuing to return the
>>>>> classes.
>>>>>
>>>>> Sorry that I have been absent from this thread.  My queue of other
>>>>> stuff is building up.
>>>>>
>>>>> Scott
>>>>>
>>>>>>
>>>>>>> I'm surprised getUnqualifiedJarName is no longer needed. I
>>>>>>> thought we
>>>>>>> used it as the default PU name but the current code does not use
>>>>>>> getUnqualifiedJarName
>>>>>>> anymore.
>>>>>>
>>>>>> I have never seen that #getUnqualifiedJarName used aside from tests.
>>>>>>
>>>>>>> We initially designed the Scanner interface to minimize the work
>>>>>>> the
>>>>>>> Scanner implementor has to do and keep as much of the JPA
>>>>>>> knowledge to
>>>>>>> HEM's code. Your design seems to require the Scanner to understand
>>>>>>> more of
>>>>>>> JPA including the notion of root jar and additional jar files.
>>>>>>
>>>>>> There is actually very very very little "JPA knowledge" being
>>>>>> asked of
>>>>>> the Scanner in my proposal.  Keep in mind that in both the cases
>>>>>> that
>>>>>> have surfaced so far where we actually need "custom Scanner" both
>>>>>> are
>>>>>> cases where the Scanner provider is also the thing that is
>>>>>> handing us
>>>>>> the root/additional jars.  For EE JPA thats actually part of the
>>>>>> PersistenceUnitInfo contract; no magic there.  So for JBoss AS (or
>>>>>> another AS) to hand us both the PersistenceUnitInfo (with jar urls)
>>>>>> and
>>>>>> the Scanner (knowning how to scan said url protocols) is not
>>>>>> unreasonable.  In the case of Enterprise OSGi (at least based on our
>>>>>> initial target environment), we have a PersistenceUnitInfo that only
>>>>>> points us to the root url (#getJarFileUrls returns nothing), but
>>>>>> this is
>>>>>> the kind of "environment specifics" the current implementation
>>>>>> forces
>>>>>> Hibernate to understand.  And then, in both cases it forces
>>>>>> Hibernate to
>>>>>> import and use non-standard APIs just to do the scanning (JBoss's
>>>>>> VirtualFile contract and quite a few OSGi contracts).  The important
>>>>>> point I think you are missing is that it is far more difficult
>>>>>> asking
>>>>>> Hibernate to understand all the url protocol schemes in play then
>>>>>> for
>>>>>> the environments using those protocols to tell use how to scan them.
>>>>>>
>>>>>>
>>>>>>> Things around:
>>>>>>>
>>>>>>> - getMappingFileNames to return the stream for these files,
>>>>>> Not at all following here.  Do you mean getMappingFileNames on the
>>>>>> PersistenceUnitInfo?  Well that does *not* return streams, it
>>>>>> returns
>>>>>> Strings.  And the spec specifically says that the Strings are
>>>>>> supposed
>>>>>> to be the resource names of the mapping files (aka, they should be
>>>>>> loadable by that name through ClassLoader).  So what exactly is the
>>>>>> point here?
>>>>>>
>>>>>>> - isExcludeUnlistedClasses to not scan classes in the root JAR,
>>>>>> Exactly.  This "option" only has bearing on the root jar.  For all
>>>>>> other
>>>>>> jars Hibernate tries to be friendly and load everything.  But,
>>>>>> that is
>>>>>> hardly "deep JPA knowledge".  The option in terms of the root jars
>>>>>> maps
>>>>>> directly to an explicit JPA discussion.  Nothing deep about the
>>>>>> knowledge there.  And for the non-root jars, there is nothing JPA
>>>>>> specific in this option; its purely a Hibernate *choice*.
>>>>>>
>>>>>>> - getJarFileUrls
>>>>>> Again, I think you are missing the point that generally speaking the
>>>>>> PersistenceUnitInfo provider and the Scanner provider are
>>>>>> one-in-the-same.
>>>>>>
>>>>>>> - look for META-INF/orm.xml in the root JAr (only) and exclude it
>>>>>>> if it
>>>>>>> is already listed explicitly in the getMappingFileNames to not
>>>>>>> process
>>>>>>> it twice.
>>>>>> Not sure how this is classified as "deep JPA knowledge".
>>>>>>
>>>>>>> - getManagedClassNames depending on how much you delegate to the
>>>>>>> scanner
>>>>>> Again, not sure how this is classified as "deep JPA knowledge".  I
>>>>>> assume you mean because of PUI#excludeUnlistedClasses, but see that
>>>>>> discussion above.
>>>>>>
>>>>>>> That makes me concerned about code duplication and bugs unless
>>>>>>> someone
>>>>>>> deep in JPA immplement all of these Scanner implementations.
>>>>>>
>>>>>> So, I am really not seeing this "need for deep knowledge of JPA" on
>>>>>> the
>>>>>> Scanner implementor in what I propose.
>>>>>> _______________________________________________
>>>>>> hibernate-dev mailing list
>>>>>> hibernate-dev at lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> hibernate-dev mailing list
>>>> hibernate-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev