[hibernate-dev] Scanner contract

Wed Mar 20 23:27:31 EDT 2013

I should point out that for most cases, writing a new Scanner should be 
as simple as extending 
org.hibernate.jpa.boot.scan.spi.AbstractScannerImpl and passing in a 
custom org.hibernate.jpa.boot.archive.spi.ArchiveDescriptorFactory that 
handles specific archive types (by URL protocol, etc).

For OSGi, we will need a tiny bit more work for environments like we 
have seen that only give us the root url.  There, assuming we extend 
AbstractScannerImpl, we will need the custom ArchiveDescriptorFactory 
and somehow it will need to account for the single url -> multiple 
artifacts deal (probably the ArchiveDescriptors it returns will need to 
do some delegation to sub-ArchiveDescriptors).

On Wed 20 Mar 2013 07:39:51 PM CDT, Steve Ebersole wrote:
> Here is the initial work I did on this.
>
> https://github.com/sebersole/hibernate-core/tree/HHH-8088
>
> Unfortunately it includes a lot of fixes to bad test code, especially
> in org.hibernate.jpa.test.packaging.PackagingTestCase and children
> (please no more relative file references in tests).
>
> At a high level, scanning involves mainly collaboration between stuff
> in 2 packages:
> 1) org.hibernate.jpa.boot.archive - essentially what used to be
> JarVisitor and friends
> 2) org.hibernate.jpa.boot.scan - uses the archive "walking" provided
> by org.hibernate.jpa.boot.archive and applies filtering to the "entries"
>
> The ultimate goal of the scan is to build a
> org.hibernate.jpa.boot.scan.spi.ScanResult which essentially collects
> together the org.hibernate.jpa.boot.spi.ClassDescriptor,
> org.hibernate.jpa.boot.spi.PackageDescriptor and
> org.hibernate.jpa.boot.spi.MappingFileDescriptor references.
>
>
> On Tue 19 Mar 2013 11:02:38 AM CDT, Steve Ebersole wrote:
>> The problem is what Hibernate does once it returns from the Scanner
>> calls.  It applies filters.  And the filters is applies are different
>> based on whether the url being scanned is the PU root url or a
>> non-root.  You simply don't have that info; so I don't think it will
>> work.
>>
>>
>> On Tue 19 Mar 2013 10:49:56 AM CDT, Brett Meyer wrote:
>>> It's probably too early in the process to know for sure, but I'm
>>> already headed down the path of using the existing Scanner contract
>>> for OSGi scanning in 4.2.  I provide a custom Scanner that ignores
>>> the "jar URL" completely -- all scans tap into the OSGi
>>> BundleWiring.  The scanning methods shouldn't be called repeatedly
>>> since PersistenceUnitInfo#getJarFileUrls is empty (at least in the
>>> Karaf container).  Correct me if that won't work...
>>>
>>> Brett Meyer
>>> Red Hat Software Engineer, Hibernate
>>> +1 260.349.5732
>>>
>>> ----- Original Message -----
>>> From: "Steve Ebersole" <steve at hibernate.org>
>>> To: "Scott Marlow" <smarlow at redhat.com>
>>> Cc: "Hibernate hibernate-dev" <hibernate-dev at lists.jboss.org>
>>> Sent: Tuesday, March 19, 2013 11:19:09 AM
>>> Subject: Re: [hibernate-dev] Scanner contract
>>>
>>> Yes, part of the redesign was to return class names (and streams)
>>> rather than Class instances, but thats just part of the reasoning.
>>>
>>> The problem is that we really can't continue to use the Scanner
>>> contract as-is; it is not great for OSGi environments.  Could we hack
>>> up the OSGi stuff to work with Scanner?  I don't think so.  When OSGi
>>> PersistenceUnitInfo simply returns you a root URL, how can we possibly
>>> apply different filters for root/non-root using the existing Scanner
>>> contract?
>>>
>>> Anyway, I am well down the path of implementing this.  I'll push to my
>>> fork when done and everyone can chime in their on real concrete code.
>>> One thing I can tell you that will be a huge bugbear here in terms of
>>> refactoring is that this existing Scanner and JarVisitor code has
>>> *zero* unit tests.
>>>
>>>
>>> On Tue 19 Mar 2013 10:06:42 AM CDT, Scott Marlow wrote:
>>>> On 03/18/2013 09:15 AM, Steve Ebersole wrote:
>>>>> On Mon 18 Mar 2013 05:14:01 AM CDT, Emmanuel Bernard wrote:
>>>>>>
>>>>>> JBoss AS does use this contract so if you break it, we will need
>>>>>> some
>>>>>> kind of compatibility matrix between Hibernate and JBoss AS and EAP.
>>>>>> Not unsurmountable but always a small annoyance.
>>>>>> Maybe other environments also make use of this interface but I am
>>>>>> not
>>>>>> aware of them.
>>>>>
>>>>> As far as JBoss AS, Scott has been involved in this design from the
>>>>> beginning.
>>>>
>>>> Given all of the discussion so far, and feedback from Ales/Emmanuel
>>>> who originally created the Scanner, I want to back track and reassess
>>>> before we get too far ahead on changing the scanner (from the point
>>>> ofo view of the AS side, just to confirm that the new design would
>>>> work).  My initial observation was that the AS side is returning Class
>>>> instances that we are only getting the name from.  Just returning the
>>>> name directly might be better.  However, if Hibernate does need to
>>>> access to the classes, I'm also fine with continuing to return the
>>>> classes.
>>>>
>>>> Sorry that I have been absent from this thread.  My queue of other
>>>> stuff is building up.
>>>>
>>>> Scott
>>>>
>>>>>
>>>>>> I'm surprised getUnqualifiedJarName is no longer needed. I
>>>>>> thought we
>>>>>> used it as the default PU name but the current code does not use
>>>>>> getUnqualifiedJarName
>>>>>> anymore.
>>>>>
>>>>> I have never seen that #getUnqualifiedJarName used aside from tests.
>>>>>
>>>>>> We initially designed the Scanner interface to minimize the work the
>>>>>> Scanner implementor has to do and keep as much of the JPA
>>>>>> knowledge to
>>>>>> HEM's code. Your design seems to require the Scanner to understand
>>>>>> more of
>>>>>> JPA including the notion of root jar and additional jar files.
>>>>>
>>>>> There is actually very very very little "JPA knowledge" being
>>>>> asked of
>>>>> the Scanner in my proposal.  Keep in mind that in both the cases that
>>>>> have surfaced so far where we actually need "custom Scanner" both are
>>>>> cases where the Scanner provider is also the thing that is handing us
>>>>> the root/additional jars.  For EE JPA thats actually part of the
>>>>> PersistenceUnitInfo contract; no magic there.  So for JBoss AS (or
>>>>> another AS) to hand us both the PersistenceUnitInfo (with jar urls)
>>>>> and
>>>>> the Scanner (knowning how to scan said url protocols) is not
>>>>> unreasonable.  In the case of Enterprise OSGi (at least based on our
>>>>> initial target environment), we have a PersistenceUnitInfo that only
>>>>> points us to the root url (#getJarFileUrls returns nothing), but
>>>>> this is
>>>>> the kind of "environment specifics" the current implementation forces
>>>>> Hibernate to understand.  And then, in both cases it forces
>>>>> Hibernate to
>>>>> import and use non-standard APIs just to do the scanning (JBoss's
>>>>> VirtualFile contract and quite a few OSGi contracts).  The important
>>>>> point I think you are missing is that it is far more difficult asking
>>>>> Hibernate to understand all the url protocol schemes in play then for
>>>>> the environments using those protocols to tell use how to scan them.
>>>>>
>>>>>
>>>>>> Things around:
>>>>>>
>>>>>> - getMappingFileNames to return the stream for these files,
>>>>> Not at all following here.  Do you mean getMappingFileNames on the
>>>>> PersistenceUnitInfo?  Well that does *not* return streams, it returns
>>>>> Strings.  And the spec specifically says that the Strings are
>>>>> supposed
>>>>> to be the resource names of the mapping files (aka, they should be
>>>>> loadable by that name through ClassLoader).  So what exactly is the
>>>>> point here?
>>>>>
>>>>>> - isExcludeUnlistedClasses to not scan classes in the root JAR,
>>>>> Exactly.  This "option" only has bearing on the root jar.  For all
>>>>> other
>>>>> jars Hibernate tries to be friendly and load everything.  But,
>>>>> that is
>>>>> hardly "deep JPA knowledge".  The option in terms of the root jars
>>>>> maps
>>>>> directly to an explicit JPA discussion.  Nothing deep about the
>>>>> knowledge there.  And for the non-root jars, there is nothing JPA
>>>>> specific in this option; its purely a Hibernate *choice*.
>>>>>
>>>>>> - getJarFileUrls
>>>>> Again, I think you are missing the point that generally speaking the
>>>>> PersistenceUnitInfo provider and the Scanner provider are
>>>>> one-in-the-same.
>>>>>
>>>>>> - look for META-INF/orm.xml in the root JAr (only) and exclude it
>>>>>> if it
>>>>>> is already listed explicitly in the getMappingFileNames to not
>>>>>> process
>>>>>> it twice.
>>>>> Not sure how this is classified as "deep JPA knowledge".
>>>>>
>>>>>> - getManagedClassNames depending on how much you delegate to the
>>>>>> scanner
>>>>> Again, not sure how this is classified as "deep JPA knowledge".  I
>>>>> assume you mean because of PUI#excludeUnlistedClasses, but see that
>>>>> discussion above.
>>>>>
>>>>>> That makes me concerned about code duplication and bugs unless
>>>>>> someone
>>>>>> deep in JPA immplement all of these Scanner implementations.
>>>>>
>>>>> So, I am really not seeing this "need for deep knowledge of JPA" on
>>>>> the
>>>>> Scanner implementor in what I propose.
>>>>> _______________________________________________
>>>>> hibernate-dev mailing list
>>>>> hibernate-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>>>>
>>>>
>>> _______________________________________________
>>> hibernate-dev mailing list
>>> hibernate-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev