[wildfly-dev] WFLY-508, JBeret initial review and integration issues

Wed Jul 24 13:42:18 EDT 2013

----- Original Message -----
> From: "Cheng Fang" <cfang at redhat.com>
> To: wildfly-dev at lists.jboss.org, "Kevin Conner" <kconner at redhat.com>, "Pete Muir" <pmuir at redhat.com>
> Sent: Wednesday, July 24, 2013 11:05:36 AM
> Subject: Re: [wildfly-dev] WFLY-508,	JBeret initial review and integration issues
> 
> David,
> 
> Thanks for sharing your comments and observations.  More inline...
> On 7/24/13 12:09 PM, David M. Lloyd wrote:
> > On initial review of JBeret we have noticed a number of issues that
> > need to be addressed.  The culmination amounts to a series of
> > questions and observations here:
> >
> > #1) Why did we not choose to just use the RI?  In other words, what
> > benefit do we get from JBeret that is not also in the RI?  In
> > other,
> > other words, why should we *use* this code instead of the RI at
> > this
> > point in time?
> Batch RI (http://java.net/projects/jbatch from IBM) was created
> solely
> for the purpose of a reference implementation, and is a subset of
> IBM's
> batch offering.  The RI code base is refreshed periodically by IBM
> contributors and it doesn't seem to open to community contribution.
>  I
> haven't done a deep technical comparison between the 2 yet, but I
> guess
> there are areas that one is better than the other and vise versa.
> Looking a bit longer term, batch has been an area Java EE and JBoss
> haven't paid much attention to, and I believe is an area that can
> offer
> future growth potential.  Having our own impl would give us more
> flexibility when it comes to integration with the rest of the stack,
> design choices, and community building.  I'm also adding Kev and Pete
> for their perspectives.
> >
> > #2) Why does JBeret duplicate facilities already present in the
> > WildFly code base and deployer chain - e.g. annotation indexing,
> > reflection indexing, thread management, parsing facilities, etc.?
> Batch spec require an impl to be run in either Java EE or Java SE
> environment.  So inevitably certain services have to reside in JBeret
> itself to satisfy the SE runtime.  Since we started the impl as a
> standalone first, there may be certain aspects that do not fit nicely
> in
> WildFly.  It is in the plan to better align with the appserver by
> leveraging existing services when running inside WildFly.  For
> example,
> use the concurrency utils in EE.
> 
> >
> > #3) Specific to algorithmic complexity - it appears that jobs are
> > keyed by ID, yet accessed using a sequential search [1] - this does
> > not scale well to large numbers of jobs.  Is there no better
> > approach?
> The expectation is there is large amount of data, but the number of
> jobs
> are not that large.  Say we run a reporting job every day, it is
> still
> one single job with many JobInstance and JobExecution.  So I think
> the
> sequential access is acceptable.  I guess another reason I didn't
> want
> to maintain a mapping is I really don't want to duplicate the job id
> as
> the key.
> 

I think the assumption that there will be a small number of jobs, with a large amount of data is a bad one.  I did large scale batch applications when I worked at Sprint, and the number of jobs was very large indeed.  Many thousands, in fact.  I Java EE batch is going to be anything more than a toy, then this is fine, but if customers try to adopt this under this current design, I think they will quickly give up on it.

Andy

> >
> > #4) JAXB seems to be being used to parse XML, which is a departure
> > from all of our other services which expect parsing to be done
> > during
> > deployment processing in a more efficient manner.  Is there any
> > better
> > way we can integrate this, preferably not using JAXB?
> It works well so far in standalone distro, but I'm open to
> alternative
> mechanism in either standalone or EE.
> >
> > #5) There are a number of resources present that seem inappropriate
> > for the production JAR [2] [3].  Is this intentional?
> These are work in progress.  sql files are for implementing a jdbc
> job
> repository.  Why are they inappropriate?
> >
> > #6) This code base makes extensive use of static state, including
> > static fields that seem not to be adequately protected for
> > thread-safety, and at least one static thread pool [4].  This needs
> > to
> > be fixed, as these kinds of things make embedding difficult or
> > impossible.
> 
> In EE environment, thread pool will switch to the managed service
> provided by WildFly, preferably the new concurrency utils.  Can you
> list
> other places you've noticed that make bad use of static state?
> 
> Appreciate your feedback.
> 
> Cheng
> >
> > [1]
> > https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/repository/AbstractRepository.java#L50
> > [2]
> > https://github.com/jberet/jsr352/tree/master/jberet-core/src/main/resources/sql
> > [3]
> > https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/resources/jobXML.xjb
> > [4]
> > https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/util/ConcurrencyService.java#L22
> >
> 
> _______________________________________________
> wildfly-dev mailing list
> wildfly-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>