[wildfly-dev] WFLY-508, JBeret initial review and integration issues

Wed Jul 24 19:46:09 EDT 2013

JBeret currently only supports in-VM multi-threaded job executions. We 
are aware of this requirement and is planned in future iterations, at 
least after it's well integrated into WildFly.

I'm glad you brought up that point.

Cheng

On 7/24/13 5:34 PM, Stuart Douglas wrote:
> Something else I thought I should ask, has any thought been given to 
> how this would work in a clustered environment?
>
> I would assume that most customers that would want this would also 
> want some form of HA for the jobs, if a single node goes down you 
> would not want all you batch jobs to grind to a halt.
>
> Stuart
>
>
> On Wed, Jul 24, 2013 at 11:25 PM, Stuart Douglas 
> <stuart.w.douglas at gmail.com <mailto:stuart.w.douglas at gmail.com>> wrote:
>
>
>
>
>     On Wed, Jul 24, 2013 at 8:14 PM, Cheng Fang <cfang at redhat.com
>     <mailto:cfang at redhat.com>> wrote:
>
>
>         On 7/24/13 12:52 PM, Stuart Douglas wrote:
>>         I have also been looking at this today, and there are quite a
>>         few things in the code base that worry me about its quality.
>>
>>         1) JdbcRepository seems to save jobs to the database but
>>         never seems to actually load them again or remove them? [1]
>         JdbcRepository is still work in progress, and is currently not
>         used yet.
>
>>
>>         2) Some things seem to be implemented in a very inefficient
>>         manner, using lists when a map or a set would be more
>>         appropriate. For example i
>>         <https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/repository/JdbcRepository.java#L36>n
>>         AbstractJobRepository all jobs are stored in a list, and as a
>>         result every operation on the repo is O(n) on the number of
>>         jobs. A map would be a far more suitable data structure here.
>>         This will be a real problem is a customer is ever trying to
>>         scale to even a moderately sized number of jobs and job
>>         instances.
>         See my reply to previous message.  Initially I did implement
>         it as a map, but didn't like duplicating id as the key so
>         changed it to list.  I don't expect the number of jobs to be
>         that large, or access to jobs to be a hot spot. But I'm open
>         to switch it since the feedback so far has favored a mapping
>         lookup.
>
>>
>>         3) Thread safety
>>         Almost all objects in the code base are mutable (i.e. no use
>>         of final), and with a few exceptions most of the code is not
>>         synchronized. From what I can see not much thought has been
>>         given to thread safety, and looking through the code I think
>>         there are quite a few places where there are the potential to
>>         have threading issues. e.g. In [2], where a list that is
>>         being modified concurrently is returned to the caller. The
>>         caller cannot safely use the list, as it may be modified by
>>         another thread as it is being iterated. There are other
>>         places where I think there are the potential for races,
>>         however I don't know the code well enough to be sure.
>         If most of the code is synchronized, I would also be
>         worried;-) . But I agreeed, thread safety is the area we need
>         to look more closely as we integrate to WildFly.  In {2],
>         what's your recommendation? to always return a new list the
>         the caller, which seems a bit wasteful.
>
>
>
>     You must either return a new list or use a concurrent list as the
>     backing data structure.
>
>     Stuart
>
>
>>
>>         4) It looks like it has been designed as a standalone project
>>         to be embedded into a deployment, and no thought has been
>>         given to how to actually integrate it into Wildfly. I know
>>         David already mentioned the statics issue, but this is a big
>>         problem. e.g. only one jberet.properties will be loaded, so
>>         if two applications have different properties files then one
>>         will leak into the other app, depending on the current TCCL
>>         when the BatchConfig class is first accessed.
>         jberet.properties is only for standalone distro.  For running
>         in WildFly, all configuration will be included in subsystem
>         configuration.
>
>         Appreciate all the feedback!
>
>         Cheng
>
>>
>>         Stuart
>>
>>         [1]
>>         https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/repository/JdbcRepository.java#L36
>>         [2]
>>         https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/runtime/JobExecutionImpl.java#L134
>>
>>
>>
>>         On Wed, Jul 24, 2013 at 6:09 PM, David M. Lloyd
>>         <david.lloyd at redhat.com <mailto:david.lloyd at redhat.com>> wrote:
>>
>>             On initial review of JBeret we have noticed a number of
>>             issues that need
>>             to be addressed.  The culmination amounts to a series of
>>             questions and
>>             observations here:
>>
>>             #1) Why did we not choose to just use the RI?  In other
>>             words, what
>>             benefit do we get from JBeret that is not also in the RI?
>>              In other,
>>             other words, why should we *use* this code instead of the
>>             RI at this
>>             point in time?
>>
>>             #2) Why does JBeret duplicate facilities already present
>>             in the WildFly
>>             code base and deployer chain - e.g. annotation indexing,
>>             reflection
>>             indexing, thread management, parsing facilities, etc.?
>>
>>             #3) Specific to algorithmic complexity - it appears that
>>             jobs are keyed
>>             by ID, yet accessed using a sequential search [1] - this
>>             does not scale
>>             well to large numbers of jobs.  Is there no better approach?
>>
>>             #4) JAXB seems to be being used to parse XML, which is a
>>             departure from
>>             all of our other services which expect parsing to be done
>>             during
>>             deployment processing in a more efficient manner.  Is
>>             there any better
>>             way we can integrate this, preferably not using JAXB?
>>
>>             #5) There are a number of resources present that seem
>>             inappropriate for
>>             the production JAR [2] [3].  Is this intentional?
>>
>>             #6) This code base makes extensive use of static state,
>>             including static
>>             fields that seem not to be adequately protected for
>>             thread-safety, and
>>             at least one static thread pool [4].  This needs to be
>>             fixed, as these
>>             kinds of things make embedding difficult or impossible.
>>
>>             [1]
>>             https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/repository/AbstractRepository.java#L50
>>             [2]
>>             https://github.com/jberet/jsr352/tree/master/jberet-core/src/main/resources/sql
>>             [3]
>>             https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/resources/jobXML.xjb
>>             [4]
>>             https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/util/ConcurrencyService.java#L22
>>
>>             --
>>             - DML
>>             _______________________________________________
>>             wildfly-dev mailing list
>>             wildfly-dev at lists.jboss.org
>>             <mailto:wildfly-dev at lists.jboss.org>
>>             https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>
>>
>>
>>
>>         _______________________________________________
>>         wildfly-dev mailing list
>>         wildfly-dev at lists.jboss.org  <mailto:wildfly-dev at lists.jboss.org>
>>         https://lists.jboss.org/mailman/listinfo/wildfly-dev
>
>
>         _______________________________________________
>         wildfly-dev mailing list
>         wildfly-dev at lists.jboss.org <mailto:wildfly-dev at lists.jboss.org>
>         https://lists.jboss.org/mailman/listinfo/wildfly-dev
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/wildfly-dev/attachments/20130724/af4fb6ac/attachment-0001.html