[wildfly-dev] WFLY-508, JBeret initial review and integration issues

Thu Jul 25 06:56:08 EDT 2013

Good point. I met recently large customer who not only need to run in a clustered environment but also large number of tasks. They are currently using Spring batch and would prefer to go with what JBoss can ship and support. They like the idea of using standards rather than depending on non standards impl /libraries. 

Sent from my iPhone

On 25 Jul, 2013, at 5:35, Stuart Douglas <stuart.w.douglas at gmail.com> wrote:

> Something else I thought I should ask, has any thought been given to how this would work in a clustered environment? 
> 
> I would assume that most customers that would want this would also want some form of HA for the jobs, if a single node goes down you would not want all you batch jobs to grind to a halt. 
> 
> Stuart
> 
> 
> On Wed, Jul 24, 2013 at 11:25 PM, Stuart Douglas <stuart.w.douglas at gmail.com> wrote:
>> 
>> 
>> 
>> On Wed, Jul 24, 2013 at 8:14 PM, Cheng Fang <cfang at redhat.com> wrote:
>>> 
>>> On 7/24/13 12:52 PM, Stuart Douglas wrote:
>>>> I have also been looking at this today, and there are quite a few things in the code base that worry me about its quality.
>>>> 
>>>> 1) JdbcRepository seems to save jobs to the database but never seems to actually load them again or remove them? [1]
>>> JdbcRepository is still work in progress, and is currently not used yet.
>>> 
>>>> 
>>>> 2) Some things seem to be implemented in a very inefficient           manner, using lists when a map or a set would be more appropriate. For example in AbstractJobRepository all jobs are stored in a list, and as a result every operation on the repo is O(n) on the number of jobs. A map would be a far more suitable data structure here. This will be a real problem is a customer is ever trying to scale to even a moderately sized number of jobs and job instances.
>>> See my reply to previous message.  Initially I did implement it as a map, but didn't like duplicating id as the key so changed it to list.  I don't expect the number of jobs to be that large, or access to jobs to be a hot spot.  But I'm open to switch it since the feedback so far has favored a mapping lookup.
>>> 
>>>> 
>>>> 3) Thread safety
>>>> Almost all objects in the code base are mutable (i.e. no use of final), and with a few exceptions most of the code is not synchronized. From what I can see not much thought has been given to thread safety, and looking through the code I think there are quite a few places where there are the potential to have threading issues. e.g. In [2], where a list that is being modified concurrently is returned to the caller. The caller cannot safely use the list, as it may be modified by another thread as it is being iterated. There are other           places where I think there are the potential for races, however I don't know the code well enough to be sure.
>>> If most of the code is synchronized, I would also be worried ;-)     .  But I agreeed, thread safety is the area we need to look more closely as we integrate to WildFly.  In {2], what's your recommendation? to always return a new list the the caller, which seems a bit wasteful.
>> 
>> 
>> You must either return a new list or use a concurrent list as the backing data structure.
>> 
>> Stuart
>>  
>>> 
>>>> 
>>>> 4) It looks like it has been designed as a standalone project to be embedded into a deployment, and no thought has           been given to how to actually integrate it into Wildfly. I know David already mentioned the statics issue, but this is a big problem. e.g. only one jberet.properties will be loaded, so if two applications have different properties files then one will leak into the other app, depending on the current TCCL when the BatchConfig class is first accessed.
>>> jberet.properties is only for standalone distro.  For running in WildFly, all configuration will be included in subsystem configuration.
>>> 
>>> Appreciate all the feedback!
>>> 
>>> Cheng
>>> 
>>>> 
>>>> Stuart
>>>> 
>>>> [1] https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/repository/JdbcRepository.java#L36
>>>> [2] https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/runtime/JobExecutionImpl.java#L134
>>>> 
>>>> 
>>>> 
>>>> On Wed, Jul 24, 2013 at 6:09 PM, David M. Lloyd <david.lloyd at redhat.com> wrote:
>>>>> On initial review of JBeret we have noticed a number of issues that need
>>>>> to be addressed.  The culmination amounts to a series of questions and
>>>>> observations here:
>>>>> 
>>>>> #1) Why did we not choose to just use the RI?  In other words, what
>>>>> benefit do we get from JBeret that is not also in the RI?  In other,
>>>>> other words, why should we *use* this code instead of the RI at this
>>>>> point in time?
>>>>> 
>>>>> #2) Why does JBeret duplicate facilities already present in the WildFly
>>>>> code base and deployer chain - e.g. annotation indexing, reflection
>>>>> indexing, thread management, parsing facilities, etc.?
>>>>> 
>>>>> #3) Specific to algorithmic complexity - it appears that jobs are keyed
>>>>> by ID, yet accessed using a sequential search [1] - this does not scale
>>>>> well to large numbers of jobs.  Is there no better approach?
>>>>> 
>>>>> #4) JAXB seems to be being used to parse XML, which is a             departure from
>>>>> all of our other services which expect parsing to be done during
>>>>> deployment processing in a more efficient manner.  Is there any better
>>>>> way we can integrate this, preferably not using JAXB?
>>>>> 
>>>>> #5) There are a number of resources present that seem inappropriate for
>>>>> the production JAR [2] [3].  Is this intentional?
>>>>> 
>>>>> #6) This code base makes extensive use of static state, including static
>>>>> fields that seem not to be adequately protected for thread-safety, and
>>>>> at least one static thread pool [4].  This needs to be fixed, as these
>>>>> kinds of things make embedding difficult or impossible.
>>>>> 
>>>>> [1]
>>>>> https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/repository/AbstractRepository.java#L50
>>>>> [2]
>>>>> https://github.com/jberet/jsr352/tree/master/jberet-core/src/main/resources/sql
>>>>> [3]
>>>>> https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/resources/jobXML.xjb
>>>>> [4]
>>>>> https://github.com/jberet/jsr352/blob/master/jberet-core/src/main/java/org/jberet/util/ConcurrencyService.java#L22
>>>>> 
>>>>> --
>>>>> - DML
>>>>> _______________________________________________
>>>>> wildfly-dev mailing list
>>>>> wildfly-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> wildfly-dev mailing list
>>>> wildfly-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>> 
>>> 
>>> _______________________________________________
>>> wildfly-dev mailing list
>>> wildfly-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
> 
> _______________________________________________
> wildfly-dev mailing list
> wildfly-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/wildfly-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/wildfly-dev/attachments/20130725/0bd08784/attachment-0001.html