[rules-users] Bugs in Drools 5.3.0 break Fusion event processing
Edson Tirelli
ed.tirelli at gmail.com
Wed Nov 9 08:53:00 EST 2011
Richard,
This is great info. Yes, please open JIRA's for all 3 issues and we will
make sure this is fixed for the next release.
Thank you,
Edson
2011/11/9 Richard Calmbach <rcalmbac at gmail.com>
> I am making extensive use of the event processing features of the Drools
> rule engine. Upgrading from Drools 5.2.0.Final to Drools 5.3.0.Final broke
> 47 of my unit tests and also broke my functional tests. There seem to be
> multiple changes in Drools 5.3.0 that cause incorrect behavior and/or break
> backward compatibility. Here are the results of my investigation so far.
>
> Issue F1:
> While tracking down the failures of my unit tests, I experimented with the
> Broker example, and while probably not finding the root cause of the broken
> unit tests, I nonetheless came across what clearly seems to be incorrect
> behavior in the DefaultTimerJobInstance.call() method. The bug only reveals
> itself after all input has been processed, so it is masked by the fact that
> running the Broker demo through the entire sequence in stocktickstream.dat
> (1100 lines) takes a long time. In order to reveal the problem more
> quickly, run the demo with only the first 14 lines in stocktickstream.dat.
> Do so for both the 5.2.0 Broker demo (against Drools 5.2.0) and the 5.3.0
> Broker demo (against Drools 5.3.0). The Broker example code in both
> versions is identical, so only the Drools-internal code changes matter.
>
> When running the 5.2.0 Broker demo to the end, you get one
> java.text.ParseException (given the structure of the example code, that's
> expected, albeit not elegant, and not the focus of our investigation). In
> particular, no matter how many lines stocktickstream.dat contains, you
> always get exactly one ParseException at the end.
>
> Contrast this with running the 5.3.0 Broker demo: At the end you get N
> occurrences of java.text.ParseException, where N is the number of lines in
> stocktickstream.dat. So for 14 lines you get 14 occurrences of
> ParseException. Looking at two specific methods shows us why:
>
> Method
> org.drools.examples.broker.events.EventFeeder.FeedJob.execute(JobContext):
>
> public void execute(JobContext context) {
> this.sink.receive( ((FeedContext) context).event );
> if ( this.source.hasNext() ) {
> ((FeedContext) context).setEvent( this.source.getNext() );
> this.trigger.setNextFireTime( ((FeedContext)
> context).getEvent().getDate() );
> clock.scheduleJob( this,
> context,
> trigger );
> }
> }
>
> Note in particular how this method already takes care of scheduling the
> next job execution by updating the next fire time of the job's existing
> FeedTrigger instance. Unfortunately, in Drools 5.3.0,
> DefaultTimerJobInstance.call() does a duplicate scheduling of the same job:
>
> Method org.drools.time.impl.DefaultTimerJobInstance.call():
>
> public Void call() throws Exception {
> this.trigger.nextFireTime(); // need to pop
> if ( handle.isCancel() ) {
> return null;
> }
> this.job.execute( this.ctx );
> if ( handle.isCancel() ) {
> return null;
> }
>
> // our triggers allow for flexible rescheduling
> Date date = this.trigger.hasNextFireTime();
> if ( date != null ) {
> scheduler.internalSchedule( this );
> }
>
> return null;
> }
>
>
> So, every job is duplicated and that's why there are 2*N calls to
> org.drools.examples.broker.events.StockTickPersister.load() instead of N.
>
> I think the rescheduling inside DefaultTimerJobInstance.call() is
> incorrect. For one, it breaks backward compatibility, and it is unexpected.
> The job should be in charge of deciding whether there is another job to
> schedule or what to do. Implicitly scheduling the next job just by updating
> the trigger time is a little too much magic.
>
> Issue F2:
> This is the bug that causes my unit tests to fail. I have not pinpointed
> the root cause, but it seems to have to do with the event scheduling Drools
> does as part of its job execution mechanism. Its symptom is a
> NullPointerException during insertion of an event. What makes it so tricky
> is that with the out-of-the-box configuration, Drools catches such
> exceptions in org.drools.time.impl.PseudoClockScheduler.runCallBacks() and
> passes them to the aptly named "DoNothingSystemEventListener", which
> literally does nothing, not so much as logging (the methods are empty). So
> you don't actually know that the event insertion failed, you only wonder
> why your mock WorkingMemoryEventListener is telling you that your
> expectations are not met. The stack trace (as copied from the Eclipse Debug
> view, hence the unusual formatting) inside Drools is:
>
> Date.getMillisOf(Date) line: 939
> Date.compareTo(Date) line: 959
> DefaultTimerJobInstance.compareTo(DefaultTimerJobInstance) line: 38
> DefaultTimerJobInstance.compareTo(Object) line: 13
> PriorityQueue<E>.siftUpComparable(int, E) line: 582
> PriorityQueue<E>.siftUp(int, E) line: 574
> PriorityQueue<E>.offer(E) line: 274
> PriorityQueue<E>.add(E) line: 251
> PseudoClockScheduler.internalSchedule(TimerJobInstance) line: 136
> PseudoClockScheduler.scheduleJob(Job, JobContext, Trigger) line: 126
> ObjectTypeNode.assertObject(InternalFactHandle, PropagationContext,
> InternalWorkingMemory) line: 230
> EntryPointNode.assertObject(InternalFactHandle, PropagationContext,
> ObjectTypeConf, InternalWorkingMemory) line: 244
> NamedEntryPoint.insert(InternalFactHandle, Object, Rule, Activation,
> ObjectTypeConf) line: 330
> NamedEntryPoint.insert(Object, boolean, boolean, Rule, Activation) line:
> 291
> NamedEntryPoint.insert(Object) line: 116
> NamedEntryPoint.insert(Object) line: 48
> <my code calling into Drools>
>
> Here is method
> org.drools.time.impl.DefaultTimerJobInstance.compareTo(DefaultTimerJobInstance):
>
> public int compareTo(DefaultTimerJobInstance o) {
> return this.trigger.hasNextFireTime().compareTo(
> o.getTrigger().hasNextFireTime() );
> }
>
>
> Essentially, this method calls java.util.Date.compareTo(Date) with a null
> argument, which, as documented, causes a NullPointerException. Sometimes,
> this.trigger.hasNextFireTime() already returns null, and then the NPE gets
> thrown in DefaultTimerJobInstance.compareTo() itself.
>
> I've seen different stack traces leading to this NPE, so this must be
> affecting scheduling and job execution quite broadly.
>
> Issue F3:
> It is debatable whether this is a bug, but it is a backward-compatibility
> breaking change. Previously, when scheduling a job with
> org.drools.time.TimerService.scheduleJob(Job job, JobContext ctx, Trigger
> trigger) (for both real-time and pseudo clock), you could pass a null
> JobContext (say, because you didn't need one), and it would work. However,
> in Drools 5.3.0, this causes a NullPointerException at:
>
> org.drools.time.impl.DefaultTimerJobFactoryManager.createTimerJobInstance(Job,
> JobContext, Trigger, JobHandle, InternalSchedulerService) line: 25
>
> I realize that if it's not in knowledge-api-<version>.jar, it's not an
> official API, but the available interfaces and classes in
> org.drools.time.** (as used in the Broker example) are *very* useful for
> test harnesses *and* for production code (for implementing dynamic timers,
> for instance). So, this is more of a heads-up: If you are suddenly getting
> an NPE, this might be the cause.
>
>
> Please let me know whether I should create JIRA bug reports for issues F1
> and F2. Also, I'd be interested to hear whether others have run into issues
> with Fusion in Drools 5.3.0.
>
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>
>
--
Edson Tirelli
JBoss Drools Core Development
JBoss by Red Hat @ www.jboss.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20111109/73873e53/attachment.html
More information about the rules-users
mailing list