Dear Drools Experts,


Short version

  1. Cron-based rules triggers (more than once) for a full second, rather than once every defined period;
  2. After some time, event creation from the drl seems to hang. The “time to hang” decreases as we increase the throughput of incoming events;
  3. The two issues seem to be somehow related (well, maybe).

Full version


We are testing Drools Fusion to implement CEP functionalities in our platform. We are performing these tests using Drools 6.0.1.Final.

As a basic test case, we set up a synthetic stream of events, and defined a couple of rules to implement a simple throughput metric. Here’s the complete drl:


package it.intext.unity.test

 

import it.intext.unity.test.SynthEvent; 

import java.util.Date;

 

global org.slf4j.Logger logger;

 

declare SynthEvent

    @role( event )

    @timestamp( timestamp )

end

 

declare EventCounter

      @role( event )

      @timestamp( timestamp )

      id          : long

      key         : String

      timestamp   : Date

end

 

 

// Business rules

rule "Create counter"

when

$e : SynthEvent() from entry-point "synth"

then

entryPoints["counters"].insert(new EventCounter( $e.getId(), "event", $e.getTimestamp() ) );

end

 

// Metrics

rule "Count epm"

      // Emit count every 10s, accumulate over 1m

timer ( cron: 0/10 * * * * ? )

      when

            Number( $count : intValue ) from accumulate(

              EventCounter( key == "event" ) over window:time( 60s )from entry-point "counters"count(1) )

      then

logger.debug("epm = {}", $count );

end


The SynthEvent class is very basic:


public class SynthEvent {

 

      long id;

      Date timestamp;

      List<String> meta;

      … // Getters/Setters and constructors omitted

}


Now, the test is performed running the session in one thread (code below)


      private void process(final KieSession session) {

            new Thread(){

                  public void run() {

                        session.fireUntilHalt();

                  };

            }.start();

      }

 

while feeding the events on a second thread (code below)


      private void feed(final KieSession session) {

            new Thread(){

                  @SuppressWarnings("unchecked")

                  public void run() {

                        try {

                             int counter = 0;

                             while(true) {

                                   counter++;

                                   session.execute(CommandFactory.newInsert(createEvent(), null, false, "synth"));

                                   Thread.sleep(getSleepRate());

                                   if ((counter % 1000) == 0) {

                                         logger.debug("Total events: {}", counter);

                                   }

                             }

                        } catch (InterruptedException e) {

                             logger.warn("{}", e);

                        }

                  };

            }.start();

      }

I expected the cron rule “count epm” to trigger once every 10 seconds. But here’s an extract from the log (running at 5 events per second +/- 20%):

[DEBUG] 2014-03-25 04:56:10.008 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 1
[DEBUG] 2014-03-25 04:56:10.075 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2
[DEBUG] 2014-03-25 04:56:10.262 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 3
[DEBUG] 2014-03-25 04:56:10.507 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 4
[DEBUG] 2014-03-25 04:56:10.678 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 5
[DEBUG] 2014-03-25 04:56:10.871 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 6
[DEBUG] 2014-03-25 04:56:20.001 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 50
[DEBUG] 2014-03-25 04:56:20.042 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 51
[DEBUG] 2014-03-25 04:56:20.231 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 52
[DEBUG] 2014-03-25 04:56:20.405 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 53
[DEBUG] 2014-03-25 04:56:20.647 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 54
[DEBUG] 2014-03-25 04:56:20.823 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 55
[DEBUG] 2014-03-25 04:56:20.992 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 56

 

As you can see, the “count epm” rules fires once per incoming event for a full second (e.g. for the second cycle we see the first activation at 04:56:20.001 and the last one at 04:56:20.992). 

This is not a major issue, since it can be solved with a sort-of publish/subscribe pattern which, BTW, increases the system stability (if anyone is interested I can post more details on the list). 

Anyhow, the test ran for more than 20’ consistently until we decided it was enough.


Now, the problem is that as we increase the throughput (e.g. 50 events per second), the test runs for a few minutes before it becomes inconsistent. The evidence is that the first rule stops to trigger, and EventCounter aren’t created anymore. Here’s a log extract:

[DEBUG] 2014-03-25 07:37:40.008 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 199
[DEBUG] 2014-03-25 07:37:40.017 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 200
[DEBUG] 2014-03-25 07:37:40.034 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 201
[DEBUG] 2014-03-25 07:37:40.058 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 202
[DEBUG] 2014-03-25 07:37:40.076 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 203
… here the system is stable as we get about 3K events per minute as expected
[DEBUG] 2014-03-25 07:38:36.022 (StreamTester.java:run:70) Total events: 3000
[DEBUG] 2014-03-25 07:38:40.001 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2999
[DEBUG] 2014-03-25 07:38:40.016 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2998
[DEBUG] 2014-03-25 07:38:40.024 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2999
[DEBUG] 2014-03-25 07:38:40.034 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2998
[DEBUG] 2014-03-25 07:38:40.047 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2999
[DEBUG] 2014-03-25 07:38:40.057 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2998

[DEBUG] 2014-03-25 07:39:50.974 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 3008
[DEBUG] 2014-03-25 07:39:50.988 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 3007
[DEBUG] 2014-03-25 07:39:50.996 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 3008
[DEBUG] 2014-03-25 07:39:55.842 (StreamTester.java:run:70) Total events: 7000
… Somewhere in between the system becomes inconsistent and the number of events decreases to 0
 [DEBUG] 2014-03-25 07:40:00.001 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2734
 [DEBUG] 2014-03-25 07:40:00.017 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2733
 [DEBUG] 2014-03-25 07:40:00.037 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 2732

[DEBUG] 2014-03-25 07:40:50.937 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 181
[DEBUG] 2014-03-25 07:40:50.953 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 180
[DEBUG] 2014-03-25 07:40:50.975 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 179
[DEBUG] 2014-03-25 07:40:50.997 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 178
[DEBUG] 2014-03-25 07:40:55.438 (StreamTester.java:run:70) Total events: 10000
 [DEBUG] 2014-03-25 07:41:00.001 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 0
[DEBUG] 2014-03-25 07:41:10.001 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 0
 
We added a quick rule to count “live EventCounter”

rule "Count live counters"

    timer ( cron: 0/60 * * * * ? )

      when

          Number( $count : intValue ) from accumulate(

              EventCounter( key == "event" ) from entry-point "counters", count(1) )

      then

          logger.debug("Live counters = {}", $count );

end

 

and, again, the evidence is that when things go wrong the EventCounter events aren’t created anymore.

[DEBUG] 2014-03-25 07:40:55.438 (StreamTester.java:run:70) Total events: 10000
[DEBUG] 2014-03-25 07:41:00.001 (Rule_Count_live_counters1625367465.java:defaultConsequence:14) Live counters = 0
[DEBUG] 2014-03-25 07:41:00.001 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 0
[DEBUG] 2014-03-25 07:41:10.001 (Rule_Count_epm442808096.java:defaultConsequence:14) epm = 0

Adding a debug line in the "Create counter" rule confirm this hypotesis (after a while the rule won't trigger anymore). Modified rule is:


rule "Create counter"

    when

      $e : SynthEvent() from entry-point "synth"

    then

      entryPoints["counters"].insert( new EventCounter( $e.getId(), "event", $e.getTimestamp() ) );

      if (Math.random() < 0.01) logger.debug("New event: {}", $e.getId());

end


When the system becomes inconsistent, we stop seeing "New event" log lines.


What are we missing? Any suggestion? Anyone experienced the same problems we have?

Your help will be greatly appreciated.

Thanks in advance,

Vieri