Thanks Wolfgang,

Yes, we do have a lot of events/hour, because it is a complex network we're monitoring. Our system has been running for some time, but the Drools rules engine is a new addition to attempt to manage some of the complexity.

Perhaps I should clarify events and alarms: Our main system tracks alarms within the network, but each alarm may have several events, like an event when the alarm is first raised, an event when its status goes from major to critical and an event when the alarm is cleared. So the main entity in our rules is an Alarm, and whenever we get an event we insert a new Alarm into the knowledge base if we've never seen the Alarm before, or update the Alarm if we have. 

We have one other rule that removes all Alarms whose status haven't changed for 24 hours, regardless of whether they have cleared or not. This rule's syntax is very similar to the one from my previous email. We specifically have this rule to try and keep the fact count in the rules engine manageable. 

  rule "Old, Inactive Alarm?"
  timer(int: 30m 30m)
  salience -10
  when
$a : Alarm(severity != "cleared")
  then
double lastUpdate = minutesSince($a.getEventTime());
if(lastUpdate > 24 * 60) {
retract($a);
}
  end

So what you said would explain the memory usage. All Alarms end up in "Old, Inactive Alarm?"'s queue waiting for 24 hours.

I'm going to disable this rule "Old, Inactive Alarm?" for the time being. Unfortunately the nature of the problem means that I'll have to monitor it for a day or two before I can draw any conclusions.

It seems that the proper solution to this problem would be to get more memory. 

Thank you,
Werner

On Tue, May 29, 2012 at 9:35 AM, Wolfgang Laun <wolfgang.laun@gmail.com> wrote:
On 29/05/2012, Werner Stoop <wstoop@gmail.com> wrote:
> Hi, thank you for your response.
>
> We use Drools 5.3.1 through Maven. When I invoke Drools, for each event I
> receive I do the following:
>
>   ksession.insert(obj);
>   ksession.fireAllRules();
>
This is OK.

>
> Yes, we do use timers. In one case we want to remove alarms that have been
> cleared for more than an hour from the knowledgebase. We don't remove them
> immediately because some alarms clear briefly and then come back. The rule
> I've written to handle this situation is the following:
>
> rule "Old Cleared Alarm?"
> timer(int: 10m 10m)
> salience -10
> when
> $a : Alarm(severity == "cleared")
> then
> double lastUpdate = minutesSince($a.getEventTime());
> if(lastUpdate > 60) {
> logger.debug("Alarm " + $a.getAlarmId() + " is old. Removing...");
> retract($a);
> }
> end
>
> Is there any other way to write this? I've found that I can't put the
> minutesSince($a.getEventTime()) in the rule's when-clause.

It's fine as you have it; it would not be evaluated correctly on the LHS.

But considering 2000000 events, if they were all Alarm, you'd have a
rate of 17800 events/hour, and so you'd have that many scheduled
agenda items.

What about the other timer rules for other Event types? Are there
similar scenarios?

-W

>
> Thank you,
> Werner
>
> On Tue, May 29, 2012 at 8:10 AM, Wolfgang Laun
> <wolfgang.laun@gmail.com>wrote:
>
>> Just to make sure: How do you invoke the Engine? (I suppose you don't
>> call with a limit for rule firings.)
>>
>> Unless it's a bug (BTW: your Drools version is?), it's due to one or
>> more of your rules.
>>
>> Are you using timers? How?
>>
>> A detailed investigation of the whereabouts of these
>> ScheduledAgendaItem objects might be done by investigating (via the
>> unstable API) the Agenda and its various components.
>>
>> -W
>>
>> On 28/05/2012, Werner Stoop <wstoop@gmail.com> wrote:
>> > Hi,
>> >
>> > We're using Drools with a StatefulKnowledgeSession to process events
>> coming
>> > from equipment in our network. The system draws conclusions about the
>> state
>> > of the equipment and writes those conclusions to a table in our
>> > database. All our rules work as we expected and the system produces the
>> > correct results.
>> >
>> > However, the memory usage of the JVM steadily goes up when the system
>> runs
>> > for extended periods of time until we start getting
>> > OutOfMemoryExceptions
>> > and the server has to be restarted. This is in spite of the fact that
>> > the
>> > fact count reported by
>> > the StatefulKnowledgeSession.getFactCount() stays reasonably stable,
>> > with around 30 000 facts (give or take) at any point in time.
>> >
>> > I have run the Eclipse Memory Analyzer tool
>> > (http://www.eclipse.org/mat/
>> )
>> > against heap dumps from the JVM several times now, and every time it
>> > reports more and more instances
>> > of org.drools.common.ScheduledAgendaItem referenced from one instance
>> > of
>> > java.lang.Object[]
>> >
>> > To be concrete, since this morning the uptime is more than 112 hours in
>> > total, during which the system has processed little over 2 000 000
>> > events
>> > from the network. It has 29 000 facts in the knowledge session, yet in
>> the
>> > heap dump we see 829 632 instances of
>> > org.drools.common.ScheduledAgendaItem.
>> >
>> > What is the ScheduledAgendaItem for? Is there something wrong with my
>> rules
>> > that causes this many instances to be held? Is there something I should
>> do
>> > to release these instances or the Object[] holding on to them?
>> >
>> > Thanks,
>> > Werner Stoop
>> >
>> _______________________________________________
>> rules-users mailing list
>> rules-users@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>
_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users