Hi Wolfgang,<div><br></div><div>Thank you for your help. This sounds like a much better idea than what I have at the moment. </div><div><br></div><div>I'll have to read up on queries in Drools first, though, because I've never used them before.</div>
<div><br><div class="gmail_quote">On Tue, May 29, 2012 at 12:21 PM, Wolfgang Laun <span dir="ltr"><<a href="mailto:wolfgang.laun@gmail.com" target="_blank">wolfgang.laun@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
For this kind of clean-up (to get rid of events that have been around<br>
for 24h plus) you can insert a single event, let's call it EveryHour,<br>
and write a rule with a timer, to fire timer(int: 1h 1h). (If this is<br>
too coarse, use 15m 15 or whatever.) On the RHS, run a query to select<br>
all that you want to discard, and discard. The current time - 24h<br>
would have to be a parameter to the query.<br>
<br>
This should reduce the number of scheduled activations, at the cost of<br>
running the query; this depends on the number of Alarm events in the<br>
system.<br>
<br>
Other techniques I can think of might require some additional<br>
bookkeeping, so as to have all uncleared Alarms in some Collection.<br>
This could be tricky, depending on the number of state transitions,<br>
etc.<br>
<span class="HOEnZb"><font color="#888888"><br>
-W<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
<br>
On 29/05/2012, Werner Stoop <<a href="mailto:wstoop@gmail.com">wstoop@gmail.com</a>> wrote:<br>
> Thanks Wolfgang,<br>
><br>
> Yes, we do have a lot of events/hour, because it is a complex network we're<br>
> monitoring. Our system has been running for some time, but the Drools rules<br>
> engine is a new addition to attempt to manage some of the complexity.<br>
><br>
> Perhaps I should clarify events and alarms: Our main system tracks alarms<br>
> within the network, but each alarm may have several events, like an event<br>
> when the alarm is first raised, an event when its status goes from major to<br>
> critical and an event when the alarm is cleared. So the main entity in our<br>
> rules is an Alarm, and whenever we get an event we insert a new Alarm into<br>
> the knowledge base if we've never seen the Alarm before, or update the<br>
> Alarm if we have.<br>
><br>
> We have one other rule that removes all Alarms whose status haven't changed<br>
> for 24 hours, regardless of whether they have cleared or not. This rule's<br>
> syntax is very similar to the one from my previous email. We specifically<br>
> have this rule to try and keep the fact count in the rules engine<br>
> manageable.<br>
><br>
> rule "Old, Inactive Alarm?"<br>
> timer(int: 30m 30m)<br>
> salience -10<br>
> when<br>
> $a : Alarm(severity != "cleared")<br>
> then<br>
> double lastUpdate = minutesSince($a.getEventTime());<br>
> if(lastUpdate > 24 * 60) {<br>
> retract($a);<br>
> }<br>
> end<br>
><br>
> So what you said would explain the memory usage. All Alarms end up in "Old,<br>
> Inactive Alarm?"'s queue waiting for 24 hours.<br>
><br>
> I'm going to disable this rule "Old, Inactive Alarm?" for the time being.<br>
> Unfortunately the nature of the problem means that I'll have to monitor it<br>
> for a day or two before I can draw any conclusions.<br>
><br>
> It seems that the proper solution to this problem would be to get more<br>
> memory.<br>
><br>
> Thank you,<br>
> Werner<br>
><br>
> On Tue, May 29, 2012 at 9:35 AM, Wolfgang Laun<br>
> <<a href="mailto:wolfgang.laun@gmail.com">wolfgang.laun@gmail.com</a>>wrote:<br>
><br>
>> On 29/05/2012, Werner Stoop <<a href="mailto:wstoop@gmail.com">wstoop@gmail.com</a>> wrote:<br>
>> > Hi, thank you for your response.<br>
>> ><br>
>> > We use Drools 5.3.1 through Maven. When I invoke Drools, for each event<br>
>> > I<br>
>> > receive I do the following:<br>
>> ><br>
>> > ksession.insert(obj);<br>
>> > ksession.fireAllRules();<br>
>> ><br>
>> This is OK.<br>
>><br>
>> ><br>
>> > Yes, we do use timers. In one case we want to remove alarms that have<br>
>> been<br>
>> > cleared for more than an hour from the knowledgebase. We don't remove<br>
>> them<br>
>> > immediately because some alarms clear briefly and then come back. The<br>
>> rule<br>
>> > I've written to handle this situation is the following:<br>
>> ><br>
>> > rule "Old Cleared Alarm?"<br>
>> > timer(int: 10m 10m)<br>
>> > salience -10<br>
>> > when<br>
>> > $a : Alarm(severity == "cleared")<br>
>> > then<br>
>> > double lastUpdate = minutesSince($a.getEventTime());<br>
>> > if(lastUpdate > 60) {<br>
>> > logger.debug("Alarm " + $a.getAlarmId() + " is old. Removing...");<br>
>> > retract($a);<br>
>> > }<br>
>> > end<br>
>> ><br>
>> > Is there any other way to write this? I've found that I can't put the<br>
>> > minutesSince($a.getEventTime()) in the rule's when-clause.<br>
>><br>
>> It's fine as you have it; it would not be evaluated correctly on the LHS.<br>
>><br>
>> But considering 2000000 events, if they were all Alarm, you'd have a<br>
>> rate of 17800 events/hour, and so you'd have that many scheduled<br>
>> agenda items.<br>
>><br>
>> What about the other timer rules for other Event types? Are there<br>
>> similar scenarios?<br>
>><br>
>> -W<br>
>><br>
>> ><br>
>> > Thank you,<br>
>> > Werner<br>
>> ><br>
>> > On Tue, May 29, 2012 at 8:10 AM, Wolfgang Laun<br>
>> > <<a href="mailto:wolfgang.laun@gmail.com">wolfgang.laun@gmail.com</a>>wrote:<br>
>> ><br>
>> >> Just to make sure: How do you invoke the Engine? (I suppose you don't<br>
>> >> call with a limit for rule firings.)<br>
>> >><br>
>> >> Unless it's a bug (BTW: your Drools version is?), it's due to one or<br>
>> >> more of your rules.<br>
>> >><br>
>> >> Are you using timers? How?<br>
>> >><br>
>> >> A detailed investigation of the whereabouts of these<br>
>> >> ScheduledAgendaItem objects might be done by investigating (via the<br>
>> >> unstable API) the Agenda and its various components.<br>
>> >><br>
>> >> -W<br>
>> >><br>
>> >> On 28/05/2012, Werner Stoop <<a href="mailto:wstoop@gmail.com">wstoop@gmail.com</a>> wrote:<br>
>> >> > Hi,<br>
>> >> ><br>
>> >> > We're using Drools with a StatefulKnowledgeSession to process events<br>
>> >> coming<br>
>> >> > from equipment in our network. The system draws conclusions about<br>
>> >> > the<br>
>> >> state<br>
>> >> > of the equipment and writes those conclusions to a table in our<br>
>> >> > database. All our rules work as we expected and the system produces<br>
>> the<br>
>> >> > correct results.<br>
>> >> ><br>
>> >> > However, the memory usage of the JVM steadily goes up when the<br>
>> >> > system<br>
>> >> runs<br>
>> >> > for extended periods of time until we start getting<br>
>> >> > OutOfMemoryExceptions<br>
>> >> > and the server has to be restarted. This is in spite of the fact<br>
>> >> > that<br>
>> >> > the<br>
>> >> > fact count reported by<br>
>> >> > the StatefulKnowledgeSession.getFactCount() stays reasonably stable,<br>
>> >> > with around 30 000 facts (give or take) at any point in time.<br>
>> >> ><br>
>> >> > I have run the Eclipse Memory Analyzer tool<br>
>> >> > (<a href="http://www.eclipse.org/mat/" target="_blank">http://www.eclipse.org/mat/</a><br>
>> >> )<br>
>> >> > against heap dumps from the JVM several times now, and every time it<br>
>> >> > reports more and more instances<br>
>> >> > of org.drools.common.ScheduledAgendaItem referenced from one<br>
>> >> > instance<br>
>> >> > of<br>
>> >> > java.lang.Object[]<br>
>> >> ><br>
>> >> > To be concrete, since this morning the uptime is more than 112 hours<br>
>> in<br>
>> >> > total, during which the system has processed little over 2 000 000<br>
>> >> > events<br>
>> >> > from the network. It has 29 000 facts in the knowledge session, yet<br>
>> >> > in<br>
>> >> the<br>
>> >> > heap dump we see 829 632 instances of<br>
>> >> > org.drools.common.ScheduledAgendaItem.<br>
>> >> ><br>
>> >> > What is the ScheduledAgendaItem for? Is there something wrong with<br>
>> >> > my<br>
>> >> rules<br>
>> >> > that causes this many instances to be held? Is there something I<br>
>> should<br>
>> >> do<br>
>> >> > to release these instances or the Object[] holding on to them?<br>
>> >> ><br>
>> >> > Thanks,<br>
>> >> > Werner Stoop<br>
>> >> ><br>
>> >> _______________________________________________<br>
>> >> rules-users mailing list<br>
>> >> <a href="mailto:rules-users@lists.jboss.org">rules-users@lists.jboss.org</a><br>
>> >> <a href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a><br>
>> >><br>
>> ><br>
>> _______________________________________________<br>
>> rules-users mailing list<br>
>> <a href="mailto:rules-users@lists.jboss.org">rules-users@lists.jboss.org</a><br>
>> <a href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a><br>
>><br>
><br>
_______________________________________________<br>
rules-users mailing list<br>
<a href="mailto:rules-users@lists.jboss.org">rules-users@lists.jboss.org</a><br>
<a href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a><br>
</div></div></blockquote></div><br></div>