[rules-users] 14GB of NotNodeLeftTuples produced by one rule?

Svenja Brunstein svenja.brunstein at gmail.com
Mon Jan 7 07:55:57 EST 2013


>
> The system will create network nodes even when only one pattern matches.
> 150,000/50,000 = 3 exactly, or average?

 3 exactly.

If you have 3 events A, B, C with identical ids and different users,
> you'll get the following candidates for an activation: (A,B), (B,A),
> (A,C), (C,A), (B,C), (C,B)
> and this increases O(n^2). - Since you know the exact distribution of
> your data, you might compute this precisely.
>
Okay. But if I always have only 3 events with the same id, the next three
events D, E, F, that might have other users and another id, would not be
combined with A, B or C, right?
I would get the six combinations you defined plus (D,E), (E,D), (D,F),
(F,D), (E,F), and (F,E).
Going on with that, I would only see it growing by O(2n), which for this
150,000 events would mean 300,000 activations. Where am I going wrong?


> Is the distribution of id/user combinations realistic?

What do you mean by realistic? In our test scenario, we always have 3
events with the same id, and approx. 1000 users which are randomly used in
the events.


> What else do
> you need to do with Event type "a"? Similar? Completely different? -
> There would be a simple solution to significantly reduce the memory
> requirements, but it may not be feasible due to these answers.

At the moment we are just designing a generic solution, which might be
extended by rules afterwards, so that "old" events might need to be reused.
In a real environment, of course, we would retract some events not needed
any longer. But for now we are doing some performance testing and were
surprised that we could "crash" the system with one single rule. Of course,
with a lot of events ;-)

2013/1/7 Wolfgang Laun <wolfgang.laun at gmail.com>

> On 07/01/2013, Svenja Brunstein <svenja.brunstein at gmail.com> wrote:
> > Thanks for the input. For 150,000 type "a" events we had about 50,000
> > different ids and 1,000 user values.
> > After all, combinations possible for type "b" were only 1,000,000 (1,000
> > users * 1,000 users), which is why I am surprised to have 88 million
> > instances.
>
> The system will create network nodes even when only one pattern matches.
> 150,000/50,000 = 3 exactly, or average?
>
> If you have 3 events A, B, C with identical ids and different users,
> you'll get the following candidates for an activation: (A,B), (B,A),
> (A,C), (C,A), (B,C), (C,B)
> and this increases O(n^2). - Since you know the exact distribution of
> your data, you might compute this precisely.
>
> Is the distribution of id/user combinations realistic? What else do
> you need to do with Event type "a"? Similar? Completely different? -
> There would be a simple solution to significantly reduce the memory
> requirements, but it may not be feasible due to these answers.
>
> >
> > Yes, it is intentional to have the rule fire twice for each combination
> :-)
> > Unfortunately, retracting events is not an option right now.
>
> Then, at least, generate both in a single rule.
>
> >
> > I started another round, where I ensured to insert a lot more "b" events:
> > The memory used by NotNodeLeftTuples is a lot less, even though these
> nodes
> > still use most of the memory.
> > Concluding from all that, I guess it is possible that the nodes take that
> > much space (up to many GB), and the more events are inserted which
> > invalidate the NOT nodes, the less memory is used by them?
>
> Well, you don't need the NOT node, and their number depends on the
> distribution of your data.
>
> -W
>
> >
> > 2013/1/7 Wolfgang Laun <wolfgang.laun at gmail.com>
> >
> >> The amount of memory required for 150K type "a" depends on the actual
> >> distribution of this data w.r.t. fields id and user, and other
> >> circumstances; it is not only the rule that is to blame.
> >>
> >> There is one flaw, though: The rule would fire twice for a matching
> >> pair of events of type "a". It's possible that you do want to have a
> >> type "b" for both combinations of user and friendid, but you could
> >> create both in a single rule, which should halve your memory
> >> requirements. If there is no ordered attribute, use the timestamp to
> >> restrict a pair to only one combination (hint: "after").
> >>
> >> This will still generate a lot of network nodes.
> >>
> >> Other ideas for reduction may have to take the entire application
> >> scenario into account, e.g., can you retract events after they have
> >> been paired, or how do you do inserts and calls to fireAllRules, etc.
> >> Most importantly, however, is the actual frequency of id and user
> >> values in relation to type "a" events.
> >>
> >> -W
> >>
> >>
> >>
> >> On 07/01/2013, Svenja Brunstein <svenja.brunstein at gmail.com> wrote:
> >> > Hi all,
> >> >
> >> > we observe a strange behavior with one of our rules. After deployment
> >> > and sending lots of events (~150,000 of type "a"), the server slows
> >> > down
> >> > rapidly until it runs out of memory.
> >> > We checked with VisualVM which objects are filling the memory: In one
> >> > moment there were almost 14GB of NotNodeLeftTuples (88,933,186
> >> Instances)!
> >> >
> >> > This is our rule:
> >> >
> >> > rule "example"
> >> > when
> >> > $evt1:EventObject(type=='a', $id:data['id'], $user:user) from
> >> > entry-point
> >> > internalstream
> >> > $evt2:EventObject(type=='a', data['id']==$id, user!=$user,
> $user2:user)
> >> > from entry-point internalstream
> >> > not(EventObject(type=='b', user==$user, data['friendid']==$user2) from
> >> > entry-point internalstream)
> >> > then
> >> > EventObject evt = new EventObject();
> >> > evt.setType('b');
> >> > evt.setUser($evt1.getUser());
> >> > evt.put('friendid', $evt2.getUser());
> >> > entryPoints['internalstream'].insert(evt);
> >> > end
> >> >
> >> > Is that behavior correct for such a size of event combinations when
> >> using a
> >> > NOT in the rule?
> >> >
> >> > Thanks,
> >> > Svenja
> >> >
> >> _______________________________________________
> >> rules-users mailing list
> >> rules-users at lists.jboss.org
> >> https://lists.jboss.org/mailman/listinfo/rules-users
> >>
> >
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20130107/9cce09c3/attachment.html 


More information about the rules-users mailing list