On 07/01/2013, Svenja Brunstein <svenja.brunstein(a)gmail.com> wrote:
Thanks for the input. For 150,000 type "a" events we had
about 50,000
different ids and 1,000 user values.
After all, combinations possible for type "b" were only 1,000,000 (1,000
users * 1,000 users), which is why I am surprised to have 88 million
instances.
The system will create network nodes even when only one pattern matches.
150,000/50,000 = 3 exactly, or average?
If you have 3 events A, B, C with identical ids and different users,
you'll get the following candidates for an activation: (A,B), (B,A),
(A,C), (C,A), (B,C), (C,B)
and this increases O(n^2). - Since you know the exact distribution of
your data, you might compute this precisely.
Is the distribution of id/user combinations realistic? What else do
you need to do with Event type "a"? Similar? Completely different? -
There would be a simple solution to significantly reduce the memory
requirements, but it may not be feasible due to these answers.
Yes, it is intentional to have the rule fire twice for each combination :-)
Unfortunately, retracting events is not an option right now.
Then, at least, generate both in a single rule.
I started another round, where I ensured to insert a lot more "b" events:
The memory used by NotNodeLeftTuples is a lot less, even though these nodes
still use most of the memory.
Concluding from all that, I guess it is possible that the nodes take that
much space (up to many GB), and the more events are inserted which
invalidate the NOT nodes, the less memory is used by them?
Well, you don't need the NOT node, and their number depends on the
distribution of your data.
-W
2013/1/7 Wolfgang Laun <wolfgang.laun(a)gmail.com>
> The amount of memory required for 150K type "a" depends on the actual
> distribution of this data w.r.t. fields id and user, and other
> circumstances; it is not only the rule that is to blame.
>
> There is one flaw, though: The rule would fire twice for a matching
> pair of events of type "a". It's possible that you do want to have a
> type "b" for both combinations of user and friendid, but you could
> create both in a single rule, which should halve your memory
> requirements. If there is no ordered attribute, use the timestamp to
> restrict a pair to only one combination (hint: "after").
>
> This will still generate a lot of network nodes.
>
> Other ideas for reduction may have to take the entire application
> scenario into account, e.g., can you retract events after they have
> been paired, or how do you do inserts and calls to fireAllRules, etc.
> Most importantly, however, is the actual frequency of id and user
> values in relation to type "a" events.
>
> -W
>
>
>
> On 07/01/2013, Svenja Brunstein <svenja.brunstein(a)gmail.com> wrote:
> > Hi all,
> >
> > we observe a strange behavior with one of our rules. After deployment
> > and sending lots of events (~150,000 of type "a"), the server slows
> > down
> > rapidly until it runs out of memory.
> > We checked with VisualVM which objects are filling the memory: In one
> > moment there were almost 14GB of NotNodeLeftTuples (88,933,186
> Instances)!
> >
> > This is our rule:
> >
> > rule "example"
> > when
> > $evt1:EventObject(type=='a', $id:data['id'], $user:user) from
> > entry-point
> > internalstream
> > $evt2:EventObject(type=='a', data['id']==$id, user!=$user,
$user2:user)
> > from entry-point internalstream
> > not(EventObject(type=='b', user==$user,
data['friendid']==$user2) from
> > entry-point internalstream)
> > then
> > EventObject evt = new EventObject();
> > evt.setType('b');
> > evt.setUser($evt1.getUser());
> > evt.put('friendid', $evt2.getUser());
> > entryPoints['internalstream'].insert(evt);
> > end
> >
> > Is that behavior correct for such a size of event combinations when
> using a
> > NOT in the rule?
> >
> > Thanks,
> > Svenja
> >
> _______________________________________________
> rules-users mailing list
> rules-users(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/rules-users
>