[rules-dev] Left and Right Unlinking - Community Project Proposal

Wed Oct 27 21:21:11 EDT 2010

Hi Mark,

Shame on me... I didn't actually do a lot of progress, but I'm still willing
to do it.

I was sort of waiting for the whole RuleFest thing finish, so that I could
bother you guys with questions on IRC.

I will try to get in touch with you tomorrow.

Leo.

On Wed, Oct 27, 2010 at 11:46 PM, Mark Proctor <mproctor at codehaus.org>wrote:

>  Anyone made progress on this?
>
> Mark
>
> On 20/08/2010 17:02, Mark Proctor wrote:
>
> In an effort to help encourage those thinking of learning more about the
> internals of rule engines. I have made a document on implementating left and
> right unlinking. I describe the initial paper in terms relevant to Drools
> users, and then how that can be implemented in Drools and a series of
> enhancements over the original paper. The task is actually surprisingly
> simple and you only need to learn a very small part of the Drools
> implementation to do it, as such it's a great getting started task. For
> really large stateful systems of hundreds or even thousands of rules and
> hundreds of thousands of facts it should save significant amounts of memory.
> http://blog.athico.com/2010/08/left-and-right-unlinking-community.html
>
> Any takers?
>
> Mark
>
> Introduction
> The following paper describes Left and Right unlinking enhancements for
> Rete based networks:
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.6246
>
> A rete based rule engine consists of two parts of the network, the alpha
> nodes and the beta nodes. When an object is first inserted into the engine
> it is discriminated against by the object type Node, this is a one input and
> one output node. From there it may be further discriminated against by alpha
> nodes that constrain on literal values before reaching the right input of a
> join node in the beta part of the network. Join nodes have two inputs, left
> and right. The right input receives propagations consisting of a single
> object from the alpha part of the network. The left input receives
> propagations consisting of 1 or more objects, from the parent beta node. We
> refer to these propagating objects as LeftTuple and RightTuple, other
> engines also use the terms tokens or partial matches. When a tuple
> propagation reaches a left or right input it's stored in that inputs memory
> and it attempts to join with all possible tuples on the opposite side. If
> there are no tuples on the opposite side then no join can happen and the
> tuple just waits in the node's memory until a propagation from the opposite
> side attempts to join with it. If a given. It would be better if the engine
> could avoid populating that node's memory until both sides have tuples. Left
> and right unlinking are solutions to this problem.
>
> The paper proposes that a node can either be left unlinked or right
> unlinked, but not both, as then the rule would be completely disconnected
> from the network. Unlinking an input means that it will not receive any
> propagations and that the node's memory for that input is not populated,
> saving memory space. When the opposite side, which is still linked, receives
> a propagation the unlinked side is linked back in and receives all the none
> propagated tuples. As both sides cannot be unlinked, the paper describes a
> simple heuristic for choosing which side to unlink. Which ever side becomes
> empty first, then unlink the other. It says that on start up just
> arbitrarily chose to unlink one side as default. The initial hit from
> choosing the wrong side will be negligible, as the heuristic corrects this
> after the first set of propagations.
>
> If the left input becomes empty the right input is unlink, thus clearing
> the right input's memory too. The moment the left input receives a
> propagation it re-attaches the right input fully populating it's memory. The
> node can then attempt joins as normal. Vice-versa if the right input becomes
> empty it unlinks the left input. The moment the right input receives a
> propagation it re-attaches the left input fully populating it's memory so
> that the node can attempt to join as normal.
>
> Implementing Left and Right Unlinking for shared Knowledge Bases
>
> The description of unlinking in the paper won't work for Drools or for
> other rule engines that share the knowledge base between multiple sessions.
> In Drools the session data is decoupled from the main knowledge base and
> multiple sessions can share the same knowledge base. The paper above
> describes systems where the session data is tightly coupled to the knowledge
> base and the knowledge base has only a single session. In shared systems a
> node input that is empty for one session might not be empty for another.
> Instead of physically unlinking the nodes, as described in the paper, an
> integer value can be used on the session's node memory that indicates if the
> node is unlinked for left, right or both inputs. When the propagating node
> attempts to propagate instead of just creating a left or right tuple and
> pushing it into the node. It'll first retrieve the node's memory and only
> create the tuple and propagate if it's linked.
>
> This is great as it also avoids creating tuple objects that would just be
> discarded afterwards as there would be nothing to join with, making things
> lighter on the GC. However it means the engine looks up the node memory
> twice, once before propagating to the node and also inside of the node as it
> attempt to do joins. Instead the node memory should be looked up once, prior
> to propagating and then passed as an argument, avoiding the double lookup.
>
> Traditional Rete has memory per alpha node, for each literal constraint, in
> the network. Drools does not have alpha memory, instead facts are pulled
> from the object type node. This means that facts may needlessly evaluate in
> the alpha part of the network, only to be refused addition to the node
> memory afterwards. Rete supports something called “node sharing”, where
> multiple rules with similar constructs use the same nodes in the network.
> For this reason shared nodes cannot easily be unlinked. As a compromise when
> the alpha node is no longer shared, the network can do a node memory lookup,
> prior to doing the evaluation and check if that section of the network is
> unlinked and avoid attempting the evaluation if it is. This allows for left
> and right unlinking to be used in a engine such as Drools.
>
> Using Left and Right Unlinking at the Same Time
>
> The original paper describes an implantation in which a node cannot have
> both the left and right inputs unlinked for the same node. Building on the
> extension above to allow unlinking to work with a shared knowledge base the
> initial linking status value can be set to both left and right being
> unlinked. However in this initial state, where both sides are unlinked, the
> leaf node's right input isn't just waiting for a left propagation so the
> right can re-link itself (which it can't as the left is unlinked too). It's
> also waiting to receive it's first propagation, when it does it will link
> the left input back in. This will then tell it's parent node's right input
> to also do the same, i.e. wait for it's first right input propagation and
> link in the left when it happens. If it already has a right propagation
> it'll just link in the left anyway. This will trickle up until the root is
> finally linked in and propagations can happen as normally, and the rule's
> nodes return to the above heuristics for when to link and unlink the nodes.
>
> Avoid Unnecessary Eager Propagations
>
> A rule always eagerly propagates all joins, regardless of whether the child
> node can undertake joins too, for instance of there is no propagates for the
> leaf node then no rules can fire, and the eager propagations are wasted
> work. Unlinking can be extended to try to prevent some level of eager
> propagations. Should the leaf node become right unlinked and that right
> input also become empty it will unlink the left too (so both sides are
> unlinked) and go back to waiting for the first right propagation, at which
> point it'll re-link the left. If the parent node also has it's right input
> unlinked at the point that it's child node unlinks the left it will do this
> too. It will repeat this up the chain until it reaches a node that has both
> left and right linked in. This stops any further eager matching from
> occurring that we know can't result in an activation until the leaf node has
> at least one right input.
>
> Heuristics to Avoid Churn from Excessive and Unnecessary Unlinking
>
> The only case where left and right linking would be a bad idea is in
> situations that would cause a "churn”. Churn is when a node with have a
> large amount of right input memory is continually caused to be linked in and
> linked out, forcing those nodes to be repeatedly populated which causes a
> slow down. However heuristics can be used here too, to avoid unnecessary
> unlinking. The first time an input becomes empty unlink the opposite and
> store a time stamp (integer counter for fact handles from the WM). Then have
> a minimum delta number, say 100. The next time it attempts to unlink,
> calculate the delta of the current time stamp (integer counter on fact
> handle) and the time stamp of the node which last unlinked (which was
> recorded at the point of unlinking) if it's less than 100 then do nothing
> and don't unlink until it's 100 or more. If it's 100 or more then unlink and
> as well as storing the unlink time stamp, then take the delta of 100 or more
> and apply a multiple (2, 3, 4 etc depending on how steep you want it to
> rise, 3 is a good starting number) and store it. Such as if the delta is 100
> then store 300. The next time the node links and attempts to unlink it must
> be a delta of 300 or more, the time after that 900 the time after that 2700.
>
>
> _______________________________________________
> rules-dev mailing listrules-dev at lists.jboss.orghttps://lists.jboss.org/mailman/listinfo/rules-dev
>
>
>
> _______________________________________________
> rules-dev mailing list
> rules-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-dev/attachments/20101028/5eee1d49/attachment.html