http://www.dzone.com/links/r/rip_rete_time_to_get_phreaky.html
—
I've just done some high level documentation for the new rule algorithm I've
called PHREAK, a word play on Hybrid Reasoning. It's still a bit rough and high level,
but hopefully still interesting. It builds on ReteOO, so good to read that bit first.
RETEOO ALGORITHM
The ReteOO was developed throughout the 3, 4 and 5 series releases. It takes the RETE
algorithm and applies well known enhancements, all of which are covered by existing
academic literature:
Node sharingSharing is applied to both the alpha and beta network. The beta network
sharing is always from the root pattern.
Alpha indexingAlpha Nodes with many children use a hash lookup mechanism, to avoid testing
each result.
Beta indexingJoin, Not and Exist nodes indexing their memories using a hash. This reduces
the join attempts for equal checks. Recently range indexing was added to Not and Exists.
Tree based graphsJoin matches did not contain any references to their parent or children
matches. Deletions would have to recalculate all join matches again, which involves
recreating all those join match objects, to be able to find the parts of the network where
the tuples should be deleted. This is called symmetrical propagation. A tree graph
provides parent and children references, so a deletion is just a matter of following those
references. This is asymmetrical propagation. The result is faster and less impact on the
GC, and more robust because changes in values will not cause memory leaks if they happen
without the engine being notified.
Modify-in-placeTraditional RETE implements a modify as a delete + insert. This causes all
join tuples to be GC'd, many of which are recreated again as part of the insert.
Modify-in-place instead propagates as a single pass, every node is inspected
Property reactiveAlso called "new trigger condition". Allows more fine grained
reactivity to updates. A Pattern can react to changes to specific properties and ignore
others. This alleviates problems of recursion and also helps with performance.
Sub-networksNot, Exists and Accumulate can each have nested conditional elements, which
forms sub-networks.
Backward ChainingProlog style derivation trees for backward chaining are supported. The
implementation is stack based, so does not have method recursion issues for large graphs.
Lazy Truth MaintenanceTruth maintenance has a runtime cost, which is incurred whether TMS
is used or not. Lazy TMS only turns it on, on first use. Further it's only turned on
for that object type, so other object types do not incur the runtime cost.
Heap based agendaThe agenda uses a binary heap queue to sort rule matches by salience,
rather than any linear search or maintenance approach.
PHREAK ALGORITHM
Drools 6 introduces a new algorithm, that attempts to address some of the core issues of
RETE. The algorithm is not a rewrite form scratch and incorporates all of the existing
code from ReteOO, and all it's enhancements. While PHREAK is an evolution of the RETE
algorithm, it is no longer classified as a RETE implementation. In the same way that once
an animal evolves beyond a certain point and key characteristics are changed, the animal
becomes classified as new species. There are two key RETE characteristics that strongly
identify any derivative strains, regardless of optimizations. That it is an eager, data
oriented algorithm. Where all work is doing during the insert, update or delete actions;
eagerly produce all partial matches for all rules. PHREAK in contrast is characterised as
a lazy, goal oriented algorithm; where partial matching is aggressively delayed.
This eagerness of RETE can lead to a lot of churn in large systems, and much wasted work.
Where wasted work is classified as matching efforts that do not result in a rule firing.
PHREAK was heavily inspired by a number of algorithms; including (but not limited to)
LEAPS, RETE/UL and Collection-Oriented Match. PHREAK has all enhancements listed in the
ReteOO, in addition it adds the following set of enhancements, which are explained in more
detail in the following paragraphs.
Lazy rule evaluation
Set oriented propagations
Three layers of contextual memory; Node, Segment and Rule memories.
Rule, segment and node based linking.
Stack based evaluations, with pause and resume.
Lazy (delayed) rule evaluation.
Isolated rule evaluation.
Set oriented propagations.
When the PHREAK engine is started all rules are said to be unlinked, no rule evaluation
can happen while rules are unlinked. The insert, update and deletes actions are queued
before entering the beta network. A simple heuristic is to select the rule to evaluate
that is most likely to result in rule firings, and thus delay the evaluation and firing of
the other rules. Only once a rule has all right inputs populated will the rule be
considered linked in, although no work is yet done. Instead a goal is created, that
represents the rule, and placed into a priority queue; which is ordered by salience. Each
queue itself is associated with an AngendaGroup. Only the active AgendaGroup will inspect
it's queue, popping the goal for the rule with the highest salience and submitting it
for evaluation. So the work done shifts from the insert, update, delete phase to the
fireAllRules phase. Only the rule for which the goal was created is evaluated, other
potential rule evaluations from those facts are delayed. While individual rules are
evaluated, node sharing is still achieved through the process of segmentation, which is
explained later.
Each successful join attempt in RETE produces a tuple (or token, or partial match) that
will be propagated to the child nodes. For this reason it is characterised as a tuple
oriented algorithm. For each child node that it reaches it will attempt to join with the
other side of the node, again each successful join attempt will be propagated straight
away. This creates a descent recursion effect. Thrashing the network of nodes as it
ripples up and down, left and right from the point of entry into the beta network to all
the reachable leaf nodes.
PHREAK propagation is set oriented (or collection-oriented), instead of tuple oriented.
For the rule being evaluated it will visit the first node and process all queued insert,
update and deletes. The results are added to a set and the set is propagated to the child
node. In the child node all queued inset, update and deletes are processed, adding the
results to the same set. Once finished that set is propagated to the next child node, and
so on until the terminal node is reached. This creates a single pass, pipeline type
effect, that is isolated to the current rule being evaluated. This creates a batch process
effect which can provide performance advantages for certain rule constructs; such as
sub-networks with accumulates. In the future it will leans itself to being able to exploit
multi-core machines in a number of ways.
The Linking and Unlinking uses a layered bit mask system, based on a network segmentation.
When the rule network is built segments are created for nodes that are shared by the same
set of rules. A rule itself is made up from a path of segments, although if there is no
sharing that will be a single segment. A bit-mask offset is assigned to each node in the
segment. Also another bit-mask (the layering) is assignment to each segment in the
rule's path. When there is at least one input (data propagation) the node's bit is
set on. When each node has it's bit set to on the segment's bit is also set to on.
Conversely if any not bit is set to off, the segment is then also set to off. If each
segment in the rule's path is set to on, the rule is said to be linked in and a goal
is created to schedule the rule for evaluation.
This is a fairly simple heuristic, using at least one items of data per node. Future work
would attempt to delay the linking even further; using techniques such as arc consistency
to determine that no matching will rule in rule instance firings.
Where as RETE has just a singe unit of memory, the node memory, PHREAK has 3 levels of
memory. This allows for much more contextual understanding during evaluation of a Rule.
PHREAK 3 Layered memory system
Example 1 shows a single rule, with three patterns; A, B and C. It forms a single segment,
with bits 1, 2 and 4 for the nodes.
Example 1: Single rule, no sharing
Example 2 demonstrates what happens when another rule is added that shares the pattern A.
A is placed in it's own segment, resulting in two segments per rule. Those two
segments form a path, for their respective rules. The first segment is shared by both
paths. When A is linked the segment becomes linked, it then iterates each path the segment
is shared by, setting the bit 1 to on. If B and C are later turned on, the second segment
for path R1 is linked in; this causes bit 2 to be turned on for R1. With bit 1 and bit 2
set to on for R1, the rule is now linked and a goal created to schedule the rule for later
evaluation and firing.
When a rule is evaluated it is the segments that allow the results of matching to be
shared. Each segment has a staging memory to queue all insert, update and deletes for that
segment. If R1 was to evaluated it would process A and result in a set of tuples. The
algorithm detects that there is a segmentation split and will create peers tuples for each
insert, update and delete in the set and add them to R2's staging memory. Those tuples
will be merged with any existing merged tuples and wait for R2 to eventually be
evaluated.
Example 2: Two rules, with sharing
Example 3 add a third rule and demonstrates what happens when A and B are shared. Only the
bits for the segments are shown this time, demonstrating that R4 has 3 segments, R3 has 3
segments and R1 has 2 segments. A and B and shared by R1, R3 and R4. While D is shared by
R3 and R4.
Example 3: Three rules, with sharing
Sub-networks are formed when a Not, Exists or Accumulate node contain more than one
element. In Example 4 "B not( C )" forms the sub-network, note that
"not(C)" is a single element and does not require a sub network and is merged
inside of the Not node.
The sub-network gets it's own segment. R1 still has a path of two segments. The
sub-network forms another "inner" path. When the sub-network is linked in, it
will link in the outer segment.
Example 4 : Single rule, with sub-network and no sharing
Example 5 shows that the sub-network nodes can be shard by a rule that does not have a
sub-network. This results in the subnetwork segment being split into two.
Example 5: Two rules, one with a sub-network and sharing
Not nodes with constraints and accumulate nodes have special behaviour and can never
unlink a segment, and are always considered to have their bits on.
All rule evaluations are incremental, and will not waste work recomputing matches that it
has already produced.
The evaluation algorithm is stack based, instead of method recursion. Evaluation can be
paused and resumed at any time, via the use of a StackEntry to represent current node
being evaluated.
When a rule evaluation reaches a sub-network a StackEntry is created for the outer path
segment and the sub-network segment. The sub-network segment is evaluated first, when the
set reaches the end of the sub-network path it is merged into a staging list for the outer
node it feeds into. The previous StackEntry is then resumed where it can process the
results of the sub network. This has the added benefit that all work is processed in a
batch, before propagating to the child node; which is much more efficient for accumulate
nodes.
The same stack system can be used for efficient backward chaining. When a rule evaluation
reaches a query node it again pauses the current evaluation, by placing it on the stack.
The query is then evaluated which produces a result set, which is saved in a memory
location for the resumed StackEntry to pick up and propagate to the child node. If the
query itself called other queries the process would repeat, with the current query being
pause and a new evaluation setup for the current query node.