[rules-users] [Drools Planner] Hard constraint appears to be ignored - How to isolate a working memory corruption in drools expert

Geoffrey De Smet ge0ffrey.spam at gmail.com
Sun Mar 4 08:56:38 EST 2012



Op 04-03-12 10:43, Reinis schreef:
> Hi all,
>
> I localized the bug and have testcase that fails in one move, has only two rules and used fact data is reduced to couple of entries (just enough to make it break down).
>
> Drools config:  drools-core-5.4.0.Beta2 and drools-planner-core-5.3.1.Final
>
> Now I'd like to do one clarification before issuing a pull request for the testcase:
>
> The error appears to be in the PRESUMED wm and not in the REAL wm:
>
> Presumed workingMemory:
>     Score rule (penalize late Foos) has count (4) and weight total (4).<- OK
>     Score rule (reward Bar consolidation) has count (1) and weight total (1).<- WRONG
> Real workingMemory:
>     Score rule (penalize late Foos) has count (4) and weight total (4).<- OK
>     Score rule (reward Bar consolidation) has count (2) and weight total (2).<- OK
>
> Based on logging the real wm contains then correct score.
The presumed workingMemory is the state full, long-lived workingMemory 
which is carried from step to step, from move to move and therefor does 
delta's.
The real workingMemory is a short-lived, new workingMemory just to 
verify during errors just to find out what's ScoreOccurrences are wrong.

Naming is indeed confusing. I 'll fix that :)
> Question: If I strip the testcase off of the drools planner code and rewrite it to only manipulate wm. How do I assert if the error is there?
Simply assert that reward Bar cosolidation has 2 total weight :)
> I mean, Geoffrey told me to do simple assertion of the score, but wouldn't the assertion be correct, since the REAL wm is correct?
> What I wanted to do was to:
> 1. Simplify the testcase as Geoffrey described;
> 2. retrieve scoreCalculator after firing all rules;
> 3. assert scoreCalculator.getSoftConstraintsBroken() == 6 (<- and this would be correct)
>
> Thank you for your hints, guys!
> Reinis
>
>
> On 02/28/2012 11:52 AM, Geoffrey De Smet wrote:
>
>> Hi Reinis,
>>
>> So drools-core 5.4.0.Beta2 doesn't fix it?
>> Then there might be another working memory corruption bug left in drools
>> expert.
>>
>> Read the manual section about "Delta score calculation" to understand
>> why there's a presumed and a real workingMemory.
>>
>> Isolating a working memory corruption is a PITA... but if you want to go
>> for it, here's how:
>>
>>> here's how I usually isolate such a bug to a small reproducable unit
>>> test:
>>>
>>> - Turn on environmentMode TRACE so it crashes as fast as possible
>>> - Turn on TRACE logging.
>>> http://docs.jboss.org/drools/release/5.4.0.Beta1/drools-planner-docs/html_single/index.html#d0e2450
>>>
>>> So you see something like this:
>>> INFO  Solver started: time spend (0), score (-6), new best score (-6),
>>> random seed (0).
>>> TRACE         Ignoring not doable move (col0 at row0 =>    row0).
>>> TRACE         Move score (-4), accepted (true) for move (col0 at row0 =>
>>> row1).
>>> TRACE         Move score (-4), accepted (true) for move (col0 at row0 =>
>>> row2).
>>> TRACE         Move score (-4), accepted (true) for move (col0 at row0 =>
>>> row3).
>>> ...
>>> TRACE         Move score (-3), accepted (true) for move (col1 at row0 =>
>>> row3).
>>> ...
>>> TRACE         Move score (-3), accepted (true) for move (col2 at row0 =>
>>> row3).
>>> ...
>>> TRACE         Move score (-4), accepted (true) for move (col3 at row0 =>
>>> row3).
>>> DEBUG     Step index (0), time spend (6), score (-3), new best score
>>> (-3), accepted move size (12) for picked step (col1 at row0 =>    row3).
>>> ...
>>> DEBUG     Step index (1), time spend (10), score (-2), new best score
>>> (-2), accepted move size (12) for picked step (...).
>>> ...
>>> - If there's a construction heuristic that doesn't crash: remove it by:
>>> -- run only the construction heuristic config
>>> -- take the output solution as the new input solution
>>> -- remove the construction heuristic config
>>>
>>> - If it doesn't crash in step 0: remove non-crashing steps before it
>>> -- run it just before the step it crashes and take the output solution
>>> as the input solution
>>> -- verify that it now crashes in step 0 (*)
>>>
>>> - If it doesn't crash in the first move: remove non-crashing move's
>>> before it
>>> -- Open DefaultDecider.decideNextStep(LocalSearchStepScope) and add
>>> moveIterator.next().
>>> For example, if move 6 crashes:
>>>
>>> public void decideNextStep(LocalSearchStepScope localSearchStepScope) {
>>> WorkingMemory workingMemory = localSearchStepScope.getWorkingMemory();
>>> Iterator<Move>    moveIterator =
>>> selector.moveIterator(localSearchStepScope);
>>> // BEGIN
>>> moveIterator.next();
>>> moveIterator.next();
>>> moveIterator.next();
>>> moveIterator.next();
>>> moveIterator.next(); // 5 next()'s
>>> // END
>>> while (moveIterator.hasNext()) {
>>> Move move = moveIterator.next();
>>> ...
>>>
>>> -- verify that it now crashes in move 0 (*)
>>>
>>> (*) If it doesn't crash, that means that an earlier step or move
>>> helped cause this issue.
>>> Remove as many steps and moves as you can.
>>> This means that the final unit test will have multiple fireAllRules():
>>> ... do move 1 ...
>>> ... fireAllRules
>>> ... undo move 1 ...
>>> (... fireAllRules)
>>> ... do move 2 ...
>>> ... fireAllRules
>>> // Now the WM is corrupted
>>>
>>>
>>> - Now (and not sooner!), remove rules by commented out all the rules
>>> in the DRL that doesn't stop the crashing
>>> - Remove all the problem facts and planning entities that doesn't stop
>>> the crashing.
>>> - Now the problem should be small enough to rewrite it in a simple
>>> unit test for Drools Expert.
>> Op 27-02-12 23:43, Reinis schreef:
>>> Hello,
>>>
>>> I did all the suggestions from Geoffrey (thanks sir) and now the TRACE is failing with an additional rule:
>>>
>>> java.lang.IllegalStateException: The presumedScore (0hard/0soft) is corrupted because it is not the realScore  (0hard/-1soft).
>>> Presumed workingMemory:
>>>       Score rule (penalize late assignments) has count (11) and weight total (8).
>>>       Score rule (reward Resource consolidation) has count (7) and weight total (7).
>>> Real workingMemory:
>>>       Score rule (penalize late assignments) has count (11) and weight total (*7*).
>>>       Score rule (reward Resource consolidation) has count (*8*) and weight total (*8*).
>>>
>>> The offending rules look like this:
>>>
>>> rule "penalize late Assignments"
>>>         when
>>>             $project : Project(kickOff != null, $kickOff : kickOff)
>>>             $assignment : Assignment(project == $project)
>>>             eval($assignment.getInterval().getStart().isAfter($kickOff))
>>>         then
>>>             insertLogical(new IntConstraintOccurrence("penalize late assignments", ConstraintType.NEGATIVE_SOFT,
>>> 			new Duration($kickOff, $assignment.getInterval().getStart()).toStandardDays().getDays(), $assignment, $project));
>>> end
>>>
>>> rule "reward Resource consolidation"
>>> 	when
>>> 		$leftAssignment : Assignment($project : project, $leftResource : resource != null, $leftId : id, $leftIntervalStartMillis : intervalStartMillis)
>>> 		$rightAssignment : Assignment(project == $project, resource == $leftResource, id != $leftId, intervalStartMillis<= $leftIntervalStartMillis)
>>> 		eval(new Duration($leftAssignment.getIntervalStartMillis() - $rightAssignment.getIntervalStartMillis()).getStandardDays() == 0)
>>> 	then
>>> 		insertLogical(new IntConstraintOccurrence("reward Resource consolidation", ConstraintType.POSITIVE
>>> 		, 1, $leftAssignment, $rightAssignment));
>>> end
>>> *
>>> If I comment the "reward Resource consolidation" rule out, the test case completes successfully.
>>> *
>>> Is there anything else I can try?
>>>
>>> br
>>> Reinis
>>>
>>> On 02/27/2012 09:46 AM, Geoffrey De Smet wrote:
>>>
>>>> Op 26-02-12 19:47, Reinis schreef:
>>>>> Hello,
>>>>>
>>>>>> 1) TRACE
>>>>> It did actually crash!
>>>> Good :)
>>>>> java.lang.IllegalStateException: The presumedScore (0hard/-860soft) is corrupted because it is not the realScore  (0hard/-858soft).
>>>> It's not crashing on the hard constraint difference, but that probably
>>>> means it doesn't get there yet.
>>>>> Presumed workingMemory:
>>>>>         Score rule (reward Resource consolidation) has count (44) and weight total (44).
>>>>> Real workingMemory:
>>>>>         Score rule (reward Resource consolidation) has count (46) and weight total (46).
>>>>>
>>>>> But the rule causing it looks "normal" to me:
>>>>> rule "reward Resource consolidation"
>>>>> 	when
>>>>> 		$leftAssignment : Assignment($project : project, $leftResource : resource != null, $leftId : id)
>>>>> 		$rightAssignment : Assignment(project == $project, resource == $leftResource, id != $leftId)
>>>>> 		eval(new Duration(Math.abs($leftAssignment.getInterval().getStartMillis() - $rightAssignment.getInterval().getStartMillis())).getStandardDays() == 0)
>>>> Looks good to me.
>>>>
>>>> What you could do is make a Assignment.getIntervalStartMillis() and
>>>> Assignment( ..., $leftStartMillis : intervalStartMillis)
>>>> and add that rightAssignment's startMillis should be>     then
>>>> leftAssignment's $leftStartMillis to avoid the Math.abs and double
>>>> counting of this constraint.
>>>>> 	then
>>>>> 		insertLogical(new IntConstraintOccurrence("reward Resource consolidation", ConstraintType.POSITIVE
>>>>> 		, 1, $leftAssignment, $rightAssignment));
>>>>> end
>>>>>
>>>>>> 2) drools version and planner version
>>>>> drools-core-5.3.1.Final
>>>> Can you try TRACE with drools-core, drools-compiler and knowledge-api
>>>> 5.4.0.Beta2?
>>>> There was a working memory corruption bug fixed in drools 5.4 over 5.3,
>>>> not sure if it made 5.3.1.
>>>>> drools-planner-core-5.3.1.Final
>>>> (No need to upgrade planner itself at this time - as that will force you
>>>> down the UpgradeRecipe.txt).
>>>>> Thank you for the hints! Is the "eval" causing all this trouble?
>>>>>
>>>>> br
>>>>> Reinis
>>>>>
>>>>>
>>>>> On 02/26/2012 06:45 PM, Geoffrey De Smet wrote:
>>>>>
>>>>>> 1) If you haven't already, can you turn on<environmentMode>TRACE</...>
>>>>>> and let it run a while (it will be much slower)?
>>>>>> Report here if that crashes or not.
>>>>>>
>>>>>> 2) What drools version and what planner version are you using?
>>>>>>
>>>>>> Op 26-02-12 17:51, Reinis schreef:
>>>>>>> Hello, it appears that hard constraint I have defined is being ignored
>>>>>>> in some rare cases. Here's an example:
>>>>>>>
>>>>>>> This is what my application logs after getting best solution after
>>>>>>> solver has returned from solve():
>>>>>>>
>>>>>>> 2012-02-23 21:28:54,595 [main] INFO  Project Project1 assignment
>>>>>>> 563:[2012-02-27T13:00:00.000/2012-02-27T17:00:00.000] was assigned to John
>>>>>>>
>>>>>>> 2012-02-23 21:28:54,597 [main] INFO  Project Project2 assignment
>>>>>>> 508:[2012-02-27T13:00:00.000/2012-02-27T17:00:00.000] was assigned to John
>>>>>>>
>>>>>>> (The log says that John was assigned twice in same time period to two
>>>>>>> different assignments (assignment 563 and assignment 508)
>>>>>>>
>>>>>>> Here short explanation of the relevant domain data
>>>>>>> assignment.id = 563
>>>>>>> assignment.interval = 2012-02-27T13:00:00.000/2012-02-27T17:00:00.000
>>>>>>> assignment.project = Project1
>>>>>>> assignment.resource = John
>>>>>>>
>>>>>>> And this is rule that, to my opinion should've prohibited that two
>>>>>>> assignments get assigned to the same resource:
>>>>>>>
>>>>>>> rule "only one assignment within interval for the same resource"
>>>>>>>             when
>>>>>>>                 $leftAssignment : Assignment($leftId : id, $interval :
>>>>>>> interval, resource != null, $resource : resource)
>>>>>>>                 $rightAssignment : Assignment(interval == $interval, resource
>>>>>>> == $resource, id != $leftId)
>>>>>>>             then
>>>>>>>                 insertLogical(new IntConstraintOccurrence("only one assignment
>>>>>>> within interval for the same resource", ConstraintType.NEGATIVE_HARD, 1,
>>>>>>> $leftAssignment, $rightAssignment));
>>>>>>> end
>>>>>>>
>>>>>>> The score looks like this:
>>>>>>>
>>>>>>> 2012-02-23 21:28:54,594 [main] DEBUG The hard score is: 0 and the soft
>>>>>>> score is: -532
>>>>>>>
>>>>>>> So no hard constraints are broken according to planner.
>>>>>>>
>>>>>>> My question is not what is wrong with this particular example (unless it
>>>>>>> is obvious for you), but if you can think of any anti-pattern that would
>>>>>>> cause this behavior?
>>>>>>>
>>>>>>> What I already checked is:
>>>>>>> - this happens really sporadically (most of the time the result is
>>>>>>> CORRECT but in 1 from 5 runs I get this kind of error);
>>>>>>> - I checked, re-checked and re-re-checked all the hashCode, compareTo,
>>>>>>> solutionHashCode, solutionCompare, clone, ... methods - they appear to
>>>>>>> be correct;
>>>>>>> - moves are simple and I tried to keep them consistent with the way
>>>>>>> moves in drools-planner examples are constructed;
>>>>>>>
>>>>>>> Would be great if you could give me a hint to what direction I should
>>>>>>> look next.
>>>>>>>
>>>>>>> br
>>>>>>> Reinis
>>>>>>> _______________________________________________
>>>>>>> rules-users mailing list
>>>>>>> rules-users at lists.jboss.org
>>>>>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>>>>>>
>>>>> _______________________________________________
>>>>> rules-users mailing list
>>>>> rules-users at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>>>>
>>> _______________________________________________
>>> rules-users mailing list
>>> rules-users at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>>
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>

-- 
With kind regards,
Geoffrey De Smet





More information about the rules-users mailing list