Hi all,
update. I created JBRULES-3416 now and added unit test to replicate this:
If anyone would find the time to cross-check it, would be great!
On 03/04/2012 02:56 PM, Geoffrey De Smet wrote:
Op 04-03-12 10:43, Reinis schreef:
> Hi all,
>
> I localized the bug and have testcase that fails in one move, has only two rules and
used fact data is reduced to couple of entries (just enough to make it break down).
>
> Drools config: drools-core-5.4.0.Beta2 and drools-planner-core-5.3.1.Final
>
> Now I'd like to do one clarification before issuing a pull request for the
testcase:
>
> The error appears to be in the PRESUMED wm and not in the REAL wm:
>
> Presumed workingMemory:
> Score rule (penalize late Foos) has count (4) and weight total (4).<- OK
> Score rule (reward Bar consolidation) has count (1) and weight total (1).<-
WRONG
> Real workingMemory:
> Score rule (penalize late Foos) has count (4) and weight total (4).<- OK
> Score rule (reward Bar consolidation) has count (2) and weight total (2).<-
OK
>
> Based on logging the real wm contains then correct score.
The presumed workingMemory is the state full, long-lived workingMemory
which is carried from step to step, from move to move and therefor does
delta's.
The real workingMemory is a short-lived, new workingMemory just to
verify during errors just to find out what's ScoreOccurrences are wrong.
Naming is indeed confusing. I 'll fix that :)
> Question: If I strip the testcase off of the drools planner code and rewrite it to
only manipulate wm. How do I assert if the error is there?
Simply assert that reward Bar cosolidation has 2 total weight :)
> I mean, Geoffrey told me to do simple assertion of the score, but wouldn't the
assertion be correct, since the REAL wm is correct?
> What I wanted to do was to:
> 1. Simplify the testcase as Geoffrey described;
> 2. retrieve scoreCalculator after firing all rules;
> 3. assert scoreCalculator.getSoftConstraintsBroken() == 6 (<- and this would be
correct)
>
> Thank you for your hints, guys!
> Reinis
>
>
> On 02/28/2012 11:52 AM, Geoffrey De Smet wrote:
>
>> Hi Reinis,
>>
>> So drools-core 5.4.0.Beta2 doesn't fix it?
>> Then there might be another working memory corruption bug left in drools
>> expert.
>>
>> Read the manual section about "Delta score calculation" to understand
>> why there's a presumed and a real workingMemory.
>>
>> Isolating a working memory corruption is a PITA... but if you want to go
>> for it, here's how:
>>
>>> here's how I usually isolate such a bug to a small reproducable unit
>>> test:
>>>
>>> - Turn on environmentMode TRACE so it crashes as fast as possible
>>> - Turn on TRACE logging.
>>>
http://docs.jboss.org/drools/release/5.4.0.Beta1/drools-planner-docs/html...
>>>
>>> So you see something like this:
>>> INFO Solver started: time spend (0), score (-6), new best score (-6),
>>> random seed (0).
>>> TRACE Ignoring not doable move (col0@row0 => row0).
>>> TRACE Move score (-4), accepted (true) for move (col0@row0 =>
>>> row1).
>>> TRACE Move score (-4), accepted (true) for move (col0@row0 =>
>>> row2).
>>> TRACE Move score (-4), accepted (true) for move (col0@row0 =>
>>> row3).
>>> ...
>>> TRACE Move score (-3), accepted (true) for move (col1@row0 =>
>>> row3).
>>> ...
>>> TRACE Move score (-3), accepted (true) for move (col2@row0 =>
>>> row3).
>>> ...
>>> TRACE Move score (-4), accepted (true) for move (col3@row0 =>
>>> row3).
>>> DEBUG Step index (0), time spend (6), score (-3), new best score
>>> (-3), accepted move size (12) for picked step (col1@row0 => row3).
>>> ...
>>> DEBUG Step index (1), time spend (10), score (-2), new best score
>>> (-2), accepted move size (12) for picked step (...).
>>> ...
>>> - If there's a construction heuristic that doesn't crash: remove it
by:
>>> -- run only the construction heuristic config
>>> -- take the output solution as the new input solution
>>> -- remove the construction heuristic config
>>>
>>> - If it doesn't crash in step 0: remove non-crashing steps before it
>>> -- run it just before the step it crashes and take the output solution
>>> as the input solution
>>> -- verify that it now crashes in step 0 (*)
>>>
>>> - If it doesn't crash in the first move: remove non-crashing move's
>>> before it
>>> -- Open DefaultDecider.decideNextStep(LocalSearchStepScope) and add
>>> moveIterator.next().
>>> For example, if move 6 crashes:
>>>
>>> public void decideNextStep(LocalSearchStepScope localSearchStepScope) {
>>> WorkingMemory workingMemory = localSearchStepScope.getWorkingMemory();
>>> Iterator<Move> moveIterator =
>>> selector.moveIterator(localSearchStepScope);
>>> // BEGIN
>>> moveIterator.next();
>>> moveIterator.next();
>>> moveIterator.next();
>>> moveIterator.next();
>>> moveIterator.next(); // 5 next()'s
>>> // END
>>> while (moveIterator.hasNext()) {
>>> Move move = moveIterator.next();
>>> ...
>>>
>>> -- verify that it now crashes in move 0 (*)
>>>
>>> (*) If it doesn't crash, that means that an earlier step or move
>>> helped cause this issue.
>>> Remove as many steps and moves as you can.
>>> This means that the final unit test will have multiple fireAllRules():
>>> ... do move 1 ...
>>> ... fireAllRules
>>> ... undo move 1 ...
>>> (... fireAllRules)
>>> ... do move 2 ...
>>> ... fireAllRules
>>> // Now the WM is corrupted
>>>
>>>
>>> - Now (and not sooner!), remove rules by commented out all the rules
>>> in the DRL that doesn't stop the crashing
>>> - Remove all the problem facts and planning entities that doesn't stop
>>> the crashing.
>>> - Now the problem should be small enough to rewrite it in a simple
>>> unit test for Drools Expert.
>> Op 27-02-12 23:43, Reinis schreef:
>>> Hello,
>>>
>>> I did all the suggestions from Geoffrey (thanks sir) and now the TRACE is
failing with an additional rule:
>>>
>>> java.lang.IllegalStateException: The presumedScore (0hard/0soft) is corrupted
because it is not the realScore (0hard/-1soft).
>>> Presumed workingMemory:
>>> Score rule (penalize late assignments) has count (11) and weight total
(8).
>>> Score rule (reward Resource consolidation) has count (7) and weight
total (7).
>>> Real workingMemory:
>>> Score rule (penalize late assignments) has count (11) and weight total
(*7*).
>>> Score rule (reward Resource consolidation) has count (*8*) and weight
total (*8*).
>>>
>>> The offending rules look like this:
>>>
>>> rule "penalize late Assignments"
>>> when
>>> $project : Project(kickOff != null, $kickOff : kickOff)
>>> $assignment : Assignment(project == $project)
>>> eval($assignment.getInterval().getStart().isAfter($kickOff))
>>> then
>>> insertLogical(new IntConstraintOccurrence("penalize late
assignments", ConstraintType.NEGATIVE_SOFT,
>>> new Duration($kickOff,
$assignment.getInterval().getStart()).toStandardDays().getDays(), $assignment,
$project));
>>> end
>>>
>>> rule "reward Resource consolidation"
>>> when
>>> $leftAssignment : Assignment($project : project, $leftResource : resource
!= null, $leftId : id, $leftIntervalStartMillis : intervalStartMillis)
>>> $rightAssignment : Assignment(project == $project, resource ==
$leftResource, id != $leftId, intervalStartMillis<= $leftIntervalStartMillis)
>>> eval(new Duration($leftAssignment.getIntervalStartMillis() -
$rightAssignment.getIntervalStartMillis()).getStandardDays() == 0)
>>> then
>>> insertLogical(new IntConstraintOccurrence("reward Resource
consolidation", ConstraintType.POSITIVE
>>> , 1, $leftAssignment, $rightAssignment));
>>> end
>>> *
>>> If I comment the "reward Resource consolidation" rule out, the test
case completes successfully.
>>> *
>>> Is there anything else I can try?
>>>
>>> br
>>> Reinis
>>>
>>> On 02/27/2012 09:46 AM, Geoffrey De Smet wrote:
>>>
>>>> Op 26-02-12 19:47, Reinis schreef:
>>>>> Hello,
>>>>>
>>>>>> 1) TRACE
>>>>> It did actually crash!
>>>> Good :)
>>>>> java.lang.IllegalStateException: The presumedScore (0hard/-860soft)
is corrupted because it is not the realScore (0hard/-858soft).
>>>> It's not crashing on the hard constraint difference, but that
probably
>>>> means it doesn't get there yet.
>>>>> Presumed workingMemory:
>>>>> Score rule (reward Resource consolidation) has count (44)
and weight total (44).
>>>>> Real workingMemory:
>>>>> Score rule (reward Resource consolidation) has count (46)
and weight total (46).
>>>>>
>>>>> But the rule causing it looks "normal" to me:
>>>>> rule "reward Resource consolidation"
>>>>> when
>>>>> $leftAssignment : Assignment($project : project, $leftResource :
resource != null, $leftId : id)
>>>>> $rightAssignment : Assignment(project == $project, resource ==
$leftResource, id != $leftId)
>>>>> eval(new
Duration(Math.abs($leftAssignment.getInterval().getStartMillis() -
$rightAssignment.getInterval().getStartMillis())).getStandardDays() == 0)
>>>> Looks good to me.
>>>>
>>>> What you could do is make a Assignment.getIntervalStartMillis() and
>>>> Assignment( ..., $leftStartMillis : intervalStartMillis)
>>>> and add that rightAssignment's startMillis should be> then
>>>> leftAssignment's $leftStartMillis to avoid the Math.abs and double
>>>> counting of this constraint.
>>>>> then
>>>>> insertLogical(new IntConstraintOccurrence("reward Resource
consolidation", ConstraintType.POSITIVE
>>>>> , 1, $leftAssignment, $rightAssignment));
>>>>> end
>>>>>
>>>>>> 2) drools version and planner version
>>>>> drools-core-5.3.1.Final
>>>> Can you try TRACE with drools-core, drools-compiler and knowledge-api
>>>> 5.4.0.Beta2?
>>>> There was a working memory corruption bug fixed in drools 5.4 over 5.3,
>>>> not sure if it made 5.3.1.
>>>>> drools-planner-core-5.3.1.Final
>>>> (No need to upgrade planner itself at this time - as that will force you
>>>> down the UpgradeRecipe.txt).
>>>>> Thank you for the hints! Is the "eval" causing all this
trouble?
>>>>>
>>>>> br
>>>>> Reinis
>>>>>
>>>>>
>>>>> On 02/26/2012 06:45 PM, Geoffrey De Smet wrote:
>>>>>
>>>>>> 1) If you haven't already, can you turn
on<environmentMode>TRACE</...>
>>>>>> and let it run a while (it will be much slower)?
>>>>>> Report here if that crashes or not.
>>>>>>
>>>>>> 2) What drools version and what planner version are you using?
>>>>>>
>>>>>> Op 26-02-12 17:51, Reinis schreef:
>>>>>>> Hello, it appears that hard constraint I have defined is
being ignored
>>>>>>> in some rare cases. Here's an example:
>>>>>>>
>>>>>>> This is what my application logs after getting best solution
after
>>>>>>> solver has returned from solve():
>>>>>>>
>>>>>>> 2012-02-23 21:28:54,595 [main] INFO Project Project1
assignment
>>>>>>> 563:[2012-02-27T13:00:00.000/2012-02-27T17:00:00.000] was
assigned to John
>>>>>>>
>>>>>>> 2012-02-23 21:28:54,597 [main] INFO Project Project2
assignment
>>>>>>> 508:[2012-02-27T13:00:00.000/2012-02-27T17:00:00.000] was
assigned to John
>>>>>>>
>>>>>>> (The log says that John was assigned twice in same time
period to two
>>>>>>> different assignments (assignment 563 and assignment 508)
>>>>>>>
>>>>>>> Here short explanation of the relevant domain data
>>>>>>> assignment.id = 563
>>>>>>> assignment.interval =
2012-02-27T13:00:00.000/2012-02-27T17:00:00.000
>>>>>>> assignment.project = Project1
>>>>>>> assignment.resource = John
>>>>>>>
>>>>>>> And this is rule that, to my opinion should've prohibited
that two
>>>>>>> assignments get assigned to the same resource:
>>>>>>>
>>>>>>> rule "only one assignment within interval for the same
resource"
>>>>>>> when
>>>>>>> $leftAssignment : Assignment($leftId : id,
$interval :
>>>>>>> interval, resource != null, $resource : resource)
>>>>>>> $rightAssignment : Assignment(interval ==
$interval, resource
>>>>>>> == $resource, id != $leftId)
>>>>>>> then
>>>>>>> insertLogical(new
IntConstraintOccurrence("only one assignment
>>>>>>> within interval for the same resource",
ConstraintType.NEGATIVE_HARD, 1,
>>>>>>> $leftAssignment, $rightAssignment));
>>>>>>> end
>>>>>>>
>>>>>>> The score looks like this:
>>>>>>>
>>>>>>> 2012-02-23 21:28:54,594 [main] DEBUG The hard score is: 0 and
the soft
>>>>>>> score is: -532
>>>>>>>
>>>>>>> So no hard constraints are broken according to planner.
>>>>>>>
>>>>>>> My question is not what is wrong with this particular example
(unless it
>>>>>>> is obvious for you), but if you can think of any anti-pattern
that would
>>>>>>> cause this behavior?
>>>>>>>
>>>>>>> What I already checked is:
>>>>>>> - this happens really sporadically (most of the time the
result is
>>>>>>> CORRECT but in 1 from 5 runs I get this kind of error);
>>>>>>> - I checked, re-checked and re-re-checked all the hashCode,
compareTo,
>>>>>>> solutionHashCode, solutionCompare, clone, ... methods - they
appear to
>>>>>>> be correct;
>>>>>>> - moves are simple and I tried to keep them consistent with
the way
>>>>>>> moves in drools-planner examples are constructed;
>>>>>>>
>>>>>>> Would be great if you could give me a hint to what direction
I should
>>>>>>> look next.
>>>>>>>
>>>>>>> br
>>>>>>> Reinis
>>>>>>> _______________________________________________
>>>>>>> rules-users mailing list
>>>>>>> rules-users(a)lists.jboss.org
>>>>>>>
https://lists.jboss.org/mailman/listinfo/rules-users
>>>>>>>
>>>>> _______________________________________________
>>>>> rules-users mailing list
>>>>> rules-users(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/rules-users
>>>>>
>>> _______________________________________________
>>> rules-users mailing list
>>> rules-users(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/rules-users
>>>
> _______________________________________________
> rules-users mailing list
> rules-users(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/rules-users
>