[rules-users] GC Overhead Limit Exceeded and 1B JoinLeftNode Objects

Wolfgang Laun wolfgang.laun at gmail.com
Mon Feb 25 01:58:02 EST 2013


On 24/02/2013, Julian Klein <julianklein at gmail.com> wrote:

> 2) I see what you mean regarding the literal "." in the regular
> expressions.  It should be a literal ("366\.\\d+|743\.3\\d?) .

"366\\.\\d+|743\\.\\d?"

> 3) Regarding the four activations with SiteVisits.  I believe this is
> desired as the second site visit may or may not be the same visit as the
> first.

[Hmm, words like "first" and "second" imply two distinct events, so they can't
be "the same". You may have a mess in your data so that there could be two
reports (!) of the *same* visit.]

>  We want to capture both scenarios.  This is also potentially a
> complex data cleansing issue.  I don't have many options here AFAICT.

I'd say it's a "get-your-specs-cleansed" issue. Consider the rule
(after eliminating the useless binding of Inspector and some
simplifications):
when
      $sv1 : SiteVisit( yearRecorded == durationCycleYear, !annualVisit )
      FaultCode( svID == $sv1.ID, code matches "366.12" )
      $sv2 : SiteVisit( yearRecorded == durationCycleYear, !annualVisit,
                        inspectorID == $sv1.inspectorID )   // ensure
same Inspector
      FaultCode( svID == $sv2.ID, code matches "45.61" )
then

Now, not knowing how SiteVisit and FaultCode can be related, I can
only guess that possible data could be:

(1)
SiteVisit( 2013, "13/42", false )  // year, ID, annualVisit
  FaultCode( 366.12, "13/42" )
SiteVisit( 2013, "13/55", false )
  FaultCode( 45.61, "13/55" )

Two distinct visits, fires once.

(2)
SiteVisit( 2013, "13/42", false )
  FaultCode( 366.12, "13/42" )
  FaultCode(  45.61, "13/42" )

There is only one visit, but fires once nonetheless.

(3)
SiteVisit( 2013, "13/42", false )
  FaultCode( 366.12, "13/42" )
  FaultCode(  45.61, "13/42" )

SiteVisit( 2013, "13/55", false )
  FaultCode( 366.12, "13/55" )
  FaultCode(  45.61, "13/55" )

Two visits, fires twice.

It's possible that multiple FaultCodes per SiteVisit aren't possible,
but the data model (AFAIK) suggests otherwise, and if this is so, you
should be prepared for every possible situation. I can't imagine
handling all three situations adequately with this single rule.

> 4) Finding the same inspector, good point.  This a data cleansing issue.
>  I'll think about options upstream.

No, it is definitely not "data cleansing" - it is an issue of writing
rules well. (Sorry for the harsh wording.)

-W


More information about the rules-users mailing list