[rules-users] simple rule takes long time

Scott Reed sreed at avacoda.com
Mon Jun 30 10:47:52 EDT 2008


1) Finding all the duplicate pairs is causing combinatorial explosion. 
With 30000 objects the rule will compare every object against every 
other object which requires running the run about 450 MILLION times. If 
you must discover every pair that is a duplicate, then I think you have 
a hard problem. However, if you can stop considering an object once you 
see it is a duplicate of at least one other datum, then you can add a 
field to your object to record that it is already a duplicate and remove 
it from further consideration, something like the following:

public class Data {
    private int id =0;
    private boolean isDuplicated = false;
    public Data(int id) {
        this.id = id;
    }
    public int getId() {
        return id;
    }
    public boolean getIsDuplicated() {
       return isDuplicated;
    }
    public void setIsDuplicated( boolean isDuplicated ) {
       this.isDuplicated = isDuplicated;
    }
}

rule "Unique data"
    when
        data : Data()
        old : Data( isDuplicated == false, this != data, this.id == data.id)
    then
        log.log("Following data are not unique: " + data.getId() + " and 
" + old.getId());
       $data.setIsDuplicated( true );
       $old.setIsDuplicated( true );
end

--------------------
2) I'm pretty sure the test, this != $data, is redundant since it's 
understood that the condition is relating two *different* objects of the 
same class. Tagging the id fields will make rule a little more efficient 
and easier to read. There's no need to use 'this.id', it's understood 
that 'id' is the id field of old.

rule "Unique data"
    when
        $data1: Data( $id1: id )
        $data2: Data( isDuplicated == false, id == $id1 )
    then
        log.log( "Duplicated id: " + $id1 );
       $data1.setIsDuplicated( true );
       $data2.setIsDuplicated( true );
end

( I like to use dollar signs at the front of tag names because it makes 
it easier to distinguish what is what. You don't have to. )


ygaurav wrote:
> Hi All 
>
> I am new to drools and I am trying to see if we can use it. I have a simple
> file
>
> public class Data {
> 	private int id =0;
> 	public Data(int id) {
> 		this.id = id;
> 	}
> 	public int getId() {
> 		return id;
> 	}
> }
>
> rule "Unique data"
>     when
>         data : Data()
>         old : Data(this != data, this.id == data.id)
>     then 
>         log.log("Following data are not unique: " + data.getId() + " and " +
> old.getId());
> end
>
>
> When I try to load 30,000 of data in memory it takes long time ( around 12
> hours )  Can anybody suggests a better way of doing it. 
>
>
> Thanks
> Gaurav
>   




More information about the rules-users mailing list