I forgot the updates and added a couple of $'s (see below).
Scott Reed wrote:
1) Finding all the duplicate pairs is causing combinatorial
explosion.
With 30000 objects the rule will compare every object against every
other object which requires running the run about 450 MILLION times.
If you must discover every pair that is a duplicate, then I think you
have a hard problem. However, if you can stop considering an object
once you see it is a duplicate of at least one other datum, then you
can add a field to your object to record that it is already a
duplicate and remove it from further consideration, something like the
following:
public class Data {
private int id =0;
private boolean isDuplicated = false;
public Data(int id) {
this.id = id;
}
public int getId() {
return id;
}
public boolean getIsDuplicated() {
return isDuplicated;
}
public void setIsDuplicated( boolean isDuplicated ) {
this.isDuplicated = isDuplicated;
}
}
rule "Unique data"
when
data : Data()
old : Data( isDuplicated == false, this != data, this.id ==
data.id)
then
log.log("Following data are not unique: " + data.getId() + "
and " + old.getId());
$data.setIsDuplicated( true );
$old.setIsDuplicated( true );
The previous two lines should be:
data.setIsDuplicated( true );
update(data);
old.setIsDuplicated( true );
update(old);
end
--------------------
2) I'm pretty sure the test, this != $data, is redundant since it's
understood that the condition is relating two *different* objects of
the same class. Tagging the id fields will make rule a little more
efficient and easier to read. There's no need to use 'this.id', it's
understood that 'id' is the id field of old.
rule "Unique data"
when
$data1: Data( $id1: id )
$data2: Data( isDuplicated == false, id == $id1 )
then
log.log( "Duplicated id: " + $id1 );
$data1.setIsDuplicated( true );
$data2.setIsDuplicated( true );
end
( I like to use dollar signs at the front of tag names because it
makes it easier to distinguish what is what. You don't have to. )
ygaurav wrote:
> Hi All
> I am new to drools and I am trying to see if we can use it. I have a
> simple
> file
>
> public class Data {
> private int id =0;
> public Data(int id) {
> this.id = id;
> }
> public int getId() {
> return id;
> }
> }
>
> rule "Unique data"
> when
> data : Data()
> old : Data(this != data, this.id == data.id)
> then log.log("Following data are not unique: " +
> data.getId() + " and " +
> old.getId());
> end
>
>
> When I try to load 30,000 of data in memory it takes long time (
> around 12
> hours ) Can anybody suggests a better way of doing it.
>
> Thanks
> Gaurav
>
_______________________________________________
rules-users mailing list
rules-users(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users