I'm (very) new to Drools and am trying to determine the best way to use it for our
scenario.
I have a domain model that has millions of object instances representing Documents, People
and Incidents.
- Incidents are associated with People (1-100+ per person)
- Documents are associated with Incidents (1-50+ per incident)
- Documents are also associated directly with People (1-100+)
- Typical number of documents directly or indirectly associated with
a person is 20-40.
- An incident has a start and end date
- The documents associated with the incident will be created
approximately within the time bounds of the incident
I need to have a workflow for which there will be 1000's of instances that wait for
certain documents to be associated with an Incident (or Person) and then continues with an
interaction with a user before terminating. It may have to exist for days or weeks.
The simple solution is to have a single session that contains all the documents, incidents
and people and to which new documents, incidents and people are added as they arrive at
the system. Then I can have a workflow that runs with rules over those facts that will
wait until all the necessary documents are available and continue on with the human
interaction part.
Of course this won't work. There are too many facts and most of them are not relevant
to any particular workflow instance anyway.
So, I'm thinking of doing this with a session per workflow instance where the session
is populated with the relevant documents on startup and has any new relevant documents are
added when they arrive (using a jms event pipeline).
I can easily filter the event stream to just insert new facts into the session's
working memory for documents that are associated with the person and I can pre-populate
the session's working memory with a selection of existing documents before starting
the workflow. A typical number of documents required for a workflow instance is in the
order of 6-50.
So I would create a knowledgebase preloaded with the workflow and rules and create a new
session each time I need to run the workflow. At any one time I could have 1000's of
sessions. I delete sessions (and reclaim any associated resources) when the workflow
completes so I don't need to clean up individual facts. The sessions would need to be
stateful and persistent (to a database).
Does this make sense or is there a better way to approach this? Will Drools scale OK when
used like this?
thanks,
Brian Wallis
InfoMedix | Architect
p: 3 8615 4553 | f: 3 8615 4501
e: brian.wallis(a)infomedix.com.au
Level 5, 451 Little Bourke Street, Melbourne VIC 3000