[jbpm-dev] [Design of JBoss jBPM] - Re: Process/task search based on variables

Fri Oct 3 18:02:08 EDT 2008

anonymous wrote : Performant? How performat? Can you be concrete? (Total number of process instances/task instances/new task instances per day/number of concurrent users...)I
Well, certainly performant enough for anything that I would consider a typical task management system.  Look at the simple query:
--TASK AGING
  | select distinct
  |     pd.name_ AS process,
  |     ti.id_ AS task_id,
  |     tsk.name_ AS task,
  |     ti.actorid_ as assignee,
  |     ti.create_ as create_date,
  |     ti.duedate_ as due_date
  | from
  |     jbpm_taskinstance ti,
  |     jbpm_task tsk,
  |     jbpm_token t,
  |     jbpm_processinstance pi,
  |     jbpm_processdefinition pd
  | where
  |     ti.end_ is null
  |     and ti.task_ = tsk.id_
  |     and ti.token_ = t.id_
  |     and pi.id_ = t.processinstance_
  |     and pd.id_ = pi.processdefinition_
  | ;
This is a very simply query, and limiting results to those tasks assigned to a particular actor or to tasks from a particular processInstance is obviously trivial.  There are only four inner joins on columns that can (or should) be indexed.  Add in one more inner join to a "Business Keys" table, and you can return all tasks based on any of your business keys.

If a query this simple is a performance concern, it is because you're truly dealing with many millions of rows or you're returning too many rows at once.  I haven't seen a use case for a task management system that would really have many millions of rows, and data returned to a UI from the database should be paginated as part of the query (a user can't visually process more than 20 or 30 rows at a time, so don't force the user or the UI to handle much more than that.)

I don't know really how many "transactions" we've got going on in the course of a day, but I do know that with the simple data relationships in jBPM and our business key table and with the relatively small data sets we're consuming, any performance degradation will be the fault of our own business/UI tier design.  For a modern data base, stored row count won't be a practical impact.

Our current application is a Seam/JPA app.  We have about 70 group queues with maybe 300 users.  We have a particular view in which users can see a series of tabs representing each group queue to which they belong or manage.  They can select one of the tabs and will see the tasks in that queue.  When a manager selects a group queue to view, in the body of the tab they are presented with two lists of the first 10 assigned tasks and the first 10 unassigned tasks in that queue (each listing is sorted by task priority and due date).  If necessary, they can page through the tasks 10 at a time (going back to the data base each time).  Each row communicates relevant task data from the task table in addition to customer name, account number, order number, and any associated jeopardy data.   

For myself, when I look at this screen (being a universal manager in the system) I see all 70 group queues listed in 70 tabs down the left, each displaying how many unassigned tasks are in that queue.  Clicking on a queue to view its tasks (the two tables of assigned and unassigned tasks) seems to take just at 2 seconds regardless of how many underlying tasks there are in the queue.

The view of an individual's personal queue (all tasks assigned to that actor) takes just at a second to present.

Of course any "report view" of the data is exactly that--a report--and the user can be expected to wait several seconds.

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4180373#4180373

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4180373