[jbpm-dev] Help with ht problem? (Re: BZ 853019)

Maciej Swiderski mswiders at redhat.com
Thu Sep 13 07:18:15 EDT 2012


Great, now I fully understand the behavior and consequences of changes 
we discussed. Thanks a lot Marco!!!

In fact, idea behind putting transaction boundary inside session was 
exactly the same to enclose TaskServerHandler operations in transaction, 
but went to far with simplifications (without considering LocalTaskService).

Looking forward to the wiki :)

P.S.
Great discussion btw, I hope more people will get involved soon (as Eric 
did).

Maciej

On 13.09.2012 12:44, Marco Rietveld wrote:
> Hi Maciej,
>
> Thanks for reading the whole explanation  (It probably could have been 
> a little shorter. :) )
>
> Remote users will not have the same problem as local users:
>
> 1. Local users are in the same JVM as the (Local)TaskService, which 
> means that they can still access (lazy-load) the properties.
> 2. Remote users will never be able to lazy-load properties: the Task 
> instance that they access has been serialized and the remote user is 
> also in a different JVM.
>
> Serialization of the Task object (which happens when the Task object 
> is sent from the server to the remote user) forces all of the (task) 
> properties to be accessed and, of course, serialized -- so that when 
> the remote user receives the Task instance, all properties are there.
>
> In short, the serialization required to send the Task instance to the 
> remote user forces a preload.
>
>
> You have a good point about having too (2) many tx's in one operation: 
> what we can do to avoid that is simply have the TaskServerHandler open 
> a tx before the (taskServiceSession) operation and close it after the 
> write. The tx logic that happens in the operation will then recognize 
> that there's an active tx and do nothing.
>
> Of course, this is /yet another/ band-aid on the human-task structure, 
> but it's the best one -- any other changes would impact the 
> LocalTaskService as we've discussed.
>
>
> One thing I'm glad about is that it seems that I've been able to 
> communicate just how painful the current human-task code is for me -- 
> and why we need to change it.
>
> I have a bunch of code and text that shows my ideas about how 
> Human-Task should work -- I'll make sure to push that to a git 
> repository/wiki as soon as possible so that other people can 
> contribute and advise.
>
>
> Some of the problems/issues that I fix or try to fix are:
>
> 1. non-normalized data model
> 2. "Ingrown" API -- problem domain was taken as the solution domain
> 3. transformation/business logic is not centralized
> 4. the human-task thread structure is not at all enterprise friendly
> 5. badly defined API
> 6. Unneccessary use of a notification architecture for logging 
> (Human-task events are logs, not events).
>
>
> Thanks,
> Marco
>
>
> 13-09-12 12:27, Maciej Swiderski:
>> Thanks Marco for explanation. I would say that making 
>> LocalTaskService transaction based on every operation is right way to 
>> do it that will ensure we are consistent for all cases (regardless 
>> remote or local).
>>
>> As it comes for lazy loaded properties of a task we will have that 
>> for both remote or local, don't you think? Even if session.write is 
>> transactional users on the other side of the wire won't be able to 
>> access properties that are lazy loaded (difference could be that 
>> exception will not be thrown but null/empty list will be returned) - 
>> assuming we are not going to preload everything in advance before 
>> writing to the session.
>>
>> All comes down to the issue we expose entities to the outside world, 
>> so to say.
>>
>> I agree that making session.write will resolve issue we currently 
>> have with remote task services but it in fact could sightly affect 
>> performance as in some case it will mean two transactions for one 
>> operations, correct?
>>
>> Thanks
>> Maciej
>>
>> On 13.09.2012 09:57, Marco Rietveld wrote:
>>> Hi Maciej,
>>>
>>> Just so were on the same page, (and for clarification to others 
>>> reading along), this is what we're talking about (I think  :) ):
>>>
>>> 1. Changing the TaskServiceSession so that instantiation starts a 
>>> transaction and disposal ends the transaction. (Currently, tx's in 
>>> human-task are started at various different points depending on the 
>>> operation requested).
>>> 2. Changing the LocalTaskService so that a TaskServiceSession is 
>>> instantiated and disposed with every operation.
>>>
>>>
>>> The main reason to do 2 is because otherwise, programs that are 
>>> already written that use the LocalTaskService might break. At this 
>>> point, users currently using the LocalTaskService expect that the 
>>> transaction (whether it's a local or JTA tx) will be ended by the 
>>> LocalTaskService at the end of an operation.
>>>
>>> If we only do 1 (change tx behaviour) but not 2, then a tx will be 
>>> opened when the LocalTaskService is initiated and a tx will only be 
>>> closed when the LocalTaskService is closed. (All the tx logic 
>>> inbetween will not fire, because the tx mgr will see that there's an 
>>> active tx and not do anything to modify the status of the active tx.).
>>>
>>> Except, for JTA tx's -- and probably also for Spring tx's -- this 
>>> isn't true. Something else that the user is doing could then end 
>>> those tx's, and that would break the LocalTaskService instance 
>>> (which expects to be able to close a tx when it disposes -- but 
>>> can't, because the user already has. ) True, in this situation 
>>> everything would work (because of the inner tx logic) until the 
>>> LocalTaskService was disposed.
>>>
>>> Besides the technical consideration above, there's also the fact 
>>> that users now expect the behaviour of the LocalTaskService to be 
>>> transactional. That means that if they're using the 
>>> LocalTaskService, and an exception is thrown halfway through, that 
>>> the things that have been already done using the LocalTaskService 
>>> will .. well, be done.
>>>
>>> If we don't do 2 (ensure similar tx behaviour), then the following 
>>> situation can occur, and users will definitely be angry about this:
>>>
>>> 1. User initializes LocalTaskService
>>> 2. User starts process (where by 5 tasks are created).
>>> 3. User completes task 1 (of 5) via LocalTaskService
>>> 4. User completes task 2 (of 5) via LocalTaskService
>>> 5. Exception is thrown by something, and we exit the stack.
>>>
>>> If we don't do ensure similar tx behavior, then a. none of the 5 
>>> tasks will have been saved and b. 2 of the 5 tasks (which won't even 
>>> exist) won't have the status "Completed".
>>>
>>>
>>> --------------------
>>>
>>> On another note, I'm realizing that what I'm proposing above is not 
>>> something we can do anyways.
>>>
>>> The problem, of course, comes back to the fact that the API/DTO 
>>> object is our entity. That means that if we go through with the 1,2 
>>> (tx by init/dispose and new TaskServiceSession per op), then we can 
>>> have the following:
>>>
>>> 1. LocalTaskService initiated, etc..
>>> 2. User calls LocalTaskService and gets a Task object back.
>>>    - which means: a. entitymanager opened, b. tx opened, c. retrieve 
>>> task d. tx closed e. em closed.
>>> 3. User does something else with LocalTaskService
>>>   - which means.. (see above)
>>> 4. User tries to access something in the Task object -- but 
>>> something in a (lazily-loaded) collection that of course hasn't been 
>>> loaded yet.
>>> 5. Proxy instance of collection element tries to retrieve the 
>>> element using the em.. that was closed in 2e.
>>> 6. "Boom!" as they say, or in other words, exception and User 
>>> doesn't understand wtf is going on.
>>>
>>> So it looks like we're  back to my original Option 2 or 3:
>>> 2. Run through option tree in order to force loading
>>> 3. tx around session.write().
>>>
>>> I'm favoring option 3, mostly because it's the least work and 
>>> probably the most robust. Obviously, neither option involves 
>>> changing the LocalTaskService.
>>>
>>>
>>> Thanks,
>>> Marco
>>>
>>>
>>> 13-09-12 09:15, Maciej Swiderski:
>>>> Marco, why we need to do that? Can't we just use it as is, meaning 
>>>> that several operations will be included in same transaction, like 
>>>> start, complete for example? Will this break on query level or ...
>>>> I am not sure how often it is used like that - two task service 
>>>> operations in single task service session?
>>>>
>>>> I can see that in some cases beneficial (like all or nothing) and 
>>>> in some cases not really welcome (inserting users/groups - one 
>>>> fails roll backs all others).
>>>>
>>>> Thanks
>>>> Maciej
>>>>
>>>> On 12.09.2012 23:56, Marco Rietveld wrote:
>>>>> Maciej,
>>>>>
>>>>> I was thinking about that -- but doing that breaks the 
>>>>> LocalTaskService (or otherwise, we have to rewrite 
>>>>> LocalTaskService so that it opens a new TaskServiceSession for 
>>>>> every operation, just the way the TaskServerHandler handles that).
>>>>>
>>>>> Actually, the more I think about that, the better it sounds. It 
>>>>> might impact the performance of LocalTaskService slightly, but it 
>>>>> will be worth it, I think.
>>>>>
>>>>> Thanks,
>>>>> Marco
>>>>>
>>>>> 12-09-12 17:16, Maciej Swiderski:
>>>>>> Marco, another way could be to ensure transaction is started when 
>>>>>> taskservicesession is created and closed (committed/rolledback) 
>>>>>> when taskservicesession is disposed, I did that for a fix on 
>>>>>> https://issues.jboss.org/browse/JBPM-3763 which is on postgresql 
>>>>>> and worked fine. So that way we ensure that session.write is in 
>>>>>> transaction as well. Of course not tested all possible cases but 
>>>>>> worked for main ones.
>>>>>>
>>>>>> Wdyt?
>>>>>>
>>>>>> Maciej
>>>>>>
>>>>>> On 12.09.2012 12:22, Marco Rietveld wrote:
>>>>>>> Hi Maciej and Mauricio,
>>>>>>>
>>>>>>> I'm struggling to find a good solution for a problem and was 
>>>>>>> hoping to get your advice about the following.
>>>>>>>
>>>>>>>
>>>>>>> The human-task service uses it's entities as DTO's, namely the 
>>>>>>> Task class/instances.
>>>>>>>
>>>>>>> However, we use Hibernate, which uses lazy-loading, which means 
>>>>>>> that Hibernate substitutes proxy instances in collections until 
>>>>>>> the actual collection elements are needed.
>>>>>>>
>>>>>>> With Hibernate 3, we miraculously were able to avoid any large 
>>>>>>> problems. However, testing with EAP 6 has uncovered situations, 
>>>>>>> primarily with postgresql, in which this strategy (entity as 
>>>>>>> DTO) just won't work.
>>>>>>>
>>>>>>> The problem is that even if all the "persistence" work is done 
>>>>>>> in one tx, the collections are still lazily-loaded. That means 
>>>>>>> if a task service operation has to return a Task instance, that 
>>>>>>> the serialization of the Task object (when it's being sent) 
>>>>>>> triggers the loading of entities. Due to postgresql's Large 
>>>>>>> Object facility, this means that there needs to be a transaction 
>>>>>>> around this action. Because we don't surround the 
>>>>>>> session.write(resultsCmnd); operation with a tx, we get an 
>>>>>>> exception.
>>>>>>>
>>>>>>> (To tell the truth, I don't understand why this worked with 
>>>>>>> Hibernate 3.. )
>>>>>>>
>>>>>>> As I've been writing this, I've come up with a couple of solutions:
>>>>>>>
>>>>>>> 1. Turn off lazy-loading for all entities.
>>>>>>> 2. Force the loading of all relevant entities by going through 
>>>>>>> the object tree (task.getPeopleAssignments().size(), etc.. )
>>>>>>> 3. Put a transaction around session.write(resultsCmnd);
>>>>>>>
>>>>>>> Option 1 has a big impact on performance, especially if we start 
>>>>>>> talking about high-volumes.
>>>>>>> Option 2 has a slightly larger impact on performance but Option 
>>>>>>> 3 seems a little bit ugly to me.
>>>>>>>
>>>>>>>
>>>>>>> Are there any options I missed? Any advice or comments?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Marco
>>>>>>>
>>>>>>> PS. This is (IMHO) one of the reasons we need to rewrite 
>>>>>>> human-task. I've been working on a proposal/POC, but the 
>>>>>>> important thing is that certain problems that we have now aren't 
>>>>>>> also present in the rewritten version.
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
> -- 
> jBPM/Drools developer
> Utrecht, the Netherlands

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/jbpm-dev/attachments/20120913/016aeae5/attachment-0001.html 


More information about the jbpm-dev mailing list