[jboss-jira] [JBoss JIRA] (DROOLS-1030) Nullpointer in JpaTimerJobInstance.call on startup

Thu Jan 21 11:48:00 EST 2016

    [ https://issues.jboss.org/browse/DROOLS-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152040#comment-13152040 ] 

Artur Kronenberg commented on DROOLS-1030:
------------------------------------------

Ok,

I think I have the issue (or some issue) reproduced. I will attach this in a second. Here is what I do:

1. The standalone has 2 methods; create and load. Load takes the id from the create run to load the thing. I made the code as close as I could to where I found the bug/feature in the first place. The exception is not 100% the same, but I think I may have been hunting a red herring. I think those exceptions are all related, but they might happen at a slightly different point in time - so different symptoms with maybe the same root. 

2. First run: create 

Just run the main, wait until the thing is executed and then shut down the run. This is important as I think it has something to do with persistence + expiry which means the run needs to terminate so that drools can't update the session anymore (timerjobs and friends I believe). 

3. Wait for 1-2 minutes. I recommend 9gag, cracks me up :D 

4. Change load to take the ID of the run in step 2. Comment out the create method, and run the same thing again. 

This has worked for me 3 out of 3 times with an exception talking about transaction and stuff: http://pastebin.com/RqKr3FcE

Observations:

1. Load under 1 minute wait time. (so immediately) 
Everything works fine. It appears no timer jobs are started and no exceptions are thrown.

2. Load after 1 minute wait
The 1k models that event 1 loads are set to expire after 1 minute. 
When loading the session, it appears that for each fact that ought to be expired, 1 JpaTimerJob is started. The logs show heeeaaaapps of hibernate operations going on. I suspect that all the jobs try to do their thing and there may or may not be a bunch of retrying happening in the background. 

Conclusion is that I am still not sure what is happening exactly. But the signs stand in a way that I suspect loading a session which has facts/events that are supposed to be expired seems to be what the problem is. Please let me know if there's anything else I can do. If I manage to reproduce the NPE properly I will do that. Sadly the exception of my actual system vs my standalone example are slightly wrong.

Note: I believe it is essential for this exception that:
* Facts/events are inserted in a transaction - see TransactionHelper class 
* fireAllRules is called in a separate thread 

> Nullpointer in JpaTimerJobInstance.call on startup 
> ---------------------------------------------------
>
>                 Key: DROOLS-1030
>                 URL: https://issues.jboss.org/browse/DROOLS-1030
>             Project: Drools
>          Issue Type: Feature Request
>          Components: core engine
>         Environment: Mac OS 10.10.5, Eclipse Mars Release, Java 1.8, Drools 6.3.0.Final
>            Reporter: Artur Kronenberg
>            Assignee: Mario Fusco
>
> Hi,
> I have noticed Nullpointer exceptions when I try and reload a persisted session on startup. It is a bit hard to recreate (I am actually not managing it now since I deleted all old sessions to attempt recreation) but I figured maybe someone has an idea. Essentially I am getting this NPE twice:
> java.lang.NullPointerException: null
>         at org.drools.persistence.jpa.JpaTimerJobInstance.call(JpaTimerJobInstance.java:50) [drools-persistence-jpa-6.3.0.Final.jar:6.3.0.Final]
>         at org.drools.persistence.jpa.JpaTimerJobInstance.call(JpaTimerJobInstance.java:30) [drools-persistence-jpa-6.3.0.Final.jar:6.3.0.Final]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_51]
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_51]
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_51]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_51]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_51]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> When debugging I noticed the following behaviour which points to a race condition: 
> If I start my server and try to recreate the session, those NPEs happen 2x.
> If I start my server and put a breakpoint BEFORE creating the CommandService for execution, I can wait for a few seconds and the service can be found.
> It appears that the scheduler's timerJobFactoryManager is not fully there at the time the sessions is being loaded? Or something else is racing with the service creation. 
> Is there a way for me to make sure everything is fully instantiated before I use drools? Does a workaround exist? Is this even a bug? 
> Thanks and let me know your thoughts. If I run into this again I will attempt to try and reproduce it. Let me know if more info is needed!
> UPDATE:
> I noticed that the reason I can't reproduce it at the moment is that the JpaTimerJobInstance is never called with the newly persisted session. It appears that that timer job depends on something else to happen so that it thinks it needs to start the timerJob 

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)