[jboss-dev] AS5 testsuite failures

Brian Stansberry brian.stansberry at redhat.com
Thu Oct 25 10:59:27 EDT 2007


Tim Fox wrote:
> Are you sure getState() isn't being called? I need to check, but JBM 
> should always be calling getState() after joining a channel.
> 
> (Also.. the erroneous code comes from here 
> http://www.jboss.com/index.html?module=bb&op=viewtopic&t=118300 where 
> Brian explained to us how to use a mux channel, I think it was just copy 
> and pasted ;) )
> 

Ai, hoist by my own petard. :)

> Brian Stansberry wrote:
>> I've added a workaround to the AS[1] that should allow it to start 
>> properly until JBMESSAGING-1120 is fixed.
>>
>> [1] http://jira.jboss.com/jira/browse/JBAS-4909
>>
>> Brian Stansberry wrote:
>>> The cause of this is 
>>> http://jira.jboss.com/jira/browse/JBMESSAGING-1120. Vladimir's 
>>> written a unit test that shows the combination of the way JBM uses a 
>>> shared channel and the other AS services use it leads to the behavior 
>>> we're seeing.
>>>
>>> The JBM guys fixing JBMESSAGING-1120 will make the problem go away. 
>>> I've asked Vladimir look into why the JBM usage caused 
>>> Channel.getState() to incorrectly return true, which is what led to 
>>> the hang.
>>>
>>> Brian Stansberry wrote:
>>>> Adrian wrote:
>>>>> On Tue, 2007-10-23 at 09:58 -0500, Brian Stansberry wrote:
>>>>>> Adrian wrote:
>>>>>>> When I look at this code, it looks like it is doing an infinte wait,
>>>>>>> shouldn't this have a timeout?
>>>>>>>
>>>>>>> http://fisheye.jboss.org/browse/JBossCache/core/tags/2.0.0.GA/src/org/jboss/cache/CacheImpl.java?r=4237 
>>>>>>>
>>>>>>>
>>>>>> Arguably yes, although only as a second line of defense against a 
>>>>>> bug. Which there appears to be here. You only get to the wait call 
>>>>>> if channel.getState() returns true (line 1270 in the above linked 
>>>>>> rev.) The getState call has a timeout.  In your case it should 
>>>>>> *not* have returned true, as your node is the only cluster member.
>>>>>>
>>>>>> Vladimir/Manik, does anything here ring a bell?  E.g. was there 
>>>>>> any change in the Channel.getState() behavior in JG 2.5.0?  
>>>>>> Adrian's seeing this in AS trunk, which is using JBC 2.0.0.GA and 
>>>>>> JG 2.5.0.GA.
>>>>>>
>>>>>
>>>>> Make sure you do an rm -rf thirdparty or use a clean checkout
>>>>> so you can be sure you have all the latest jars.
>>>>>
>>>>
>>>> Yeah, I did.
>>>>
>>>>>> Adrian, has some kind of parallelization of deployment been 
>>>>>> introduced? 
>>>>>
>>>>> No. But each service is free to do some startup asynch by forking
>>>>> a thread.
>>>>>
>>>>>> Your logging seems to show JBoss Messaging deployment proceeding 
>>>>>> in parallel with deployment of cluster-beans.xml.  When I run 
>>>>>> "./run.sh -b localhost -c all" I don't see this problem (and the 
>>>>>> JBM logging occurs after the cluster-beans.xml stuff.)
>>>>>
>>>>> So you think this could be a race? I don't see anything in the
>>>>> DEBUG messages (or the thread dump) that suggests both are
>>>>> active at the same time.
>>>>>
>>>>> Here's the DEBUG from the last JBoss Messaging logging:
>>>>>
>>>>
>>>> Thanks for this.  I'd misread your earlier log; saw two sets of 
>>>> ASCII art showing channels starting, and thought one was JBM 
>>>> deployment, one from JBC.  Actually, they were both from JBM, which 
>>>> creates 2 channels.
>>>>
>>>> The JBC that's hanging actually shares the 2nd channel with JBM.  
>>>> This is likely where the problem is; some defect in the state 
>>>> transfer handling with shared channels. When I launch AS, the JBC 
>>>> deployment is happening before JBM; for whatever reason for you it's 
>>>> JBM then JBC. I'll add a depends or something to force JBM to go 
>>>> first on my setup and I bet I'll see the same thing you do.
>>>>
>>>>> 2007-10-23 14:28:34,335 DEBUG
>>>>> [org.jboss.jms.server.connectionfactory.ConnectionFactory] Started
>>>>> jboss.messaging.connectionfactory:service=ConnectionFactory
>>>>> 2007-10-23 14:28:34,336 DEBUG
>>>>> [org.jboss.system.microcontainer.jmx.ServiceControllerLifecycleCallback] 
>>>>>
>>>>> Registered MBean jboss.jgroups:service=MultiplexerChannelFactory
>>>>> 2007-10-23 14:28:34,337 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring create call;
>>>>> current state is Started
>>>>> 2007-10-23 14:28:34,337 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring start call;
>>>>> current state is Started
>>>>> 2007-10-23 14:28:34,362 DEBUG [org.jboss.cache.jmx.CacheJmxWrapper]
>>>>> Registered in JMX under jboss.cache:service=EJB3SFSBClusteredCache
>>>>> 2007-10-23 14:28:34,362 DEBUG [org.jboss.system.ServiceController]
>>>>> starting service jboss.cache:service=EJB3SFSBClusteredCache
>>>>> 2007-10-23 14:28:34,362 DEBUG [org.jboss.system.ServiceController]
>>>>> Registering service jboss.cache:service=EJB3SFSBClusteredCache
>>>>> 2007-10-23 14:28:34,373 DEBUG [org.jboss.cache.jmx.CacheJmxWrapper]
>>>>> Registered in JMX under
>>>>> jboss.cache:partitionName=DefaultPartition,service=HAPartitionCache
>>>>> 2007-10-23 14:28:34,374 DEBUG [org.jboss.system.ServiceController]
>>>>> starting service
>>>>> jboss.cache:service=HAPartitionCache,partitionName=DefaultPartition
>>>>> 2007-10-23 14:28:34,374 DEBUG [org.jboss.system.ServiceController]
>>>>> Registering service
>>>>> jboss.cache:service=HAPartitionCache,partitionName=DefaultPartition
>>>>> 2007-10-23 14:28:34,384 DEBUG [org.jboss.cache.jmx.CacheJmxWrapper]
>>>>> Registered in JMX under jboss.cache:service=ClusteredSSOCache
>>>>> 2007-10-23 14:28:34,384 DEBUG [org.jboss.system.ServiceController]
>>>>> starting service jboss.cache:service=ClusteredSSOCache
>>>>> 2007-10-23 14:28:34,384 DEBUG [org.jboss.system.ServiceController]
>>>>> Registering service jboss.cache:service=ClusteredSSOCache
>>>>> 2007-10-23 14:28:34,390 DEBUG [org.jboss.cache.jmx.CacheJmxWrapper]
>>>>> Constructing Cache
>>>>> 2007-10-23 14:28:34,797 DEBUG
>>>>> [org.jboss.cache.factories.InterceptorChainFactory] interceptor chain
>>>>> is:
>>>>> class org.jboss.cache.interceptors.CallInterceptor
>>>>> class org.jboss.cache.interceptors.UnlockInterceptor
>>>>> class org.jboss.cache.interceptors.PessimisticLockInterceptor
>>>>> class org.jboss.cache.interceptors.ReplicationInterceptor
>>>>> class org.jboss.cache.interceptors.NotificationInterceptor
>>>>> class org.jboss.cache.interceptors.TxInterceptor
>>>>> class org.jboss.cache.interceptors.CacheMgmtInterceptor
>>>>> class org.jboss.cache.interceptors.InvocationContextInterceptor
>>>>> 2007-10-23 14:28:34,806 DEBUG
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Not 
>>>>> using
>>>>> an EvictionPolicy
>>>>> 2007-10-23 14:28:34,815 DEBUG
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] cache
>>>>> mode is REPL_ASYNC
>>>>> 2007-10-23 14:28:34,815 DEBUG
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Created
>>>>> Multiplexer Channel for cache cluster DefaultPartition-ClusteredSS
>>>>> OCache using stack udp
>>>>> 2007-10-23 14:28:34,844 DEBUG
>>>>> [org.jboss.cache.marshall.VersionAwareMarshaller] Initialised with
>>>>> version 2.0.0 and versionInt 20
>>>>> 2007-10-23 14:28:34,844 DEBUG
>>>>> [org.jboss.cache.marshall.VersionAwareMarshaller] Using default
>>>>> marshaller class org.jboss.cache.marshall.CacheMarshaller200
>>>>> 2007-10-23 14:28:34,844 DEBUG
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Using
>>>>> marshaller org.jboss.cache.marshall.VersionAwareMarshaller
>>>>> 2007-10-23 14:28:34,845 DEBUG
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Block
>>>>> received at 127.0.0.1:32787
>>>>> 2007-10-23 14:28:34,899 DEBUG
>>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Invoked on
>>>>> cache instance [127.0.0.1:32787] and InvocationContext [Invocation
>>>>> Context{methodCall=MethodName: _block; MethodIdInteger: 43; Args:
>>>>> ()transaction=null, globalTransaction=null,
>>>>> optionOverrides=Option{failSilently=false, cacheModeLocal
>>>>> =false, dataVersion=null, suppressLocking=false,
>>>>> forceDataGravitation=false, skipDataGravitation=false},
>>>>> originLocal=true, txHasMods=false}]
>>>>> 2007-10-23 14:28:34,899 DEBUG
>>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Setting up
>>>>> transactional context.
>>>>> 2007-10-23 14:28:34,899 DEBUG
>>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Setting tx
>>>>> as null and gtx as null
>>>>> 2007-10-23 14:28:34,900 DEBUG
>>>>> [org.jboss.cache.interceptors.ReplicationInterceptor]
>>>>> isLocalCommitOrRollback? false; gtx = null
>>>>> 2007-10-23 14:28:34,900 DEBUG
>>>>> [org.jboss.cache.interceptors.PessimisticLockInterceptor]
>>>>> PessimisticLockInterceptor invoked for method MethodName: _block;
>>>>> MethodIdInteg
>>>>> er: 43; Args: ()
>>>>> 2007-10-23 14:28:34,900 DEBUG
>>>>> [org.jboss.cache.interceptors.PessimisticLockInterceptor] bypassed
>>>>> locking as method _block() doesn't require locking
>>>>> 2007-10-23 14:28:34,900 DEBUG
>>>>> [org.jboss.cache.interceptors.CallInterceptor] Passing up method
>>>>> MethodName: _block; MethodIdInteger: 43; Args: () so it gets 
>>>>> invoked on cache.
>>>>> 2007-10-23 14:28:34,900 DEBUG
>>>>> [org.jboss.cache.interceptors.UnlockInterceptor] Attempting to release
>>>>> locks on current thread.  Lock table is {}
>>>>> 2007-10-23 14:28:34,900 DEBUG
>>>>> [org.jboss.cache.interceptors.ReplicationInterceptor] Non-tx and non
>>>>> crud meth
>>>>> 2007-10-23 14:28:34,900 DEBUG
>>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Resetting
>>>>> invocation-scope options
>>>>> 2007-10-23 14:28:34,900 DEBUG
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Block
>>>>> processed at 127.0.0.1:32787
>>>>> 2007-10-23 14:28:34,902 INFO
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache]
>>>>> viewAccepted(): [127.0.0.1:32787|0] [127.0.0.1:32787]
>>>>> 2007-10-23 14:28:34,902 DEBUG
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] UnBlock
>>>>> received at 127.0.0.1:32787
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Invoked on
>>>>> cache instance [127.0.0.1:32787] and InvocationContext [Invocation
>>>>> Context{methodCall=MethodName: _unblock; MethodIdInteger: 44; Args:
>>>>> ()transaction=null, globalTransaction=null,
>>>>> optionOverrides=Option{failSilently=false, cacheModeLoc
>>>>> al=false, dataVersion=null, suppressLocking=false,
>>>>> forceDataGravitation=false, skipDataGravitation=false},
>>>>> originLocal=true, txHasMods=false}]
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Setting up
>>>>> transactional context.
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Setting tx
>>>>> as null and gtx as null
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.ReplicationInterceptor]
>>>>> isLocalCommitOrRollback? false; gtx = null
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.PessimisticLockInterceptor]
>>>>> PessimisticLockInterceptor invoked for method MethodName: _unblock;
>>>>> MethodIdInt
>>>>> eger: 44; Args: ()
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.PessimisticLockInterceptor] bypassed
>>>>> locking as method _unblock() doesn't require locking
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.CallInterceptor] Passing up method
>>>>> MethodName: _unblock; MethodIdInteger: 44; Args: () so it gets 
>>>>> invoked o
>>>>> n cache.
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.UnlockInterceptor] Attempting to release
>>>>> locks on current thread.  Lock table is {}
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.ReplicationInterceptor] Non-tx and non
>>>>> crud meth
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Resetting
>>>>> invocation-scope options
>>>>> 2007-10-23 14:28:34,903 DEBUG
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] UnBlock
>>>>> processed at 127.0.0.1:32787
>>>>> 2007-10-23 14:28:34,903 INFO
>>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] 
>>>>> CacheImpl
>>>>> local address is 127.0.0.1:32787
>>>>>
>>>>> Got bored waiting here. ;-)
>>>>>
>>>>> 2007-10-23 14:32:30,094 INFO
>>>>> [org.jboss.bootstrap.microcontainer.ServerImpl] JBoss SHUTDOWN
>>>>>
>>>>>
>>>>> There's no JBoss Cache logging before this, only some generic
>>>>> clustering logging (I've included where jms is using clustering
>>>>> but otherwise stripped its logging):
>>>>>
>>>>>
>>>>> 2007-10-23 14:28:27,360 DEBUG [org.jboss.system.ServiceController]
>>>>> starting service jboss.jgroups:service=MultiplexerChannelFactory
>>>>> 2007-10-23 14:28:27,360 DEBUG [org.jboss.system.ServiceController]
>>>>> Registering service jboss.jgroups:service=MultiplexerChannelFactory
>>>>> 2007-10-23 14:28:27,364 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring create call;
>>>>> current state is Stopped
>>>>>
>>>>> ...
>>>>>
>>>>> 2007-10-23 14:28:27,367 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Starting 
>>>>> JChannelFactory
>>>>> 2007-10-23 14:28:27,367 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Started 
>>>>> JChannelFactory
>>>>> 2007-10-23 14:28:27,368 DEBUG
>>>>> [org.jboss.messaging.core.jmx.MessagingPostOfficeService] Starting
>>>>> jboss.messaging:service=PostOffice
>>>>> 2007-10-23 14:28:27,378 DEBUG
>>>>> [org.jboss.messaging.core.jmx.MessagingPostOfficeService]
>>>>> org.jboss.messaging.core.jmx.MessagingPostOfficeService at 14b7042 uses
>>>>> Multiplexe
>>>>> rJChannelFactory
>>>>>
>>>>> 2007-10-23 14:28:28,110 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Passing unique node id
>>>>> 127.0.0.1:1099 to the channel as additional data
>>>>> 2007-10-23 14:28:28,121 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Passing unique node id
>>>>> 127.0.0.1:1099 to the channel as additional data
>>>>> 2007-10-23 14:28:28,214 INFO  [STDOUT] 
>>>>> -------------------------------------------------------
>>>>> GMS: address is 127.0.0.1:32786
>>>>> -------------------------------------------------------
>>>>> 2007-10-23 14:28:30,306 DEBUG
>>>>> [org.jboss.messaging.core.impl.postoffice.GroupMember]
>>>>> org.jboss.messaging.core.impl.postoffice.GroupMember
>>>>> $ControlMembershipListener at 334
>>>>> cb9 got new view [127.0.0.1:32786|0] [127.0.0.1:32786], old view is 
>>>>> null
>>>>> 2007-10-23 14:28:30,386 DEBUG
>>>>> [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice]
>>>>> org.jboss.messaging.core.impl.postoffice.MessagingPostOffice at 1c86116:
>>>>> 127.
>>>>> 0.0.1:32786 joined
>>>>> 2007-10-23 14:28:30,386 DEBUG
>>>>> [org.jboss.messaging.core.impl.postoffice.GroupMember] First view
>>>>> arrived
>>>>> 2007-10-23 14:28:30,387 DEBUG
>>>>> [org.jboss.messaging.core.impl.postoffice.GroupMember] We are the 
>>>>> first
>>>>> member of the group so no need to wait for state
>>>>> 2007-10-23 14:28:30,391 INFO  [STDOUT] 
>>>>> -------------------------------------------------------
>>>>> GMS: address is 127.0.0.1:32787
>>>>> -------------------------------------------------------
>>>>> 2007-10-23 14:28:32,412 DEBUG
>>>>> [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice] Updated
>>>>> failover map:
>>>>>
>>>>>              0->0
>>>>>
>>>>> 2007-10-23 14:28:32,427 DEBUG
>>>>> [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice]
>>>>> org.jboss.messaging.core.impl.postoffice.MessagingPostOffice at 1c86116
>>>>> puts replicant locally: JVMID->6-fnu8e48f-1-yvy7e48f-0gpx6x-d24o4c5
>>>>>
>>>>> ...
>>>>>
>>>>> 2007-10-23 14:28:32,872 DEBUG
>>>>> [org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint]
>>>>> updateClusteredClients being called!!! clientFactoriesToUpdate.size 
>>>>> = 0
>>>>>
>>>>> ,,,
>>>>>
>>>>> 2007-10-23 14:28:34,335 DEBUG
>>>>> [org.jboss.jms.server.connectionfactory.ConnectionFactory] Started
>>>>> jboss.messaging.connectionfactory:service=ConnectionFactory
>>>>> 2007-10-23 14:28:34,336 DEBUG
>>>>> [org.jboss.system.microcontainer.jmx.ServiceControllerLifecycleCallback] 
>>>>>
>>>>> Registered MBean jboss.jgroups:service=MultiplexerChannelFactory
>>>>> 2007-10-23 14:28:34,337 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring create call;
>>>>> current state is Started
>>>>> 2007-10-23 14:28:34,337 DEBUG
>>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring start call;
>>>>> current state is Started
>>>>>
>>>>
>>>
>>
> 

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com



More information about the jboss-development mailing list