[jboss-dev] AS5 testsuite failures

Tim Fox tim.fox at jboss.com
Thu Oct 25 05:59:47 EDT 2007


Are you sure getState() isn't being called? I need to check, but JBM 
should always be calling getState() after joining a channel.

(Also.. the erroneous code comes from here 
http://www.jboss.com/index.html?module=bb&op=viewtopic&t=118300 where 
Brian explained to us how to use a mux channel, I think it was just copy 
and pasted ;) )

Brian Stansberry wrote:
> I've added a workaround to the AS[1] that should allow it to start 
> properly until JBMESSAGING-1120 is fixed.
>
> [1] http://jira.jboss.com/jira/browse/JBAS-4909
>
> Brian Stansberry wrote:
>> The cause of this is 
>> http://jira.jboss.com/jira/browse/JBMESSAGING-1120. Vladimir's 
>> written a unit test that shows the combination of the way JBM uses a 
>> shared channel and the other AS services use it leads to the behavior 
>> we're seeing.
>>
>> The JBM guys fixing JBMESSAGING-1120 will make the problem go away. 
>> I've asked Vladimir look into why the JBM usage caused 
>> Channel.getState() to incorrectly return true, which is what led to 
>> the hang.
>>
>> Brian Stansberry wrote:
>>> Adrian wrote:
>>>> On Tue, 2007-10-23 at 09:58 -0500, Brian Stansberry wrote:
>>>>> Adrian wrote:
>>>>>> When I look at this code, it looks like it is doing an infinte wait,
>>>>>> shouldn't this have a timeout?
>>>>>>
>>>>>> http://fisheye.jboss.org/browse/JBossCache/core/tags/2.0.0.GA/src/org/jboss/cache/CacheImpl.java?r=4237 
>>>>>>
>>>>>>
>>>>> Arguably yes, although only as a second line of defense against a 
>>>>> bug. Which there appears to be here. You only get to the wait call 
>>>>> if channel.getState() returns true (line 1270 in the above linked 
>>>>> rev.) The getState call has a timeout.  In your case it should 
>>>>> *not* have returned true, as your node is the only cluster member.
>>>>>
>>>>> Vladimir/Manik, does anything here ring a bell?  E.g. was there 
>>>>> any change in the Channel.getState() behavior in JG 2.5.0?  
>>>>> Adrian's seeing this in AS trunk, which is using JBC 2.0.0.GA and 
>>>>> JG 2.5.0.GA.
>>>>>
>>>>
>>>> Make sure you do an rm -rf thirdparty or use a clean checkout
>>>> so you can be sure you have all the latest jars.
>>>>
>>>
>>> Yeah, I did.
>>>
>>>>> Adrian, has some kind of parallelization of deployment been 
>>>>> introduced? 
>>>>
>>>> No. But each service is free to do some startup asynch by forking
>>>> a thread.
>>>>
>>>>> Your logging seems to show JBoss Messaging deployment proceeding 
>>>>> in parallel with deployment of cluster-beans.xml.  When I run 
>>>>> "./run.sh -b localhost -c all" I don't see this problem (and the 
>>>>> JBM logging occurs after the cluster-beans.xml stuff.)
>>>>
>>>> So you think this could be a race? I don't see anything in the
>>>> DEBUG messages (or the thread dump) that suggests both are
>>>> active at the same time.
>>>>
>>>> Here's the DEBUG from the last JBoss Messaging logging:
>>>>
>>>
>>> Thanks for this.  I'd misread your earlier log; saw two sets of 
>>> ASCII art showing channels starting, and thought one was JBM 
>>> deployment, one from JBC.  Actually, they were both from JBM, which 
>>> creates 2 channels.
>>>
>>> The JBC that's hanging actually shares the 2nd channel with JBM.  
>>> This is likely where the problem is; some defect in the state 
>>> transfer handling with shared channels. When I launch AS, the JBC 
>>> deployment is happening before JBM; for whatever reason for you it's 
>>> JBM then JBC. I'll add a depends or something to force JBM to go 
>>> first on my setup and I bet I'll see the same thing you do.
>>>
>>>> 2007-10-23 14:28:34,335 DEBUG
>>>> [org.jboss.jms.server.connectionfactory.ConnectionFactory] Started
>>>> jboss.messaging.connectionfactory:service=ConnectionFactory
>>>> 2007-10-23 14:28:34,336 DEBUG
>>>> [org.jboss.system.microcontainer.jmx.ServiceControllerLifecycleCallback] 
>>>>
>>>> Registered MBean jboss.jgroups:service=MultiplexerChannelFactory
>>>> 2007-10-23 14:28:34,337 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring create call;
>>>> current state is Started
>>>> 2007-10-23 14:28:34,337 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring start call;
>>>> current state is Started
>>>> 2007-10-23 14:28:34,362 DEBUG [org.jboss.cache.jmx.CacheJmxWrapper]
>>>> Registered in JMX under jboss.cache:service=EJB3SFSBClusteredCache
>>>> 2007-10-23 14:28:34,362 DEBUG [org.jboss.system.ServiceController]
>>>> starting service jboss.cache:service=EJB3SFSBClusteredCache
>>>> 2007-10-23 14:28:34,362 DEBUG [org.jboss.system.ServiceController]
>>>> Registering service jboss.cache:service=EJB3SFSBClusteredCache
>>>> 2007-10-23 14:28:34,373 DEBUG [org.jboss.cache.jmx.CacheJmxWrapper]
>>>> Registered in JMX under
>>>> jboss.cache:partitionName=DefaultPartition,service=HAPartitionCache
>>>> 2007-10-23 14:28:34,374 DEBUG [org.jboss.system.ServiceController]
>>>> starting service
>>>> jboss.cache:service=HAPartitionCache,partitionName=DefaultPartition
>>>> 2007-10-23 14:28:34,374 DEBUG [org.jboss.system.ServiceController]
>>>> Registering service
>>>> jboss.cache:service=HAPartitionCache,partitionName=DefaultPartition
>>>> 2007-10-23 14:28:34,384 DEBUG [org.jboss.cache.jmx.CacheJmxWrapper]
>>>> Registered in JMX under jboss.cache:service=ClusteredSSOCache
>>>> 2007-10-23 14:28:34,384 DEBUG [org.jboss.system.ServiceController]
>>>> starting service jboss.cache:service=ClusteredSSOCache
>>>> 2007-10-23 14:28:34,384 DEBUG [org.jboss.system.ServiceController]
>>>> Registering service jboss.cache:service=ClusteredSSOCache
>>>> 2007-10-23 14:28:34,390 DEBUG [org.jboss.cache.jmx.CacheJmxWrapper]
>>>> Constructing Cache
>>>> 2007-10-23 14:28:34,797 DEBUG
>>>> [org.jboss.cache.factories.InterceptorChainFactory] interceptor chain
>>>> is:
>>>> class org.jboss.cache.interceptors.CallInterceptor
>>>> class org.jboss.cache.interceptors.UnlockInterceptor
>>>> class org.jboss.cache.interceptors.PessimisticLockInterceptor
>>>> class org.jboss.cache.interceptors.ReplicationInterceptor
>>>> class org.jboss.cache.interceptors.NotificationInterceptor
>>>> class org.jboss.cache.interceptors.TxInterceptor
>>>> class org.jboss.cache.interceptors.CacheMgmtInterceptor
>>>> class org.jboss.cache.interceptors.InvocationContextInterceptor
>>>> 2007-10-23 14:28:34,806 DEBUG
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Not 
>>>> using
>>>> an EvictionPolicy
>>>> 2007-10-23 14:28:34,815 DEBUG
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] cache
>>>> mode is REPL_ASYNC
>>>> 2007-10-23 14:28:34,815 DEBUG
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Created
>>>> Multiplexer Channel for cache cluster DefaultPartition-ClusteredSS
>>>> OCache using stack udp
>>>> 2007-10-23 14:28:34,844 DEBUG
>>>> [org.jboss.cache.marshall.VersionAwareMarshaller] Initialised with
>>>> version 2.0.0 and versionInt 20
>>>> 2007-10-23 14:28:34,844 DEBUG
>>>> [org.jboss.cache.marshall.VersionAwareMarshaller] Using default
>>>> marshaller class org.jboss.cache.marshall.CacheMarshaller200
>>>> 2007-10-23 14:28:34,844 DEBUG
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Using
>>>> marshaller org.jboss.cache.marshall.VersionAwareMarshaller
>>>> 2007-10-23 14:28:34,845 DEBUG
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Block
>>>> received at 127.0.0.1:32787
>>>> 2007-10-23 14:28:34,899 DEBUG
>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Invoked on
>>>> cache instance [127.0.0.1:32787] and InvocationContext [Invocation
>>>> Context{methodCall=MethodName: _block; MethodIdInteger: 43; Args:
>>>> ()transaction=null, globalTransaction=null,
>>>> optionOverrides=Option{failSilently=false, cacheModeLocal
>>>> =false, dataVersion=null, suppressLocking=false,
>>>> forceDataGravitation=false, skipDataGravitation=false},
>>>> originLocal=true, txHasMods=false}]
>>>> 2007-10-23 14:28:34,899 DEBUG
>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Setting up
>>>> transactional context.
>>>> 2007-10-23 14:28:34,899 DEBUG
>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Setting tx
>>>> as null and gtx as null
>>>> 2007-10-23 14:28:34,900 DEBUG
>>>> [org.jboss.cache.interceptors.ReplicationInterceptor]
>>>> isLocalCommitOrRollback? false; gtx = null
>>>> 2007-10-23 14:28:34,900 DEBUG
>>>> [org.jboss.cache.interceptors.PessimisticLockInterceptor]
>>>> PessimisticLockInterceptor invoked for method MethodName: _block;
>>>> MethodIdInteg
>>>> er: 43; Args: ()
>>>> 2007-10-23 14:28:34,900 DEBUG
>>>> [org.jboss.cache.interceptors.PessimisticLockInterceptor] bypassed
>>>> locking as method _block() doesn't require locking
>>>> 2007-10-23 14:28:34,900 DEBUG
>>>> [org.jboss.cache.interceptors.CallInterceptor] Passing up method
>>>> MethodName: _block; MethodIdInteger: 43; Args: () so it gets 
>>>> invoked on cache.
>>>> 2007-10-23 14:28:34,900 DEBUG
>>>> [org.jboss.cache.interceptors.UnlockInterceptor] Attempting to release
>>>> locks on current thread.  Lock table is {}
>>>> 2007-10-23 14:28:34,900 DEBUG
>>>> [org.jboss.cache.interceptors.ReplicationInterceptor] Non-tx and non
>>>> crud meth
>>>> 2007-10-23 14:28:34,900 DEBUG
>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Resetting
>>>> invocation-scope options
>>>> 2007-10-23 14:28:34,900 DEBUG
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] Block
>>>> processed at 127.0.0.1:32787
>>>> 2007-10-23 14:28:34,902 INFO
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache]
>>>> viewAccepted(): [127.0.0.1:32787|0] [127.0.0.1:32787]
>>>> 2007-10-23 14:28:34,902 DEBUG
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] UnBlock
>>>> received at 127.0.0.1:32787
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Invoked on
>>>> cache instance [127.0.0.1:32787] and InvocationContext [Invocation
>>>> Context{methodCall=MethodName: _unblock; MethodIdInteger: 44; Args:
>>>> ()transaction=null, globalTransaction=null,
>>>> optionOverrides=Option{failSilently=false, cacheModeLoc
>>>> al=false, dataVersion=null, suppressLocking=false,
>>>> forceDataGravitation=false, skipDataGravitation=false},
>>>> originLocal=true, txHasMods=false}]
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Setting up
>>>> transactional context.
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Setting tx
>>>> as null and gtx as null
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.ReplicationInterceptor]
>>>> isLocalCommitOrRollback? false; gtx = null
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.PessimisticLockInterceptor]
>>>> PessimisticLockInterceptor invoked for method MethodName: _unblock;
>>>> MethodIdInt
>>>> eger: 44; Args: ()
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.PessimisticLockInterceptor] bypassed
>>>> locking as method _unblock() doesn't require locking
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.CallInterceptor] Passing up method
>>>> MethodName: _unblock; MethodIdInteger: 44; Args: () so it gets 
>>>> invoked o
>>>> n cache.
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.UnlockInterceptor] Attempting to release
>>>> locks on current thread.  Lock table is {}
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.ReplicationInterceptor] Non-tx and non
>>>> crud meth
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.interceptors.InvocationContextInterceptor] Resetting
>>>> invocation-scope options
>>>> 2007-10-23 14:28:34,903 DEBUG
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] UnBlock
>>>> processed at 127.0.0.1:32787
>>>> 2007-10-23 14:28:34,903 INFO
>>>> [org.jboss.cache.CacheImpl.DefaultPartition-ClusteredSSOCache] 
>>>> CacheImpl
>>>> local address is 127.0.0.1:32787
>>>>
>>>> Got bored waiting here. ;-)
>>>>
>>>> 2007-10-23 14:32:30,094 INFO
>>>> [org.jboss.bootstrap.microcontainer.ServerImpl] JBoss SHUTDOWN
>>>>
>>>>
>>>> There's no JBoss Cache logging before this, only some generic
>>>> clustering logging (I've included where jms is using clustering
>>>> but otherwise stripped its logging):
>>>>
>>>>
>>>> 2007-10-23 14:28:27,360 DEBUG [org.jboss.system.ServiceController]
>>>> starting service jboss.jgroups:service=MultiplexerChannelFactory
>>>> 2007-10-23 14:28:27,360 DEBUG [org.jboss.system.ServiceController]
>>>> Registering service jboss.jgroups:service=MultiplexerChannelFactory
>>>> 2007-10-23 14:28:27,364 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring create call;
>>>> current state is Stopped
>>>>
>>>> ...
>>>>
>>>> 2007-10-23 14:28:27,367 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Starting 
>>>> JChannelFactory
>>>> 2007-10-23 14:28:27,367 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Started 
>>>> JChannelFactory
>>>> 2007-10-23 14:28:27,368 DEBUG
>>>> [org.jboss.messaging.core.jmx.MessagingPostOfficeService] Starting
>>>> jboss.messaging:service=PostOffice
>>>> 2007-10-23 14:28:27,378 DEBUG
>>>> [org.jboss.messaging.core.jmx.MessagingPostOfficeService]
>>>> org.jboss.messaging.core.jmx.MessagingPostOfficeService at 14b7042 uses
>>>> Multiplexe
>>>> rJChannelFactory
>>>>
>>>> 2007-10-23 14:28:28,110 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Passing unique node id
>>>> 127.0.0.1:1099 to the channel as additional data
>>>> 2007-10-23 14:28:28,121 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Passing unique node id
>>>> 127.0.0.1:1099 to the channel as additional data
>>>> 2007-10-23 14:28:28,214 INFO  [STDOUT] 
>>>> -------------------------------------------------------
>>>> GMS: address is 127.0.0.1:32786
>>>> -------------------------------------------------------
>>>> 2007-10-23 14:28:30,306 DEBUG
>>>> [org.jboss.messaging.core.impl.postoffice.GroupMember]
>>>> org.jboss.messaging.core.impl.postoffice.GroupMember
>>>> $ControlMembershipListener at 334
>>>> cb9 got new view [127.0.0.1:32786|0] [127.0.0.1:32786], old view is 
>>>> null
>>>> 2007-10-23 14:28:30,386 DEBUG
>>>> [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice]
>>>> org.jboss.messaging.core.impl.postoffice.MessagingPostOffice at 1c86116:
>>>> 127.
>>>> 0.0.1:32786 joined
>>>> 2007-10-23 14:28:30,386 DEBUG
>>>> [org.jboss.messaging.core.impl.postoffice.GroupMember] First view
>>>> arrived
>>>> 2007-10-23 14:28:30,387 DEBUG
>>>> [org.jboss.messaging.core.impl.postoffice.GroupMember] We are the 
>>>> first
>>>> member of the group so no need to wait for state
>>>> 2007-10-23 14:28:30,391 INFO  [STDOUT] 
>>>> -------------------------------------------------------
>>>> GMS: address is 127.0.0.1:32787
>>>> -------------------------------------------------------
>>>> 2007-10-23 14:28:32,412 DEBUG
>>>> [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice] Updated
>>>> failover map:
>>>>
>>>>              0->0
>>>>
>>>> 2007-10-23 14:28:32,427 DEBUG
>>>> [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice]
>>>> org.jboss.messaging.core.impl.postoffice.MessagingPostOffice at 1c86116
>>>> puts replicant locally: JVMID->6-fnu8e48f-1-yvy7e48f-0gpx6x-d24o4c5
>>>>
>>>> ...
>>>>
>>>> 2007-10-23 14:28:32,872 DEBUG
>>>> [org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint]
>>>> updateClusteredClients being called!!! clientFactoriesToUpdate.size 
>>>> = 0
>>>>
>>>> ,,,
>>>>
>>>> 2007-10-23 14:28:34,335 DEBUG
>>>> [org.jboss.jms.server.connectionfactory.ConnectionFactory] Started
>>>> jboss.messaging.connectionfactory:service=ConnectionFactory
>>>> 2007-10-23 14:28:34,336 DEBUG
>>>> [org.jboss.system.microcontainer.jmx.ServiceControllerLifecycleCallback] 
>>>>
>>>> Registered MBean jboss.jgroups:service=MultiplexerChannelFactory
>>>> 2007-10-23 14:28:34,337 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring create call;
>>>> current state is Started
>>>> 2007-10-23 14:28:34,337 DEBUG
>>>> [org.jboss.ha.framework.server.JChannelFactory] Ignoring start call;
>>>> current state is Started
>>>>
>>>
>>
>

-- 
Tim Fox
Messaging lead
JBoss - a division of Red Hat




More information about the jboss-development mailing list