[jboss-user] [JBoss Messaging] - Re: JBM exception
xxbrandonoxx
do-not-reply at jboss.com
Mon Sep 15 14:58:40 EDT 2008
I am having the exact same issue when I try to cluster the application. It is a very difficult situation to create a simple test for as it includes MDBs, datasources, databases, clusters, jboss messaging, etc.
I think it breaks down to a failure of the cluster. I have 2 boxes A & B. Both A & B have jboss messaging installed based upon the documentation. They both use a shared Oracle data store. I am using a shared (configured) Queue for request, and temporary queues for responses (a basic pattern I think). Things seem to work well under light load, but upon funkload'ing a sample app to trigger the mechanisms with high load the cluster seems to fail.
I am using the system like this.
* I have a simple one page, one form JSF app that will fire a JMS request on my common request queue. This app will create a temporary response queue, listen on it with a java "Future" object, put a message on the global configured queue with appropriate reply-to, a MDB echo's back a hard coded string on the reply-to, the app echo's response to screen.
* I have my app configured such that if the request goes to A, the we use A's 1099 port for JMS lookups, and similarly on B.
* I am only hitting box A in this test
What I see
The application seems to work ok for the first few seconds, but then fails between 5-20 seconds into the load test. Like I said above, I am only hitting A.
I see a lot of the following on A
2008-09-15 14:21:36,192 ERROR [org.jboss.messaging.util.ExceptionUtil] SessionEndpoint[1hn-dln9f5lf-1-b8q6e5lf-np8q9k-t1ce1a]
| addTemporaryDestination [pko-ctaaf5lf-1-b8q6e5lf-np8q9k-t1ce1a]
| java.lang.IllegalStateException: org.jboss.messaging.core.impl.postoffice.GroupMember at 1ad11ec response not received from 10.5
| 0.12.136:56010 - there may be others
| at org.jboss.messaging.core.impl.postoffice.GroupMember.multicastControl(GroupMember.java:253)
| at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.internalAddBinding(MessagingPostOffice.java:1886)
|
The more informative piece of information (IMHO) actually comes from B
| 2008-09-15 14:22:51,660 DEBUG [org.jboss.messaging.core.impl.postoffice.GroupMember] org.jboss.messaging.core.impl.postoffice.GroupMember$ControlMembershipListener at 1aec0d1 got new view [10.50.12.136:56010|2] [10.50.12.136:56010], old view is [10.50.12.136:56010|1] [10.50.12.136:56010, 10.50.12.137:35992]
| 2008-09-15 14:22:51,660 DEBUG [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice] Updated failover map:
|
| 1->1
|
| 2008-09-15 14:22:51,660 DEBUG [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice] org.jboss.messaging.core.impl.postoffice.MessagingPostOffice at fbf107: 10.50.12.137:35992 left
| 2008-09-15 14:22:51,660 DEBUG [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice] org.jboss.messaging.core.impl.postoffice.MessagingPostOffice at fbf107: node 2 has crashed
| 2008-09-15 14:22:51,661 DEBUG [org.jboss.messaging.core.impl.postoffice.MessagingPostOffice] org.jboss.messaging.core.impl.postoffice.MessagingPostOffice at fbf107 the failover node for the crashed node is 1
| 2008-09-15 14:22:58,558 INFO [org.jboss.cache.TreeCache] viewAccepted(): [10.50.12.136:55867|32] [10.50.12.136:55867]
| 2008-09-15 14:22:58,566 DEBUG [org.jboss.cache.buddyreplication.BuddyManager] Instance 10.50.12.136:55867 broadcasting membership in buddy pool default to recipients []
| 2008-09-15 14:22:58,566 DEBUG [org.jboss.cache.buddyreplication.BuddyManager] Data owner address 10.50.12.136:55867
| 2008-09-15 14:22:58,566 DEBUG [org.jboss.cache.buddyreplication.BuddyManager] Entering updateGroup. Current group: BuddyGroup: (dataOwner: 10.50.12.136:55867, groupName: 10.50.12.136_55867, buddies: [10.50.12.137:35817]). Current View membership: [10.50.12.136:55867]
| 2008-09-15 14:22:58,566 INFO [org.jboss.cache.buddyreplication.NextMemberBuddyLocator] Expected to look for 1 buddies but could only find 0 suitable candidates - trying with colocated buddies as well.
| 2008-09-15 14:22:58,566 INFO [org.jboss.cache.buddyreplication.NextMemberBuddyLocator] Expected to look for 1 buddies but could only find 0 suitable candidates - trying again, ignoring buddy pool hints.
| 2008-09-15 14:22:58,566 INFO [org.jboss.cache.buddyreplication.NextMemberBuddyLocator] Expected to look for 1 buddies but could only find 0 suitable candidates - trying with colocated buddies as well.
| 2008-09-15 14:22:58,566 INFO [org.jboss.cache.buddyreplication.NextMemberBuddyLocator] Expected to look for 1 buddies but could only find 0 suitable candidates!
| 2008-09-15 14:22:58,572 INFO [org.jboss.cache.buddyreplication.BuddyManager] Removing obsolete buddies from buddy group [10.50.12.136_55867]. Obsolete buddies are [10.50.12.137:35817]
| 2008-09-15 14:22:58,572 INFO [org.jboss.cache.buddyreplication.BuddyManager] New buddy group: BuddyGroup: (dataOwner: 10.50.12.136:55867, groupName: 10.50.12.136_55867, buddies: [])
|
#'s
I am currently running a load test of 256 concurrent users consistently requesting with 0.1 second delay between requests. Note: this is the same configuration that ran fairly well on my laptop in a non-clustered situation (using hypersonic).
Versions
JBoss: 4.2.1.GA
JBoss Messaging
Implementation-Version: 1.4.0.SP3 (build: CVSTag=JBossMessaging_1_4_0_
SP3 date=200712131418)
Remoting
Implementation-Version: 2.2.2.SP4
Any help or guidance would be appreciated.
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4176599#4176599
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4176599
More information about the jboss-user
mailing list