[Design of Messaging on JBoss (Messaging/JBoss)] - Re: JBMESSAGING-519 - Failover(HA) design discussion
by clebert.suconic@jboss.com
public void testSimpleWithOneProducerTransacted() throws Exception
| {
| JBossConnection conn = (JBossConnection)this.factoryServer1.createConnection();
| Session session = conn.createSession(true,Session.AUTO_ACKNOWLEDGE);
| Destination destination = (Destination)getCtx1().lookup("queue/testQueue");
| MessageProducer producer = session.createProducer(destination);
|
| Message message = session.createTextMessage("Hello Before");
| producer.send(message);
|
| ClientConnectionDelegate delegate = (ClientConnectionDelegate)conn.getDelegate();
| ConnectionState state = (ConnectionState)delegate.getState();
|
| JBossConnection conn2 = (JBossConnection)this.factoryServer2.createConnection();
| conn.getDelegate().failOver(conn2.getDelegate());
|
| message = session.createTextMessage("Hello After");
| producer.send(message);
| session.commit();
| }
|
Just completing my last post, if I used the same testcase in a transacted scenario (with pending transactions) I would get this exception:
| org.jboss.aop.NotFoundInDispatcherException: Object with oid: 47 was not found in the Dispatcher
| at org.jboss.aop.Dispatcher.invoke(Dispatcher.java:85)
| at org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:127)
| at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:1008)
| at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:857)
| at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:454)
| at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:541)
| at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:261)
| at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:172)
| at org.jboss.remoting.Client.invoke(Client.java:589)
| at org.jboss.remoting.Client.invoke(Client.java:581)
| at org.jboss.jms.client.delegate.DelegateSupport.invoke(DelegateSupport.java:111)
| at org.jboss.jms.client.delegate.ClientConnectionDelegate$sendTransaction_N4986868250254447300.invokeNext(ClientConnectionDelegate$sendTransaction_N4986868250254447300.java)
|
I guess I should throw an exception if there are pending transactions and a failover happened. I think that would be the expected signature. (Anyone disagrees?)
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3977688#3977688
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3977688
19 years, 6 months
[Design of Messaging on JBoss (Messaging/JBoss)] - Re: JBMESSAGING-519 - Failover(HA) design discussion
by clebert.suconic@jboss.com
I'm creating a method called failoever(ConnectionDelegate newDelegate) into ConnectionDelegate.
I could already have a simple producer failing over another node after the method failoever was called.
Looking at this testcase you will have an idea how this is supposed to work. Onc the method failoever(connection2.getDelegate()) is called the producer is now connected to the new node.
public void testSimpleWithOneProducer() throws Exception
| {
| JBossConnection conn = (JBossConnection)this.factoryServer1.createConnection();
| Session session = conn.createSession(false,Session.AUTO_ACKNOWLEDGE);
| Destination destination = (Destination)getCtx1().lookup("queue/testQueue");
| MessageProducer producer = session.createProducer(destination);
|
| Message message = session.createTextMessage("Hello Before");
| producer.send(message);
|
| ClientConnectionDelegate delegate = (ClientConnectionDelegate)conn.getDelegate();
| ConnectionState state = (ConnectionState)delegate.getState();
|
| JBossConnection conn2 = (JBossConnection)this.factoryServer2.createConnection();
| conn.getDelegate().failOver(conn2.getDelegate());
|
| message = session.createTextMessage("Hello After");
| producer.send(message);
| }
|
|
I have a question about what would happen with Producers when dealing with transactions, as when the producer is failedOver the new connection, All the ObjectIDS (referenced on the current TransactionContext) won't exist on the new node.
Should we throw an exception next sendMessage or commit is called?
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3977684#3977684
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3977684
19 years, 6 months
[Design of Messaging on JBoss (Messaging/JBoss)] - Summary of approach to HA/Failover
by timfox
Following our meeting today with Bela, Brian, Clebert, Ovidiu and myself here are my thoughts as to what we need to do:
We need to main a cluster wide consistent mapping of node -> List
Where FailoverInfo contains the JGroups address of the failover node (actually the async stack) and the remoting invoker locator.
Whether the actually remoting invoker locator is used depends on whether we use remoting, but in any case it is some kind of address the client can use to make a connection to a server.
i.e.
| struct FailoverInfo
| {
| Address address;
| InvokerLocator locator;
| }
|
We need to maintain a list per node, since any particular node can have 1 or more failover nodes.
We should look at using the Distributed Replicator Manager code from JBoss AS clustering which Brian has separated into it's own jar for actually maintaining this state across the cluster.
This has nice functionality such as being able to register listeners to respond to changes.
If we can't use this then it's not a big deal just to use JGroups state directly as we already do for managing the binding state.
When a node joins the group we should generate the failover info list for that node before registering it with the group.
A simple algorithm would be to consider the JGroups view list as a circular buffer and choose failover nodes as the next items in the view list to the current node. We would have to be careful to not choose addresses on the current node as failovers. We should make this pluggable.
(Question: Given a JGroups address how do we know if it is on the same physical box as another JGroups address? If the machine has multiple NICs this may be tricky. Maybe we need to propagate some other kind of machine id in the state too? Or we propagate a list of NICs per box in the state.)
The JGroups address would be needed when doing in memory message persistence for replicating messages from one server node to another.
When changes in the failover map occur due to nodes joining leaving the group, this needs to be propagated to clients, and can probably be done on some kind of special control channel that we can multiplex on top of the transport assuming we use our own multiplexing transport.
When a failover occurs the first failover node detects the failure of the node by a change in view and then takes over responsibility of the failed nodes partial queues.
If there are any persistent messages to load they are then loaded from storage. Loading from storage won't be necessary if in memory message replication is being used since the messages will already be in memory in the failed over node.
Around the same time, the client will receive an connection exception in the connection listener and assume the server has failed (should we retry on the current server first in case the failure was transitory and the server hasn't really failed? E.g. if someone temporarily disconnected the network cable).
If the client determines the server really has failed then, it tries another server based on its client side load balancing policy. This server may not be the correct failover server for the failed node due to difficulties in synchronizing the client and server side failover mapping.
In this case the server tells the client the locator of the correct server, and the client tries to connect there. This process continues until the client finds the correct server or a maximum number of tries is reached at which point it will give up.
When the client connects to the failover server the server failover process (reloading of queues) may not have completed yet. In this case the server will block the invocation until the failover is complete. (I.e. it won't send a response to the connect until it is complete).
Once the client has successfully connected to the correct server it then recreates any session, consumer, producer and browser objects that existed before failure for the failed connection.
It then sends a list of <message_id, destination> corresponding to unacked messages in each of the sessions on the connection that failed. Based on this the server recreates the delivery list in the server consumer delates.
If the failed node is subsequently resurrected, then it is not such a simple matter to just move the connections back to the original node since there may be unacknowledged messages in live sessions. If we move the connections then we any non persistent messages might get redelivered.
Therefore we can only safely move back connections if there are no unacked messages in any sessions.
This is probably part of a bigger question of how we redistribute connections over many nodes when we suddenly add a lot of nodes to the cluster.
For the first pass we should probably not bother since this is tricky. However if we want to be able to automatically spread load smoothly and get benefits when adding new nodes to cluster with already created connections we should consider this.
We should also consider being able to bring down a node smoothly from the management console without losing sessions - i.e. move them transparently to another node. Again this is not a high priority but something to think about.
Any more thoughts?
(BTW We should probably put all of this in a wiki...)
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3977644#3977644
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3977644
19 years, 6 months
[Design of POJO Server] - Re: Need more metadata for the JARStructure deployer
by scott.stark@jboss.org
Another aspect that should be applied to the recognized classpath indepdendent of the structure deployer is the VFSUtils.addManifestLocations behavior. The base classpath should be augmented via classpath manifest after the structure is determined.
What if the StructureDeployer is updated to support an explicit StructureMetaData notion:
| public interface StructureDeployer
| {
| /**
| * Determine the structure of a deployment
| *
| * @param context the context
| * @return true when it is recongnised
| */
| boolean determineStructure(DeploymentContext context, StructureMetaData);
| ...
|
| public interface StructureMetaData
| {
| /** The paths of subdeployments relative to the root DeploymentContext */
| public String[] getDeploymentPaths();
| /** The paths of classpath entries relative to the root DeploymentContext */
| public String[] getClasspathFiles();
| }
|
and instead of the first structure deployer determining the structure, all deployers can interact to augment the StructureMetaData. Creating DeploymentContexts and filling out the classpath based on manifests, etc would be aspects of the MainDeployer rather than structural deployer behaviors.
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3977638#3977638
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3977638
19 years, 6 months