[Design of Messaging on JBoss (Messaging/JBoss)] - Re: Client failover redeliveries discussion
by timfox
I thought we had gone over all this already, but here goes again....
"ovidiu.feodorov(a)jboss.com" wrote :
| A client may happen to be sending a message when the failure occurs. If the message is sent individually (not in the context of a transaction), then the on-going synchronous invocation is going to fail, the client code will catch the exception, and most likely will re-try to send the message,
|
What do you mean the "client code" will catch the exception?
Do you mean the application code?
If so, this is incorrect - JBoss Messaging failover is supposed to be automatic - this is one of our selling points.
Applications shouldn't have to catch connection exceptions and retry like in JBossMQ.
"Ovidiu" wrote :
| If the client happens to send message in the context of a transaction when the failure occurs, we could either throw an exception, and discard everything, or go for the more elegant solution of transparently copying the transactional state (the corresponding TxState instance) into the new ResourceManager and send the messages over the new connection when transaction commits. We're probably not doing this right now, but this is what should be doing.
|
Same reasoning applies as previous comment. It should be transparent.
"Ovidiu" wrote :
| It is interesting to consider also what happens when the failure occurs right in the middle of "sentTransaction()" invocation.
|
Again same reasoning applies.
There is a problem here of an exception being received and a retry occuring but the transaction/send actually went through on the previous node.
Please see discussion on duplicated message detection in a previous thread for more information on this.
"Ovidiu" wrote :
|
| What is more interesting is what happens with the messages that are already in the MessageCallbackHandler's buffer.
|
| For a seamless fail-over, they will need to be transferred in the new MessageCallbackHandler's buffer. Also important, immediately after the failover condition is detected, any in-progress read should be completed, and no further reads should be accepted until the client-side fail-over is complete ("client side failover lockdown"). The next post-failover read should be done from the new MessagingCallbackHandler's buffer.
|
There is no need to copy anything since Clebert is re-using the same connection, consumer, buffer objects before and after failover - he is just changing the ids.
"Ovidiu" wrote :
| Contrary to what has been discussed so far on this thread, I think we can also salvage non-persistent messages, with minimum of effort. I'll address this issue again later. The acknowledgments for these messages (persistent and non-persistent) will be sent by the new Connection Delegate.
|
| We also have the acknowledgments accumulated in a transaction on the client-side. The case should be dealt with similarly with the way we handle transacted messages (copy the TxState instance).
|
Again, no copying is necessary - just re-use the same object.
"Ovidiu" wrote :
|
| Tim wrote :
| | Yes - we should send the ids of every persistent message as part of the failover protocol - the server then repopulates the delivery list in the server consumer endpoint
| |
|
| I think we can go a step further and also send the IDs of non-persistent messages that have been "failed-over" on the client side. This way, the client will continue to receive (and successfully acknowledge) non-persistent messages that otherwise would have been lost.
|
This makes no sense. When server A fails and server B takes over, only the persistent messages are resurrected into server B's queues.
The non persistent messages are lost.
Therefore it's not possible that the non persistent messages can be successfully acknowledged on server B, since server B won't know about them.
This is why I said the non persistent messages should be removed from the client state so they don't attempt to be acked.
"Ovidiu" wrote :
| Tim wrote :
| | Clebert wrote :
| | | - Should we ignore ACKs for non existent messages on the server?
| | |
| | Non existent messages on the server will be non persistent messages that didn't survive the failover.
| | They should be removed from the client side list on failover so the acks will never get sent.
| |
|
| Not necessarily. See my above comment. We could also include the ids of non-persistent messages with the list of message ID sent to the server as part of the failover protocol, and thus be able to "salvage" those messages as well. I don't see any problem if we do that. We get better fault tolerance.
|
How can you salvage a message that doesn't exist any more? The non persistent messages wil have been lost when server A failed.
"Ovidiu" wrote :
| What about the "fail-over protocol"? Your statement above seem to assume that the new server node is called into without any "preparation", as would a completely new client that creates a new connection, session and consumer endpoints. This is not going to work, those server-side objects need to undergo a "post-failover" preparation phase, where deliveries for the client-side failed over messages are created and so forth.
|
Correct.
"Ovidiu" wrote :
| Tim wrote :
| | So to summarise:
| | ...
| | 3) Let the server "stall" you until server failover has completed
| | ...
| |
|
| What exactly does this mean?
|
When a server fails, the server side failover kicks in, and the server loads those queues which it is taking over responsibuility for.
This may take several seconds, during which time we do not want a failed over client connection to start sending/consuming from those queues since they might receive a partial state.
Hence we need to stall the connection at reconnection until the server completes its failover protocol. I.e. a "valve".
This is covered in the wiki page I believe (like most of this stuff).
"Ovidiu" wrote :
| Tim wrote :
| | So to summarise:
| | ...
| | 5) Delete any non persistent messages from the client list of unacked messages in any sessions in the failed connection.
| | ...
| |
|
| Why? See my comment above. Why do you think "salvaging" non-persistent messages too isn't going to work?
|
Because the new server know nothing about the non persistent messages, since they won't have survived the server failure.
"Ovidiu" wrote :
|
| Non-persistent message ids too.
|
No point doing that, for the reasons explained twice already in this thread.
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3980937#3980937
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3980937
19 years, 5 months
[Design of Messaging on JBoss (Messaging/JBoss)] - Re: Server side HA and failover
by timfox
"clebert.suconic(a)jboss.com" wrote : I have *tried* to drive fail over from client side.
|
| Last week I could make a state transfer from one node to another, moving all the messages from a failed queue to a new queue.
|
As already discussed, moving messages is not an option since there may be 10s of millions of messages in each partial queue.
Also fully client driven failover won't work since we need to ensure JMS semantics with durable subscriptions - especially when doing in memory persistent message replication - this we have already discussed too.
"Clebert" wrote :
| There are implications of doing this way as we will could have more than one Local Queues when a failover occurs.
|
| Say if you have two nodes A and B, each node with a client and a durable subscription on that client.
|
| Now say if node A fails. Now the clients from node A will be redirect into node B, and Routers will have to treat B's connections as they were still on nodeA but now they are local on node B.
|
Why is this a problem?
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3980934#3980934
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3980934
19 years, 5 months
[Design of POJO Server] - Re: Structure deployer changes comitted to trunk
by scott.stark@jboss.org
I just did a clean build and only the webservice deployer is failing (as expected). Let me know if your seeing other problems. You will need to update your thirdparty microcontainer snapshot to pickup the latest jboss-deployers.jar, as well as updating the server module to pickup the EARStructure changes.
| [starksm@succubus build]$ cd output/jboss-5.0.0.Beta/bin
| [starksm@succubus bin]$ run.sh
| =========================================================================
|
| JBoss Bootstrap Environment
|
| JBOSS_HOME: /home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta
|
| JAVA: /home/starksm/java/jrockit-jdk1.5.0_06/bin/java
|
| JAVA_OPTS: -Dprogram.name=run.sh -Xms128m -Xmx512m -Dorg.jboss.resolver.warning=true -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000
|
| CLASSPATH: /home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/bin/run.jar:/home/starksm/java/jrockit-jdk1.5.0_06/lib/tools.jar
|
| =========================================================================
|
| 17:00:29,482 INFO [ServerImpl] Starting JBoss (Microcontainer)...
| 17:00:29,498 INFO [ServerImpl] Release ID: JBoss [Morpheus] 5.0.0.Beta (build: CVSTag=HEAD date=200610251648)
| 17:00:29,498 INFO [ServerImpl] Home Dir: /home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta
| 17:00:29,499 INFO [ServerImpl] Home URL: file:/home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/
| 17:00:29,503 INFO [ServerImpl] Library URL: file:/home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/lib/
| 17:00:29,503 INFO [ServerImpl] Patch URL: null
| 17:00:29,504 INFO [ServerImpl] Server Name: default
| 17:00:29,506 INFO [ServerImpl] Server Home Dir: /home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/server/default
| 17:00:29,507 INFO [ServerImpl] Server Home URL: file:/home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/server/default/
| 17:00:29,508 INFO [ServerImpl] Server Data Dir: /home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/server/default/data
| 17:00:29,508 INFO [ServerImpl] Server Temp Dir: /home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/server/default/tmp
| 17:00:29,508 INFO [ServerImpl] Server Config URL: file:/home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/server/default/conf/
| 17:00:29,509 INFO [ServerImpl] Server Library URL: file:/home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/server/default/lib/
| 17:00:29,509 INFO [ServerImpl] Root Deployment Filename: jboss-service.xml
| 17:00:29,534 INFO [ServerImpl] Starting Microcontainer, bootstrapURL=file:/home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/server/default/conf/deployer-beans.xml
| 17:00:30,710 INFO [ProfileImpl] Using profile root:/home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/bin/file:/home/svn/JBossHead/jboss-head/build/output/jboss-5.0.0.Beta/server/default/profile
| 17:00:31,348 INFO [ServerInfo] Java VM: BEA JRockit(R) R26.4.0-63-63688-1.5.0_06-20060626-2259-linux-x86_64,BEA Systems, Inc.
| 17:00:31,349 INFO [ServerInfo] OS-System: Linux 2.6.9-42.0.2.ELsmp,amd64
| 17:00:31,465 INFO [JMXKernel] Legacy JMX core initialized
| 17:00:33,549 INFO [WebService] Using RMI server codebase: http://succubus.starkinternational.com:8083/
| 17:00:33,855 INFO [NamingService] JNDI bootstrap JNP=/0.0.0.0:1099, RMI=/0.0.0.0:1098, backlog=50, no client SocketFactory, Server SocketFactory=class org.jboss.net.sockets.DefaultSocketFactory
| 17:00:34,343 INFO [SocketServerInvoker] Invoker started for locator: InvokerLocator [socket://192.168.2.101:4446/?dataType=invocation&enableTcpNoDelay=true&marshaller=org.jboss.invocation.unified.marshall.InvocationMarshaller&socketTimeout=600000&unmarshaller=org.jboss.invocation.unified.marshall.InvocationUnMarshaller]
| 17:00:35,183 INFO [Embedded] Catalina naming disabled
| 17:00:35,207 INFO [ClusterRuleSetFactory] Unable to find a cluster rule set in the classpath. Will load the default rule set.
| 17:00:35,210 INFO [ClusterRuleSetFactory] Unable to find a cluster rule set in the classpath. Will load the default rule set.
| 17:00:35,328 INFO [AprLifecycleListener] The Apache Tomcat Native library which allows optimal performance in production environments was not found on the java.library.path: /home/starksm/java/jrockit-jdk1.5.0_06/jre/lib/amd64/jrockit:/home/starksm/java/jrockit-jdk1.5.0_06/jre/lib/amd64:/home/starksm/java/jrockit-jdk1.5.0_06/jre/../lib/amd64
| 17:00:35,373 INFO [Http11Protocol] Initializing Coyote HTTP/1.1 on http-0.0.0.0-8080
| 17:00:35,374 INFO [AjpProtocol] Initializing Coyote AJP/1.3 on ajp-0.0.0.0-800917:00:35,374 INFO [Catalina] Initialization processed in 164 ms
| 17:00:35,376 INFO [StandardService] Starting service jboss.web
| 17:00:35,381 INFO [StandardEngine] Starting Servlet Engine: Apache Tomcat/2.0.0.dev
| 17:00:35,405 INFO [StandardHost] XML validation disabled
| 17:00:35,422 INFO [Catalina] Server startup in 48 ms
| 17:00:38,324 INFO [TomcatDeployment] deploy, ctxPath=/invoker, warUrl=.../deploy/http-invoker.sar/invoker.war/
| 17:00:38,561 INFO [WebappLoader] Dual registration of jndi stream handler: factory already defined
| 17:00:39,158 INFO [StandardContext] Container org.apache.catalina.core.ContainerBase.[jboss.web].[localhost].[/invoker] has already been started
| 17:00:39,175 INFO [SocketServerInvoker] Invoker started for locator: InvokerLocator [socket://192.168.2.101:3873/]
| 17:00:39,226 INFO [MailService] Mail Service bound to java:/Mail
| 17:00:39,707 INFO [TomcatDeployment] deploy, ctxPath=/web-console, warUrl=.../deploy/management/console-mgr.sar/web-console.war/
| 17:00:40,115 INFO [StandardContext] Container org.apache.catalina.core.ContainerBase.[jboss.web].[localhost].[/web-console] has already been started
| 17:00:40,856 INFO [TomcatDeployment] deploy, ctxPath=/jbossws, warUrl=.../deploy/jbossws.sar/jbossws-context.war
| 17:00:40,901 INFO [StandardContext] Container org.apache.catalina.core.ContainerBase.[jboss.web].[localhost].[/jbossws] has already been started
| 17:00:40,908 INFO [TomcatDeployment] deploy, ctxPath=/jmx-console, warUrl=.../deploy/jmx-console.war/
| 17:00:40,959 INFO [StandardContext] Container org.apache.catalina.core.ContainerBase.[jboss.web].[localhost].[/jmx-console] has already been started
| 17:00:40,976 INFO [RARDeployment] Required license terms exist, view .../deploy/jboss-ha-local-jdbc.rar!/META-INF/ra.xml
| 17:00:40,978 INFO [RARDeployment] Required license terms exist, view .../deploy/mail-ra.rar!/META-INF/ra.xml
| 17:00:40,996 INFO [RARDeployment] Required license terms exist, view .../deploy/jms/jms-ra.rar!/META-INF/ra.xml
| 17:00:41,019 INFO [RARDeployment] Required license terms exist, view .../deploy/quartz-ra.rar!/META-INF/ra.xml
| 17:00:41,101 INFO [SimpleThreadPool] Job execution threads will use class loader of thread: main
| 17:00:41,120 INFO [QuartzScheduler] Quartz Scheduler v.1.5.2 created.
| 17:00:41,124 INFO [RAMJobStore] RAMJobStore initialized.
| 17:00:41,125 INFO [StdSchedulerFactory] Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.properties'17:00:41,125 INFO [StdSchedulerFactory] Quartz scheduler version: 1.5.2
| 17:00:41,128 INFO [QuartzScheduler] Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started.
| 17:00:41,129 INFO [RARDeployment] Required license terms exist, view .../deploy/jboss-xa-jdbc.rar!/META-INF/ra.xml
| 17:00:41,130 INFO [RARDeployment] Required license terms exist, view .../deploy/jboss-local-jdbc.rar!/META-INF/ra.xml
| 17:00:41,228 INFO [WrapperDataSourceService] Bound ConnectionManager 'jboss.jca:service=DataSourceBinding,name=DefaultDS' to JNDI name 'java:DefaultDS'
| 17:00:41,603 INFO [SimpleThreadPool] Job execution threads will use class loader of thread: main
| 17:00:41,604 INFO [QuartzScheduler] Quartz Scheduler v.1.5.2 created.
| 17:00:41,607 INFO [JobStoreCMT] Using db table-based data access locking (synchronization).
| 17:00:41,725 INFO [JobStoreCMT] Removed 0 Volatile Trigger(s).
| 17:00:41,725 INFO [JobStoreCMT] Removed 0 Volatile Job(s).
| 17:00:41,732 INFO [JobStoreCMT] JobStoreCMT initialized.
| 17:00:41,732 INFO [StdSchedulerFactory] Quartz scheduler 'JBossEJB3QuartzScheduler' initialized from an externally provided properties instance.
| 17:00:41,732 INFO [StdSchedulerFactory] Quartz scheduler version: 1.5.2
| 17:00:41,740 INFO [JobStoreCMT] Freed 0 triggers from 'acquired' / 'blocked' state.
| 17:00:41,751 INFO [JobStoreCMT] Recovering 0 jobs that were in-progress at the time of the last shut-down.
| 17:00:41,751 INFO [JobStoreCMT] Recovery complete.
| 17:00:41,751 INFO [JobStoreCMT] Removed 0 'complete' triggers.
| 17:00:41,752 INFO [JobStoreCMT] Removed 0 stale fired job entries.
| 17:00:41,755 INFO [QuartzScheduler] Scheduler JBossEJB3QuartzScheduler_$_NON_CLUSTERED started.
| 17:00:41,951 INFO [RARDeployment] Required license terms exist, view .../deploy/jboss-ha-xa-jdbc.rar!/META-INF/ra.xml
| 17:00:41,999 INFO [Ejb3Deployment] EJB3 deployment time took: 20
| 17:00:42,071 INFO [Ejb3Deployment] EJB3 deployment time took: 71
| 17:00:42,071 INFO [Ejb3Deployment] EJB3 deployment time took: 0
| 17:00:42,074 INFO [Ejb3Deployment] EJB3 deployment time took: 2
| 17:00:42,075 INFO [Ejb3Deployment] EJB3 deployment time took: 1
| 17:00:42,085 INFO [Ejb3Deployment] EJB3 deployment time took: 9
| 17:00:42,096 INFO [Ejb3Deployment] EJB3 deployment time took: 10
| 17:00:42,098 INFO [Ejb3Deployment] EJB3 deployment time took: 2
| 17:00:42,099 INFO [Ejb3Deployment] EJB3 deployment time took: 1
| 17:00:42,118 INFO [Ejb3Deployment] EJB3 deployment time took: 19
| 17:00:42,121 INFO [Ejb3Deployment] EJB3 deployment time took: 3
| 17:00:42,121 INFO [Ejb3Deployment] EJB3 deployment time took: 0
| 17:00:42,122 INFO [Ejb3Deployment] EJB3 deployment time took: 1
| 17:00:42,159 INFO [Ejb3Deployment] EJB3 deployment time took: 37
| 17:00:42,195 INFO [Ejb3Deployment] EJB3 deployment time took: 36
| 17:00:42,405 INFO [Ejb3Deployment] EJB3 deployment time took: 210
| 17:00:42,405 INFO [Ejb3Deployment] EJB3 deployment time took: 0
| 17:00:42,772 INFO [Ejb3Deployment] EJB3 deployment time took: 367
| 17:00:42,783 INFO [Ejb3Deployment] EJB3 deployment time took: 11
| 17:00:42,791 INFO [Ejb3Deployment] EJB3 deployment time took: 8
| 17:00:42,797 INFO [Ejb3Deployment] EJB3 deployment time took: 6
| 17:00:42,814 INFO [Ejb3Deployment] EJB3 deployment time took: 17
| 17:00:42,814 INFO [Ejb3Deployment] EJB3 deployment time took: 0
| 17:00:42,816 INFO [Ejb3Deployment] EJB3 deployment time took: 2
| 17:00:42,816 INFO [Ejb3Deployment] EJB3 deployment time took: 0
| 17:00:42,818 INFO [Ejb3Deployment] EJB3 deployment time took: 1
| 17:00:42,818 INFO [Ejb3Deployment] EJB3 deployment time took: 0
| 17:00:42,821 INFO [Ejb3Deployment] EJB3 deployment time took: 3
| 17:00:42,838 INFO [Ejb3Deployment] EJB3 deployment time took: 17
| 17:00:42,846 INFO [Ejb3Deployment] EJB3 deployment time took: 7
| 17:00:42,871 INFO [Ejb3Deployment] EJB3 deployment time took: 25
| 17:00:42,871 INFO [Ejb3Deployment] EJB3 deployment time took: 0
| 17:00:42,873 INFO [Ejb3Deployment] EJB3 deployment time took: 2
| 17:00:42,883 ERROR [ProfileServiceBootstrap] Failed to load profile: Summary of incomplete deployments (SEE PREVIOUS ERRORS FOR DETAILS):
|
| *** CONTEXTS MISSING DEPENDENCIES: Name -> Dependency{Required State:Actual State}
|
| jboss.ws:service=DeployerInterceptorEJB21
| -> jboss.ejb:service=EJBDeployer{Create:** NOT FOUND **}
| -> jboss.ejb:service=EJBDeployer{Start:** NOT FOUND **}
|
| jboss.ws:service=DeployerInterceptorEJB3
| -> jboss.ejb3:service=EJB3Deployer{Create:** NOT FOUND **}
| -> jboss.ejb3:service=EJB3Deployer{Start:** NOT FOUND **}
|
| jboss.ws:service=DeployerInterceptorNestedJSE
| -> jboss.ws:service=WebServiceDeployerJSE{Start:Configured}
| -> jboss.ws:service=WebServiceDeployerJSE{Create:Configured}
|
| jboss.ws:service=WebServiceDeployerJSE
| -> jboss.web:service=WebServer{Create:** NOT FOUND **}
| -> jboss.web:service=WebServer{Start:** NOT FOUND **}
|
|
| *** CONTEXTS IN ERROR: Name -> Error
|
| jboss.web:service=WebServer -> ** NOT FOUND **
|
| jboss.ejb3:service=EJB3Deployer -> ** NOT FOUND **
|
| jboss.ejb:service=EJBDeployer -> ** NOT FOUND **
|
|
| 17:00:42,897 INFO [Http11Protocol] Starting Coyote HTTP/1.1 on http-0.0.0.0-8080
| 17:00:42,920 INFO [AjpProtocol] Starting Coyote AJP/1.3 on ajp-0.0.0.0-8009
| 17:00:42,924 INFO [ServerImpl] JBoss (Microcontainer) [5.0.0.Beta (build: CVSTag=HEAD date=200610251648)] Started in 13s:405ms
|
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3980874#3980874
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3980874
19 years, 5 months
[Design of POJO Server] - Structure deployer changes comitted to trunk
by scott.stark@jboss.org
I just updated the EARStructure deployer to match the refactored StructureDeployer interface. I also split this into an EARStructure, AppParsingDeployer(application.xml) and JBossAppParsingDeployer(jboss-app.xml).
| <bean name="EARStructureDeployer" class="org.jboss.deployment.EARStructure">
| <install bean="MainDeployer" method="addStructureDeployer">
| <parameter>
| <this/>
| </parameter>
| </install>
| <uninstall bean="MainDeployer" method="removeStructureDeployer">
| <parameter>
| <this/>
| </parameter>
| </uninstall>
| </bean>
|
| <bean name="AppParsingDeployer" class="org.jboss.deployment.AppParsingDeployer">
| <install bean="MainDeployer" method="addDeployer">
| <parameter>
| <this/>
| </parameter>
| </install>
| <uninstall bean="MainDeployer" method="removeDeployer">
| <parameter>
| <this/>
| </parameter>
| </uninstall>
| </bean>
| <bean name="JBossAppParsingDeployer" class="org.jboss.deployment.JBossAppParsingDeployer">
| <install bean="MainDeployer" method="addDeployer">
| <parameter>
| <this/>
| </parameter>
| </install>
| <uninstall bean="MainDeployer" method="removeDeployer">
| <parameter>
| <this/>
| </parameter>
| </uninstall>
| </bean>
|
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3980870#3980870
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3980870
19 years, 5 months