[Design of Messaging on JBoss (Messaging/JBoss)] - Replication
by timfox
Currently JBM 2.0 uses server replication between a live and a backup server to maintain the backup server in a (quasi) identical state so that on failover of a client from the live to the backup server, the clients session(s) can be found in the exact same state they left off on on the live server and can be reattached so the client can continue it's operation 100% transparently.
The current implementation in TRUNK uses a single thread for replication which makes it easier to guarantee that any state changes on the backup are applied in the same order as live, but has a down side in that forcing everything to be single threaded destroys concurrency on the server, effectively pushing everything to a single core.
Recently I have been working on multi-threaded replication. This allows state changes to be applied on backup by many different threads so we solve the concurrency problem. However, we still have to ensure that state changes are applied globally in the same order on backup as live. This is tricky with multiple threads. The technique used is to note the acquisition of mutexes around shared data on the live node and when replicating we replicate this list of acquires too. On the backup node we create a special mutex which forces locks to be obtained in the same order as the list.
This is a complex problem to solve/implement. The current status is it's "more or less working" but not ready yet, and would probably take a significant amount of time to complete/debug fully etc.
The replication code significantly complicates the server code, and all replication comes at a cost of latency - since you need to make sure each packet is replicated and received on the backup before returning results to the user.
Let's take a look at what other messaging systems do:
1) Weblogic JMS - they don't use server replication
2) Websphere MQ - they don't use server replication
3) Tibco EMS - they don't use server replication
4) ActiveMQ - has slow synchronous single threaded replication
5) SonicMQ - *does* have full server replication.
Really, only one of our significant competitors (SonicMQ) actually does server replication.
Most of them do failover via a shared store on a shared filesystem, any session state is lost.
Since most users use one of the above systems which typically don't have server replication, it seems to me it can't be a critically important feature, that's worth the cost (latency, performance). A non replicating server is likely to be faster than a replicating server.
Therefore, what I am proposing is we remove full server replication from the JBM 2.0 server, since it's not worth the cost in terms of
a) Performance overhead
b) Maintainability difficulties
c) Hard work in implementing and debugging it.
Compared to the small benefit of having 100% transparent failover.
If we remove full server replication, when a client detects server failure it can stil automatically fail over to the backup server and automatically reconnect, the only difference will be the session state won't be there, so in a non transacted session, any messages or acks sent might not have actually reached the server which could result in sent messages being lost or duplicates delivered.
For a transacted session, that has already sent messages or acks before failover occurs, we just need to flag the session as rollback only, and on commit, the commit will fail with a TransactionRolledBackException. The user would need to catch this and restart the transaction. In such a way we can maintain the once and only once delivery guarantee and never lose messages or get duplicates with transacted sessions.
AIUI is pretty much how the majority of the other messaging systems handle failover.
With no full server replication the user can choose two modes of HA:
1) Failover via a shared store residing on a shared file system. When the live fails the backup loads the journal, and clients can connect to it.
2) Replicated data store. We can replicate the data store from the live to the backup node so there's no need for a shared file system. Replicating the data store is a lot easier than replicating the entire server
I'll park the MT replication code in a branch in case we want to revisit it in the future.
Thoughts?
View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4248387#4248387
Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4248387
16 years, 5 months
[Design of JBoss Web Services] - Re: JBossWS Deployers Integration
by richard.opalka@jboss.com
"alessio.soldano(a)jboss.com" wrote :
| Basically, we had a abstraction to support AS4.2 deployment in a similar way to AS 5.x, now most of that additional abstraction is gone, why DA are still there? Can't we simply move each of them in a different deployer? (considering the WSDeploymentDeployer seems to me just something like a wrapper of the DAs)
|
Yes, DA abstraction is still there and will probably stay there ;)
There are two main reasons why we're wrapping JBossWS DAs:
we can't replace DAs with Deployers, otherwise our stacks will be tightly coupled with AS internals. To separate all supported SOAP stacks from different AS versions we have AS IL (JBoss AS Integration Layer) that does that abstraction job. Note that there are DAs specified in stack config files, and we want to keep them AS agnostic.
we will need to implement Endpoint.publish() in near future (this will use DAs via DAManager - DAManager will be used as the replacement of deployers chain on Java client side)
"alessio.soldano(a)jboss.com" wrote :
| I saw the comment, perhaps a brief summary of the infinite discussion with Ales should be added to the jira description :-) It's not that clear otherwise. Btw, what's the status of JBDEPLOY-201? I see it's marked as in progress, but it doesn't seem to me somebody is working on it...
| Thanks
IMHO the JBDEPLOY-201 title is self descriptive. Regarding that issue Ales Justin just wrote me:
Currently there is no rush for it.
Hence I don't see the point of cutting new release,
as I have already done them too many.
I rather wait and get any additional fixes in (e.g. recent 208, 209).
I'll at least try to port the changes to Branch_2_0 asap,
so you can easily run a 2.0.9-SNAPSHOT version to test your stuff.
I just asked him to port the changes to JBDeploy Branch_2_0 branch so we can test the fix using 2.0.9-SNAPSHOT
View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4248386#4248386
Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4248386
16 years, 5 months
[Design of JBoss Web Services] - Re: JBossWS Deployers Integration
by alessio.soldano@jboss.com
"richard.opalka(a)jboss.com" wrote : "alessio.soldano(a)jboss.com" wrote :
| | So, what's actually the reason for doing this, instead of having an actual separate deployer for each of the current deployment aspect, instead of doing the wrap process into WSDeploymentAspectDeployer? Wouldn't it be clearer?
| |
| The main reason is to enable other AS deployers to hook into WS deployers pipeline, see this.
Perhaps you're not getting me right, what I propose would make that even easier (as we'll end up with N real deployers instead of 4 deployers + 1 being instanciated multiple times according to the DeploymentAspect provided in the configuration).
Basically, we had a abstraction to support AS4.2 deployment in a similar way to AS 5.x, now most of that additional abstraction is gone, why DA are still there? Can't we simply move each of them in a different deployer? (considering the WSDeploymentDeployer seems to me just something like a wrapper of the DAs)
anonymous wrote :
| "alessio.soldano(a)jboss.com" wrote :
| | So my question is, why is relativeOrder required for all the deployment aspects turning into WSDeploymentAspectDeployer instances?
| Comment in jboss-beans config files is self descriptive ;)
|
| <!-- [JBDEPLOY-201] workaround -->
|
| In short about JBDEPLOY-201.
| There's a bug in domino ordering (deployers ordeing AS implementation) we have to workaround until fix is available in AS. This bug appears if there are multiple inputs/outputs specified on deployers (this affects all WS deployers, except WSDescriptorDeployer).
|
I saw the comment, perhaps a brief summary of the infinite discussion with Ales should be added to the jira description :-) It's not that clear otherwise. Btw, what's the status of JBDEPLOY-201? I see it's marked as in progress, but it doesn't seem to me somebody is working on it...
Thanks
View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4248365#4248365
Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4248365
16 years, 5 months