I did some work to make sure the LargeMessage sending will be replicated to the backup
nodes.
On my initial work, I was having a lot of problems on getting the LargeMessage to be
replicated and sent on both nodes. I was aways getting an Out-of-credits on the backup
when replicating the message, causing issues on replicating the ACKs and properly failing
over to the backup.
Changes I had to make to fix the issues on LargeMessage and Failover:
1) I simple fix to sendLargeMessage
| On ServerConsumerImpl::
|
| private void sendLargeMessage(final MessageReference ref, final ServerMessage
message)
| {
| // TODO: Should we block until the replication is done?
| channel.replicatePacket(new SessionReplicateDeliveryMessage(id,
message.getMessageID(), message.getDestination()));
|
| // SendLargeMessage has to be done on the same thread used on the QueueImpl or
we would have problems with ordering and flow control
| largeMessageSender = new LargeMessageSender((LargeServerMessage)message, ref);
| largeMessageSender.sendLargeMessage();
|
| }
|
This above code used to wait the replication to finish before sending the LargeMessage,
what would use another thread. The queue would continue its work asynchronously and the
Consumer would eventually handle another message while sendLargeMessage was still
processing, what would cause issues on the backup node (not enough credits).
2) When taking credits on backup, we will eventually resume sending the largeMessage.
That process needs to be done synchronously while receiving the credit from the live node.
If we play the commands on a different order we would eventually reject messages because
of credit during the replication, or because we would still processing a largeMessage.
promptDelivery is now calling resumeLargeMessage if largeMessageSender != null, and this
is resumeLargeMessage:
| private void resumeLargeMessage()
| {
| if (messageQueue.isBackup())
| {
| // We are supposed to finish largeMessageSender, or use all the possible
credits before we return this method.
| // If we play the commands on a different order than how they were generated
on the live node, we will
| // eventually still be running this largeMessage before the next message
come, what would reject messages
| // from the cluster
| largeMessageSender.resumeLargeMessageRunnable.run();
| }
| else
| {
| executor.execute(largeMessageSender.resumeLargeMessageRunnable);
| }
| }
|
3) While receiving LargeMessages, the client will send credits back, as soon as the chunk
is received on the client.
For that handleLargeMessageContinuation, will call flowControl.
While flowcontrol is being called, handleLargeMessageContinuation caller will be holding a
lock of the ClientRemoteConsumer.
As a result, we will have a dead lock if failover happens while flowControl is being
called within handleLargeMessageContinuation.
I am now using an executor for the flowControl, when receiving LargeMessages. This way I
release the lock on the ClienteRemoteConsumers while sending the flowcontrol back, so
Failover will be able to perform outside of the locks.
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4201554#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...